Final weeks! I finished up the results regarding the number of discrepancies in our corpus and created some visualizations of the data using R. These graphs and charts were used on the poster that we created for the Simmons Undergraduate Research Symposium. The poster was well received and I am grateful that I had the opportunity to present our project. It was really exciting to show all that we had learned this year.
This week I met again with BJ, a professor at Simmons, to discuss my code for the remaining Tags. I had encountered a problem where the code used to find the total number of discrepancies (vs the total number of patients with discrepancies) was producing a value that did not make sense. With his help I was able to debug the program and it is now working!
This week I will continue to work on finding discrepancies in the other tags in our corpus : hyperglycemia, hyperlipidemia, CAD, family history, and obesity. We also began to work on a final paper outlining the research we completed this year.
This past Wednesday, Stephanie, Rebecca, and I were able to present our work during the Simmons Undergraduate Research poster session! It was a lot of fun. Simmons has a strong focus on health and life sciences, so our work was well-received, and additionally, many professors and students who stopped by seemed to learn a lot about what computer science can entail, and how it can be beneficial across disciplines and professions.
There were two other CS-based research posters that I saw, and additionally I was able to see the other Simmons CREU team present their work during one of the panels! It was great to see the culmination of the other research that's been taking place this year, and the project seemed really interesting.
Our poster was accepted for the student poster session at the upcoming Tapia Celebration of Diversity in Computing, so we're thrilled to be able to go (even if it might only be Rebecca!).
Overall, this has been an enjoyable, albeit sometimes challenging, year of research. I personally am grateful to have the opportunity to investigate these clinical narratives more closely, and I am actually looking into being a participant in the i2b2 challenge itself next time it rolls around! I would like to thank Professor Amber Stubbs as well as CREU and SLIS for supporting this project and being generally fantastic.
The Simmons Undergraduate Research Conference is approaching, so we've been finalizing some of the results from the discrepancy searches; I've been working with Stephanie and Rebecca to make a poster so that we can present the work this coming Wednesday. The Undergraduate Research Conference gives students at Simmons the chance to present their work during a poster session, and several dozen students present every year.
However, there are generally very few posters about computer science projects, so we hope to contribute somewhat to the visibility of the CS program here at Simmons. In the past, professors and students alike have been interested in CS research, and presenting in these contexts grants the Simmons community more understanding of what computer science can contribute to both within the field as well as in an interdisciplinary sense.
So, for next week, we'll finish up the visual materials (poster, data visualization, etc.), and make sure we're set to explain our work by Wednesday!
Over the past few weeks, I've been working on a number of things. I had to alter some of the CSVs I created, because my original files were counting symptoms as mentions of a condition. Analyzing symptoms would be interesting as well, but at this stage we are only looking at direct mentions of medical conditions. To simplify things I created a program that will populate any CSV based on the tag you input.
We recently submitted a proposal for the undergraduate research symposium here at Simmons. Since we're nearing the end of the semester, we've also begun writing a final paper on our project.
I've been working on primarily the same project throughout this time. Trying to find the total number of discrepancies in smoking status was pretty tricky, but I was able to find this number after doing some research and much perseverence (yay!). I also took some time to learn how to use knitr, a package in R that allows me to create html documents so I can easily share my code and results and reduce risk of error. As a group we also submitted a proposal to the undergraduate research conference at our university.
This past week, I've been dealing with a strange Python issue that arose during a different project -- I still haven't resolved the problem (which is related to my path! consternation), but I've been working on some markups for the timeline layout I'm trying to develop. Stay tuned!
Along the way, I'm working on code that will automatically detect discrepancies in the narratives -- that part is a little slower going, but that's why it's a long project. Onward!
This past week, I created more CSVs for mentions of CAD, hypertension, hyperlipidemia, and obesity. These CSVs are based on whether or not each condition is mentioned at all, so there are only two possible options (mentioned or not mentioned). But in many cases, we have more information than that about the conditions or related events. Over this next week, which is spring break, I'll be looking into how to organize and analyze this more complex information.