Clinical Narratives

2015-2016 CREU examining clinical narratives

Final weeks!  I finished up the results regarding the number of discrepancies in our corpus and created some visualizations of the data using R. These graphs and charts were used on the poster that we created for the Simmons Undergraduate Research Symposium. The poster was well received and I am grateful that I had the opportunity to present our project. It was really exciting to show all that we had learned this year.

This week I met again with BJ, a professor at Simmons, to discuss my code for the remaining Tags. I had encountered a problem where the code used to find the total number of discrepancies (vs the total number of patients with discrepancies) was producing a value that did not make sense. With his help I was able to debug the program and it is now working!


This past Wednesday, Stephanie, Rebecca, and I were able to present our work during the Simmons Undergraduate Research poster session! It was a lot of fun. Simmons has a strong focus on health and life sciences, so our work was well-received, and additionally, many professors and students who stopped by seemed to learn a lot about what computer science can entail, and how it can be beneficial across disciplines and professions.

There were two other CS-based research posters that I saw, and additionally I was able to see the other Simmons CREU team present their work during one of the panels! It was great to see the culmination of the other research that's been taking place this year, and the project seemed really interesting.

Our poster was accepted for the student poster session at the upcoming Tapia Celebration of Diversity in Computing, so we're thrilled to be able to go (even if it might only be Rebecca!).

Overall, this has been an enjoyable, albeit sometimes challenging, year of research. I personally am grateful to have the opportunity to investigate these clinical narratives more closely, and I am actually looking into being a participant in the i2b2 challenge itself next time it rolls around! I would like to thank Professor Amber Stubbs as well as CREU and SLIS for supporting this project and being generally fantastic.


The Simmons Undergraduate Research Conference is approaching, so we've been finalizing some of the results from the discrepancy searches; I've been working with Stephanie and Rebecca to make a poster so that we can present the work this coming Wednesday. The Undergraduate Research Conference gives students at Simmons the chance to present their work during a poster session, and several dozen students present every year.

However, there are generally very few posters about computer science projects, so we hope to contribute somewhat to the visibility of the CS program here at Simmons. In the past, professors and students alike have been interested in CS research, and presenting in these contexts grants the Simmons community more understanding of what computer science can contribute to both within the field as well as in an interdisciplinary sense.

So, for next week, we'll finish up the visual materials (poster, data visualization, etc.), and make sure we're set to explain our work by Wednesday!

Over the past few weeks, I've been working on a number of things. I had to alter some of the CSVs I created, because my original files were counting symptoms as mentions of a condition. Analyzing symptoms would be interesting as well, but at this stage we are only looking at direct mentions of medical conditions. To simplify things I created a program that will populate any CSV based on the tag you input.

I also designed a wireframe of the timeline visualization which Katie has been working on. I'm not very familiar with Javascript, but I've been looking at some web tutorials so that I can hopefully contribute to that aspect of our project as well.

We recently submitted a proposal for the undergraduate research symposium here at Simmons. Since we're nearing the end of the semester, we've also begun writing a final paper on our project.

I've been working on primarily the same project throughout this time. Trying to find the total number of discrepancies in smoking status was pretty tricky, but I was able to find this number after doing some research and much perseverence (yay!). I also took some time to learn how to use knitr, a package in R that allows me to create html documents so I can easily share my code and results and reduce risk of error. As a group we also submitted a proposal  to the undergraduate research conference at our university.

I've worked on temporal ordering (particularly with smoking), but since creating visual timelines is a pretty reliable way of determining discrepancies by looking at them, I've spent the last couple of weeks looking into how best to approach solid data visualization. I'm going to start by brushing up on my JavaScript skills so that I can create interactive browser-based visualizations. They won't be live until I'm sure that generating live timelines (with no names or other identifying information, but even so) doesn't violate any privacy issues (or HIPAA of course).

Along the way, I'm working on code that will automatically detect discrepancies in the narratives -- that part is a little slower going, but that's why it's a long project. Onward!

This past week, I created more CSVs for mentions of CAD, hypertension, hyperlipidemia, and obesity. These CSVs are based on whether or not each condition is mentioned at all, so there are only two possible options (mentioned or not mentioned). But in many cases, we have more information than that about the conditions or related events. Over this next week, which is spring break, I'll be looking into how to organize and analyze this more complex information.