Author Archives: sittig

This past Wednesday, Stephanie, Rebecca, and I were able to present our work during the Simmons Undergraduate Research poster session! It was a lot of fun. Simmons has a strong focus on health and life sciences, so our work was well-received, and additionally, many professors and students who stopped by seemed to learn a lot about what computer science can entail, and how it can be beneficial across disciplines and professions.

There were two other CS-based research posters that I saw, and additionally I was able to see the other Simmons CREU team present their work during one of the panels! It was great to see the culmination of the other research that's been taking place this year, and the project seemed really interesting.

Our poster was accepted for the student poster session at the upcoming Tapia Celebration of Diversity in Computing, so we're thrilled to be able to go (even if it might only be Rebecca!).

Overall, this has been an enjoyable, albeit sometimes challenging, year of research. I personally am grateful to have the opportunity to investigate these clinical narratives more closely, and I am actually looking into being a participant in the i2b2 challenge itself next time it rolls around! I would like to thank Professor Amber Stubbs as well as CREU and SLIS for supporting this project and being generally fantastic.


The Simmons Undergraduate Research Conference is approaching, so we've been finalizing some of the results from the discrepancy searches; I've been working with Stephanie and Rebecca to make a poster so that we can present the work this coming Wednesday. The Undergraduate Research Conference gives students at Simmons the chance to present their work during a poster session, and several dozen students present every year.

However, there are generally very few posters about computer science projects, so we hope to contribute somewhat to the visibility of the CS program here at Simmons. In the past, professors and students alike have been interested in CS research, and presenting in these contexts grants the Simmons community more understanding of what computer science can contribute to both within the field as well as in an interdisciplinary sense.

So, for next week, we'll finish up the visual materials (poster, data visualization, etc.), and make sure we're set to explain our work by Wednesday!

I've worked on temporal ordering (particularly with smoking), but since creating visual timelines is a pretty reliable way of determining discrepancies by looking at them, I've spent the last couple of weeks looking into how best to approach solid data visualization. I'm going to start by brushing up on my JavaScript skills so that I can create interactive browser-based visualizations. They won't be live until I'm sure that generating live timelines (with no names or other identifying information, but even so) doesn't violate any privacy issues (or HIPAA of course).

Along the way, I'm working on code that will automatically detect discrepancies in the narratives -- that part is a little slower going, but that's why it's a long project. Onward!

Over the last couple of weeks, I extended a little bit of what I was working on (ordering temporal annotations with regard to other treatments). Although initially I used tuples (time, description) to track changes in medication/smoking, I recently switched to using a custom object approach (i.e., a medication object, smoking history object, etc) and then sorting them according to time (which can be extracted simply via function).

This upcoming week is Simmons' spring break, so I'm hoping to spend some extra time working on not only finding temporal relationships between medications/smoking, etc, but also on visualizations of these in a more concrete timeline fashion, so we don't have to make timelines manually.

Although the code I wrote over the last few weeks was primarily in script format, I put some time in over the weekend to create Python classes in order to analyze medication history more effectively. Building off of my previous Medication class, I introduced a time attribute, so that I can keep track of when the medication was being taken (before/during/after DCT).

Last week, Simmons closed the school due to snow on Monday, so we didn't meet; instead, I worked on extracting medication data from tags. I've been using linked lists to keep track of which medications are mentioned when.

This week, I ran into a couple of issues with my code, but I've been working to resolve them, and have also been reading more about methods for addressing temporal issues in code. More updates to follow!

Over winter break, I spent a few days reading through the articles published last year detailing the outcomes of the various tracks of the 2014 i2b2 challenge, which is where the data we're currently investigating came from in the first place. The articles I read primarily concerned risk factor identification and heart disease prediction systems, since those are closely relevant to our team's work.

In particular, a few of these papers explicitly mention the impact of missing (i.e., unstated, undocumented) risk factors on developed systems which rely on tagged risk factor metadata; developing methods not only to identify missing risk factors but also to create workarounds seems to be an area of clear research opportunity/necessity.

Our first week back, we wrote the rest of our proposal for the 2016 Tapia Celebration student poster session, and submitted it Friday (thanks very much to Professor Stubbs' help!), and will move on to move computational pursuits in the meantime.

The code I was working on is going well, but not complete yet; testing my own code has shown a few inconsistencies, which I've addressed as necessary.

This past week, on Dec. 8, we presented an overview of our project to a small research group here at Simmons, and a lot of the questions that were asked were helpful, particularly with respect to areas of inquiry we haven't considered explicitly. (Some were out of the scope of the project, but still good to keep in mind.)

I've been working on finding discrepancies in diabetes mentions in the clinical narratives, and additionally Rebecca has been working to address discrepancies in patients' smoking history. Before the end of the semester, our goal is to report some preliminary findings with regard to discrepancies in clinical narratives.

In the past couple of weeks, I've attempted to combine all of the Python classes I've written in order to extract data from XML files, store them in Patient objects, and assess discrepancies in the clinical narratives.

My part of discrepancy-seeking is attempting to identify places in the records where mentions of diabetes are inconsistent or lacking; in spite of the fact that all of the records in the corpus were selected based on patients' diabetes status, many of the clinical narratives do not mention whether or not the patient has diabetes.

The first step will be to gather statistics regarding whether or not a given patient has any explicit reference to diabetes, including via medication mentions, physical state, etc.