CREU

This week we agreed to all read a corpus entitled "Creation of a new longitudinal corpus of clinical narratives". We also began looking a CSV file regarding predicted and actual values of patient information regarding age and gender, and will soon be doing statistical analysis of this data (how often the predicted differed from the actual, etc.).  Lastly, are working on creating a "Person" object in Python, this code will help us extract specific data that we are interested in analyzing.

We met again this week to discuss our next steps with the data. We’ve spent time this week familiarizing ourselves with the data, and discussing our plan of attack from here on out with this project. We’ve all been working on completing our CITI certification, and began discussing which programming language would be best to use on this project. We need to write code in order to import the narratives and begin our analysis.

After spending some time writing a program to read in our patient files, we've begun working with regular expressions in Python. First, we worked through many exercises to reacquaint ourselves with regular expressions in general. Then, we worked on implementing these regular expressions in Python. As a simple test, I wrote a program to pull the main text out of an xml file. The next step would be to retrieve more specific data and store it in a useful format.

Greetings!

We've spent some time making sure that out CITI certifications are all in order so that we are able to handle the patient files. We've also spent time brushing up on our python skills, since we've decided this is the programming language we are going to use for this project. Lastly, we wrote some code in order to import the patient files into python.

Stay tuned for more updates!

Now that we've all gotten our CITI certifications sorted, we'll be programming soon! We've decided to use Python for this project. I'm excited for this, because I haven't used Python since my first semester here at Simmons. We've also looked at several records and discussed how we will be working with and analyzing the data.  For now, we are working with a small set of clinical narratives. Our first step will be to write a simple program to read in our files. Hopefully we will have a private server set up soon where we will be able to work with the full range of records.

The CREU Clinical Narratives project is officially underway! We're going to be spending our year studying patterns in clinical narratives using a natural language processing (NLP) approach.

This week, we met for the first time since last semester and began discussing data dissemination among the members of our student research team. Stephanie, Rebecca, and Katie signed the data use agreement contract, and we began the requisite CITI training for dealing with medical record data. We also discussed what our first steps will be once we've completed the CITI training, so we'll get started with that very soon.

Next week, we'll meet again and determine the next few steps of our project. We're all very excited -- stay tuned for more updates from the CREU crew!