This week I got to use R to analyze the CSV file. I ran some descriptive analysia on them, and also wrote some code to produce some graphs. I ran into an issue with how the file was configured however. I am collaborating with a professor from the statistics department to write some code in order to analyze the data by patient, and not by patient visits. In the meanwhile, I was able to do some comparison of our corpus data to national averages of age and gender.
So we now have access to a server, and since I’ve never used one before I spent time teaching myself how to navigate it using some online tutorials. We also ran some initial analysis on the csv file that contained info about all the patient's age and gender. We noticed some discrepancies which we will be running further analysis on.
We’re still having a few issues getting our server up and running, but it should be happening soon. Meanwhile, we spent some time reading papers related to our research, and I brushed up on my R skills.
This week we split up the work among the three of us. I focused on writing a code that read in a CSV file of patient information. Later I will be using this file and file reader and applying statistical analysis on the expected versus reported ages of the patients.
This week Rebecca and I worked together on creating person and patient objects in python. It had been a while since I had done this in python, so I took some time getting familiarized with how to do this. Our code is pretty basic, it takes in user input (name, smoking status, diabetes status, etc) and assigned it to a person/patient object. Next week we will be working on reworking the code so that is takes in this information from the tags in our patient files.
This week we agreed to all read a corpus entitled "Creation of a new longitudinal corpus of clinical narratives". We also began looking a CSV file regarding predicted and actual values of patient information regarding age and gender, and will soon be doing statistical analysis of this data (how often the predicted differed from the actual, etc.). Lastly, are working on creating a "Person" object in Python, this code will help us extract specific data that we are interested in analyzing.
We met again this week to discuss our next steps with the data. We’ve spent time this week familiarizing ourselves with the data, and discussing our plan of attack from here on out with this project. We’ve all been working on completing our CITI certification, and began discussing which programming language would be best to use on this project. We need to write code in order to import the narratives and begin our analysis.
We've spent some time making sure that out CITI certifications are all in order so that we are able to handle the patient files. We've also spent time brushing up on our python skills, since we've decided this is the programming language we are going to use for this project. Lastly, we wrote some code in order to import the patient files into python.
Stay tuned for more updates!