During week 15, I worked on extracting diabetes mentions from each medical record and writing them to a CSV file. I formatted the file based on what will work best for Stephanie when she goes on to analyze the CSV in R. It took me a while to work out all the bugs, but I had an accurate CSV by the end of the week. Just skimming the file I could see that a surprising number of records never mention the fact that the patient has diabetes.
Now that I've developed the script to go through each record, extract diabetes mentions, and write them to a CSV, doing the same with other tags will be much easier. I also created a CSV file detailing the patient's smoking status. Though this file was relatively simple for me to create, it will be harder to analyze since someone's smoking status can change over time.
Currently, I'm working on going through this same process to create CSV files for family history and other tags.