This year, the annual Linguamatics Text Mining Summit, which took place Oct 12-14 in Newport, RI, introduced the first-ever I2E Healthcare Hackathon. With a rapidly expanding healthcare user base and continued interest from our pharmaceutical clients in real world data, we realized the value of running a session specifically targeting mining electronic health records. Our aim was to have an in-depth exercise that closely reflected the types of data and issues that our customers and our own team encounter. Linguamatics has been involved in a wide variety of complex and valuable customer projects, allowing us to develop best practice querying strategies and data processing techniques. The Hackathon was a great opportunity for us to share these best practices with the greater Linguamatics and text mining community. The session was set up as a competition in a similar style to the i2b2 NLP challenges, with teams being formed from the 23 attendees from a mixture of experienced and new users from pharma and healthcare groups.
The challenge: extract key data elements from a gold standard set of medical transcripts covering disease terms, medications and dosages, and lab values. Going further, the teams also needed to identify whether disease terms were related to the patient, to family history or were negative statements indicating a disease was not present. The context within which the disease terms occur was therefore of major importance. Contestants were encouraged to use the various disease ontologies provided with I2E to identify key disease concepts and submit the normalized disease code as part of their results to enable automated results assessment. The teams were also able to use region fields associated with the different sections of the transcripts to focus the queries into the appropriate area of the document.
The session lasted the entire day and although everyone was exhausted by the end, the feedback from all was extremely positive. For example:
- “It was exactly what I hoped it would be”
- “It was really good to be able to just write queries for the day without needing to worry about data formatting and indexing”
- “Even though I’m a new user I learned a lot about I2E because the challenge closely matches my work”
- “It was fun, educational and challenging. I enjoyed working in a team, the networking and guidance from I2E staff. I'd be very eager to do it again.”
- “The hackathon provided invaluable real-world experience, hands-on training and use of the tool”
We had eight teams taking part in the challenge and working hard through the day. The teams were so committed to the task that it was very difficult to persuade them to break for refreshments and lunch.
We announced the winners of the competition the following day during the main conference sessions; the honors went to the team from Pentavere Research Group. All the teams performed well, with some focusing on specific types of data and others having a try at all the different data types. The top three ranking and special mentions are as follows:
- #1 Pentavere Research Group
- #2 SCPFBMS
- #3 Team melting pot
- Best negated disease: Team Matt and Enlai
- Best medication: Jeremalika
- Best dosage: JNH_WX_LOS
- Best labs: TracySamRyan
After such a positive response we are certainly planning a similar session for next year’s Text Mining Summit in the beautiful Cape Cod! We will incorporate the excellent feedback from participants to possibly consider a smaller task, team composition across experienced and novice users and more opportunities to tackle similar challenges. Thanks to everyone who took part and helped organize, especially James, David, Himanshu, Sharon, Erin, Paul and Mira.
We look forward to seeing you next year!