Precise Medication Extraction using Agile Text Mining

Shivade C,  Cormack J, Milward  D

Proc 5th Int Workshop Health Text Mining Information Analysis (Louhi), EACL. 2014 Apr; pp.75–79



Agile text mining is widely used for commercial text mining in the pharmaceutical industry. It can be applied without building an annotated training corpus, so is well-suited to novel or one-off extraction tasks. In this work we wanted to see how efficiently it could be adapted for healthcare extraction tasks such as medication extraction.

The aim was to identify medication names, associated dosage, route of administration, frequency, duration and reason, as specified in the 2009 i2b2 medication challenge. Queries were constructed based on 696 discharge summaries available as training data. Performance was measured on a test dataset of 251 unseen documents. F1-scores were calculated by comparing system annotations against ground truth provided for the test data. 

Despite the short amount of time spent in adapting the system to this task, it achieved high precision and reasonable recall (precision of 0.92, recall of 0.715).

It would have ranked fourth in comparison to the original challenge participants on the basis of its F-score of 0.805 for phrase level horizontal evaluation. This shows that agile text mining is an effective approach towards information extraction that can yield highly accurate results.