Agios uses Natural Language Processing in Clinical Safety

January 29 2019

Linguamatics NLP platform enables rapid adverse event understanding from clinical trials

Identifying serious adverse events (SAEs) during clinical trials is a critical part of patient monitoring, and Agios wanted to enable a more rapid response to SAEs. These forms can be in image or PDF format, and manual extraction of the key patient data is slow and error-prone. Agios developed a workflow to process the Serious Adverse Event (SAE) report forms, using Linguamatics NLP platform to extract all relevant patient data. The workflow steps included:

  • OCR of the image SAE reports to render the data accessible
  • Indexing all documents with ontologies such as MeSH, MedDRA, WHO Drugs to normalize and code the data attributes
  • Using Linguamatics NLP platform queries to extract study drug, concomitant medications, adverse events, date of onset, lab test results and other key patient attributes 
  • Loading the data into a clinical safety database for rapid access

Identification of at-risk patients with network visualisations

A specific clinical example explored the risk of a rare (potentially life-threatening) adverse event, Differentiation Syndrome (DS) in patients on a clinical trial of Agios’s IDH1-inhibitor AG120. DS is a complication of first-line chemotherapy in some Acute promyelocytic leukemia (APL) patients, which can be fatal if not recognized on time and treated aggressively.

Using the Linguamatics text mining workflow developed, the Agios team were able to highlight and cluster MedDRA terms associated with DS across the patient pool in the on-going clinical trial. They were able to characterize which adverse events (potential DS symptoms) are most likely to co-occur with DS in the patient cohort, which events appear in only some cases, and which subsets of patients might be more at risk from DS than others. The extracted data was visualized as networks in Cytoscape and enabled clinicians to explore the patterns of symptoms between patients, and critically, identify those at risk.


At Agios, Linguamatics I2E has been used for 10 years. Stuart Murray (Director, Informatics) said that one reason I2E is used is speed: to get decision support as fast and as comprehensively as possible. “We’ve used I2E from very early exploratory research to discover targets for our pipeline through to pre-clinical development looking for safety signals, and now most recently for pharmacovigilance to understand what is going on in our clinical trials”. 

The clinical safety team was not familiar with this type of network analysis before, but found the insights it provided highly valuable and have decided to implement this analysis in other projects. 


Network analysis of the I2E data to identify possible patients with potentially fatal Differentiation Syndrome (DS). Clinicians defined primary and second ary diagnostic criteria for DS. I2E extracted and structured the AEs for all patients from the SAE report forms. This allows visualisation of any patient (blue node) with either primary (red) or secondary (green) symptoms. Patients most at risk (in the centre of the Primary Diagnostic Criteria cluster) can be rapidly identified for appropriate intervention or care.

To learn more: