Attendees at Linguamatics Spring Text Mining Conference

What can YOU do with NLP and Text Mining?

It seems like there’s not much you can’t do, if you are as ingenious as our customers!  Want to understand your patients better? Use NLP. Want to visualise the world of chemical safety? Use NLP. Want to look at real world data for adverse events? Use NLP!

These were some of the topics presented by healthcare and life science customers, at our Spring Text Mining Conference in Cambridge UK (#LMSpring17). Attendees from the pharmaceutical industry, biotech, healthcare, academia, and partner vendor companies came together for hands-on workshops, round table discussions, and of course, excellent presentations and talks.

Huntsman Cancer Institute: Speeding Patient Care Alerts

Samir Courdy, Chief Research Information Officer and Director of Research Informatics at the Huntsman Cancer Institute, kicked off the day with a talk on “Navigating the Quagmire of Clinical Data in Free Text Reports”. The team at Huntsman are using NLP to structure and capture data that is currently only available in free text into their Clinical Cancer Research (CCR) database. This framework reduces the cost and work load for manual curation, and enables more effective identification of disease group patients. Queries developed at HCI have been shared with City of Hope cancer treatment and research center. Samir highlighted the importance and advantages of such community sharing of NLP resources for healthcare practitioners and for patients.

Figure: The workflow used at Huntsman Cancer Institute, to create structured data attributes from clinical notes, pathology & radiology reports, map the results into the HCI Clinical Cancer Registry. This workflow speeds patient care alerts.


AstraZeneca: Comparing FDA Drug Labels and PatientsLikeMe Real World Evidence

James Loudon-Griffiths, Clinical Information Scientist at AstraZeneca, shared “Stories from Clinical Informatics at AstraZeneca”, a selection of projects using NLP to answer challenging questions within late stage drug development. James described research to compare real world data for the adverse event of nausea to clinical data on the same set of drugs. The data sources used were diverse – patient reported outcomes from PatientsLikeMe, and clinical trial summary data from FDA Drug Labels. The team used NLP to structure the data from FDA Drug Labels, making statistical analyses of the two data sets possible. James discussed the overall correlation, and also some of the discrepancies in nausea reporting identified between the clinical setting compared to the real world, some of which could be due to dosing and usage differences.

Mundipharma: Ensuring IDMP Compliance in Regulatory Operations

Moving to a very different pharma realm, regulatory informatics, Jon Sanford, Head of Regulatory Operations, and Will Hayes, Director, IT Business Transformation, from Mundipharma Research presented on “IDMP and Compliance - Using Text Mining to Support Regulatory Workflows”. They described a successful pilot project to use I2E to extract structured data elements for IDMP Iteration 1 from the unstructured text and tables within Summary of Product Characteristic documents. In the pilot, the team evaluated accuracy against a manually extracted gold standard data set, and generally rated the outcome as “excellent” (90%+ accuracy). The Mundipharma team is now productising this workflow for all Mundipharma products, and expanding to non-English language SmPCs.

CRUK: Mining Pathology Reports to Deliver Enhanced Precision Medicine Care

Back into healthcare, Helen Pitman, CRUK, gave a fascinating talk on “Extracting Attributes from Pathology Records in the CRUK Stratified Medicine Programme 1”. In a double-act with Paul Milligan, Senior Product Manager at Linguamatics, Helen shared how NLP fits in their precision medicine strategy, aiming to tailor treatment and care to individual patients based on the molecular make-up of their disease. Their project used I2E in workflows to tackle over 100 pathology-related data items, from over 10k patients, six different tumour indications, across eight clinical sites in the UK.  Helen said, “It was possible to automate the entire process from document input to results generation, eliminating the possibility of errors being manually introduced into the process”.

Roche: Integrated Chemistry Workflows to Answer Drug Discovery Research Questions

Daniel Stoffler, Senior Principal Scientist, and Raul Rodriguez-Esteban, Senior Scientist, from Roche Pharma Research and Early Development (pRED) presented the final customer talk of the day, "Artemis – a text mining tool for Chemists". Artemis is a web-based front-end to I2E  developed to empower drug project teams to easily ask questions about what is known or published around chemicals and associations to targets, safety or toxicity information. The workflow integrates the text-mined results with other internal data (such as molecular properties) and the output is visualised in Spotfire, helping users to answer questions such as:

  • “Extract from publications any chemicals affecting targets in the area of Eye Diseases”
  • “Show me which targets play a role in the eye disease Cataract”
  • “Let me filter out unwanted compounds”
  • “Now show me the chemical diversity”

This talk beautifully illustrated one of the take-home messages from the day. Text mining, integrated into enterprise systems and workflows, provides a hugely valuable tool to solve a variety of challenges related to information buried in text – whether related to population health, stratified medicine, regulatory compliance, understanding signals from real world evidence, or drilling into the possible risk liabilities of compounds in drug development.

So, while the sun didn’t shine as much as last year, it didn’t snow, and the lively discussions and events kept us all warm. The conference provided several days filled with food for thought on the value and power of text mining for a variety of applications within pharma, healthcare, and the broader life sciences. Thank you to everyone who contributed, and we hope to see you all at our other events across the year.

And, if you want to understand what YOU could do with NLP, please contact us!


Figure: An illustration of the challenges facing CR-UK in extracting and standardizing data elements for stratified medicine; this shows just some of the ways physicians described a particular drug combination in some of the electronic medical records.