February 4, 2016 was World Cancer Day, and February is National Cancer Prevention Month. Throughout this month, individuals and groups worldwide are writing and sharing about the importance of taking steps to reduce your risk of cancer on an individual level and also the importance of cancer research on a clinical level.

Linguamatics are one of the pioneers in investing in Natural Language Processing (NLP) text mining technology to improve patient outcomes and cancer care, and one of the few companies using NLP at all. We have been working in healthcare for over 10 years, and recently announced a collaboration with Cancer Research UK to improve the characterization of cancer patient data for precision medicine.

NLP is growing rapidly in healthcare not only for research, but also now in widespread use for computer aided coding and computer aided document improvement. Simon Beaulah, our Director of Healthcare Strategy,  has published a white paper on 9 ways Natural Language Processing is being used by scientists to improve our (actionable) understanding of cancer. This highlights how, by applying NLP, significant impact can be achieved in improving cancer care by targeting the following areas:


Faster, better, cheaper... how often have we heard these words, in the context of any process along the long path of drug development? There are a myriad of solutions that can help at different stages, enabling more comprehensive target assessment, more rapid lead optimization, and so on.  One of the most expensive parts of the drug development process is clinical trials, with bottlenecks including access to knowledge for site selection, patient populations, principal investigators and key opinion leaders. 

Researchers naturally look to utilize information from current and past trials but manually extracting the relevant information can be resource-intensive, repetitive and, therefore, prone to errors.  Time is money, so reducing costs and errors is critical.  

One of our customers, Merck, use Linguamatics I2E for text analytics over public domain clinical trial data, to improve clinical trial site selection. 

One example of the benefits of text analytics is a site selection project for Merck Experimental Medicine division (EMS). They needed to locate a clinical trial site that would be able to conduct gastric bypass trials with the ability to measure gut peptides before and after surgery. The ideal trial site needed to fit many different characteristics - over a dozen - which would be hugely time-consuming to find using the public domain search interface to ClinicalTrials.gov. 


Linguamatics is pleased to announce the latest release of its award-winning natural language processing (NLP)-based text mining and analytics platform, I2E 4.4.

This latest release expands the range of online content access available through I2E OnDemand to include FDA AERS data, from the US Food and Drug Administration’s Adverse Events Reporting System.
 

Software enhancements

I2E 4.4 also adds a number of important software enhancements, including an NLP plugin framework to support non-English languages, enhanced capabilities for viewing chemical structures, better extraction of information from tables, and a new human - readable query language.

FDA AERS is typically used to monitor and discover safety issues in drugs released for public use. This new addition to Linguamatics’ cloud based I2E OnDemand platform allows users to immediately start mining this valuable safety data source without the overhead of downloading, processing and maintaining the information themselves.

The availability of a new multi-language plug-in framework in I2E 4.4 builds on the theme of this release to extend text mining to a wider range of content.

Text miners can now analyze documents written in a new language by plugging in an appropriate third-party language module.
 

Extended support for PowerPoint, Word and Excel

I2E 4.4 also delivers a number of further improvements, such as extended support for Microsoft PowerPoint, Word and Excel documents. This allows efficient review of document repositories, including extraction of information from tables.
 


A new speaker has just been announced for the annual Text Mining Conference hosted in Cambridge, UK. This annual conference has been running for over 10 years and features text analytics use cases particularly across pharma and life science. Information professionals across top 100 pharma and life science organizations gather to share insight, best practice and discuss the future of text mining technology.

Eleanor Yelland will be presenting on: I2E in mental health: Analysis of online transcripts used in cognitive behavioural therapy.

Eleanor is a PhD Student in the Division of Psychiatry at University College London. Her PhD is a partnership with Linguamatics and Ieso Digital Health, who provide text-based online cognitive behavioural therapy.

The project focuses on the language within the treatment sessions and how text mining methods can be applied to best use this to learn about and improve treatment provision. The work primarily involves identifying potentially relevant linguistic characteristics, measuring these and building statistical models of their relationship with therapy outcome scores.  

This adds to a world-class list of speakers across pharma and healthcare who will be presenting at the conference, including:

Jonathan Hartmann, Georgetown University Medical Center: Evolution of I2E to improve patient care

Thierry Breyette, Novo Nordisk: Generating Actionable Insights from Real World Data

Cassie Gregson, AstraZeneca: Application of Text Mining to Clinical Research


Reading some of the FDA blogs reviewing 2015, I was interested to read that "for the second consecutive year, [the FDA] approved more drugs to treat rare diseases than any previous year in our history." This is great news for the patients affected by these rare or orphan diseases, and there is of course potential for applications of such drugs and the knowledge around these diseases across the wider population and in broader healthcare.

Text analytics can play a part in developing better understanding around the biology of these rare diseases. There's a great example of this application of text mining from Madhusudan Natarajan at Shire Pharmaceuticals. Shire develops and provides healthcare in the areas of behavioural health, gastrointestinal conditions, rare diseases, and regenerative medicine, and Madhu has presented his research using text analytics to uncover disease severity and genotype-phenotype associations for Hunter Syndrome (also known as Mucopolysaccharidosis II).

We recently hosted a webinar with Madhu. In this webinar, he illustrates some of the challenges for R&D for orphan diseases, particularly around text mining for mutation and variant patterns, which can be reported in so many different ways in the literature. 

Webinar: A systematic examination of gene-disease associations through text mining approaches