Posts from October 2018

This month, over 100 life science and healthcare informatics professionals met at the Linguamatics Text Mining Summit 2018 in Portsmouth, NH.

Attendees from multiple pharma companies presented valuable new use cases on how they are using Linguamatics I2E’s Natural Language Processing (NLP)-based AI technology to solve big data challenges from bench to bedside – mining unstructured real world data for rapid reporting of patient trends; discovery of new therapeutic indications of drug targets; developing novel biologics; and supporting risk management and drug safety.

Presenters from healthcare shared how they unlock insights by mining Electronic Health Records (EHRs) in some of the most innovative areas in healthcare today - including real world evidence for clinical outcomes; streamlining prior authorization and medical review workflows; and identifying clinical care gaps.


A key requirement in drug development – and increasingly in precision/personalized medicine and pharmacogenomics – is a comprehensive understanding of the genetic associations for the disease of interest. For a multiple sclerosis (MS) biomarker discovery project, Sanofi wanted to annotate the association of human leukocyte antigen (HLA) alleles and haplotypes with diseases and drug hypersensitivity, as the HLA genotype is responsible for some 30% of the risk of MS and participates in almost every aspect of the disease.

HLA alleles have been associated with multiple autoimmune diseases, various types of cancer, infectious disease, and drug adverse events, but there are no known resources that systematically annotate these associations.

Developing a Comprehensive Catalog of Disease Annotations using Natural Language Processing (NLP)-based Text Analytics

Sanofi identified more than 400 HLA alleles through a whole exome sequencing-based HLA typing and analysis workflow. These potential candidate biomarkers were not annotated in any database. Sanofi then used the Linguamatics I2E NLP solution to analyse and search the literature to annotate the association of the identified HLA alleles with diseases and drug hypersensitivity.

Sanofi linguistically processed and indexed a literature corpus of 25 million PubMed abstracts and 4 million full text journal articles with I2E text analytics, using an internally developed HLA gene ontology, alongside Linguamatics I2E’s dictionary of relationship verbs (e.g. causes, leads to, results in) and Diseases ontology. This identified HLA alleles and haplotypes and their relationships with diseases and drug sensitivity.  


Pondering DNA at The Eagle, Cambridge

I was recently privileged to have a pint of Guinness at the Eagle in Cambridge with some colleagues after work in our U.K. office. The Eagle is a historically significant pub in the area of DNA. Two things came to mind:  1) What is it with the number 51 and controversies? Area 51 and Photo 51 both bring up their own issues...and if we combine the two it would get really interesting...Alien DNA. Now that would be a good pub conversation! Especially here at the Eagle, where James Watson and Francis Crick theorized DNA’s helical structure.  And 2) I wish Rosalind Franklin could have lived to see how things are evolving in precision medicine.


Use case presentations and training workshops will highlight innovative uses of Linguamatics NLP-based AI technology

Boston - October 9, 2018 - Linguamatics, the leading natural language processing (NLP) text analytics provider, today announced that multiple top-tier biopharma firms and healthcare organizations will share best practices and present a variety of use cases at the upcoming Linguamatics Fall Text Mining Summit. At the October 15-17, 2018 Summit in New Castle, N.H., Linguamatics clients, staff, and other members of the text mining technology community will exchange ideas, network, discuss industry trends, and explore the latest NLP text mining enhancements.  

Featured presenters include representatives from Agios Pharmaceuticals, Atrius Health, Bristol-Myers Squibb, Eli Lilly, Glaxo-SmithKline, Mercy, Novo Nordisk, Regeneron Pharmaceuticals, Sanofi, and Secure Exchange Solutions. Use case presentations and training workshops will highlight the wide range of ways organizations are leveraging I2E, Linguamatics’ powerful NLP-based AI technology, to extract actionable insights from the huge amount of unstructured data available in healthcare and the life sciences.

Key use case presentations cover a diverse range of application areas including: