Pondering DNA at The Eagle, Cambridge

I was recently privileged to have a pint of Guinness at the Eagle in Cambridge with some colleagues after work in our U.K. office. The Eagle is a historically significant pub in the area of DNA. Two things came to mind:  1) What is it with the number 51 and controversies? Area 51 and Photo 51 both bring up their own issues...and if we combine the two it would get really interesting...Alien DNA. Now that would be a good pub conversation! Especially here at the Eagle, where James Watson and Francis Crick theorized DNA’s helical structure.  And 2) I wish Rosalind Franklin could have lived to see how things are evolving in precision medicine.


Use case presentations and training workshops will highlight innovative uses of Linguamatics NLP-based AI technology

Boston - October 9, 2018 - Linguamatics, the leading natural language processing (NLP) text analytics provider, today announced that multiple top-tier biopharma firms and healthcare organizations will share best practices and present a variety of use cases at the upcoming Linguamatics Fall Text Mining Summit. At the October 15-17, 2018 Summit in New Castle, N.H., Linguamatics clients, staff, and other members of the text mining technology community will exchange ideas, network, discuss industry trends, and explore the latest NLP text mining enhancements.  

Featured presenters include representatives from Agios Pharmaceuticals, Atrius Health, Bristol-Myers Squibb, Eli Lilly, Glaxo-SmithKline, Mercy, Novo Nordisk, Regeneron Pharmaceuticals, Sanofi, and Secure Exchange Solutions. Use case presentations and training workshops will highlight the wide range of ways organizations are leveraging I2E, Linguamatics’ powerful NLP-based AI technology, to extract actionable insights from the huge amount of unstructured data available in healthcare and the life sciences.

Key use case presentations cover a diverse range of application areas including:


Integrated NLP data management solution speeds clinical data abstraction and curation for disease registries, clinical research and quality reporting


There surely can’t be anyone in the pharma industry who hasn’t heard the story of thalidomide. The disaster that followed the release onto the market of thalidomide in 1959 triggered a wave of regulatory changes to ensure reliable evidence of drug safety, efficacy and chemical purity, before a new drug is released onto the market. 

While failure of clinical efficacy is the major cause of drug attrition, a poor safety profile is also a major factor in failure of drugs in development, at all stages from initial lead candidate through preclinical and clinical development to post-marketing surveillance. In order to ensure the safety of drugs on the market, rigorous testing is carried out throughout the pipeline, and can be categorised into preclinical safety/toxicology in animal models, clinical safety in human subjects, and then post-market pharmacovigilance, to look for safety signals across a wide patient population (see schematic below).

At every stage, critical data is being both generated and sought from unstructured text – from internal safety report, scientific literature, individual case safety reports, clinical investigator brochures, patient forum, social media, conference abstracts. Intelligent search across these hundreds of thousands of pages can provide the information for key decision support. Many of our customers are using the power of Linguamatics I2E’s Natural Language Processing (NLP) solution to transform the unstructured text into actionable structured data that can be rapidly visualized and analyzed, at every stage through the safety lifecycle of a drug.


Our latest version of I2E includes improvements that make it easier to integrate the tool into your organization and process your internal documents, as well as the usual usability enhancements and under-the-hood modifications.

I2E 5.3.1 supports Single Sign-On (SSO) by connecting with Federated Authenticated Systems such as ADFS and Shibboleth. This means that you can be authenticated in one system and then seamlessly log into the I2E client without prompting for your credentials. If you are not already logged in via another system, I2E will initiate the login process via a redirect to a special web page.

We’ve improved the hit highlighting in our Excel results format (figure 1): terms from your search use the same colors for each column in your results and the colors are consistent across the I2E Query Editor, HTML results and highlighted cache documents.

Excel results show color-coded terms in the Hit column

Figure 1. Excel results show color-coded terms in the Hit column

A good example of recent continual improvement in I2E is the Class Chooser. In recent releases, we have increased search speed, added as-you-type class suggestions and, in I2E 5.3.1, we’ve added additional information for each class match to show which Ontology the term is from (figure 2). This helps to quickly review your results to get to your correct match(es), particularly when you’re using a term that could occur in different ontologies.