Posts from August 2016

Wilmington, DE – The Pistoia Alliance, an organization dedicated to improving global life sciences R&D, has seen its membership increase with a number of new members including both large multinationals and start-ups.

The new members include Accenture, Linguamatics, Novaseek, Repositive, Agrimetrics and Daniel Taylor. Existing members upgrading to the new Startup membership category this quarter include KNIME, Scitegrity, Databiology, The Hyve, Binocular Vision, BioVariance, and Promeditec. This takes the membership of the Pistoia Alliance to over 80 globally, which includes many of the world’s biggest pharmaceutical companies, many of the most innovative start-ups and companies and organizations that support the life sciences sector.

Dr. Steve Arlington, Pistoia Alliance President said: “I am delighted to see that the Pistoia Alliance continues to show strong growth across all segments in life sciences, and we continue to attract a broad range of members. At the same time, we are also changing how we operate. Our challenge is to promote and encourage pre-competitive collaboration between our members, to benefit our members and ultimately accelerate the delivery of new drugs, devices and services to enhance performance within the sector. The Pistoia Alliance is well placed to help life sciences tackle many of its challenges and through our new strategy and the continued support of our members we will continue to support the global life sciences industry.”


Until recently, the use of natural language processing (NLP) in healthcare has been primarily limited to research efforts and population health within academic medical centers. However, with the proliferation of unstructured data from electronic medical records, providers are now seeking to harness the potential of their data and considering a variety of use cases for NLP technology.[1] That’s the conclusion of a recent KLAS report entitled “Natural Language Processing: Glimpses into the Future of Unstructured Data Mining.”

The report includes insights from 58 provider organizations and examines the various ways providers are currently leveraging NLP technology, as well as some of use cases poised for wider adoption. Coding and documentation applications represent the broadest use of NLP engines. But it is clear providers have a growing interest in NLP solutions that advance their population health initiatives. An increasingly popular use case, involves applications that use NLP to mine unstructured data within patient populations and include predictive analytics to identify at-risk patient populations.

 A few of the major findings from KLAS’s report are summarized below.

How is NLP being used today?


We are always enthused to read about new ways to utilize text mining in the drug discovery and development process, and very much enjoyed the recent paper by Heinemann et al., “Reflection of successful anticancer drug development processes in the literature”. In this study, the researchers develop tools that allow the prediction of the approval or failure of a targeted cancer drug, using models based on information mined from MEDLINE abstracts, along with a slew of other quantitative metadata (e.g. MeSH headings, author counts, fraction of authors with industry affiliation, and more). 

I2E, Linguamatics text mining platform, enabled the researchers to sytematically identify all MEDLINE abstracts containing both the protein target and the specific disease indication of a known set of successfully approved or failed cancer therapeutics; for example, abstracts containing both Her2 and breast cancer, or c-Kit and gastrointestinal stromal tumor (GIST). I2E enables the use of large vocabularies or ontologies of genes and diseases to extract key information, and the researchers used I2E for the rapid retrieval of publications containing any one of the many synonyms of a protein target or indication. 

The researchers found that the set of approved target-indication pairs showed a significantly higher publication count, from 9 years before FDA approval, compared to the eventually-failing pairs. 

Taking the study further, they applied machine learning classifiers and found that the extracted data features could be used to predict success or failure of target-indication pairs, and hence, approved or failed drugs. They conclude: