Skip to main content

Blog

Natural Language Processing (NLP) and text analytics experts from pharmaceutical and biotech companies, healthcare providers and payers gathered together to discuss the latest industry trends and to hear the product news and case studies from Linguamatics on August 26th.

The keynote presentation from Dr Gabriel Escobar was the highlight of the event, covering a rehospitalization prediction project that the Kaiser Permanente Department of Research have been working on in collaboration with Linguamatics.

The predictive model has been developed using a cohort of approximately 400,000 patients and combine scores from structured clinical data with factors derived from unstructured data using I2E.

Key factors that could affect a patient’s likelihood of rehospitalization are trapped in text; these include ambulatory status, social support network and functional status. I2E queries enabled KP to extract these factors and use them to indicate the accuracy of the structured data’s predictive score.

Leading the use of I2E in healthcare, Linguamatics exemplified how cancer centers are working together to develop queries for pathology reports, mining medical literature and predicting pneumonia from radiology reports. They also demonstrated a prototype application to match patients to clinical trials and a cohort selection tool using semantic tagging of patient narratives in the Apache Solr search engine.


(Cambridge, UK & Boston, USA - 2 September 2014) - Today, Linguamatics I2E Semantic Enrichment has been selected as a KMWorld 2014 Trend-Setting Product.

I2E Semantic Enrichment, the newest offering from Linguamatics, is used within an existing enterprise search deployment to enrich the current search metadata to make information more discoverable and provide more relevant search results.

I2E Semantic Enrichment uses Linguamatics’ text analytics platform, I2E, to bring powerful semantic and natural language processing (NLP) technology to enterprise search applications, particularly within life sciences and healthcare.

The system scans millions of documents, to identify and mark-up semantic entities such as genes, drugs, diseases, organizations, authors, patient characteristics and lifestyle factors plus other relevant concepts and relationships.

Enterprise search engines consume this enriched metadata to provide a faster, more effective search for users. This results in a richer search experience, increased findability and improved speed to insight, enabling users to be more productive and spend less time on search.

 “I2E Semantic Enrichment was selected by the panel because it demonstrates innovative use of natural language processing-based text analytics to address the issue of enterprise search findability, bringing better search results to the most important stakeholder – the customer." says Hugh McKellar, KMWorld Editor-in-Chief.

Dr. Phil Hastings, SVP Sales and Marketing at Linguamatics comments “We’re honored to receive this recognition from KMWorld.


Since the human genome was published in 2001, we have been talking about the potential application of this knowledge to personalized medicine, and in the last couple of years, we seem at last to be approaching this goal.

A better understanding of the molecular basis of diseases is key to development of personalized medicine across pharmaceutical R&D, as was discussed last year by Janet Woodcock, Director of the FDA’s Center for Drug Evaluation and Research (CDER).

FDA CDER has been urging adoption of pharmacogenomics strategies and pursuit of targeted therapies for a variety of reasons. These include the potential for decreasing the variability of response, improving safety, and increasing the size of treatment effect, by stratifying patient populations.

Pharmacogenomics is the study of the role an individual’s genome plays in drug response, which can vary from  adverse drug reactions to lack of therapeutic efficacy. With the recent explosion in sequence data from next generation sequencing (NGS) technologies, one of the bottlenecks in application of genomic variation data to understanding disease is access to annotation.

From NGS workflows, scientists can quickly identify long lists of candidate genes that differ between two conditions (case-control, or family hierarchies, for example). Gene annotations are essential to interpret these gene lists and to discover fundamental properties like gene function and disease relevance.


Rehospitalization is a serious problem in medicine.

Medical aspects are complicated by end of life care issues as well as a regulatory environment in which hospitals can experience financial penalties for "excess" rehospitalization rates. Existing rehospitalization predictive models, most of which are based on administrative data, have poor statistical performance, as do models that employ limited physiologic data.

At Linguamatics' upcoming seminar in San Francisco, Dr. Escobar will present work on a new rehospitalization model that employs data from a comprehensive electronic medical record and which could be instantiated in real time.

He will also present a "road map" to explain how data from natural language processing can be incorporated into this model as well as on future strategies for instantiation of NLP engines into routine clinical operations.

Dr. Escobar is a research scientist at the Kaiser Permanente Division of Research in Oakland as well as being the Regional Director for Hospital Operations Research for Kaiser Permanente Northern California.

An expert on risk adjustment and predictive modeling, Dr. Escobar has published over 130 peer-reviewed articles and is currently in the middle of deploying a real-time early warning system for deterioration outside the intensive care unit at two Kaiser Permanente hospitals.


IBM Watson gets a lot of attention in the medical field for trying to take capabilities that were demonstrated on the Jeopardy TV show and apply that cognitive reasoning to clinical care.

The complexities of disease combined with the mass of medical literature and clinical guidelines make this high dimensional problem an appropriate challenge for an industrial power house.

However, it should not be underestimated what can be achieved using sophisticated Natural Language Processing (NLP) for information retrieval in clinical decision support.

One of my favourite customer stories in recent years concerns our work with medical librarian Jonathan Hartmann from Dahlgren Memorial Library, the health sciences library at Georgetown University.

Jonathan’s role is to support the teams on the hospital’s paediatrics and internal medicine units on rounds at the Georgetown University Medical Center with access to the latest medical insights and publications relating to the current patient.

For example, should a patient with metastatic renal cell carcinoma be given warfin (an anticoagulant) for stroke prevention? Using his iPad at the bedside, Jonathan was able to quickly find journal articles that indicated cancer treatments and potentially cancer spread can indeed increase the risk of stroke.

You can read more about the story here.

From a technical perspective the use of NLP in this scenario is well hidden, as it should be, and simply ensures that the right information is provided to assist in clinical decision making.