I2E Natural Language Processing advances research and care delivery by mining clinical insights from unstructured patient data

Cambridge, UK & Boston, USA – June 22nd, 2017 – Market leading Natural Language Processing (NLP) text analytics provider Linguamatics today announced the implementation of the Linguamatics Health enterprise NLP platform, powered by I2E, at the University of Pennsylvania Health System for the extraction of actionable insights from unstructured patient data.

“We look forward to working with Penn Medicine to help them unlock valuable insights from clinical notes in order to advance research initiatives and enhance the delivery of care,” said Simon Beaulah, senior director of healthcare at Linguamatics. “Our growing community of academic medical centers across the country have deployed the Linguamatics Health platform, and are taking advantage of its ease of use, powerful NLP capabilities, rapid query development and successful integration with enterprise systems. Our platform is particularly well-suited for this environment because it empowers organizations to work independently, and get the data they want without requiring extensive services.”


How do you ensure your healthcare company outshines the competition with so many choices out there? There’s an app for that! Well no - not yet, at least there wasn’t at the time I wrote this blog- I double checked. There is however, the National Committee for Quality Assurance (although no app, they do have a very informative Twitter account.)

The committee’s mission is to help continually ensure quality in health from all parties involved. For insurance companies, they use the Healthcare Effectiveness Data and Information Set (HEDIS) as it is “one of the most widely used sets of health care performance measures in the United States.”[1]. So rather than trying to compare two things that may sound like they are certainly similar, such as ‘pineapples to apples’, people now have a true method of payer comparison.

Download the PDF: Case Study on Big Data Analytics for Population Health

HEDIS consists of a set of measures around patient care and service. Measures vary from simple documentation of an adult Body mass index (BMI), a calculation involving only height and weight; to the more complicated documentation of comprehensive diabetes care.


Pfizer improves Patent Search 10-fold with Linguamatics I2E

Intellectual property is critical in the drug discovery process. Before initiating any new project it is important to understand the patent landscape around any particular disease area, check if there is freedom-to-operate, and assess patentability. The business case to assess commercial viability for a project must cover not just the biology, such as “is there unmet medical need” but also, “what is the IP position”.

Streamlining patent research with natural language processing (NLP) text mining

So, scientists and researchers need to be able to access the information on genes and diseases in patents. But patents can be hundreds of pages long and contain complex information constructions and interconnected facts.  Manual patent research is a time-consuming and costly process. More and more pharma companies, such as Pfizer, are looking to NLP text mining to keep up to date with their patent literature.

Pfizer researchers use Linguamatics Life Science Platform powered by I2E to find patents relating to specific diseases. The results feed a database to visualize gene targets, invention type, competitor organizations and overall patent “relevancy”. 


Pentavere Research Group of Toronto, Canada, was developing a platform to provide health insights from Real-World Evidence (RWE). Pentavere’s aim is to improve healthcare efficiency by allowing life science companies and healthcare providers to understand the impact of clinical decisions made in the primary care setting.

The company’s proprietary platform, daRWEn™, uses digitized, de-identified, and aggregated health information, but much of the valuable data that it wanted to include was locked inside free-form text, making it difficult to extract. Pentavere soon realized that it needed to incorporate natural language processing (NLP) capabilities into its platform in order to access these RWE insights. To achieve this in a timely and efficient manner, it chose to integrate the Linguamatics I2E NLP solution into daRWEn™.

Why Linguamatics? There were several important factors, including:


Text Mining Platform I2E features in Best Practices Final and as a Best of Show Award Contender; Linguamatics CTO David Milward a Featured Speaker

Cambridge, UK & Boston, USA – May 22, 2017 – Leading Natural Language Processing (NLP) text analytics provider Linguamatics today announced plans to highlight the latest version of its text mining platform at this week’s Bio-IT World Conference & Expo in Boston. Bio-IT World has named Linguamatics I2E 5.0 a contender for the Best of Show Award, and Linguamatics’ customer Pentavere Research Group a Best Practices finalist.

The Best of Show Awards showcase exceptional innovation in technologies used by life science professionals. As a Best of Show Award contender, Linguamatics is also eligible for the Bio-IT World People’s Choice Award, chosen by votes from the Bio-IT World Community. Voting for the People’s Choice Award is open from 5 pm ET Tuesday May 23 through 1 pm ET on Wednesday May 24.

Bio-IT World also chose Linguamatics' customer Pentavere Research Group as a Best Practices finalist, based on their work using I2E to mine unstructured data for real-world evidence to improve health outcomes. Best Practices finalists are recognized for their outstanding examples of technology innovation, from basic R&D to translational medicine. Pentavere deployed I2E to effectively mine unstructured EHR data, expediting delivery of their product daRWEn™ to the Real World Evidence market.