As medicinal chemists strive to fill the pipeline with the best possible novel compounds, they require efficient access to the ever-expanding mass of existing information and knowledge about compounds, targets, and diseases and how they are related. Much of this information is buried in published journal articles, patents, reports, and internal document repositories. Posing chemical compound-, target-, and disease-centered questions to extract and organize the data in order to explore these relationships is laborious, time consuming, and potentially error prone. Locating chemical structural information is especially challenging, when chemicals in the literature are described by many different names: technical, trivial, proprietary, nonproprietary, generic, or trade names.

Roche pRED decided to address this problem and equip their medicinal chemists with a chemically-aware text mining tool (Artemis) that would remove the need for manual searches and data-wrangling, and present the data in a user- and analytics-friendly environment for further exploration. Daniel Stoffler and Raul Rodriguez-Esteban, Roche, presented this work in their talk "ARTEMIS - A Text Mining Tool for Chemists" at Linguamatics Spring Text Mining Conference in 2017.


Linguamatics is pleased to congratulate US healthcare system Mercy on their recent award win at the 12th Gateway to Innovation conference. Mercy won the Innovative IT Project of the Year Award for using Linguamatics I2E Natural Language Processing (NLP) solution to extract clinical analytics insights from their Electronic Health Records (EHR) notes for cardiac patients.

Mercy Technical Services provides contract research services for medical device and pharmaceutical clients to support use of real world evidence (RWE) in Food and Drug Administration submissions. This award recognizes a project that demonstrates value or impact to the organization by solving a business problem or by addressing a specific strategic objective for the company.

NLP used to Extract Real World Evidence from EHRs

As a large health system with a mature and consolidated Epic EHR system, Mercy has a significant data set of patient treatments and outcomes. There is a multitude of information documented in the EHR, such as lists of specific symptoms, diagnoses derived from echocardiogram reports, and certain benchmarking classifications. Since typically 80% of this information is unstructured text, many valuable clinical insights are unavailable in discrete fields, and therefore vital patient information can be trapped when making clinical decisions.

NLP text mining platforms like Linguamatics I2E extract information from unstructured text-based EHRs and transform it into actionable insights that can be placed into a dataset and analyzed.


Collaboration addresses major bottlenecks in prior authorization and medical review

Cambridge, England and ROCKVILLE, Md. — June 26th, 2018 — Linguamatics, the leading natural language processing (NLP) text analytics provider, and Secure Exchange Solutions (SES), a market leader in enabling the secure exchange of health information, today announced the selection of Linguamatics Health as the NLP platform for SES SPOT, a solution that, when combined with SES Fetch, streamlines clinical information exchange and automates the review process.

Inefficient medical review processes (either before or after submitting claims) are major contributors to rising costs in healthcare systems. SES SPOT was developed to evaluate clinical information to help control costs and improve outcomes so that patients receive the appropriate care rapidly, reducing manual effort for providers and public or private health plans, and providing the opportunity to dramatically save time and money.

SES SPOT includes Linguamatics I2E to provide Artificial Intelligence (AI) to extract information from both free text and codified data in an electronic medical record, to compare extracted data with guidelines, and to return evidence, recommendations and an audit trail to automate or semi-automate approval of claims.


Innovative Artificial Intelligence and Machine Learning Technologies can improve Pharma R&D, Reduce Costs and Benefit Patients

The Pharma industry is constantly searching for more effective, more efficient tools and technologies to improve the drug discovery process. The statistics are well-known, and make gloomy reading: it takes 10-15 years to develop a new drug, at a cost of up to $1 billion. There is currently vigorous discussion over whether new tools and technologies can significantly impact these metrics. Big data, blockchain, artificial intelligence (AI) and machine learning (ML) are much talked-about as holding the key to digital transformation of drug discovery.

At the Bio-IT World Conference & Expo last month, many of these themes were explored. Across the dozen or so session tracks, there were talks and workshops to share information and best practises on how scientists in biotech, pharma, academic institutes and vendor companies are applying AI and ML for a variety of use case, such as models for adaptive clinical trials, imaging analytics (e.g. for pathology or clinical sample data), lead design, QSAR, analysing data streams from mobile monitoring devices, and more. Some snippets from the talks include:


Exome Sequencing in Rare Disease Research

Exome sequencing has become a very common tool in research of rare genetic diseases. The starting point is usually a family where several members share the same symptoms of an uncharacterized disease with presumably a genetic factor causing it. Once the exomes of a few affected individuals and their healthy parents, are sequenced, the data is ready to be analyzed, aiming to find the variant responsible for the disease phenotype. Many robust analysis tools and pipelines have been developed and are being used during the last decade or so. A typical analysis includes quality control filtration, alignment and variant calling which eventually yields a list of candidate variants, either tens or even many hundreds, which are then filtered to keep only the relevant ones.

Filtering Candidate Variants Demands Evidence

The next step, unravelling any significant biological associations for these gene variants, can be challenging.

Many approaches and criteria have been applied for the step of filtering variants, and accordingly different tools and software packages implement these approaches. For example, variants that show inconsistency between genotype and disease phenotype across samples are excluded, and ones who represent normal variability in the population regardless of disease (e.g. SNPs) are those that are filtered out. Also, significant alteration in protein structure or function based on the sequence variation is often used, or actual evidence of clinical effects (Polyphen and Clinvar respectively).