Pentavere Research Group of Toronto, Canada, was developing a platform to provide health insights from Real-World Evidence (RWE). Pentavere’s aim is to improve healthcare efficiency by allowing life science companies and healthcare providers to understand the impact of clinical decisions made in the primary care setting.

The company’s proprietary platform, daRWEn™, uses digitized, de-identified, and aggregated health information, but much of the valuable data that it wanted to include was locked inside free-form text, making it difficult to extract. Pentavere soon realized that it needed to incorporate natural language processing (NLP) capabilities into its platform in order to access these RWE insights. To achieve this in a timely and efficient manner, it chose to integrate the Linguamatics I2E NLP solution into daRWEn™.

Why Linguamatics? There were several important factors, including:


There’s a lot of buzz in the healthcare community at the moment surrounding the use of artificial intelligence with machine learning for pattern identification, decision-making, and outcome prediction. The availability of high-quality data for training algorithms is vital to machine learning’s success - but a lot of this information is tied up in unstructured clinical notes. Natural language processing (NLP) is the key to extracting the “good stuff” from this vast trove of unstructured text. Combining that “good stuff” with already structured data helps healthcare providers to understand the patterns and trends in data via machine learning - and thereby enhance care, reduce costs, and improve population health.

Which type of NLP software is best?

The first question that healthcare users must ask themselves is “Which type of NLP software best suits my needs?”

Statistical NLP systems require example data to identify patterns in new data. The examples may come from dictionaries or ontologies - or they might need to be manually annotated by a clinician - which can be an extremely laborious and institutionally costly task.

Meanwhile, most rule-based NLP systems require a specialist to define the types of language rule or pattern that represent certain healthcare concepts. This approach can make them more accurate, but they will be limited only to the patterns that the specialist has thought of.


HIMSS 17

Information Technology AND Healthcare? Why on Earth would you combine such incompatible career fields?

I can’t tell you how many times I was questioned about this in my past. Early on in my career, no one ever told me that my early pursuits of combining my Computer Operations training in the Air Force with my decision to pursue medicine was actually a good idea. In fact, it was quite the opposite. And yet - this year I can give about 45,000 more reasons (the number of attendees at HIMSS 2017 [1]) on why the path led to a promising merging career field after all.

The “missing link” career - people divided by a common career field.


Risk stratification has, so far, been biased toward structured data due to accessibility issues. As interest in long-term member wellness increases in importance it is the insights trapped in unstructured data that will become the differentiator in a changing and competitive market. The payers who are able to characterize member groups at a fundamentally more detailed level will have the advantage of population insight over those who struggle to do so.

Data sources that are increasing in scale and availability include electronic healthcare records (EHRs) data in Continuity of Care Document (CCD) format from providers, OCR notes about members, and nurses’ notes.

How can payers make effective use of unstructured data to stratify populations more effectively when much of their infrastructure is tied to structured data? Sources of unstructured data contain significantly more detail about members but are much more varied.

Here at Linguamatics Health, our Clinical NLP specialists understand the urgency and complexity of bringing together data sources, both structured and unstructured, in a workflow that gets you to insights you need quickly.


Ever find an acute problem such as a fracture, which shows in a Problem List, but healed months ago? Or perhaps the problem list states a case of bronchitis that may have been transient or may actually be Chronic Obstructive Pulmonary Disease (COPD)? After all, a diagnosis of COPD is a collaboration of symptoms and test results. How many clinicians find the spare time to go retrospectively back in the EHR and calculate a patient’s, “coughing with excessive sputum nearly everyday for at least 3 months of the year, for 2 years in a row” [1]?

But fixing the problem list can be time-consuming and complicated. Isn’t there an alternative (better) way?

Many organizations believe that in order to derive an accurate picture of their population’s health, medication lists can be just as good as their problem list. What if you find a patient taking an atypical antipsychotic medication and they don’t have a diagnosis that coincides on their Problem List? Can we just assume a mental health diagnosis? After all, this conclusion seems logical. Or is it? Is it an oversight on their Problem List or are they prescribed it for an off-label reason? According to the Agency for Healthcare Research and Quality (AHRQ), a 2011 report stated off-label atypical antipsychotic medications uses. This included areas such as; anxiety, ADHD, behavioral disturbances of dementia and severe geriatric agitation, MDD, eating disorders, insomnia, OCD, PTSD, personality disorders, substance abuse, and Tourette's syndrome. [2].

Therefore, can we really make assumptions?