The transition to new value-based payment models is spurring provider demand for technologies that enhance patient care and minimize safety risks, and in turn reduce costs. Of particular interest are tools to help providers predict the likelihood of potentially avoidable outcomes, such as a hospital readmission, pulmonary nodules turning cancerous or the contraction of sepsis.

According to a recent Linguamatics survey, most hospital CMIOs support the use of predictive models to improve the quality of care. In addition, CMIOs believe that these models can be enhanced with the use of Natural Language Processing (NLP) to access insightful data from unstructured chart notes.

Clinical NLP Important Applications

The advent of accountable care, meaningful use, and the triple aim is creating an unprecedented demand for insightful patient data. Though structured data reveals valuable information, some 80% of EHR data resides in an unstructured narrative format. Furthermore, of the 1.2 billion clinical documents produced in the US each year, 60% of the valuable information exists in unstructured narrative documents that are largely inaccessible for data mining and quality measurement.

To gain better insight into patient data, providers might be inclined to expand their use of templates to capture discrete observations. Unfortunately, when purely coded templates take the place of free-text narratives, the resulting documentation often fails to capture subtle circumstances of a patient’s story. Frequently the patient narrative is the most effective means of communicating detailed information between healthcare professions.

What alternatives do providers have for preserving the patient narrative, while at the same time gain additional insights from a patient’s complete medical record? One option is to tap into the power of Natural Language Processing (NLP) technology.

It was great to see our paper on the i2b2 NLP challenge from last year published recently. The challenge looked at extraction of Coronary Artery Disease risk factors from unstructured patient data provided by the Research Patient Data Repository of Partners Healthcare. Having done previous i2b2 challenges, such as smoking cessation, after the competition had closed, we wanted to actively participate in the 2014 NLP challenge and see how we compared against other NLP groups in the competition. Linguamatics work with many academic medical centers and cancer centers and view collaboration as a key component of our customer relationships. As such, we wanted to share our success or failure with our peers and show how a commercial system can tackle these areas.

The i2b2 training set consisted of 790 annotated documents relating to 178 patients, which we decided to divide into training (70%) and development (30%) sets. The test set contained 514 documents from 118 patients. Contestants were set this task: extract CAD risk factors such as specific diseases (e.g. diabetes), medications, family history of CAD and lab results; also take into account when tests were carried out or whether a disease diagnosis was in the past or current.

Our team’s results were excellent and, at 91.7% Micro F-Score, were competitive with the best system in this challenge. I2E, being a rule based system, was well suited for the challenge compared to machine learning systems because:

Interest in big data in healthcare is expanding rapidly with the explosion in genomic data and adoption of electronic health records (EHR) resulting from the Affordable Care Act.

This data holds the promise of improved insights into patient outcomes, treatment effectiveness, patient satisfaction and population risk, which is why it is receiving so much attention.

Considerable focus is on how to integrate structured data within your organization, for example, to gain insights from lab data and disease coding, but this is just the first step.

A large proportion of healthcare data is still in an unstructured format represented as documents, reports and images that hold significant levels of detailed data on patients that is not captured, or is poorly captured in structured data. This unstructured text from pathology, radiology and patient narratives captures the entire patient journey and is critical to understanding patient populations, assessing clinical risk and providing a better understanding of disease.

However, the format of the data poses significant challenges to its application and often results in laborious manual extraction to turn it into structured disease codes or specific data sets such as cancer registries. These manual processes are not scalable for the level of discrete data required for analytics and outcomes analysis, but how can this be addressed?

Natural Language Processing (NLP) is a hot topic in healthcare.

At this year's AMIA Annual Symposium, in Washington DC,  we brought the discussion on clinical NLP to a roundtable held on Monday lunchtime and were also invited to the AMIA NLP workgroup to present some real-life use cases in clinical NLP.

However, as much as we like sharing what we're doing, we were keen to know what other people think when it comes to how NLP can transform patient care, today and in the future.

So that’s what we did – we asked peers at the AMIA conference that question (How can NLP transform patient care?) as part of a contest with an incentive of an iPad Mini and $50 Starbucks voucher for 1st and 2nd place respectively.

More than a third of entries identified mining the unstructured, free text narrative of a medical record to be crucial to the transformation of patient care. Unsurprising really, if you consider that around 80% of data in an electronic health record is unstructured and the only real way to get this information into a useable format is using NLP.

But what was interesting was the difference in how to use this data. Ideas included; for better patient information, for using the extracted coded concepts in clinical decision support and to retrieve full patient cohorts.

It was a tough contest to judge but the winning entry came from Edgar Chou at Drexel University College of Medicine. He had a few ideas but the one we thought was most interesting, with the potential to have the greatest impact on patient care was to around the payer care mix.