NLP and making sense of data in a precision medicine world

July 21 2016

During his January 2015 State of the Union speech, President Obama announced details of his administration’s Precision Medicine Initiative, which promises to accelerate the development of tools and therapies that are customized to individual patients. Precision medicine focuses on disease treatment and prevention and considers the variability in genes, environment, and lifestyle between individual patients.

Precision medicine takes into account healthcare’s relatively minor role in impacting a patient’s overall health and well-being, compared to the larger roles of genetics, health behaviors, and social and environmental factors. The precision medicine approach thus requires that providers have access to a wealth of patient-specific data. Thanks to advancements in genetic testing and new technologies, such as patient portals and remote monitoring devices, a wide variety of patient data is now readily available. Unfortunately, clinicians may have difficulty extracting data that is clinically relevant because much of the information is stored in an unstructured format.

Consider how a physician would glean information from a paper medical chart prior to EMRs. To understand a patient’s complete health status, the doctor would search through pages and pages of notes - obviously a time-consuming and error-prone task.

With EMRs, a physician can review coded disease lists and summary dashboards to get a high-level overview of a patient’s health. In the precision medicine world, however, clinicians need a fuller picture of a patient’s health, including insights into potential health risk factors, lifestyle behaviors, and social history.

For example, a broad disease code would allow a provider to identify a patient with chronic obstructive pulmonary disease (COPD). However, in order to customize the patient’s therapy, a physician must also understand the patient’s genetic makeup, environment, and lifestyle factors. Precision medicine requires insight into such factors as the patient’s on-going exposure to secondhand smoke, previous exposure to environmental pollutants, and genetic predispositions. These details are not easily captured in a structured format, and are therefore not easily extracted. Not unlike the physician using paper charts, a provider would have to read through a patient’s complete electronic chart to discern the most relevant details. Alternatively, by utilizing an advanced technology such as Natural Language Processing (NLP), the clinician could quickly uncover these nuances and do so on a much wider scale.

The healthcare industry has leveraged NLP technologies for many years, though today’s NLP solutions are far more powerful than earlier applications that were primarily designed for research and computer-aided coding. Unlike early NLP systems, the latest NLP tools enable open and flexible development of queries and are not as reliant on expensive, clinically-annotated data sets.

In a precision medicine environment, NLP can filter and extract clinically-relevant information from unstructured patient notes and do so in a timely manner without the need for manual review. For example, providers can leverage NLP to extract discrete values of Left Ventricle Ejection Fraction from progress notes, or to identify a patient’s cancer stage from a pathology report. While EHRs commonly have fields for such clinically-relevant details, records may include major gaps because data is not consistently entered.

NLP also enables providers to analyze unstructured patient data in real time to support clinical decision-making. By deploying sophisticated predictive clinical models, providers can connect a patient’s health status with environmental and lifestyle factors to estimate a patient’s likelihood of 30-day hospital readmission or medication non-compliance. Providers can then proactively take measures to adjust a patient’s therapies based on their support network, ambulatory status, and living conditions.

As the Precision Medicine Initiative continues to gain steam, NLP will be an essential tool to help providers make sense of the growing volume of relevant data and advance their efforts to achieve optimal patient health and outcomes.