Skip to main content

Population Health Management and Analytics

To enhance population health management and analytics and identify the needs of patients, healthcare companies must extract insights from the unstructured text in EHRs. NLP is the ideal technology to extract insights from this data.

  1. Evolution of Healthcare Payment Models
  2. Challenges of Real-Time & Big Data for Population Health
  3. Population Health Management and NLP
  4. Advantages of Linguamatics NLP in Population Health

Population Health Management and Analytics

Modern population health management is the culmination of several initiatives over the past thirty years, starting with evidence-based medicine in the 1990s. Patient-centered care and value-based medicine built on this approach: adding the quality and value of medical intervention from a patient’s perspective. More recently, precision medicine has shifted the focus still further by connecting genetic details with the environmental and lifestyle factors that affect the health of individuals.

Healthcare payment models increasingly reflect this emphasis on quality and value, fueling demand for more comprehensive insights into population health. As a result, the provider and payer markets within healthcare are in the midst of a huge transformation, with 60% of commercial plans now linking payments to value.

Healthcare organizations have traditionally relied on the structured data in Electronic Health Records (EHRs) and insurance claims to analyze the health of patient populations or make clinical decisions. Structured data is valuable, but an estimated 70% of the clinical data stored in EHRs is in "unstructured" form and therefore difficult to analyze. However, this unstructured data contains a wealth of clinical information:

  • Encounter based: clinician narratives and nurse notes
  • Procedure/situation reports: pathology, radiology and discharge reports
  • Patient narratives: patient-reported information (PRI) and patient-reported outcomes (PRO).

To enhance population health analytics and identify the care needs of individuals, providers and payers must extract insights from the data stored in this unstructured text. Social factors, lifestyle choices and living conditions also play a major part in assessing the clinical risk for populations and individuals In addition, ensuring the patient’s problem list is consistent with their notes is vital when managing complex disease comorbidities.

The demand for population health management and real-time patient surveillance is increasing. To unlock value from unstructured text, organizations will require advanced Natural Language Processing (NLP) technologies such as the Linguamatics NLP platform.

The Evolution of Healthcare Payment Models


In the past, healthcare has relied upon fee-for-service compensation models to pay providers. Today, the government and private payers are shifting to alternative pay-for-value models that offer those same providers financial incentives for proactively monitoring the health of their patients, achieving quality clinical outcomes and controlling the cost of care.

To meet their quality and performance objectives, providers must now analyze vast amounts of population health data. With more access to the data in Electronic Health Records (EHR), payers are also offering care coordination services to improve the overall health of their members. As with providers, payers are targeting high-risk populations and extending opportunities for education and support.

Predictive Modeling in Healthcare Populations

Prediction and prevention go hand-in-hand in effective population health management. However, before healthcare organizations can implement pre-emptive care programs, they must first identify the relative risk of patient populations based on a variety of clinical, socio-economic and lifestyle factors.

As healthcare organizations develop more sophisticated capabilities, they can move from basic descriptive analytics towards the realm of predictive analytics & insights: allowing payers and providers to estimate the likelihood of a future outcome based on patterns in the historical data. This approach allows clinicians, financial and administrative staff to receive alerts about potential events before they occur, thereby allowing them to make more informed decisions about how to proceed with treatment.

Organizations that can identify individuals with elevated risks of developing chronic conditions as early in the disease’s progression as possible have the best chance of helping patients avoid long-term health problems that are costly and difficult to treat.

For example, healthcare populations often include a small percentage of highest-risk patients. This percentage often accounts for the largest percentage of healthcare costs.

Healthcare populations often include a small percentage of highest-risk patients. This percentage often accounts for the largest percentage of healthcare costs.

Once healthcare organizations have stratified populations based on their relative health, they can start evidence-based care plans to improve outcomes for the at-risk individuals and install preventive programs for the healthier patients.

By following best-practice protocols, payers and providers can help patients avoid costly complications and hospitalization. Analysis of population health data also helps organizations to assess how a particular value-based care plan will impact their bottom line.

Challenges to Population Health Analytics

For both providers and payers, population health analytics is often made more difficult by the heterogeneous nature of patient-related data.

The ability to automatically extract precise data from unstructured text is invaluable for organizations participating in value-based payment models. By leveraging NLP, providers can look at both the structured and unstructured data for a complete patient population outlook. They can then identify and extract specific details to assess risk or improve overall population health management.

Similarly, they can assess critical details on individual patients related to lifestyle choices such as smoking and alcohol consumption. And look at insights into a patient’s living arrangements, access to care and mobility status.

Healthcare is moving away from a reimbursement model that rewards procedures to one that rewards quality and outcomes. No longer will health care be about how many patients you can see, how many tests and procedures you can order, or how much you can charge for these things. Instead, it will be about costs and patient outcomes: quicker recoveries, fewer readmissions, lower infection rates, and fewer medical errors, to name a few. In other words, it will be about value.

Toby Cosgrove, Harvard Business Review (source).

The Challenges of Real-Time and Big Data for Population Health

Traditionally, healthcare organizations have relied upon Electronic Health Records (EHR) and claims data to analyze patient populations and the health of individuals. Claims data and EHRs have been an adequate source of data for population health analytics in the past, however, the demand for detailed, actionable information has escalated.

Patient Engagement Portals

Today, patient engagement portals such as Accenture’s Intelligent Patient, All scripts FollowMyHealth, Epic’s MyChart and Athenahealth offer a host of web-based tools allowing patients to play an active role in their own healthcare. The data available to patients may include their lab results, physician notes, discharge summaries, immunizations and overall health history.

Research shows that when patients are able to see their own health data, they take ownership of their health and are better prepared to interact with their providers about their care. In addition, as patient engagement and interaction with their own medical records grows, there will be more electronic communication about progress, lifestyle changes, medication adherence and adverse events.

New sources of patient insights are growing, but often patient-reported information is in an unstructured format.

The Role of Natural Language Processing (NLP)

Providers can use NLP to review this new patient-reported data, and gain insights on everything from mental state and fall risk to firearms access. Payers can leverage NLP to analyze member-supplied data, including sources such as online chats between patients and nurses. NLP can even be used to review social media posts and provide relevant insights about exercise routines, diet and social behavior.

At the same time, the volume of available data is increasing and so is the need to analyze unstructured patient data in real-time. By deploying sophisticated, predictive clinical models, providers can identify which patients are at higher risk for medication non-compliance or 30-day hospital re-admission. Social determinants of health (SDoH) such as details on a patient’s social support network, ambulatory status and living conditions can be mined from discharge summaries and analyzed alongside relevant lab and diagnostic details.

Established technology for mining healthcare information is focused on structured data. So, NLP and associated results must integrate with these systems for actionable insights. Source documents are in an EHR, file system, data warehouse or Hadoop data lake and these systems are often the destination for NLP results. A data warehouse is often the single source of truth for population stratification algorithms running in R, Python or SAS. There is also expanding use of Cloudera HIVE as an analytics environment as Big Data tools continue to be applied in this area.

Source documents need to be loaded into an NLP engine for analysis at either population scale or for real-time processing as new patient-related documents are generated. This means that NLP systems must support both large scale batch processing and real-time analysis. Web Service APIs are key to this type of integration and are ideally Service Oriented Architecture friendly to provide flexibility, fail over and recovery capabilities in production environments.

The output from the NLP engine is usually loaded into a data warehouse via standard ETL processes and aligned with existing structured data via MRN or other patient/member identifiers. Once the structured and unstructured sources are combined, models can be developed and run against all the relevant features.

Population Health Management and Natural Language Processing

NLP technologies are used to extract structured information from unstructured patient-related documentation. For example, providers can leverage NLP to extract discrete values of left ventricular ejection fraction from an echocardiogram, or a patient’s cancer stage from a pathology report.

While Electronic Heath Records (EHR) have fields for such clinically relevant details, records may include major gaps as data is not consistently entered, which impacts clinical care and outcomes analysis.

Applying NLP enables the capture of information from unstructured patient data in a timely manner and facilitates its use for analytical purposes. Unlike earlier systems, the latest NLP tools such as Linguamatics NLP enable open and flexible development of queries, and are not as reliant on expensive data sets manually annotated by clinicians. Interest in this field is expanding as noted in the recent KLAS report: Natural language processing: Glimpses into the future of unstructured data mining (April 2016).

Broadening the Scope of Population Health Analytics

Population health is about people and the ways in which they are both unique and the same. To get a full picture of an individual's health you need more data than you need to analyze a patient’s current clinical status, as shown below.

To get a full picture of an individual's health you need more data than you need to analyze a patient’s current clinical status

Only 20% of an individual’s health status is associated with their clinical care; other major factors that contribute to their status include health behaviors, social and economic factors and physical environment. Lifestyle choices such as tobacco, alcohol and drug use can all be extracted from unstructured text using NLP, as can sexual activity, diet and exercise.

360 degree view of patients from structured and unstructured information

Unlocking Insights within Electronic Health Records

Third-party organizations can provide reporting on economic factors impacting population health but providers and payers are also using NLP to mine for Social Determinants of Health such as social isolation, food insecurity and ambulatory status in in their own internal data.

NLP can filter more, relevant information, such as a patient’s environmental, housing and language preferences. NLP is able to unlock critical details from unstructured text. So, it is a powerful tool for organizations as they manage the health of their patient populations.

Case Studies

Case Study: Identifying Drug and Lifestyle Conflicts

NLP's ability to analyze unstructured data enables both payers and providers to build a more complete picture of each patient.

For example, using traditional (structured) claims data, an individual patient might be categorized using only the following information:

patient with missing information

  • Age: 74
  • Gender: Male
  • Suffered a heart attack and pacemaker fitted
  • Hospitalized with DVT
  • Plavix

Information in the claims data shows that the patient has been prescribed DVT Plavix, a blood thinning agent. But, it does not list any aspects of the patient's health or lifestyle that may conflict with this prescription.

Using NLP it's possible to analyze unstructured text in the patient's clinical notes and patient reported information. This data yields three items of information:

Patient full picture using unstructured and structured data

  • the patient's use of fish oil supplements
  • red wine consumption
  • wife recently deceased

The first two items are relevant in the prescription of a blood-thinning drug like DVT Plavix, the last is of major concern and would flag the person as potentially needing more support. These insights and others shown below are extracted using NLP and provide a deeper understanding of the person to improve their care.

From his Clinical Notes:

  • Ejection fraction: 50
  • BMI: 22
  • A1C: 6
  • No shortness of breath
  • Takes fish oil supplements

Social and Lifestyle Data:

  • Non-smoker
  • Red wine drinker
  • Wife recently deceased
  • Lives with sister-in-law
Case Study: Analyzing the Risk of Type 2 Diabetes in a Patient Population

Consider an Accountable Care Organization (ACO) that wants to assess the risk of type 2 diabetes in its patient population. An analysis of structured data can reveal risk factors associated with weight, race and age, but might miss risk factors that are noted in physicians’ narratives.

Using NLP, the ACO could identify the prevalence of other known risk factors, such as limited access to healthy foods, barriers to physical activity, high stress levels and social isolation.

Case Study: Population-level Cohort Selection

How can NLP help population-level cohort selection? A good example is the CMS code for lung cancer screening, which targets 55–77 year-olds who are current or past smokers, have no lung cancer diagnosis and have more than 30 pack-years of smoking.

While some of these details may be captured in modern EHRs, certain critical risk factors such as smoking pack years are typically most accurate in unstructured text. NLP can extract these factors to derive a much deeper understanding of clinical risk. People who meet the criteria are invited for CT screening to identify early signs of lung cancer.

Case Study: Supporting Value-Based Care at Atrius Health

ACOs need access to clinical data to meet reporting requirements and facilitate quality care initiatives. Critical patient information is often stored in narrative form in Electronic Health Records (EHR). Like many healthcare organizations, Atrius Health had difficulty obtaining certain information for quality metric reporting, accurate clinical documentation, and safety-net initiatives.

Using Linguamatics NLP, Atrius Health created queries to extract clinical data from free-text fields within clinician progress notes and clinical reports. For example, Atrius Health now queries unstructured echo reports to analyze cardiac function and identify high-risk heart failure patients.


Advantages of Linguamatics NLP in Population Health

Data Discovery and Exploration across Populations

Linguamatics NLP can translate unstructured text into discrete data fields by identifying the key concepts and their relationships in healthcare documentation. It can identify disease severity concepts such as TNM cancer stage, patient ambulatory status, and ejection fraction using NLP. I2E then provides this data as structured fields.

Linguamatics NLP can analyze millions of patients together and characterize how concepts are represented in patient documentation. This reduces reliance on manual chart review and allows algorithms to be tailored to new data sets.


Feature Extraction for Risk Stratification and Predictive Models

Risk stratification can be biased toward structured data due to accessibility issues. Interest in long-term patient/member wellness is increasing in importance. Harnessing the insights trapped in unstructured data will become the differentiator in a changing and competitive market.

The providers and payers who are able to characterize patient/member groups at a more detailed level will have the advantage of population insight over those who struggle to do so.


Linguamatics NLP a Highly Configurable NLP Solution

The NLP solution offers a spectrum of capabilities that customers can apply to extract insights from the patient/member related data. I2E users can:

  • Develop new algorithms
  • Change existing algorithms
  • Create and add ontologies
  • Incorporate new data sources.


Linguamatics NLP unlocks data to improve patient safety, quality, and reporting

Despite the U.S. health system having made progress in recent years, patient safety remains a challenge that healthcare organizations must prioritize. Additionally, adverse medication events cause more and more injuries and deaths each year, with a substantial cost for HCOs.

Under pressure to find a solution and improve quality measures, HCOs are now turning to Augmented Intelligence (AI) technologies, and especially NLP, to make more sense of data and use its full potential. NLP workflows can help reduce the likelihood of human error and improve patient safety. Findings are then transformed into structured data to simplify chart review and speed the identification of high-risk patients.



Ready to get started?

Request a Demo

Questions? Ask our experts