Skip to main content

Could Natural Language Processing be Useful in HEDIS® Reporting?

Author: Matthew Flores MS, RRT, CHCA  

Before we assess whether Natural Language Processing (NLP) could benefit HEDIS® reporting, it is important to look at the history of HEDIS as well as some of the information surrounding trends in quality reporting from a regulatory and operational standpoint to put the question into perspective.

The Setting

The Healthcare Effectiveness Data Information Set (HEDIS) is an important set of healthcare quality indicators developed and administered by the National Committee for Quality Assurance (NCQA) with the goal of improving the triple aim in healthcare. This is accomplished by measuring care provision at the payer level which has historically relied heavily on claims and other administrative data as the primary means for measuring clinical activities.

When HEDIS started, administrative (e.g. claims) data was the primary type of clinical information most health plans received for their patients. Over time, Hybrid measures were added using Medical Record Review (MRR) to bridge the gap of information not received in administrative data for some measures. HEDIS evolved to incorporate supplemental data from various other data sources such as immunization registries and eventually EHRs.

Meaningful Use, a provision of the HITECH act, expanded the implementation of EHR and has promoted reporting of eCQMs at the provider level. This significantly increased the collection of electronic clinical data with the potential to be used in HEDIS as well as helping to establish some new standards in quality reporting. As standards for clinical data and digital reporting continue to mature, the Office of the National Coordinator for Healthcare IT (ONC) and Centers for Medicare and Medicaid Services (CMS) have now upped the ante with the Promoting Interoperability (PI) program, and with ambitious proposed rules aimed at further improvements to clinical data access, cost, and quality (the triple aim of clinical data?).

The Digital Evolution

NCQA recently published a Memo: "The NCQA Data Measures Roadmap" (2019) in which they have announced their intent to begin an evolution toward digital reporting. In the new paradigm, NCQA proposes to reduce the burden of reporting by using digital information collected in EHRs and other sources during “the normal course of patient care” (para. 1). This effort is intended to improve efficiency, reduce cost, and make measurement scalable across domains of care. In fact, NLP is mentioned as a possible solution to evolutionary challenges in the memo.

An important element of full digital reporting is that it does not include a medical record review component. While this may eventually improve the efficiency of information retrieval, it removes an important current source of clinical events for reporting. For example, Colorectal Cancer Screening (COL) was included in the HEDIS 2020 Public Comment as a proposed measure for digital conversion and optional parallel reporting. COL is currently a Hybrid measure (and a Medicare STAR measure) with a significant portion of compliant hits coming from MRR.

According to Advent calculation of internal benchmarks, many hybrid measures experience much higher lift than Colorectal Cancer Screening with more than a third of some STAR measure hits coming from MRR. Upon conversion to digital format or replacement with a new digital measure, loss of hits attained from MRR could have a significant impact on rates for current Hybrid measures. The fiscal implications could be profound for plans earning incentives or participating in pay-for-performance arrangements with regulatory authorities and insurance sponsors.

The popular sentiment is that digital measures will establish a new performance baseline, but in reality, there will likely be significant pressure from quality professionals, executives, and other industry experts to reach comparable measurement rates without the MRR component. While some of the difference may be made up with improved interoperability increasing the volume of EHR data and the addition of SNOMED and RXNorm codes (which have been added to the HEDIS Adjustments Value Set, and are scheduled to be in traditional Value Sets for HEDIS 2020), much of the content of EHRs is unstructured and will not be accessible without intervention.

Application of NLP in HEDIS


NLP and other machine learning algorithms have the ability to derive information from unstructured data, and bridge much of the performance gap between human review and structured data collection. When mature, they may perform better, faster, and for a fraction of the cost of manual review. Adopters of NLP will be able to invest limited human resources in training/refining these algorithms and performing quality assurance over-read. Organizations can then effectively leverage human clinical expertise to accomplish more diverse reviews, while decreasing the monotony of the review task for critical clinical experts.

Efficient Data Collection

Much of the criticism surrounding EHR implementation involves provider abrasion. Clinicians have fundamentally altered their workflow to capture key data elements in structured fields for quality reporting, shifting the focus of their attention from the patient to their EHR systems. A recent Fortune magazine article indicates that doctors spend on average nearly half of their time each workday on their EHRs. Many facilities and practices have added scribes to the care team to ameliorate this impact. Many providers continue to document in freeform progress and SOAP notes. Some encounters are simply not conducive to structured formats. NLP is an ideal utility to capture data from unstructured text and introduce more flexibility into documentation requirements without losing information for quality reporting.

Enabling NCQA’s Outcome-Based Quality Measures

One of NCQA’s goals is to move from process-oriented measures to outcome-based measures. In an outcome-based paradigm, the measurement point may shift to a subjective impression from a structured process. The patient presents with symptoms of Recursive Suplebnia (my favorite mythological ailment) and is provided the standard course of treatment. That process is what we are measuring now, but in many cases the standard course is not effective.

We want to know if the treatment was effective, so we often attempt to quantify symptomology into a common scale. Scalars and tools are notoriously difficult to implement in clinical workflow and capture with standard fields and are often very time-consuming and specialized. How many scalars are providers expected to learn, administer, and document effectively? How much time do they have to execute them properly with other pressures restricting their time and attention? Many outcomes are ultimately documented with simple, but effective subjective observations such as “patient experienced relief of symptoms under current regimen” or “treatment resulted in unacceptable side-effects and must be adjusted”. Outcomes-based measurement would be strongly supported by NLP’s ability to parse and interpret these subjective touchpoints in the care process.

Social Determinants Collection

One of the prevailing themes in recent literature and thought-leadership has been social determinates of health. We have come to realize that many if not most health problems stem from social causes. Poverty, access, heredity, geography, and consumption habits all contribute to health disparities and trends. There are very few structured fields within the EHR to capture many of these social elements. NLP in conjunction with increasingly prevalent association rule mining, sequential pattern mining, or predictive analytics could help to utilize many social observations, and to correlate social causes with poor health outcomes. Those causative relationships can then be used to improve targeted interventions.

While it may be some time before we have social determinate-based measures for logistical and ethical reasons, there is a great opportunity for the healthcare system to use NLP to collect anonymized or targeted social information. Insights from social data could help with network development (query: identify encounters with complaints regarding wait or travel time), preventive identification of likely illness (via harvest of health complaints or habits from unstructured notes), social initiatives such as housing aid (query: find encounters with expression of housing insecurity), or Non-Emergency Medical Transportation services (query: identify encounters with mention of missed appointments due to transportation scarcity).

With that information, health plans can develop prevention and outreach strategies for those likely to have poor outcomes. Interventions can then be applied at the individual, geographic, and/or other custom levels. By addressing the root-causes of poor outcomes more effectively, health plans can ultimately experience resource-sparing quality performance gains at lower cost than providing unnecessary responsive care.

NLP’s Role in Data Quality and Completeness

One of the themes suffusing the transition to digital reporting is Master Data Management. What does the organization do to ensure the consistency, integrity, and availability of data? Data quality is of paramount importance to the HEDIS Compliance AuditTM and accurate reporting of HEDIS measures. EHRs have long been considered standard sources of data, meaning they are relatively consistent in terms of stability, human manipulation, and reliability. In reality, experience reveals that they are rife with common errors invalidating many potentially useful records.

Extra-structural Records. In our role as HEDIS auditors, we often see result fields with a see comment notation and the result written in another field. Measures often require a numeric value to parse and cannot read a see comment. However, NLP can collect that comment and write a useful record. It can also read the comment that states, “test ordered, but patient did not show for the appointment” and delete the record. Documentation of events in notes or comments are not currently useful for reporting, but they could be if brought into structured fields with NLP.

Deduplication. One of the most common problems with current EHRs is deduplication. A test is ordered during an encounter, the patient attends an ancillary specialist for the procedure two days later for the technical component of the test, the ancillary provider reads the test results a few days later for the professional interpretation of the test, and the result is then sent to the ordering PCP where it is read into the EHR and resulted a few days later by the PCP. Each of these five steps has a date and a potential touchpoint that may be interpreted as a record by the EHR. NLP has the potential to help identify and choose the proper event for reporting.

Event Identification. A critical element of quality reporting is event identification. Events are captured for eligible population inclusion or exclusion and for numerator compliance. When represented as distinct occurrences as a code or encounter, they are marked for measure calculation accordingly. However, many encounters occur outside of primary care environments, or are never submitted on claims due to capitated payment arrangements. Ideally, they should be submitted as encounter data, but for various reasons, these data are often never sent. Also, many events occurred prior to the advent of EHRs. Records of patient history are often not conveyed in structured fields, but can easily be parsed for the purpose by NLP (e.g, a surgical history of mastectomy 20 years ago is beyond claims or structured data, but can be collected by NLP for exclusion from the HEDIS Breast Cancer Screening (BCS) measure).


So, in response to the question of whether NLP could add value to the HEDIS audit, the opportunities are clear and self-evident. As the modality is developed for application to HEDIS, many additional opportunities for implementation will likely present themselves. The level of accuracy attained with the technology will be of critical importance since there is such a high standard for assessing bias within the audit of reported HEDIS data. Due diligence will be required to optimize NLP for the HEDIS space since it is such an incredibly risk-averse domain. However, the capabilities of the platform are quite well-suited to current strategic momentum in the field.

HEDIS® is a registered trademark of the National Committee for Quality Assurance (NCQA).

NCQA HEDIS Compliance Audit™ is a trademark of the National Committee for Quality Assurance (NCQA).

Ready to get started?

Request a Demo

Questions? Ask our experts