Using NLP for pneumonia prediction in the ICU

It’s always good to see NLP being used in a clinical care, a recent story about Microsoft and Washington University in Seattle using NLP in pneumonia detection in the ICU is a good example of this.

The project, called deCIPHER, uses a combination of Microsoft linguistics and machine learning to assess clinical information from electronic medical records and derive a diagnosis.

The system was trained against a cohort of 100 patients who had already been diagnosed with pneumonia and used a machine learning framework to build a predictive model based on extracted clinical factors. The system accurately predicted 84% of positive patients and the team are assessing incorporating the model into an ICU dashboard.

Last year Kaiser Permanente also published a paper on pneumonia diagnosis in relation to the ICU and using chest radiograph reports, using Linguamatics I2E for information extraction and also applying machine learning to the resulting clinical factors.

From a total of 194,615 ICU reports, Kaiser Permanente empirically developed a lexicon to categorize pneumonia-relevant terms and uncertainty profiles.

The team encoded lexicon items into unique queries within Linguamatics I2E and designed an algorithm to assign automated interpretations (‘positive’, ‘possible’, or ‘negative’) based on each report’s query profile. Performance was assessed in a sample of 2,466 chest radiograph reports interpreted by physician consensus and in two ICU patient subgroups including those admitted for pneumonia and for rheumatologic/endocrine diagnoses.

The algorithm, comprised of twenty rules and probability steps, assigned interpretations to reports based on query profiles. In the validation set, the algorithm had 92.7% sensitivity, 91.1% specificity, 93.3% positive predictive value, and 90.3% negative predictive value for differentiating ‘negative’ from ‘positive’/’possible’ reports.

In the ICU subgroups, the algorithm also demonstrated good performance, misclassifying few reports (5.8%).

These exciting results were very encouraging for the use of NLP in clinical care and furthermore showed the capabilities of I2E in this area.

Of further interest was that the configuration of the queries was carried out by the lead researcher and cardiologist who had no previous NLP experience.

The ability to visually configure I2E queries makes NLP accessible to non-computer programmers, allowing much wider application of the technology.