Automated identification of pneumonia in chest radiograph reports in critically ill patients

Liu V, Clark MP, Mendoza M, Saket R, Gardner MN, Turk BJ, Escobar GJ.

BMC Med Inform Decis Mak. 2013 Aug; 13:90

PMID: 23947340

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765332/

Abstract

BACKGROUND:
Prior studies demonstrate the suitability of natural language processing (NLP) for identifying pneumonia in chest radiograph (CXR) reports, however, few evaluate this approach in intensive care unit (ICU) patients.

METHODS:
From a total of 194,615 ICU reports, we empirically developed a lexicon to categorize pneumonia-relevant terms and uncertainty profiles. We encoded lexicon items into unique queries within an NLP software application and designed an algorithm to assign automated interpretations ('positive', 'possible', or 'negative') based on each report's query profile. We evaluated algorithm performance in a sample of 2,466 CXR reports interpreted by physician consensus and in two ICU patient subgroups including those admitted for pneumonia and for rheumatologic/endocrine diagnoses.

RESULTS:
Most reports were deemed 'negative' (51.8%) by physician consensus. Many were 'possible' (41.7%); only 6.5% were 'positive' for pneumonia. The lexicon included 105 terms and uncertainty profiles that were encoded into 31 NLP queries. Queries identified 534,322 'hits' in the full sample, with 2.7 ± 2.6 'hits' per report. An algorithm, comprised of twenty rules and probability steps, assigned interpretations to reports based on query profiles. In the validation set, the algorithm had 92.7% sensitivity, 91.1% specificity, 93.3% positive predictive value, and 90.3% negative predictive value for differentiating 'negative' from 'positive'/'possible' reports. In the ICU subgroups, the algorithm also demonstrated good performance, misclassifying few reports (5.8%).

CONCLUSIONS:
Many CXR reports in ICU patients demonstrate frank uncertainty regarding a pneumonia diagnosis. This electronic tool demonstrates promise for assigning automated interpretations to CXR reports by leveraging both terms and uncertainty profiles.