Developing Timely Insights on Comparative Effectiveness Research with a Text Mining Pipeline

Chang M, Chang M, Reed JZ, Milward D, Xu JJ, Cornell WD

Drug Discov Today. 2016 Mar; 21(3):473-80

PMID: 26854423


Comparative effectiveness research (CER) provides evidence for the relative effectiveness and risks of different treatment options and informs decisions made by healthcare providers, payers, and pharmaceutical companies.

CER data come from retrospective analyses as well as prospective clinical trials. Here, we describe the development of a text-mining pipeline based on natural language processing (NLP) that extracts key information from three different trial data sources: NIH, WHO International Clinical Trials Registry Platform (ICTRP), and Citeline Trialtrove. The pipeline leverages tailored terminologies to produce an integrated and structured output, capturing any trials in which pharmaceutical products of interest are compared with another therapy.

The timely information alerts generated by this system provide the earliest and most complete picture of emerging clinical research.