Identifying Domain Experts through Text-Mining Medline
Eric Su emphasized the importance of thought leaders and domain experts in the pharmaceutical industry to complement and plug gaps in-house knowledge. Previous manual methods of finding them relying on personal knowledge and contacts were too limited, and Eli Lilly decided instead to locate and rank the experts by applying the Linguamatics NLP platform to text-mine Medline.
What superficially seems to be a simple data extraction task – find and extract papers that describe diseases/drugs of interest and list and rank the authors by the number of publications – is made much more complicated by the variability in formats for personal, institutional, and drug names and diseases, so that disambiguation is huge challenge. Eric described in detail how they built Lua code to construct NLP queries of Medline via the easy-to-use I2E Pro interface. This took advantage of Linguamatics’ various ontologies (e.g. institutions, diseases) to help overcome the disambiguation problem, and then to extract and correctly format and rank the output for use by scientists and researchers.