
While Natural Language Processing (NLP)-based text mining has become a widely used technology within pharma, biotech and healthcare organizations, some still view NLP use as esoteric, only for experts. At AbbVie, however, the use of NLP has been democratized for researchers with the provision of broad-access web portals.
One project to illustrate their approach to broaden access to NLP was presented earlier this year at a Linguamatics NLP seminar. Abhik Seal, from the data science team at AbbVie, described an innovative web portal developed to provide a more effective search for pharmacokinetic and pharmacodynamic parameters, for pharmacometricians and chemists within the Clinical Pharmacology and Pharmacometrics (CPPM) group. Manual search of scientific abstracts and full-text papers is typically slow and laborious, particularly when extracting key pk/pd numerics and units, such as drug concentrations, exposures, efficacies, dosages and more.
The platform Seal’s team developed is known as PharMine (Figure 1). They have implemented a workflow (Figure 2) that takes in Medline abstracts and uses a suite of NLP queries to extract key information including:
- Disease background, diseases overview, clinical endpoints, therapies etc.
- Drug name, route of administration, dosing
- Whether the study is preclinical or clinical, and where possible, animal model or patient/population demographics
- A range of PK/Pd parameters and metrics
The web frontend, built with standard tools such as R Shiny and D3, enables a non-expert user to enter, for example, a disease search term to search. They can then rapidly see the resulting text-mined data, visually represented with histograms, word clouds and other tools for filtering and pivoting the data. The text-mined results provide the end user with integration of all the data points they need, rather than having to read each abstract or full-text paper in turn. PharMine has received good feedback from the scientist end-users. The ability to rapidly differentiate clinical studies from preclinical saves significant time. Getting back key data elements such as dose and population, rather than just a few lines of text or a whole document to read (as with Google Scholar or PubMed) is proving very valuable for the CPPM group.
Understanding the landscape for published pk/pd data in scientific papers brings significant benefit for internal research. Recently the current platform’s capabilities have been extended to search gene-disease relations, tox-related data, biomarkers identification and PPI data. Democratizing AI technologies such as NLP-based text mining is a goal that many companies are striving for, and this innovative platform is showing value to early R&D in AbbVie.
Fig.1 A screenshot of the PharMine web portal showing one view for a search for data relating to Chronic Lymphocytic Leukaemia (CLL). Users can initiate a search with a drug or disease; see the basic counts for journal, abstracts; filter using word clouds (or other graphical representations) and see the tabulated list of relevant parameters for either clinical or non-clinical studies. The development team is working to tackle some new challenges, such as tables in full- text papers, with the agile workflow enabling rapid query engineering and optimization.
Fig.2 Workflow schematic for exaction by text mining of pk/pd parameters and other relevant data. PubMed abstracts and full-text papers from PubMed Central are fed into Linguamatics NLP solution, I2E, and queries run automatically to extract disease, drugs, pk/pd parameters with appropriate context, using both semantic (e.g. disease ontology, Clarivate drug ontology) and linguistic technologies.
Learn more about Linguamatics Web Portals
Contact us for a demo
