How ready are you for IDMP?

IDMP (IDentification of Medicinal Products) is a set of international standards developed by ISO that will become mandatory in Europe in a phased approach, effective from 2018, and will also be adopted by the FDA and globally over the next few years. As with any new regulatory change, it is valuable to hear about others' experiences and ideally understand and learn from industry best practice. 

Joining the IRISS Forum is the best way of keeping track of IDMP. I joined IRISS this year - it is an excellent source for up-to-date IDMP information and also valuable input from industry experts (such as Andrew Marr, Vada Perkins, and others).

The IRISS (Implementation of Regulatory Information Submission Standards) Forum was created to address the need for a single central forum for open and broad stakeholder discussion of evolving standards, user requirements and practical, global implementation issues of these standards for the mutual benefit of both industry, government agencies and ultimately, public health.

IRISS recently (September 2016) surveyed its members, both pharmaceutical and vendor, across the current state of readiness around IDMP compliance. The companies that took part in the industry survey covered a wide range of organizational sizes, from small companies those with less than 100 EU authorizations to larger ones with more than 5000 (or, less than 10 active ingredients to more than 250). Over 80% of those in the survey had a global reach. 

 


Drug safety and pharmacovigilance are critical aspects of drug development. To understand and monitor potential risks for pharmaceuticals, researchers use many different strategies to uncover evidence of real-world reports of adverse events and patient-reported outcomes.

At the upcoming Linguamatics Text Mining Summit, there are three talks on text mining strategies that improve our understanding of drug-related adverse reactions.

Nina Mian from AstraZeneca will present research on text mining adverse event data both from FDA drug labels (derived from clinical trial data), and also from real world data from PatientsLikeMe. Eric Lewis from GSK will discuss applications of I2E for clinical safety and pharmacovigilance – particularly the problems of identifying potential “new signals” and distinguishing signal from noise. And Stuart Murray from Agios will present workflows for automated identification of potential drug safety events.

These talks, from industry specialists, demonstrate the value of text mining to access and understand the complex world of drug safety and safety signals.


We are always enthused to read about new ways to utilize text mining in the drug discovery and development process, and very much enjoyed the recent paper by Heinemann et al., “Reflection of successful anticancer drug development processes in the literature”. In this study, the researchers develop tools that allow the prediction of the approval or failure of a targeted cancer drug, using models based on information mined from MEDLINE abstracts, along with a slew of other quantitative metadata (e.g. MeSH headings, author counts, fraction of authors with industry affiliation, and more). 

I2E, Linguamatics text mining platform, enabled the researchers to sytematically identify all MEDLINE abstracts containing both the protein target and the specific disease indication of a known set of successfully approved or failed cancer therapeutics; for example, abstracts containing both Her2 and breast cancer, or c-Kit and gastrointestinal stromal tumor (GIST). I2E enables the use of large vocabularies or ontologies of genes and diseases to extract key information, and the researchers used I2E for the rapid retrieval of publications containing any one of the many synonyms of a protein target or indication. 

The researchers found that the set of approved target-indication pairs showed a significantly higher publication count, from 9 years before FDA approval, compared to the eventually-failing pairs. 

Taking the study further, they applied machine learning classifiers and found that the extracted data features could be used to predict success or failure of target-indication pairs, and hence, approved or failed drugs. They conclude:


Drug safety is, of course, a prime focus of anyone in the pharmaceutical and biotech industries. The goal of any drug R&D project is to bring to market a safe and efficacious drug – oh yes, and ideally, create the latest blockbuster!

For monitoring drug safety, there are many tools and solutions. We were pleased to see the inclusion of Linguamatics I2E, in a recent paper from the FDA on the “Use of data mining at the Food and Drug Administration” (Duggirala et al, 2016, J Am Med Inform Assoc).

This FDA review covers a very broad range of text and data mining approaches, across both FDA databases (e.g. MAUDE, VAERS) and external data such as Medline, clinical study data, and social media.

Specifically, the FDA describe the use of I2E “to study clinical safety based on chemical structure information contained in medical literature. Linguamatics I2E enables custom searches using natural language processing to interpret unstructured text. The ability to predict the clinical safety of a drug based on chemical structures is becoming increasingly important, especially when adequate safety data are absent or equivocal.”


I attended a Big Data in Pharma conference recently, and very much liked a quote from Sir Muir Gray, cited by one of the speakers: "In the nineteenth century health was transformed by clean, clear water. In the twenty-first century, health will be transformed by clean clear knowledge."  

This was part of a series of discussions and round tables on how we, within the Pharma industry, can best use big data, both current and legacy data, to inform decisions for the discovery, development and delivery of new healthcare therapeutics. Data integration, breaking down the data silos to create data assets, data interoperability, use of ontologies and NLP - these were all themes presented; with the aim of enabling researchers and scientists to have a clean, clear view of all the appropriate knowledge for actionable decisions across the drug development pipeline. 

A new publication describes how text analytics can provide one of the tools for that data interoperablity ecosystem, to create a clear, clean view.  McEntire et al. describe a system that combines Pipeline Pilot workflow tools, Linguamatics I2E NLP linguistics and semantics, and visualization dashboards, to integrate information from key public domain sources, such as MEDLINE, OMIM, ClinicalTrials.gov, NIH grants, patents, news feeds, as well as internal content sources.