Data-driven NLP plus machine learning equals better drug discovery insights

AI Siblings: NLP and Machine Learning for better drug discovery

October 13 2017

There’s growing interest in the use of machine learning to solve challenges across the drug-discovery pipeline within the biopharmaceutical community. The availability of high quality data for training algorithms is vital to machine learning success - but much of this information is tied up in unstructured, or semi-structured text sources. Natural language processing (NLP) is the key to extracting the wealth of data hidden in unstructured text, and Linguamatics’ customers have been finding out first-hand what this approach can do for them.

Using Linguamatics I2E NLP text mining:

  • Eli Lilly researchers mine adverse event data to identify potential new uses for existing drugs.
  • A top-10 pharma company process and understand unstructured “voice of the customer” call feeds, to categorize the feeds and help build predictive models.
  • Roche and Humboldt University of Berlin identified MEDLINE abstracts containing both the protein target and specific disease indication of a known set of cancer therapeutics, and applied machine learning to predict the success or failure of drugs in Phase II or III with high accuracy.

Read the full “Data-driven NLP plus machine learning” application note to find out more about how NLP can support effective machine learning projects.