Posts from May 2016

I attended a Big Data in Pharma conference recently, and very much liked a quote from Sir Muir Gray, cited by one of the speakers: "In the nineteenth century health was transformed by clean, clear water. In the twenty-first century, health will be transformed by clean clear knowledge."  

This was part of a series of discussions and round tables on how we, within the Pharma industry, can best use big data, both current and legacy data, to inform decisions for the discovery, development and delivery of new healthcare therapeutics. Data integration, breaking down the data silos to create data assets, data interoperability, use of ontologies and NLP - these were all themes presented; with the aim of enabling researchers and scientists to have a clean, clear view of all the appropriate knowledge for actionable decisions across the drug development pipeline. 

A new publication describes how text analytics can provide one of the tools for that data interoperablity ecosystem, to create a clear, clean view.  McEntire et al. describe a system that combines Pipeline Pilot workflow tools, Linguamatics I2E NLP linguistics and semantics, and visualization dashboards, to integrate information from key public domain sources, such as MEDLINE, OMIM, ClinicalTrials.gov, NIH grants, patents, news feeds, as well as internal content sources.


What if physicians could offer patients access to a potentially life-preserving test, but could not easily identify which of their patients were eligible?

That is the exact situation many providers have found themselves in since Medicare announced it would begin covering lung cancer screening for patients meeting a certain set of criteria.

In a decision memo published February, 2015, CMS agreed to make Medicare coverage available for a low dose computed tomography (LDCT) lung cancer screening for eligible patients. Patients who are between ages 55 and 77, asymptomatic, are either a current smoker or have quit within the last 15 years, and, have a tobacco smoking history of at least 30 pack-years can now qualify for an annual preventative screening.

CMS added the coverage after determining there was sufficient evidence that LDCT procedures were cost-effective for high risk populations. A study by the National Lung Cancer Screening Trial, for example, found that 12,000 deaths a year could be avoided if high-risk patients underwent a LDCT scan. Lung cancer is currently the leading cause of cancer-related death among both men and women in the US.


Linguamatics hosted our Spring Text Mining Conference in Cambridge last week (#LMSpring16). Attendees from the pharmaceutical industry, biotech, healthcare, personal consumer care, crop science, academia, and partner vendor companies came together for hands-on workshops, round table discussions, and of course, some excellent presentations and talks. 

The talks kicked off with a presentation by Thierry Breyette, Novo Nordisk, who described three different projects where text mining provided signficant value from real world data.  Thierry took the RAND Corporation definition: "Real-world data (RWD) is an umbrella term for different types of data that are not collected in conventional randomised controlled trials. RWD comes from various sources and includes patient data, data from clinicians, hospital data, data from payers and social data."

At Novo Nordisk they have gained business impact by text mining a variety of souces, including: social media to find digital opinion leaders; conversation transcripts between medical liaisons and healthcare professionals for trends around clinical insights; and mining patient & caregiver ethnographic data to see patterns in patient sentiment and compliance.