Blog

Natural Language Processing: Standing on the shoulders of giants

Many of us know the joys and sorrows of research. Weeks, months and years can pass, developing hypotheses, working in the lab or clinic, analyzing results, sometimes going back to square one, but then writing the paper, and finally, seeing the final version published and in print. The intent is that your research is shared, discussed, re-used, so that others can build on it, “standing on the shoulders of giants,” as Isaac Newton famously said.

Traditionally, getting information out of written papers for re-use has been manual; individuals reading, reviewing and extracting the key facts from tens or hundreds of papers by hand, in order to summarize the most up to date research in a field, or understand the landscape of information around a particular research topic. Over the past few decades Artificial Intelligence (AI) tools, such as Natural Language Processing (NLP), have evolved that can hugely speed up and improve this data extraction. NLP solutions can enable researchers to access information from huge volumes of scientific abstracts and literature; developing strategies and rules that drill deep into literature for hidden nuggets, or more broadly, ploughing the landscape for the nuggets of desired information.

To give you a couple of examples, I’ll share two use cases, both published recently, that use Linguamatics NLP platform across published literature, enabling researches to benefit from years of previous research.


It Takes a Village to Raise Modern Medicine

Learning from the past

“It takes a village to raise a child” is a popular old African proverb, that in my opinion has a lot of merit. Now that single parents are part of the mainstream, as well as divorced families, and other non-traditional parenting units and methods are adopted; it’s still very important for the nurturing and development to come from many different influences- especially those that are closest. I also believe this old proverb can be applied to not just childrearing, but in other areas, such as how we work together and adopt new methods to make healthcare better.


It is well known that the drug discovery and development process is lengthy, expensive and prone to failure. Starting from the selection of a novel target in discovery, through the multiple steps to regulatory approval, the overall probability of success is less than 1%.

One factor is that the majority of diseases are multifaceted, hence the challenge is identifying the most appropriate patient populations who will respond to specific interventions. A stratified approach has proven beneficial in a number of cancers and genetic diseases, and pharmaceutical companies have a strong interest in understanding how to find the sub-populations of patients to ensure the most appropriate therapies are tested in clinical trials, and applied in broader clinical use.

The ultimate aim of a stratified approach to medicine is to enable healthcare professionals to provide the “right treatment, for the right person, at the right dose, at the right time”; and there are many research initiatives (governmental, private, public) on-going to develop the appropriate knowledge and models.


Linguamatics NLP-Based Phenome Extraction from the EHR

On October 24 Benjamin Darbro, MD, PhD, Associate Professor of Pediatrics, Stead Family Department of Pediatrics at the University of Iowa, and Alyssa Hahn, doctoral student in the Interdisciplinary Graduate Program in Genetics at the University of Iowa, will present the webinar "The Use of Natural Language Processing to Improve Phenotype Extraction for Precision Medicine.”

How does NLP support precision medicine and improve patient clinical care?

Precision medicine focuses on disease treatment and prevention, taking into account the variability in genes, environment, and lifestyle between individual patients. In order to understand the best treatment pathway for a particular patient or group of patients, it is important to be able to access and analyze detailed information from the medical records of patients, and ideally broader aspects beyond their medical history.


Global Alliance for Genomics and Health (GA4GH) estimates that greater than 60 million patients will have their genome sequenced in some healthcare related scenario by 2025. 

Million Veteran Program is one such example where a research database will be assembled to anonymously study conditions such as diabetes and cancer, as well as military-related illnesses, like post-traumatic stress disorder (PTSD). As of July 2019, more than 770,000 veterans have contributed blood samples and health data. Just think about the ways this precision medicine initiative changes how we could approach clinical care! How did this revolutionary change in medicine get started?

Evolution from peas to Precision Medicine

I don’t believe that Johann Gregor Mendel would have any idea that his work with pea plants could have evolved into a vast area of medicine that had the potential to make such a revolutionary impact on healthcare. It all started in 1854, when Mendel began his work to look at the conveyance of hereditary traits. “Why peas?”, you may ask. Peas have numerous distinct varieties, and generations that could be quickly and easily produced. Of course as with any new theory, Mendel’s work met with skepticism and controversy and it was not until many years after his death that he was crowned with the title “ the father of modern genetics.”