Rare diseases and NLP

Rare disease and the most vulnerable population: how can NLP help?

Zebras -vs- Horses

Watching the development of a newborn unfold is both exciting and terrifying. As a physician and now parent of a newborn, I can say with certainty that the logical side of me struggles when diagnosing my own child. My baby wasn’t even a day old when I was already convinced that she might have Hirschsprung Disease. Later I learned the nurse had changed her diaper (during my brief nap or rather, collapse, due to pure exhaustion) and forgot to mention her intestines are indeed doing their job. As you progress you learn, as in medical school, to assume the more common problems and be aware of when you should go down the less common diagnosis route. A blocked tear duct in babies can look scary but is relatively common (1 in 25) and often clears by non-invasive methods; babies really do cry relentlessly sometimes and this is completely ‘normal’ - you learn to realize when it’s simply just the trials and tribulations of a growing child- and when it’s not. In medicine this is referred to as the Zebras -vs- Horses Phenomenon - aka look for the most common diagnosis not the most rare first. 

But sometimes it is a zebra…

There are indeed rare diseases, aka zebras. With zebras, you can generally know where to find them- say a safari or your local zoo. Rare diseases are not so reliable. They often remain hidden and aren’t found in reliable places- unless you know of a family that are carriers for a certain disease. Otherwise, you need to know where to look, and how. Rare diseases are often diagnosed at a critical moment when all other diagnoses have been ruled out (sometimes multiple times)- in my opinion this is too late. It should have already been among the differentials. But how do you accomplish such a task?  Do you simply have someone manually combing through patients charts trying to see when this rarity actually appears? That’s not a feasible solution: it’s time to bring in Artificial Intelligence. 

How can Augmented Intelligence using NLP help?

Faster identification of patients with rare diseases is key. You never know where there could be a mention of clinical traits. There might be a patient portal message a parent sent to the pediatrician mentioning observed signs and symptoms that were in question, giving a vital clue in piecing together a diagnosis - at a much faster rate. Crucial information could be missed by humans, such as a one time mention of something many years ago. To capture this more reliably, organizations can look to Augmented Intelligence (AI) techniques using Natural Language Processing (NLP). Key information relating to signs and symptoms is found in structured (i.e. discrete fields) and unstructured (e.g. free text content), and in order to identify people with rare disease you need both. NLP can be utilized over a multitude of data sources to understand essential unstructured information. With the right data, analytics can not only provide for a clearer picture for individuals but for whole disease populations, allowing for the creation or enhancement of rare disease registries to further help the cause.

Here are two use case webinars detailing how NLP is helping children with rare disease:

  • A Systematic Examination of Gene-Disease Associations Through Text Mining Approaches for Hunter Syndrome.  Madhusudan Natarajan discusses the value of NLP text mining  for disease severity and genotype-phenotype association for the disease. This rare disease, also known as Mucopolysaccharidosis II, is caused by an X-linked deficiency in iduronate-2-sulfatase.  In this disease, chains of sugar molecules used to build connective tissues in the body build up in organs and tissues over time, which can cause damage that affects physical and mental development and abilities.
     
  • The Use of Natural Language Processing to Improve Phenotype Extraction for Precision Medicine at the University of Iowa.  Benjamin Darbro, MD, PhD, Associate Professor of Pediatrics, Stead Family Department of Pediatrics at the University of Iowa, and Alyssa Hahn, doctoral student in the Interdisciplinary Graduate Program in Genetics at the University of Iowa present their NLP efforts to identify clinical phenotypes in infants. 

Protect and serving the rare disease population

There are many valiant initiatives underway  to connect patients and advocate for individuals in the area of rare diseases.  We can all do our part individually and as an organization. You can learn more about rare disease efforts through organizations such as the Rare Action Network (RAN). You can also find guidance on creating rare disease registries by the National Center for Advancing Translational Sciences (NCATS) the Rare Diseases Registry Program (RaDaR). Let’s all do our best to help advance rare disease initiatives to rapidly identify the  “zebras”. That way we can improve healthcare outcomes for the most fragile members of our human population.