Effective Rule-based NLP Enables Machine Learning and Clinical Insights

Effective Rule-based NLP Enables Machine Learning and Clinical Insights

May 18 2017

There’s a lot of buzz in the healthcare community at the moment surrounding the use of artificial intelligence with machine learning for pattern identification, decision-making, and outcome prediction. The availability of high-quality data for training algorithms is vital to machine learning’s success - but a lot of this information is tied up in unstructured clinical notes. Natural language processing (NLP) is the key to extracting the “good stuff” from this vast trove of unstructured text. Combining that “good stuff” with already structured data helps healthcare providers to understand the patterns and trends in data via machine learning - and thereby enhance care, reduce costs, and improve population health.

Which type of NLP software is best?

The first question that healthcare users must ask themselves is “Which type of NLP software best suits my needs?”

Statistical NLP systems require example data to identify patterns in new data. The examples may come from dictionaries or ontologies - or they might need to be manually annotated by a clinician - which can be an extremely laborious and institutionally costly task.

Meanwhile, most rule-based NLP systems require a specialist to define the types of language rule or pattern that represent certain healthcare concepts. This approach can make them more accurate, but they will be limited only to the patterns that the specialist has thought of.

Linguamatics I2E: a ground-breaking approach to rule-based NLP

Linguamatics I2E, however, has an innovative approach to rule-based NLP, which makes it easier to create, edit, and reuse patterns, even when the user is not an NLP specialist. I2E can dramatically accelerate the development of new machine learning algorithms and free up resources, giving machine learning projects a much greater chance of success. How does it do this?

  • No need for annotated data — I2E can produce training data for machine learning much more quickly with much less clinical input. Clinical analysts can evaluate samples of query results and compare query runs in a fraction of the time of manual annotation of clinical records. Customers can prepare data in days instead of spending months on manual review. They can also include any prior knowledge they may have of the types of linguistic construction that might be present in the raw data, avoiding training “gaps.”
  • Flexible - I2E can be easily and quickly modified, and new features, such as life style, a quality measure or a pathology attribute for example
  • Transferable - Using I2E to incorporate human knowledge as rules means that new data can be added to models with minimal effort, making it easier to apply models trained at one site to another site.
  • Rapidly add new keywords and phrases - I2E simplifies the harvesting of new keywords and phrases, and uses existing terminologies, to speed up the creation of features and shorten model development time. This rapid bootstrapping of new ontologies is a key feature of I2E.

All of this means that I2E plus machine learning is truly the best of both worlds, allowing the healthcare community to build better machine learning models to understand and address problems in patient care.

To learn more access our application note on 'Effective rule-based NLP enables Machine Learning and clinical insights' or contact us.

Access the application note