“Sometimes I think we’re becoming more of a data analytics company than anything else”

Humana Chief Medical Officer Roy Beveridge, M.D

Why are Top Payers investing in Analytics?

In a Fierce Healthcare interview, Humana highlighted their long term commitment to analytics. Humana’s focus on Medicare Advantage means that they are increasingly in data partnerships with hospitals to provide insights and support through population health tools. What is fascinating to me, is this type of partnership would never have been possible in a fee-for-service world, and reflects the commitment to move to a more value-based-care.

Population Health Analytics

Population health tools in this space need to work with the heterogeneous data sets payers receive and are used to characterize and support their members. Many groups I have spoken to stratify high risk individuals in these data sets; for example, smokers with heart disease are identified and given guidance by payers on quitting and incentives for exercise programs. I especially admire the way value-based providers and payers are working together to allow advice on high risk individuals to be given directly to the clinicians.


I always like the Fall, the "season of mists and mellow fruitfulness". It was the Autumn Equinox recently, and here in the UK, we are enjoying that lovely balance of weather at the turning of the seasons, the slow change from summer to fall, autumn fruits ripening, nights starting to draw in.

And of course this means our thoughts turn to… the Linguamatics Text Mining Summit! Held on the East Coast, this is an opportunity for our text mining community, across both healthcare and pharma industries, to come together. Attendees share best practice, get some hands-on training, and listen to talks on how others are finding real value from their textual data, using the power of NLP-based text mining.

This year, we will be in New Castle, New Hampshire, and the main talks will be on Tuesday 2nd October and Wednesday 3rd October. As always we have a balance of talks from our life science and healthcare customers, and from Linguamatics presenters, providing updates on current and future developments and plans. Our customer speakers encompass a wide range of use cases, spanning drug discovery and development, and into clinical delivery of therapeutics and better patient care.


Linguamatics I2E 5.1 focuses on further increasing the power and scale of querying, while optimizing the users’ experiences of query building.

I2E 5.1 enriches and expands on the capabilities introduced in I2E 5.0, which made a big splash in NLP text mining technology.  

I2E 5.1 addresses the increasing variety of representations of the same concept in big data by finding more matches for terms in a document: variations in accented characters, spelling errors, and OCR artefacts are taken into consideration when matching. This ‘fuzzy matching’ returns greater search results and increases recall and accuracy.

One customer commented: ‘I am really looking forward to I2E 5.1’s spelling correction…you don’t realize how much you can miss in your search results because of typos and spelling mistakes.’

Data normalization in I2E, a key feature for tackling big data’s increasing variety, is now easier to use. Regardless of how the original document is written, you can define your numeric ranges in a different unit; for example, you can filter in pounds (as an upper or lower threshold or as a range) and display the results in kilograms.

I2E 5.1 introduces an integrated view of your query and a way of dragging queries around the editor, making it easier to design, tune and maintain your searches.


The 2017 Text Mining Summit (New Castle, New Hampshire, October 2-4) will be your first opportunity to take part in our new I2E Certificate Program.  The Level 1 Query User Certificate will be open to those who have just taken the “Introduction to I2E” hands-on workshops provided at the TMS, as well as more established users, who have taken the “Introduction to I2E” training on previous occasions. See the TMS Workshop Selection Guide for more details. It’s free to join in as part of your TMS registration.

Completing the different levels of the Certificate Program will allow you to validate, extend and improve your I2E skills. The Query User Certificate will focus on using and editing basic queries and Resource queries to:

  • Create simple queries with different constraints, morphological variants, preferred terms and alternative lists

  • Use classes to improve recall and precision of queries with linguistic classes, ontologies, and pattern ontologies

  • Work with results by using limits, output formats and displays

  • Use Resource queries to answer common questions

Those taking the Query User Certificate at the TMS will have access to:

  • In-class instruction

  • Practical, hands-on experience with I2E

  • Open question sessions with I2E Experts

  • A set of learning objectives

  • Learning materials, including

    • Tutorial booklets


IDMP compliance will require Market Authorization Holders to submit and maintain a broad range of data elements about medicinal products with the EMA - 70% of which currently exist in unstructured text, hidden in multiple document formats, styles, and languages. These data play a key role in the core operation of a pharmaceutical company, and are re-used for multiple purposes across the business.

The challenge for such companies is to find a quick, accurate, and affordable way to search, extract, standardize, and structure the 300–2,000 data elements required per product for IDMP compliance.

Linguamatics NLP extracts IDMP data elements

Mundipharma Research Limited implemented a pilot project using I2E, Linguamatics natural language processing-based text mining solution, to find, highlight and extract data elements for Iteration 1 from unstructured documents such as the EMA Summary of Product Characteristics (SmPC) documents.

I2E queries were developed to extract the individual data elements using standard and customized ontologies, as well as linguistic features of SmPCs. Accuracy was evaluated against a ‘gold standard’ data set that had been manually extracted by an independent expert. Find out more about I2E's Extract Transform Load (ETL) solutions here.

Jon Sanford, Head of Regulatory Information Management and Operations at Mundipharma Research: “We were really impressed when we saw the accuracy with which I2E had been able to extract data elements from the documents”.