Posts from June 2014

Part of the I2E Enterprise installation is the Sample Web GUI — a Smart Query interface written as a web application that allows users to run smart queries using only their browser.

The Smart Query interface

A neat trick that it performs is on-the-fly class matching: start typing in a word and the server starts to suggest terms in your dictionary that would match. So a search for “psor” will suggest Psoriasis, Psoriatic Arthritis, etc.

Accepting the suggestion will then populate the search with that class rather than the word. The autosuggestion, dropdowns and tooltips are very nice from the user experience perspective, but today’s post will concentrate on the class match itself – how can a search for “psoriasis” retrieve a class match?

There is a two-part answer to that question – the first part is quite easy to answer and the second part is (only slightly) more complicated. So, let’s start with the first part.

Using the query parameters “search”, “pt” or “synonym”

Class matching is a synchronous operation in I2E that uses a query parameter to specify the input and returns the matches as a list/array of classes. Because of this, it’s something that you can try very simply with your web browser. The general form of the URL is (omitting the protocol, servername and port information for brevity):

/api;type=class/pathto/myindex/?search=psoriasis


It’s funny, isn’t it? Search at home just works. You’re looking for a holiday, train times, a particular recipe or the answer to your kid’s homework.

You sit down and type your keyword/s into your search engine. Milliseconds later, results appear – the one you’re looking for is usually one of the first ones – you click on it and voila! You have what you were looking for.

But search at work doesn’t seem to be as effective. Maybe you are looking for information internally. You know it exists but you’re not quite sure where. The information lies across silos and it’s a mix of structured and unstructured.

As a scientist it’s important for you to easily find information hidden in memos, project plans, meeting minutes, study reports, literature etc. You type a keyword search in your enterprise search engine.

A list of documents comes back but none of them look like the one you want. You feel like you’re wasting your time. Sound familiar?

You’re not alone. At least that is what recent surveys and conferences on enterprise search have revealed. According to a recent report from Findwise 64% of organizations say it’s difficult to find information within their organization. Why?

  • Poor search functionality
  • Inconsistencies in how information is tagged
  • People don’t know where to look or what to look for

So how can we address this? Well, there’s already been talk of using text analytics to improve enterprise search.


Today, Linguamatics launches I2E Semantic Enrichment to provide increased return on investment in enterprise search systems and radically improve speed to insight.

I2E Semantic Enrichment is used within an existing enterprise search deployment to enrich the current data, make it more discoverable and provide more relevant search results.

The software scans millions of documents to identify and mark-up semantic entities such as genes, drugs, diseases, organizations, authors and other relevant concepts and relationships. Enterprise search engines consume this enriched metadata to provide a faster, more effective search for users.

I2E uses natural language processing (NLP) technology to find concepts in the right context, combined with a range of other strategies including application of ontologies, taxonomies, thesauri, rule-based pattern matching and disambiguation based on context. This allows enterprise search engines to gain a better understanding of documents in order to provide a richer search experience and increase findability, which enables users to spend less time on search.

Synonyms allow the user to find all relevant results, not just those containing the exact word. I2E also provides rich, multi-level facets for the search engine to help the user filter down to the most relevant results across the areas of interest (e.g. disease, drug class, etc).

Linguamatics Executive Chairman John M. Brimacombe commented "It's no secret that enterprise search hasn’t quite lived up to expectations, with users struggling to find the information they need and organizations dissatisfied with their search solutions.


It’s always good to see NLP being used in a clinical care, a recent story about Microsoft and Washington University in Seattle using NLP in pneumonia detection in the ICU is a good example of this.

The project, called deCIPHER, uses a combination of Microsoft linguistics and machine learning to assess clinical information from electronic medical records and derive a diagnosis.

The system was trained against a cohort of 100 patients who had already been diagnosed with pneumonia and used a machine learning framework to build a predictive model based on extracted clinical factors. The system accurately predicted 84% of positive patients and the team are assessing incorporating the model into an ICU dashboard.

Last year Kaiser Permanente also published a paper on pneumonia diagnosis in relation to the ICU and using chest radiograph reports, using Linguamatics I2E for information extraction and also applying machine learning to the resulting clinical factors.

From a total of 194,615 ICU reports, Kaiser Permanente empirically developed a lexicon to categorize pneumonia-relevant terms and uncertainty profiles.


New product release allows tens of thousands of enterprise search users to benefit from the power of Linguamatics’ market-leading technology.

(Cambridge, England and Boston, USA – June 19, 2014) Today, Linguamatics launches I2E Semantic Enrichment to provide increased return on investment in enterprise search systems and radically improve speed to insight.

I2E Semantic Enrichment is used within an existing enterprise search deployment to enrich the current data, make it more discoverable and provide more relevant search results.

The software scans millions of documents to identify and mark-up semantic entities such as genes, drugs, diseases, organizations, authors and other relevant concepts and relationships. Enterprise search engines consume this enriched metadata to provide a faster, more effective search for users.

I2E uses natural language processing (NLP) technology to find concepts in the right context, combined with a range of other strategies including application of ontologies, taxonomies, thesauri, rule-based pattern matching and disambiguation based on context.

This allows enterprise search engines to gain a better understanding of documents in order to provide a richer search experience and increase findability, which enables users to spend less time on search.

Synonyms allow the user to find all relevant results, not just those containing the exact word. I2E also provides rich, multi-level facets for the search engine to help the user filter down to the most relevant results across the areas of interest (e.g. disease, drug class, etc).