How to protect and develop your enterprise search investment with text analytics

June 24 2014

It’s funny, isn’t it? Search at home just works. You’re looking for a holiday, train times, a particular recipe or the answer to your kid’s homework.

You sit down and type your keyword/s into your search engine. Milliseconds later, results appear – the one you’re looking for is usually one of the first ones – you click on it and voila! You have what you were looking for.

But search at work doesn’t seem to be as effective. Maybe you are looking for information internally. You know it exists but you’re not quite sure where. The information lies across silos and it’s a mix of structured and unstructured.

As a scientist it’s important for you to easily find information hidden in memos, project plans, meeting minutes, study reports, literature etc. You type a keyword search in your enterprise search engine.

A list of documents comes back but none of them look like the one you want. You feel like you’re wasting your time. Sound familiar?

You’re not alone. At least that is what recent surveys and conferences on enterprise search have revealed. According to a recent report from Findwise 64% of organizations say it’s difficult to find information within their organization. Why?

  • Poor search functionality
  • Inconsistencies in how information is tagged
  • People don’t know where to look or what to look for

So how can we address this? Well, there’s already been talk of using text analytics to improve enterprise search.

Text analytics, also referred to as text mining, allows users to go beyond keyword search to interpret the meaning of text in documents. While text analytics solutions have existed for some years now, more recently they’ve been working in harmony with enterprise search to improve the quality of results and make information more discoverable.

Let me give you an example, for over 10 years Linguamatics I2E has been mining data and content such as scientific literature, patents, clinical trials data, news feeds, electronic health records, social media and proprietary content – working with 17 of the top 20 pharmaceutical companies to improve and power their knowledge discovery. Meanwhile organizations have been deploying enterprise search engines to search internally.

Having been dissatisfied with their search solution and familiar with using I2E in other areas, a top 20 pharma wanted to see if the power of I2E’s text analytics could be applied to their enterprise search system.

A proof of concept was proposed using Microsoft SharePoint. The organization did some internal requirement’s gathering and worked with both Microsoft and Linguamatics to come up with a solution to improve their search.

I2E worked in the background, using its natural language processing technology to identify concepts and mark up semantic entities such as genes, drugs, diseases, organizations, anatomy, authors and other relevant concepts and relationships. Once annotated, taxonomies/thesauri were built and the marked-up documents were fed back into SharePoint.

To the users, the search interface remained the same but there was a difference in the results. I2E was able to provide semantic facets for the search engine to allow the user to quickly filter the results to what they were looking for.

The facets were concepts rather than words and this allowed users to filter results to a more intuitive set of things they were looking for e.g. just show me the results for ‘breast cancer’ as a concept. This would also include all results that had variations of how that concept was found in the text e.g. breast carcinoma, breast tumor, cancer of the breast etc.

In addition, I2E provided SharePoint with the ability to autocomplete terms as the user was typing them, and when performing the search, SharePoint was taught to look for synonyms of the word/s typed in.

The organization was incredibly happy with the improved search performance. Stating the main benefits as improved efficiency, improved search results quality, information became more transparent and available, which stimulated innovation within the organization.

This is just the beginning.

The capabilities of I2E could also be applied to other search engines and scenarios where search needs to be improved to increase the return on investment made in the system and protect and develop future investments, increase usage and findability.

If you’d like to find out more, sign up for Linguamatics’ webinar or contact us for a demo.