Introducing I2E 5.4
Development of I2E 5.4 encompassed all areas of the product, with improvements to ontologies, indexing and query evaluation. These are areas that improve results for I2E users with Enterprise and OnDemand deployment but also enhance users of iScite, I2E AMP and I2E Web Portals.
In every release our Ready-to-Use Ontologies (which are used in our I2E OnDemand indexes) are rebuilt with the latest sources and there are often other improvements made. I2E 5.4 includes several of these additional changes, including Cache Document Tooltips and Ontology Aliases.
Cache Document Tooltips now provide extra information when Ready-to-Use Ontology terms are highlighted in cached documents: click on the term to see additional information for that item (figure 1).
Ontology Aliases provides a mechanism for you to be informed that an out-of-date concept has been replaced with a new concept in an ontology. This means that the I2E server will know how to deal with concepts that are in older queries (developed against older indexes) which have been superseded in current indexes. This new feature is used extensively in I2E 5.4 to accelerate changes to our NCI Enhanced and Organizations by Sector ontologies.
Specific changes in our Ready-to-Use Ontologies include:
- Re-arranging the Linguamatics Diseases hierarchy to split it into two major branches: “Diseases and Disorders” and “Signs and Symptoms” (figure 2). This allows you to focus on Symptoms or, conversely, exclude Symptoms from your results.
- Many gene aliases (which are actually related but non-synonymous terms) have been reviewed and excluded from the Gene/Protein ontology. This improves precision when used by itself or in bigger patterns, for example, protein-protein interactions.
- The matching of names from the Organizations by Sector ontology has been improved. This means that indexing will automatically match “Contoso Société à responsabilité limitée” to “Contoso SARL” without requiring both terms to exist in the ontology.
Query Gold Standard Evaluation
I2E 5.4 introduced the new Query Gold Standard Evaluation feature. In use for several years at Hackathons at Linguamatics events, this is now available to all users of I2E to allow you to objectively measure the quality of your queries against a pre-determined gold standard dataset. Running the evaluation will score your run your query on your index, compare it against the gold standard and generate a measure of precision, recall and an F-score. Evaluation can be run in either training or test mode: training mode will also show you the query results classified as true positives, false positives and false negatives (figure 3).
Additional Indexing Features
There are some other new features in I2E 5.4 that can improve document processing and indexing. These new features are enabled by a new way to index file metadata in I2E 5.4: it is possible to pass a properties file alongside your document that will be indexed to create and populate new shadow regions in your index. This mechanism improves the experience of indexing Documentum files using I2E via the ManifoldCF connectors, allowing you to link directly back to your document in Documentum from your I2E results.