Introducing I2E 5.3.1
I2E 5.3.1 includes improvements that make it easier to integrate the tool into your organization and process your internal documents, as well as the usual usability enhancements and under-the-hood modifications.
Single Sign-On (SSO)
I2E 5.3.1 can now act as an Identity Provider, allowing the application to work with Federated Authenticated Systems such as ADFS and Shibboleth. After the set-up process in your organization, users can be authenticated in one system and then seamlessly log into the I2E client without prompting for your credentials. If you are not already logged in via another system, I2E will initiate the login process via a redirect to a special web page.
ManifoldCF: I2E Output Connector [BETA release]
Apache ManifoldCF is “an effort to provide an open source framework for connecting source content repositories like Microsoft SharePoint and OpenText Documentum, to target repositories or indexes”. Linguamatics have created an I2E Output Connector -- which is being released as a beta – to simplify the ingress and indexing of documents stored within customer organizations. This means that I2E 5.3.1 can be used to index documents stored in Documentum, allowing you to run precise analytical queries over your files (and link back to them in Documentum) without needing to manually export the documents from Documentum.
The Hit column for Excel (XLS-HTML) and Excel (ODS) in I2E 5.3.1 no longer uses blue text to highlight information. It now uses the same highlight colors that are used in the HTML view and the cached document to provide consistent coloring and make it easier to review your results.
Class Matching: New Ontology column
When searching for a match in the Class Properties window, there is a new Ontology column in the results in I2E 5.3.1.
This helps to quickly review your results to get to your correct match(es), particularly when you’re using a term that could occur in different ontologies.
Reporting of Skipped Documents and new Find/Replace pane
There are a couple of changes in the Indexing interface in the I2E Administration view to make it easier to identify problematic files in your source data.
If your completed index status is “Succeeded (with warnings)”, it may be that a particular document could not be processed. In I2E 5.3.1, there is a new tab in Index properties, labelled “Skipped input”, that lists all of the source files that could not be processed.
Another way that you could find this information is to use the new Find/Replace panel. As the name suggests, this will let you find (and, in editable fields, replace) text in multi-line text fields. It works in I2E Express, I2E Pro and (most usefully) in the I2E Administration view and is available via the usual Ctrl-F keyboard shortcut (or by using right-clicking in a text field to bring up the option).
New Indexing Mode: PSV (Pipe-separated values)
There is a document type that we are seeing more often, which uses the pipe symbol (“|”) to separate fields. To enable you to index those documents directly and remove a pre-processing step (for example, to convert it to TSV), simply select the new indexing mode and use the .psv file extension for your files.
Documents that are indexed using PSV mode will have the same Regions as those that are indexed with TSV and CSV modes, enabling you to re-use your queries.
EASL snippets in Smart Query Parameter Fields via API
I2E 5.3.1 adds the ability to include EASL in Smart Query parameter fields as part of Query Task (API) submission. This supersedes the I2E Query Notation that was previously the available syntax for Smart Query parameter fields.
One of the reasons for changing to EASL, as the Smart Query parameter syntax, is to allow more expressive terms in your search. For example, you could search for Pharmacologic Substances but exclude the general term, “drug”.
These EASL snippets can be used whenever you are submitting a query via the API, e.g. using a Query Template, writing code with the SDK or using the Query Submitter node in KNIME.