At the recent Text Mining Summit, one piece of feedback that we received was that video tutorials were a good source of helpful information for users. We considered this good timing as we had just started work on one!

KNIME and Pipeline Pilot are both popular workflow tools that I2E customers use to enhance the power of text mining but whereas the Pipeline Pilot components provided by Linguamatics are installed on the server, the KNIME nodes that we have produced are often deployed by individuals on their Desktop KNIME application. To get those users up and running quickly, we've put together a 15 minute YouTube video explaining the steps needed to:

  • Download and install the nodes
  • Create a new KNIME workflow and add the Linguamatics I2E nodes
  • Configure the nodes and run the workflow

We would love your feedback on this video (too long or too short? too quick or too slow?) and please let us know what other topics you would like to be covered by a video tutorial.


There’s a variety of ways of running searches using I2E but for most purposes, the modes can be simplified to:

  • Search using the I2E Java Client, and
  • Everything else

This distinction is important for users, administrators and developers because access to querying is licensed in the same way. Today’s post will explain the differences between the two modes as well as how to make sure that you’re using your existing capabilities in the most efficient way, with reference to license pools, capabilities and user groups.
 

Querying using the I2E Java Client

If you’re running a search via the I2E Java client, you will have an interactive license pool that has a “Pro Query” capability (for simplicity, I’m ignoring “Express Query” and “Smart Query” capabilities; the description mostly applies to these as well).

In addition to allowing you to run a search, “Pro Query” capability also provides you with uncontested access to the server (unless you log out or your session times out) and the ability to load, create and save queries.

Tasks available for I2E Java client

If two people want to run the I2E Java client at the same time, they will both need to belong to a license pool with a “Pro Query” capability and the sum of those license pools are at least two (e.g. Two named user license pools or 1 concurrent user license pool with two seats).


Part of the I2E Enterprise installation is the Sample Web GUI — a Smart Query interface written as a web application that allows users to run smart queries using only their browser.

The Smart Query interface

A neat trick that it performs is on-the-fly class matching: start typing in a word and the server starts to suggest terms in your dictionary that would match. So a search for “psor” will suggest Psoriasis, Psoriatic Arthritis, etc.

Accepting the suggestion will then populate the search with that class rather than the word. The autosuggestion, dropdowns and tooltips are very nice from the user experience perspective, but today’s post will concentrate on the class match itself – how can a search for “psoriasis” retrieve a class match?

There is a two-part answer to that question – the first part is quite easy to answer and the second part is (only slightly) more complicated. So, let’s start with the first part.

Using the query parameters “search”, “pt” or “synonym”

Class matching is a synchronous operation in I2E that uses a query parameter to specify the input and returns the matches as a list/array of classes. Because of this, it’s something that you can try very simply with your web browser. The general form of the URL is (omitting the protocol, servername and port information for brevity):

/api;type=class/pathto/myindex/?search=psoriasis


Although I2E Queries and Multi Queries are binary objects, the I2E Web Services API provides an interface to a subset of the properties of those items, including some that can be modified when running a query programmatically.

Query properties that are read-only and that can be retrieved using the API include title, creator, comments and column headers. Query properties that can be modified before query submissions include number of hits, time limit and smart query parameters.

I2E has two, related, query resources: Saved Queries (that represent the binary files on disk, stored in the Repository) and Published Queries (that represent the Published location of the Saved Queries). To ensure that Users have permissions to see Query Properties, it is recommended that you only expose access to Published Queries.

Retrieving (by GET) a Published Query provides a “handle” to the Saved Query:

HTTP Header = X-Version: *, Accept: application/json GET http://i2e.company.com:8334/api;type=published_query/QueryTree/Query1.i2q Success 200

The response should look something like:

{
“shared”: true,
“valid”: true,
“handle”: “/api;type=saved_query/4.1/Query1.i2q”,
“error”: null,
“editable”: true
}

 

If you then retrieve that handle, you will receive an error because the server is trying to represent the query itself as JSON

HTTP Header = X-Version: *, Accept: application/json GET http://i2e.company.com:8334/api;type=saved_query/4.1/Query1.i2q Error 406
 


So, an I2E user has built a great query that detects side effects and adverse events for a drug. It looks ideal as a candidate for repeated use: for example, search MEDLINE whenever it is updated.

The I2E user has also saved this query as a Smart Query, meaning the drug can be changed when the query is run. Changing the settings of a smart query in the I2E client is easy: type in a few words, click to add a class or load in a list of alternatives, or some combination of those options.

So how can options like these be transferred to the I2E server as part of the query using the Web Services API?

The answer (if you haven’t guessed from the title of this post) is to use the I2E Query Notation, a simple textual syntax for specifying words, phrases, alternatives, classes and macros. It is based on the notation that has been in use in I2E Express queries for some time, so a compound search like:

["big brown bear" mouse* CAT^]

would find the phrase “big brown bear” or mouse (and variants like mice) or CAT (but not cat, Cat or cAt).

As well as words, phrases and alternatives, you can refer to classes in your I2E indexes using the identifier for the class (its “nodeid”) and the identifier for the ontology to which it belongs (its “supplierid”): together they uniquely identify a class in an index. The format is

/sn<supplierid.nodeid>