Earlier this year, Linguamatics announced our new Connected Data Technology for federated search, and in our newest version, I2E 4.4, we build on this to take another step along the path of better data interoperability. I2E 4.4 introduces a more powerful way to customize your text analytics results using enhanced linkouts in the HTML output, enabling you, for example, to connect your text-mined data to structured content.
Linkouts enable you to link out to, or pull in, additional information relating to the preferred terms (PTs) or concept identifiers (NodeIDs) in your query results. They can be hyperlinks, images or customized output. For example, you can configure linkouts to see information from an external website by clicking on the concept in the text-mined query results. Alternatively, it is possible to enable the interface to display an image in the query results, such as a chemical structure, instead of the preferred term.
This new functionality means you can use linkouts to enhance query results, by enabling you to access additional related information to provide more context or metadata for your search. So, for example, a search for chemicals from ChEBI could link directly from the preferred term in your results to the webpage for that concept on the EBI web site (e.g. Cyclosporine), whilst a gene name in the same result links to EntrezGene (e.g. ICAM1).
This functionality has been of particular interest for those using I2E for advanced text analytics relating to intelligence around chemicals. Public websites that contain a wide variety of chemical information for linkouts include:
The linkout capability is now combined with further enhancements in order to associate chemical images with a chemical concept. It is now possible to output results containing chemicals as structural images (rather than text or SMILES) giving a chemist an immediate visual understanding of their search results.
This powerful combination of text analytics, chemical visualisation, data charting and linkouts for extended information allows for a much more efficient and meaningful review of the results for chemoinformaticians, discovery scientists, or medicinal chemists needing to extract and understand chemical structures buried in free text sources such as scientific literature or patents.
Figures: Text mining facilities easier exploration of chemical information “trapped” in patents. Patents contain a wealth of novel chemical data, and studies suggest that less than 10% of compound structures exemplified in patents are also published in journal articles. I2E Chemistry enables you to search for exemplified compounds and pull out biological context (e.g. targets, assay metrics) or physicochemical properties (such as melting point). The screenshots below show (a) tabulated results of exemplified chemicals and associated melting point data in I2E results; (b) barchart of melting points to enable data filtering; (c) a snapshot of linkout to Chemicalize for one of the novel exemplified compounds; and (d) a highlighted version of the corresponding patent text referring to the structure.
