II-SDV 2015: The International Information Conference on Search, Data Mining and Visualization

May 12 2015

The recent two day II-SDV meeting in the beautiful town of Nice on the Côte d’Azur, France, started with a day of talks considering the question of how to best maximise the value of data extracted from a wide range of sources: patents, full text articles and even big data.

The programme kicked off with a presentation from Aleksander Kapisoda from Boehringer Ingelheim (BI) describing how innovative use of custom search techniques beyond that currently offered by standard public search machines can bring tangible benefits to a global pharmaceutical company.

One theme that emerged was the potential use of text mining particularly in constructing landscapes related to emerging technologies. Jane List (Extract information UK) described some of the tools, workflows, and visualisations for patent landscaping, with a great quote from Marcel Proust: “The real voyage of discovery consists not in seeking new landscapes, but in having new eyes”. Emmanuelle Fortune (INIP, France) discussed the ability to classify world cities dubbed “Smart Cities” as hubs for technological development directly from mining the patent literature.

Staying on the topic of text mining I presented a number of use cases related on the subject of “time” and pressed home the message that using text mining can provide clear advantages for access to timely information. This presentation was then followed up by news from the Copyright Clearance Center (CCC) that the difficult process of obtaining legal permission for the purposes of text mining has recently become a lot easier with the ability to now directly create set of full text documents ready for immediate use in text mining. This has long been a goal for many information scientists, as there are valuable nuggets of information in full text that just can’t be gained from mining abstracts.

Finally the conference heard from a different group within Boehringer Ingelheim,  concerned with automating the currently time consuming process of extracting medicinally relevant chemistry from patents. Matthias Negri, collaborating with technology partners Chemaxon, has established a Knime™ workflow that makes use of Linguamatics I2E to extract the additional surrounding pharmacological context to chemistry described within the patent, to provide “a solid information base of value to any phase of a drug discovery project”.

With participants from both US and Europe, the conference provided a great opportunity to meet information specialists, patent experts and scientists from across life science specialities, as well as hearing from vendors about their new product developments. If you are interested in any of the topics discussed, more information can be found at the conference website, and I’d be happy to hear your comments.