Posts from June 2015

Life sciences and healthcare professionals gathered at the UCSF Mission Bay campus for the West Coast Natural Language Processing (NLP) & Big Data Symposium on June 18th. The symposium, co-hosted by UCSF, featured presenters from UCSF, Merck, City of Hope, Copyright Clearance Center and Linguamatics and delegates from a diverse range of organizations.

The central theme of this year’s symposium was “From bench-to-bedside, unlocking key insights from your data”. Healthcare delegates were keen to find new ways to address meaningful use and accountable care leveraging NLP text mining of electronic health records. Life sciences delegates were keen to increase the efficiency and effectiveness of their business operations by mining real world data. There was also a strong interest in forging partnership opportunities between pharma/biotech and hospitals/cancer centers.

Sorena Nadaf, the CIO and Director of Translational Informatics at UCSF Helen Diller Family Comprehensive Cancer Center delivered the welcome address and highlighted the foundation of clinical NLP and its common uses for extracting and transforming narrative information in EMR’s to support and accelerate clinical research.

NLP & Big Data Symposium
Sorena Nadaf at the NLP & Big Data Symposium in San Francisco.


Over the past few months there have been several publications which have used Linguamatics I2E to extract key information to provide value in a variety of different projects. We are constantly amazed by the inventiveness of our users, applying text analytics across the bench to bedside continuum; and these different publications are no exceptions. Using the natural language processing power of I2E, researchers are able to answer their questions rapidly and extract the results they need, with high precision and good recall; compared to more standard keyword search, which returns a document set that they then need to read.

Let’s start with Hornbeck et al., “PhosphoSitePlus, 2014: mutations, PTMs and recalibrations”. PhosphoSitePlus is an online systems biology resource for the study of protein post-translational modifications (PTMs) including phosphorylation, ubiquitination, acetylation and methylation. It’s provided by Cell Signaling Technology who have been users of I2E for several years. In the paper, they describe the value from integrating data on protein modifications from high-throughput mass spectrometry studies, with high-quality data from manual curation of published low-throughput (LTP) scientific literature.


Linguamatics I2E: the first text mining platform to integrate with Copyright Clearance Center's RightFind XML for Mining, to allow access to full-text journal articles

(Cambridge, UK and Boston, USA - 24 June 2015 ) - Linguamatics is expanding its natural language processing (NLP)-based text mining platform I2E to include easier access to full-text articles, with the integration of Copyright Clearance Center's (CCC) new text mining solution, RightFind™ XML for Mining.

Commercial life science researchers can now create sets of full-text XML articles from more than 4,000 peer-reviewed journals produced by over 25 scientific, technical, and medical (STM) publishers, and automatically make them available for text mining in I2E.

The solution enables researchers to make discoveries and connections that can only be found in full-text articles. All of the content is stored securely by CCC and is pre-authorized by publishers for commercial text mining. Users access the content using Linguamatics’ unique federated text mining architecture which allows researchers to find the key information to support business-critical decisions. The integrated solution is available now, and enables users to save time, reduce costs and help mitigate an organization’s copyright infringement risk.