Spring Text Mining Conference 2020

March 16, 2020 to March 18, 2020
Venue: Møller Institute, Cambridge, United Kingdom

The Linguamatics Spring Text Mining Conference 2020 will take place on 16-18 March 2020 at the Møller Institute in Cambridge, UK.

Our confirmed speakers are:

  • Ashley George, CEO, Bay Digital Consulting: "Digital transformation in Life sciences — BUT what the heck are we transforming to?"
  • Paul O'Regan, Clinical Informatician, Digital ECMT: "Text mining to match patients to clinical trials"
  • Sten Christensen, Principal Information Scientist, Novo Nordisk: "Early Scientific Information Pipeline"
  • Brian Schurmann Michels, Early Scientific Intelligence Project Manager, Novo Nordisk: "Early Scientific Information Pipeline"
  • Martin Menke, Medical Coding Lead Global Clinical Safety & Pharmacovigilance, CSL Behring: "Evaluation of Natural Language Processing to support Medical Coding in Adverse Event Report Processing at CSL Behring"
  • Ursula Schneider, Scientific Information Officer, Merck KGaA: "Accelerating patent evaluation with text mining"


See the agenda here.


We have training workshops run by our text mining experts available for beginners, intermediate users and for those interested in automating Linguamatics NLP. See this year's Training Workshop selection guide for the schedule and descriptions.


Registration for the event is now open!



Read more about last year's event.

Join the conversation on social media: #STMC20

Speakers & Abstracts

Novo Nordisk

Sten Christensen, Principal Information Scientist, Novo Nordisk and Brian Schurmann Michels, Early Scientific Intelligence Project Manager, Novo Nordisk

Learn how:

  • Text mining is being used to filter and extract information to scale up an otherwise manual surveillance
  • News across 8 different sources are consolidated and distributed via regular newsletters and collected in a database
  • Collaborating with two key vendors Linguamatics & Infodesk helped establish an information flow allowing us to process large amounts of information
  • Internal social platforms are being used to involve the broader organization in filtering the information

The pharma company Novo Nordisk is a strong believer in collaborating with start-ups, biotechs and universities to develop the next state of the art medical treatments. To help connect Novo Nordisk with such organizations, the project Early Scientific Intelligence was launched to help create a drone view of potentially interesting companies and projects based on systematic surveillance of at least 8 different information sources. The information is filtered via text mining and human curators to ensure fast processing while keeping the quality high before the selected potential partners are approached. Data is shared broadly across the research organization with the aim of creating a transparent discussion of the potential value of the collaboration before moving into a more formal screening. Over time, a history of known information on specific companies and projects is built up and visualized via a dashboard.

CSL Behring

Martin A.O.H. Menke, Medical Coding Lead Global Clinical Safety & Pharmacovigilance, CSL Behring, Marburg, Germany

Evaluation of Natural Language Processing to support Medical Coding in Adverse Event Report Processing at CSL Behring

For processing of adverse event reports, the information provided by a reporter in natural language is transferred (coded) into a standardised format to allow database processing. For the adverse event, indication, medical history, etc. the Medical Dictionary for Regulatory Activities – MedDRA must be used. Most of the coding is manually and time consuming. Only when the verbatim exactly matches a MedDRA term coding is automatic (currently about 30%).

Natural language processing (NLP) is programming computers to process and analyse natural language data, to recognise relevant content and act up-on it in a specific way. E.g. recognise a verbatim as an adverse event and assign a respective MedDRA code.

We have set-up a proof of concept project to evaluate the potential of Natural Language Processing to support Medical Coding at CSL Behring’s Global Clinical Safety & Pharmacovigilance department (GCSP). Real life data was used to cover for the specific nature of our patients suffering from rare diseases.

We will provide an overview of the project and its results as well as an outlook on the next steps for integrating NLP into medical coding during adverse event processing.

Digital ECMT

Paul O'Regan, Clinical Informatician

Text mining to match patients to clinical trials

  • Cancer genomic medicine has the potential to select the treatment a patient is most likely to respond to, based on the molecular drivers of their disease. However, understanding the clinical and functional significance of genomic data presents a substantial challenge to the implementation of genomic medicine in the clinical setting.
  • Software tools are essential in adding value to genomic data, due to the volume of data, the requirement to integrate with other data from disparate sources, and the need to map between those sources. Furthermore, many medical data types are stored in an unstructured format, and natural language processing is critical to extract usable information from such data.
  • Here, we demonstrate a bespoke clinical decision support tool that incorporates natural language processing. The tool is currently being used within monthly molecular tumour board (MTB) meetings for an ongoing oncology study. We compare the sensitivity and specificity of results with / without natural language processing.

Merck KGaA

Ursula Schneider, Qualified Patent Information Professional (QPIP); Silke Hoffmann, Qualified Patent Information Professional (QPIP); Kurt-Ludwig Richter, IT Consultant, Merck KGaA, Darmstadt, Germany

Accelerating patent evaluation with text mining

The Global Patent and Literature Search Services unit of Merck provides reliable and comprehensive search results for the Patent department as well as all the businesses. Our service includes a relevance check of the retrieved answers. The growing complexity of the requests and the increasing amount of publications are making it more and more difficult to deliver user-friendly results in a timely manner. We at Merck have accelerated our patent evaluation with text mining by:

  • Analyzing technical needs and optimizing hardware accordingly
  • Improving the download process from our patent source (co-development with Linguamatics)

The preparation of a patent index is now so fast that I2E can be used on a daily basis to text mine patent documents. The applied categories in the form of preferred terms (PT) as labels make relevance checks much easier and quicker. The highlighted hit terms simplify the reading of the full text, creating additional benefit for our customers and us.

Venue, Travel and Accommodation


The Møller Institute is an intelligently designed, purpose-built, residential leadership development and conference centre set within in 42 acres of beautiful parkland on the grounds of Churchill College, University of Cambridge.


By car 150 onsite car parking spaces available to delegates free of charge. This includes disabled parking close to the main entrance and overnight parking.
If you are travelling to The Møller Institute using a satellite navigation system you should enter postcode CB3 0DS as this will lead you directly to the Møller Centre entrance.
By train Cambridge Railway Station is a short taxi ride away and has frequent links to London Kings Cross, London Liverpool Street and Stansted Airport.
The station address is: Cambridge Railway Station, Station Road, Cambridge CB1 2JW.
By taxi the recommended local taxi service provider CamCab +44 (0) 1223 704 704 or www. camcab.co.uk.
Approximate taxi travel times: Cambridge Train Station (15 minutes), Stansted Airport (30 minutes), Heathrow Airport (2 hrs), Gatwick Airport (2 hrs 30 minutes).  Please note that actual travel times may vary depending on traffic.
By coach Cambridge has a network of buses and coaches run by Stagecoach www.stagecoachbus.com or National Express www.nationalexpress.com.


To book, call the Møller Institute and speak with the reception team on +44(0)1223 465 500 (mention Linguamatics).
Parking is free on site at the Møller Institute.