Fickett J, Hayes W.
European Pharmaceutical Contractor, Autumn 2004
An increasingly prominent topic at conferences of late concerns the pros and cons of text mining and how to extract the full value of the literature for drug discovery. There have been a few false starts and misunderstandings about what text mining is and what it can do. As with many new technologies, some of the claims are somewhat excessive, yet we predict that text mining will join many others in drug discovery as a key enabling technology.
What is text mining? At a basic level, text mining is the process of highlighting a small volume of relevant information from a very large set of possibly interesting documents.
There is nothing magical about the process; text mining software's 'understanding' of the literature is still rather rudimentary. However, while it may make mistakes where a researcher would recognise the interpretation as wrong, or even stupid, it is generally reliable. And unlike analysts who get tired, its actions are consistent.
The greatest benefit of software in analysing the literature is that it can slog through amounts of text that no person could ever hope to manage. The resulting information can then be used directly by humans or as an input in a follow-on data mining exercise.