Mining gene-centric relationships from literature: the roles of gene mutation and gene expression in supporting drug discovery

Tari L,  Patel J, Küntzer J, Li Y, Peng Z, Wang Y, Aguiar L, Cai J

Int J Data Mining Bioinformatics. 2014 Sep; 10(4):357-373



Identifying drug target candidates is an important task for early development throughout the drug discovery process. This process is supported by the development of new high-throughput technologies that enable better understanding of disease mechanism. It becomes critical to facilitate effective analysis of the large amount of biological data.

However, with much of the biological knowledge represented in the literature in the form of natural text, analysis and interpretation of high-throughput data has not reached its potential effectiveness. In this paper, we describe our solution in employing text mining as a technique in finding scientific information for target and biomarker discovery from the biomedical literature.

Our approach utilises natural language processing techniques to capture linguistic patterns for the extraction of biological knowledge from text. Additionally, we discuss how the extracted knowledge is used for the analysis of biological data such as next-generation sequencing and gene expression data.