Gene-disease mapping and target identification

Get the insights you need to accelerate your early drug discovery

The challenge of gene-disease mapping and target identification

The first step in the discovery of a new medicine is to identify the biological origin of the disease, and potential targets for intervention.

This requires a comprehensive understanding of the genes involved in the disease pathway, so a systematic review of the public domain literature is important. Just as important is to understand the intellectual property around any potential target area, in order to prioritise key targets for further development.

Trying to keep abreast of all the relevant literature you need to identify targets and understand the association of genes and diseases becomes an almost impossible task. Key information can be missed.  

The solution to finding the right information for gene-disease mapping in drug discovery

Linguamatics Life Science Platform, powered by text analytics engine I2E, is used by the top 18 pharmaceutical companies to quickly get results from scientific literature. The platform allows researchers to identify targets in disease areas of interest and establish ranking based on factors such as safety and potential for therapeutic benefit. Related areas such as biomarker discovery and genotype-phenotype associations can also benefit. 

Researchers report that I2E uncovers relevant papers that would never have been seen using previous methods.

According to our customers, the adoption of Linguamatics Life Science for querying has led to close to 100% improvement in the precision of literature review results, as well as increased recall. In addition, I2E reduces time needed for example, literature review can be reduced from 3-4 weeks, to just 7-10 days.

Sanofi use case for gene-disease mapping

Sanofi utilized Linguamatics I2E literature mining capabilities to annotate the association of human leukocyte antigen (HLA) alleles with diseases and drug hypersensitivity as part of a multiple sclerosis (MS) biomarker discovery project.

To learn more you can watch our webinar on Text Mining at Sanofi for Genotype-Phenotype Associations in Multiple Sclerosis or download the case study.

You can also watch the presentation of Dongyu Liu PhD, Associate Director, Translational Sciences at Sanofi from the Linguamatics Text Mining Summit 2017.

Read our blog on how NLP enhances next-generation sequencing data analysis.

Don’t just take our word for it…

"Linguamatics has really helped us tackle our big data challenges."

Baerbel LoSacco, Computational Biology Group at Boehringer Ingelheim

"Linguamatics has a great track record."

Matthew Crawford, Molecular Informatics at Pfizer

"For sophisticated text mining semantic search we think Linguamatics is the best."

Stuart Murray, Informatics at Agios Pharmaceuticals

To find out more, download the following case studies:

Pfizer case study on their use of I2E for target prioritization


AstraZeneca case study on their use of I2E for target selection


Pfizer case study on understanding IP landscapes on targets and diseases


Sanofi on genotype-phenotype associations in a multiple sclerosis biomarker project