Using Natural Language Processing (NLP), Agios Pharmaceuticals discovers new therapeutic candidate
As the search for novel anti-cancer agents continues apace, the biopharma industry struggles to make sense of the myriad studies that are published describing putative small molecule inhibitors, potential genetic targets, and possibly susceptible points of attack. One area of growing interest in oncology is cancer cell metabolism: companies are striving to develop compounds that can interrupt or inhibit the metabolic process, leading to tumor cell suppression or death. The dual challenges are to identify promising lead compounds, and to detect suitable genes implicated in the metabolic process and sensitive to chemical intervention.
Rather than starting from scratch with a blank structure-activity canvas and pursuing the traditional (and potentially lengthy and risky) lead identification/lead optimization route to a pre-clinical candidate, Agios Pharmaceuticals decided to short-circuit the process and build on previously published studies. They wanted to locate and source known inhibitors for use as tool compounds in their chemical genetics screens and to identify genes with “druggable Achilles’ heels” susceptible to chemical attack, and they chose to use NLP to quickly and effectively scour the literature.
NLP-based Text Mining to accelerate drug discovery pipelines
Agios was already successfully using Linguamatics NLP platform to identify candidate diseases and target genes in other therapeutic areas, so opted to use Linguamatics I2E to jump start its chemical genetics screens as part of its cancer metabolism program. I2E is ideally suited to tease specific information from published sources and deals efficiently and accurately with the variety and variability found in drug-related texts, using NLP, linguistic rules and ontologies to detect and extract drug names, targets, diseases, and their relationships, no matter how they are expressed.
NLP-based Literature Mining for Tool Compounds for Chemical Genetics Screens
Optimal tool compounds are small molecules that inhibit metabolic targets with sufficient potency, selectivity, cell permeability, and bioavailability, plus on-target cellular pharmacodynamic read-outs. Finding these using traditional literature search techniques was cumbersome and inefficient, but Linguamatics’ powerful semantic and linguistic capabilities quickly and accurately homed in on likely compounds. The ability to use linguistic wild cards was especially valuable, e.g. to find all the novel entities that fit the phrase, “MTOR inhibitor xyz”. The initial list of possible compounds was refined with NLP queries relating to dose value, potency, mechanism of action, selectivity, and activity in animal models or a relevant cancer cell line. In-house domain experts then triaged the refined list to produce the optimum tool compound library to move to the chemical genetics screen.
Using Genetics Screens to Identify Cell Growth Inhibitors
The tool compound set was screened across Agios’ collection of over 470 tumor cell lines, and analysis of cell line heat map response profiles indicated those cell lines that were sensitive to the tool compounds, and identified two key compounds that significantly inhibited growth in specific tumor cell lines.
At Agios, we set a record for fastest time from discovery to market. And we did this by being very good at information extraction and information use. We mined the entire [oncology] space. That’s only really possible when you use tools like Linguamatics NLP, because you have a vast amount of information to sift through.
Testing Responder Gene Hypotheses with NLP Combined with Experimental Results
Agios then needed to elucidate the specific genes in those cell lines that enabled tumor growth and were inhibited by the tool compounds. These “responder genes” may be essential for a particular tumor type, and if inhibited, will suppress or kill the cancer cells. Agios used Linguamatics NLP to search scientific publications and other sources to find detailed information about the genes and mutations involved in these specific cancer cell lines.
The Agios team used statistics and machine learning to interrogate the tool compound and tumor line genetic feature data sets and categorize the cell lines as sensitive/moderate/resistant. These calculations informed which hits to follow up in biology, metabolomics, and biochemistry to test the hypothesis that ‘compound X causes sensitivity in selective tissue-derived cell lines with low responder gene expression’. Repeating the assay in-house with defined cell lines showed significant differences in tumor cell response to the compounds across sensitive vs. resistant cell lines.
Outcome: from early discovery to IND in just 3 years
Agios’ use of Linguamatics NLP substantially shortened the time to locate previously undiscovered tool compounds and responder genes for specific tumor types. The result, obtained over 15 months, was new targets, new responder genes, and lead compounds that could jump start the target and lead optimization process. The final outcome was an IND submission for a small molecule (AG-636) against a novel target, the metabolic enzyme dihydro-orotate dehydrogenase (DHODH), for the treatment of hematologic malignancies, in Q4 2018. The Phase I clinical trial started in May 2019, and is due to finish in September 2021.
Overall, by using NLP, Agios shaved an impressive three years off the average time (ca. 6.5 years) for target identification to clinical trials; and if this time-saving results in an earlier drug launch, Agios could be on track to make significant additional drug revenues (it’s estimated that for a blockbuster drug, the value of each day saved would be $6M).
Agios’ innovative Linguamatics NLP-based approach extensively mined the global space of genetic diseases to identify candidate diseases and target genes and in doing so, it created a whole therapeutic area using text mining technologies. They are now replicating this approach in other therapeutic areas.
Download the full case study to learn more about how Agios uses NLP to innovate, populate, and drastically speed up its drug discovery pipeline and deliver novel clinical candidates and targets in record-breaking time.