Picking Your Brain: Synergy of OMIM and PubMed in Understanding Gene-Disease Associations for Synapse ProteinsLinguamatics
I read with interest a recent publication which sheds light on the complex interactions of synapse protein complexes with human disease.
The study (run by the Genes to Cognition neuroscience research programme) combined wet-lab research with bioinformatics and text analytics to uncover genetic associations with these protein complexes in over seventy human brain diseases, including Alzheimer’s Disease, Schizophrenia and Autism spectrum disorders.
The idea was to identify and develop suitable screening assays for synapse proteomes from post-mortem and neurosurgical brain samples, focusing specifically on Membrane-associated guanylate kinase (MAGUK) associated signalling complexes (MASC).
Our CTO, David Milward was involved in the text analytics work. He used the natural language processing capabilities of Linguamatics I2E platform to extract gene-mutation-disease associations from PubMed abstracts. The flexibility of I2E enabled an appropriate balance of recall and precision, thus providing comprehensive results while not overloading curators with noise. Queries were built using linguistic patterns to allow associations to be discovered between a list of several thousand relevant gene identifiers, and appropriate MedDRA disease terms.
The key aim was to provide comprehensive results with suitable accuracy to allow fast curation. These text-mined results were combined with data from Online Mendelian Inheritance in Man (OMIM) on human MASC genes and genetic disease associations.