European Community research project to enrich multilingual terminologies in biomedicine
(Cambridge, England and Boston, USA – December 11th, 2012) A key factor in successful text mining is the use of comprehensive terminologies which capture the different ways concepts can be expressed. However, although extensive terminologies exist for English, they are less common for other languages.
According to Wikipedia, a mantra is “a group of sounds, syllables or words capable of creating transformation”. MANTRA is also the highly apposite acronym of a new, European Community funded research project: Multilingual Annotation of Named entities and Terminological Resource Acquisition. Linguamatics is pleased to announce its participation as a commercial partner.
The object of MANTRA is to enrich multilingual terminologies in the biomedical domain by exploiting parallel corpora in several different languages. For example, from the knowledge that an English patent (claims 4, 5 and 6) refers to Branching Enzyme, it should be possible to discover the previously unknown German synonym Verzweigungsenzym from claims 4, 5 and 6 of the German translation. The new synonym can then be used in analysing other document sets. Terminologies in one language and the same documents in other languages can be mined simultaneously to provide enriched terminologies in those other languages.