Deriving an English Biomedical Silver Standard Corpus for CLEF-ER

Lewin I, Clematide S

CLEF 2013: Evaluation Labs and Workshop: Online Working Notes. 2013 Sep

PMID: N/A

http://www.zora.uzh.ch/87213/1/mantrasilverstd.pdf

Abstract

Abstract. We describe the automatic harmonization method used for building the English Silver Standard annotation supplied as a data source for the multilingual CLEF-ER named entity recognition challenge. The use of an automatic Silver Standard is designed to remove the need for a costly and time-consuming expert annotation.

The final voting threshold of 3 for the harmonization of 6 different annotations from the project partners kept 45% of all available concept centroids. On average, 19% (SD 14%) of the original annotations are removed. 97.8% of the partner annotations that go into the Silver Standard Corpus have exactly the same boundaries as their harmonized representations.