Deriving an English Biomedical Silver Standard Corpus for CLEF-ER

Lewin I, Clematide S

CLEF 2013: Evaluation Labs and Workshop: Online Working Notes. 2013 Sep



Abstract. We describe the automatic harmonization method used for building the English Silver Standard annotation supplied as a data source for the multilingual CLEF-ER named entity recognition challenge. The use of an automatic Silver Standard is designed to remove the need for a costly and time-consuming expert annotation.

The final voting threshold of 3 for the harmonization of 6 different annotations from the project partners kept 45% of all available concept centroids. On average, 19% (SD 14%) of the original annotations are removed. 97.8% of the partner annotations that go into the Silver Standard Corpus have exactly the same boundaries as their harmonized representations.