Compensating for annotation errors in training a relation extractor

Bonan Min, Ralph Grishman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The well-studied supervised Relation Extraction algorithms require training data that is accurate and has good coverage. To obtain such a gold standard, the common practice is to do independent double annotation followed by adjudication. This takes significantly more human effort than annotation done by a single annotator. We do a detailed analysis on a snapshot of the ACE 2005 annotation files to understand the differences between single-pass annotation and the more expensive nearly three-pass process, and then propose an algorithm that learns from the much cheaper single-pass annotation and achieves a performance on a par with the extractor trained on multi-pass annotated data. Furthermore, we show that given the same amount of human labor, the better way to do relation annotation is not to annotate with high-cost quality assurance, but to annotate more.

Original languageEnglish (US)
Title of host publicationEACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages194-203
Number of pages10
ISBN (Electronic)9781937284190
StatePublished - 2012
Event13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012 - Avignon, France
Duration: Apr 23 2012Apr 27 2012

Publication series

NameEACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings

Other

Other13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012
Country/TerritoryFrance
CityAvignon
Period4/23/124/27/12

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software

Fingerprint

Dive into the research topics of 'Compensating for annotation errors in training a relation extractor'. Together they form a unique fingerprint.

Cite this