TY - GEN
T1 - Compensating for annotation errors in training a relation extractor
AU - Min, Bonan
AU - Grishman, Ralph
N1 - Publisher Copyright:
© 2012 Association for Computational Linguistics.
PY - 2012
Y1 - 2012
N2 - The well-studied supervised Relation Extraction algorithms require training data that is accurate and has good coverage. To obtain such a gold standard, the common practice is to do independent double annotation followed by adjudication. This takes significantly more human effort than annotation done by a single annotator. We do a detailed analysis on a snapshot of the ACE 2005 annotation files to understand the differences between single-pass annotation and the more expensive nearly three-pass process, and then propose an algorithm that learns from the much cheaper single-pass annotation and achieves a performance on a par with the extractor trained on multi-pass annotated data. Furthermore, we show that given the same amount of human labor, the better way to do relation annotation is not to annotate with high-cost quality assurance, but to annotate more.
AB - The well-studied supervised Relation Extraction algorithms require training data that is accurate and has good coverage. To obtain such a gold standard, the common practice is to do independent double annotation followed by adjudication. This takes significantly more human effort than annotation done by a single annotator. We do a detailed analysis on a snapshot of the ACE 2005 annotation files to understand the differences between single-pass annotation and the more expensive nearly three-pass process, and then propose an algorithm that learns from the much cheaper single-pass annotation and achieves a performance on a par with the extractor trained on multi-pass annotated data. Furthermore, we show that given the same amount of human labor, the better way to do relation annotation is not to annotate with high-cost quality assurance, but to annotate more.
UR - http://www.scopus.com/inward/record.url?scp=85035354202&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85035354202&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85035354202
T3 - EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
SP - 194
EP - 203
BT - EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012
Y2 - 23 April 2012 through 27 April 2012
ER -