Distant supervision for relation extraction with an incomplete knowledge base

Bonan Min, Ralph Grishman, Li Wan, Chang Wang, David Gondek

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Distant supervision, heuristically labeling a corpus using a knowledge base, has emerged as a popular choice for training relation extractors. In this paper, we show that a significant number of "negative" examples generated by the labeling process are false negatives because the knowledge base is incomplete. Therefore the heuristic for generating negative examples has a serious flaw. Building on a state-of-The-Art distantly-supervised extraction algorithm, we proposed an algorithm that learns from only positive and unlabeled labels at the pair-of-entity level. Experimental results demonstrate its advantage over existing algorithms.

Original languageEnglish (US)
Title of host publicationNAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies, Proceedings of the Main Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages777-782
Number of pages6
ISBN (Electronic)9781937284473
StatePublished - 2013
Event2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013 - Atlanta, United States
Duration: Jun 9 2013Jun 14 2013

Publication series

NameNAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference

Other

Other2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
CountryUnited States
CityAtlanta
Period6/9/136/14/13

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Distant supervision for relation extraction with an incomplete knowledge base'. Together they form a unique fingerprint.

  • Cite this

    Min, B., Grishman, R., Wan, L., Wang, C., & Gondek, D. (2013). Distant supervision for relation extraction with an incomplete knowledge base. In NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference (pp. 777-782). (NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference). Association for Computational Linguistics (ACL).