Distant supervision for relation extraction with an incomplete knowledge base

Bonan Min, Ralph Grishman, Li Wan, Chang Wang, David Gondek

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Distant supervision, heuristically labeling a corpus using a knowledge base, has emerged as a popular choice for training relation extractors. In this paper, we show that a significant number of "negative" examples generated by the labeling process are false negatives because the knowledge base is incomplete. Therefore the heuristic for generating negative examples has a serious flaw. Building on a state-of-The-Art distantly-supervised extraction algorithm, we proposed an algorithm that learns from only positive and unlabeled labels at the pair-of-entity level. Experimental results demonstrate its advantage over existing algorithms.

Original languageEnglish (US)
Title of host publicationProceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies
PublisherAssociation for Computational Linguistics (ACL)
Pages777-782
Number of pages6
ISBN (Electronic)9781937284473
StatePublished - 2013
Event2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013 - Atlanta, United States
Duration: Jun 9 2013Jun 14 2013

Publication series

NameNAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference

Other

Other2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
Country/TerritoryUnited States
CityAtlanta
Period6/9/136/14/13

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Distant supervision for relation extraction with an incomplete knowledge base'. Together they form a unique fingerprint.

Cite this