Passage Retrieval for Information Extraction using Distant Supervision

Wei Xu, Ralph Grishman, Le Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose a keyword-based passage retrieval algorithm for information extraction, trained by distant supervision. Our goal is to be able to extract attributes of people and organizations more quickly and accurately by first ranking all the potentially relevant passages according to their likelihood of containing the answer and then performing a traditional deeper, slower analysis of individual passages. Using Freebase as our source of known relation instances and Wikipedia as our text source, we collected a weighted set of keywords indicative of each relation and then use it to re-rank the passages retrieved by the Lemur search engine. Experiments show that our algorithm significantly outperforms state-of-the-art passage retrieval techniques in evaluations of both individual passage retrieval and end-to-end information extraction.

Original languageEnglish (US)
Title of host publicationIJCNLP 2011 - Proceedings of the 5th International Joint Conference on Natural Language Processing
EditorsHaifeng Wang, David Yarowsky
PublisherAssociation for Computational Linguistics (ACL)
Pages1046-1054
Number of pages9
ISBN (Electronic)9789744665645
StatePublished - 2011
Event5th International Joint Conference on Natural Language Processing, IJCNLP 2011 - Chiang Mai, Thailand
Duration: Nov 8 2011Nov 13 2011

Publication series

NameIJCNLP 2011 - Proceedings of the 5th International Joint Conference on Natural Language Processing

Conference

Conference5th International Joint Conference on Natural Language Processing, IJCNLP 2011
Country/TerritoryThailand
CityChiang Mai
Period11/8/1111/13/11

ASJC Scopus subject areas

  • Language and Linguistics
  • Artificial Intelligence
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Passage Retrieval for Information Extraction using Distant Supervision'. Together they form a unique fingerprint.

Cite this