TY - GEN
T1 - Passage Retrieval for Information Extraction using Distant Supervision
AU - Xu, Wei
AU - Grishman, Ralph
AU - Zhao, Le
N1 - Funding Information:
Supported by the Intelligence Advanced Research Projects Activity (IARPA) via Air Force Research Laboratory (AFRL) contract number FA8650-10-C-7058. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, AFRL, or the U.S. Government.
Funding Information:
Supported by the Intelligence Advanced Research Projects Activity (IARPA) via Air Force Research Laboratory (AFRL) contract number FA8650-10-C-7058. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, AFRL, or the U.S. Government. We wish to thank Adam Meyers and Satoshi Sekine of New York University and Heng Ji of the City University of New York for their advice.
Publisher Copyright:
© 2011 AFNLP
PY - 2011
Y1 - 2011
N2 - In this paper, we propose a keyword-based passage retrieval algorithm for information extraction, trained by distant supervision. Our goal is to be able to extract attributes of people and organizations more quickly and accurately by first ranking all the potentially relevant passages according to their likelihood of containing the answer and then performing a traditional deeper, slower analysis of individual passages. Using Freebase as our source of known relation instances and Wikipedia as our text source, we collected a weighted set of keywords indicative of each relation and then use it to re-rank the passages retrieved by the Lemur search engine. Experiments show that our algorithm significantly outperforms state-of-the-art passage retrieval techniques in evaluations of both individual passage retrieval and end-to-end information extraction.
AB - In this paper, we propose a keyword-based passage retrieval algorithm for information extraction, trained by distant supervision. Our goal is to be able to extract attributes of people and organizations more quickly and accurately by first ranking all the potentially relevant passages according to their likelihood of containing the answer and then performing a traditional deeper, slower analysis of individual passages. Using Freebase as our source of known relation instances and Wikipedia as our text source, we collected a weighted set of keywords indicative of each relation and then use it to re-rank the passages retrieved by the Lemur search engine. Experiments show that our algorithm significantly outperforms state-of-the-art passage retrieval techniques in evaluations of both individual passage retrieval and end-to-end information extraction.
UR - http://www.scopus.com/inward/record.url?scp=84868539150&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84868539150&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84868539150
T3 - IJCNLP 2011 - Proceedings of the 5th International Joint Conference on Natural Language Processing
SP - 1046
EP - 1054
BT - IJCNLP 2011 - Proceedings of the 5th International Joint Conference on Natural Language Processing
A2 - Wang, Haifeng
A2 - Yarowsky, David
PB - Association for Computational Linguistics (ACL)
T2 - 5th International Joint Conference on Natural Language Processing, IJCNLP 2011
Y2 - 8 November 2011 through 13 November 2011
ER -