TY - CONF
T1 - Cross-lingual information extraction system evaluation
AU - Sudo, Kiyoshi
AU - Sekine, Satoshi
AU - Grishman, Ralph
N1 - Funding Information:
This research was supported in part by the Defense Advanced Research Projects Agency as part of the Translingual Information Detection, Extraction and Summarization (TIDES) program, under Grant N66001-001-1-8917 from the Space and Naval Warfare Systems Center, San Diego, and by the National Science Foundation under Grant ITS- 00325657. This paper does not necessarily reflect the position of the U.S. Government.
Funding Information:
This research was supported in part by the Defense Advanced Research Projects Agency as part of the Translingual Information Detection, Extraction and Summarization (TIDES) program, under Grant N66001-001-1-8917 from the Space and Naval Warfare Systems Center, San Diego, and by the National Science Foundation under Grant ITS-00325657. This paper does not necessarily reflect the position of the U.S. Government.
Publisher Copyright:
© 2004 COLING 2004 - Proceedings of the 20th International Conference on Computational Linguistics. All rights reserved.
PY - 2004
Y1 - 2004
N2 - In this paper, we discuss the performance of crosslingual information extraction systems employing an automatic pattern acquisition module. This module, which creates extraction patterns starting from a user's narrative task description, allows rapid customization to new extraction tasks. We compare two approaches: (1) acquiring patterns in the source language, performing source language extraction, and then translating the resulting templates to the target language, and (2) translating the texts and performing pattern discovery and extraction in the target language. We demonstrate an average of 8-10% more recall using the first approach. We discuss some of the problems with machine translation and their effect on pattern discovery which lead to this difference in performance.
AB - In this paper, we discuss the performance of crosslingual information extraction systems employing an automatic pattern acquisition module. This module, which creates extraction patterns starting from a user's narrative task description, allows rapid customization to new extraction tasks. We compare two approaches: (1) acquiring patterns in the source language, performing source language extraction, and then translating the resulting templates to the target language, and (2) translating the texts and performing pattern discovery and extraction in the target language. We demonstrate an average of 8-10% more recall using the first approach. We discuss some of the problems with machine translation and their effect on pattern discovery which lead to this difference in performance.
UR - http://www.scopus.com/inward/record.url?scp=85021659169&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021659169&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85021659169
T2 - 20th International Conference on Computational Linguistics, COLING 2004
Y2 - 23 August 2004 through 27 August 2004
ER -