Cross-lingual information extraction system evaluation

Kiyoshi Sudo, Satoshi Sekine, Ralph Grishman

Research output: Contribution to conferencePaperpeer-review

Abstract

In this paper, we discuss the performance of crosslingual information extraction systems employing an automatic pattern acquisition module. This module, which creates extraction patterns starting from a user's narrative task description, allows rapid customization to new extraction tasks. We compare two approaches: (1) acquiring patterns in the source language, performing source language extraction, and then translating the resulting templates to the target language, and (2) translating the texts and performing pattern discovery and extraction in the target language. We demonstrate an average of 8-10% more recall using the first approach. We discuss some of the problems with machine translation and their effect on pattern discovery which lead to this difference in performance.

Original languageEnglish (US)
StatePublished - 2004
Event20th International Conference on Computational Linguistics, COLING 2004 - Geneva, Switzerland
Duration: Aug 23 2004Aug 27 2004

Conference

Conference20th International Conference on Computational Linguistics, COLING 2004
Country/TerritorySwitzerland
CityGeneva
Period8/23/048/27/04

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Cross-lingual information extraction system evaluation'. Together they form a unique fingerprint.

Cite this