Acquiring topic features to improve event extraction: In pre-selected and balanced collections

Shasha Liao, Ralph Grishman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Event extraction is a particularly challenging type of information extraction (IE) that may require inferences from the whole article. However, most current event extraction systems rely on local information at the phrase or sentence level, and do not consider the article as a whole, thus limiting extraction performance. Moreover, most annotated corpora are artificially enriched to include enough positive samples of the events of interest; event identification on a more balanced collection, such as unfiltered newswire, may perform much worse. In this paper, we investigate the use of unsupervised topic models to extract topic features to improve event extraction both on test data similar to training data, and on more balanced collections. We compare this unsupervised approach to a supervised multi-label text classifier, and show that unsupervised topic modeling can get better results for both collections, and especially for a more balanced collection. We show that the unsupervised topic model can improve trigger, argument and role labeling by 3.5%, 6.9% and 6% respectively on a pre-selected corpus, and by 16.8%, 12.5% and 12.7% on a balanced corpus.

Original languageEnglish (US)
Title of host publicationInternational Conference Recent Advances in Natural Language Processing, RANLP
Pages9-16
Number of pages8
StatePublished - 2011
Event8th International Conference on Recent Advances in Natural Language Processing, RANLP 2011 - Hissar, Bulgaria
Duration: Sep 12 2011Sep 14 2011

Other

Other8th International Conference on Recent Advances in Natural Language Processing, RANLP 2011
Country/TerritoryBulgaria
CityHissar
Period9/12/119/14/11

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Acquiring topic features to improve event extraction: In pre-selected and balanced collections'. Together they form a unique fingerprint.

Cite this