Information Extraction

Ralph Grishman

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Information extraction (IE) is the automatic identification of selected types of entities, relations, or events in free text. This article appraises two specific strands of IE - name identification and classification, and event extraction. Conventional treatment of languages pays little attention to proper names, addresses etc. Presentations of language analysis generally look up words in a dictionary and identify them as nouns etc. The incessant presence of names in a text, makes linguistic analysis of the same difficult, in the absence of the names being identified by their types and as linguistic units. Name tagging involves creating, several finite-state patterns, each corresponding to some noun subset. Elements of the patterns would match specific/classes of tokens with particular features. Event extraction typically works by creating a series of regular expressions, customized to capture the relevant events. Enhancement of each expression is corresponded by a relevant, suitable enhancement in the event patterns.

Original languageEnglish (US)
Title of host publicationThe Oxford Handbook of Computational Linguistics
PublisherOxford University Press
Volume9780199276349
ISBN (Electronic)9780191743573
ISBN (Print)9780199276349
DOIs
StatePublished - Sep 18 2012

Keywords

  • Automatic
  • Event
  • Linguistic analysis
  • Name
  • Patterns
  • Tagging

ASJC Scopus subject areas

  • General Arts and Humanities
  • General Social Sciences

Fingerprint

Dive into the research topics of 'Information Extraction'. Together they form a unique fingerprint.

Cite this