Abstract
Information extraction (IE) is the automatic identification of selected types of entities, relations, or events in free text. This article appraises two specific strands of IE - name identification and classification, and event extraction. Conventional treatment of languages pays little attention to proper names, addresses etc. Presentations of language analysis generally look up words in a dictionary and identify them as nouns etc. The incessant presence of names in a text, makes linguistic analysis of the same difficult, in the absence of the names being identified by their types and as linguistic units. Name tagging involves creating, several finite-state patterns, each corresponding to some noun subset. Elements of the patterns would match specific/classes of tokens with particular features. Event extraction typically works by creating a series of regular expressions, customized to capture the relevant events. Enhancement of each expression is corresponded by a relevant, suitable enhancement in the event patterns.
Original language | English (US) |
---|---|
Title of host publication | The Oxford Handbook of Computational Linguistics |
Publisher | Oxford University Press |
Volume | 9780199276349 |
ISBN (Electronic) | 9780191743573 |
ISBN (Print) | 9780199276349 |
DOIs | |
State | Published - Sep 18 2012 |
Keywords
- Automatic
- Event
- Linguistic analysis
- Name
- Patterns
- Tagging
ASJC Scopus subject areas
- General Arts and Humanities
- General Social Sciences