Improving Arabic Dependency Parsing with Lexical and Inflectional Morphological Features

Yuval Marton, Nizar Habash, Owen Rambow

Research output: Contribution to conferencePaperpeer-review

Abstract

We explore the contribution of different lexical and inflectional morphological features to dependency parsing of Arabic, a morphologically rich language. We experiment with all leading POS tagsets for Arabic, and introduce a few new sets. We show that training the parser using a simple regular expressive extension of an impoverished POS tagset with high prediction accuracy does better than using a highly informative POS tagset with only medium prediction accuracy, although the latter performs best on gold input. Using controlled experiments, we find that definiteness (or determiner presence), the so-called phi-features (person, number, gender), and undiacritzed lemma are most helpful for Arabic parsing on predicted input, while case and state are most helpful on gold.

Original languageEnglish (US)
Pages13-21
Number of pages9
StatePublished - 2010
Event1st Workshop on Statistical Parsing of Morphologically-Rich Languages, SPMRL 2010 - Los Angeles, United States
Duration: Jun 5 2010 → …

Conference

Conference1st Workshop on Statistical Parsing of Morphologically-Rich Languages, SPMRL 2010
Country/TerritoryUnited States
CityLos Angeles
Period6/5/10 → …

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Improving Arabic Dependency Parsing with Lexical and Inflectional Morphological Features'. Together they form a unique fingerprint.

Cite this