One-step statistical parsing of hybrid dependency-constituency syntactic representations

Kais Dukes, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we describe and compare two statistical parsing approaches for the hybrid dependency-constituency syntactic representation used in the Quranic Arabic Treebank (Dukes and Buckwalter, 2010). In our first approach, we apply a multi-step process in which we use a shift-reduce algorithm trained on a pure dependency preprocessed version of the treebank. After parsing, the dependency output is converted into the hybrid representation. This is compared to a novel one-step parser that is able to learn the hybrid representation without preprocessing. We define an extended labelled attachment score (ELAS) as our performance metric for hybrid parsing, and report 87.47% (F1 score) for the multi-step approach, and 89.03% (F1 score) for the one-step integrated algorithm. We also consider the effect of using different sets of morphological features for parsing the Quran, comparing our results to recent work on Modern Standard Arabic.

Original languageEnglish (US)
Title of host publicationIWPT 2011 - Proceedings of the 12th International Conference on Parsing Technologies
PublisherAssociation for Computational Linguistics (ACL)
Pages92-103
Number of pages12
ISBN (Electronic)9781932432046
StatePublished - 2011
Event12th International Conference on Parsing Technologies, IWPT 2011 - Dublin, Ireland
Duration: Oct 5 2011Oct 7 2011

Publication series

NameIWPT 2011 - Proceedings of the 12th International Conference on Parsing Technologies

Conference

Conference12th International Conference on Parsing Technologies, IWPT 2011
Country/TerritoryIreland
CityDublin
Period10/5/1110/7/11

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'One-step statistical parsing of hybrid dependency-constituency syntactic representations'. Together they form a unique fingerprint.

Cite this