Morphological Analysis and Disambiguation for Dialectal Arabic

Nizar Habash, Ryan Roth, Owen Rambow, Ramy Eskander, Nadi Tomeh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The many differences between Dialectal Arabic and Modern Standard Arabic (MSA) pose a challenge to the majority of Arabic natural language processing tools, which are designed for MSA. In this paper, we retarget an existing state-of-the-art MSA morphological tagger to Egyptian Arabic (ARZ). Our evaluation demonstrates that our ARZ morphology tagger outperforms its MSA variant on ARZ input in terms of accuracy in part-of-speech tagging, diacritization, lemmatization and tokenization; and in terms of utility for ARZ-to-English statistical machine translation.

Original languageEnglish (US)
Title of host publicationProceedings of the 2nd Workshop on Computational Linguistics for Literature, CLfL 2013 at the 2013 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies, NAACL-HLT 2013
EditorsDavid Elson, Anna Kazantseva, Stan Szpakowicz
PublisherAssociation for Computational Linguistics (ACL)
Pages426-432
Number of pages7
ISBN (Electronic)9781937284473
StatePublished - 2013
Event2nd Workshop on Computational Linguistics for Literature, CLfL 2013 at the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2013 - Atlanta, United States
Duration: Jun 14 2013 → …

Publication series

NameProceedings of the 2nd Workshop on Computational Linguistics for Literature, CLfL 2013 at the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2013

Conference

Conference2nd Workshop on Computational Linguistics for Literature, CLfL 2013 at the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2013
Country/TerritoryUnited States
CityAtlanta
Period6/14/13 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Morphological Analysis and Disambiguation for Dialectal Arabic'. Together they form a unique fingerprint.

Cite this