Using deep morphology to improve automatic error detection in Arabic handwriting recognition

Nizar Habash, Ryan M. Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Arabic handwriting recognition (HR) is a challenging problem due to Arabic's connected letter forms, consonantal diacritics and rich morphology. In this paper we isolate the task of identification of erroneous words in HR from the task of producing corrections for these words. We consider a variety of linguistic (morphological and syntactic) and non-linguistic features to automatically identify these errors. Our best approach achieves a roughly ∼15% absolute increase in F-score over a simple but reasonable baseline. A detailed error analysis shows that linguistic features, such as lemma (i.e., citation form) models, help improve HR-error detection precisely where we expect them to: semantically incoherent error words.

Original languageEnglish (US)
Title of host publicationACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies
Pages875-884
Number of pages10
StatePublished - 2011
Event49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011 - Portland, OR, United States
Duration: Jun 19 2011Jun 24 2011

Publication series

NameACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
Volume1

Other

Other49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT 2011
Country/TerritoryUnited States
CityPortland, OR
Period6/19/116/24/11

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Using deep morphology to improve automatic error detection in Arabic handwriting recognition'. Together they form a unique fingerprint.

Cite this