We present CALIMAGLF, a Gulf Arabic morphological analyzer currently covering over 2,600 verbal lemmas. We describe in detail the process of building the analyzer starting from phonetic dictionary entries to fully inflected orthographic paradigms and associated lexicon and orthographic variants. We evaluate the coverage of CALIMA-GLF against Modern Standard Arabic and Egyptian Arabic analyzers on part of a Gulf Arabic novel. CALIMA-GLF verb analysis token recall for identifying correct POS tag outperforms both the Modern Standard Arabic and Egyptian Arabic analyzers by over 27.4% and 16.9% absolute, respectively.
|Title of host publication||Proceedings of the Third Arabic Natural Language Processing Workshop|
|Place of Publication||Valencia, Spain|
|Publisher||Association for Computational Linguistics (ACL)|
|Number of pages||11|
|State||Published - Apr 1 2017|