Automatic extraction of morphological lexicons from morphologically annotated corpora

Ramy Eskander, Nizar Habash, Owen Rambow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a method for automatically learning inflectional classes and associated lemmas from morphologically annotated corpora. The method consists of a core language-independent algorithm, which can be optimized for specific languages. The method is demonstrated on Egyptian Arabic and German, two morphologically rich languages. Our best method for Egyptian Arabic provides an error reduction of 55.6% over a simple baseline; our best method for German achieves a 66.7% error reduction.

Original languageEnglish (US)
Title of host publicationEMNLP 2013 - 2013 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1032-1043
Number of pages12
ISBN (Electronic)9781937284978
StatePublished - 2013
Event2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013 - Seattle, United States
Duration: Oct 18 2013Oct 21 2013

Publication series

NameEMNLP 2013 - 2013 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

Other

Other2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013
CountryUnited States
CitySeattle
Period10/18/1310/21/13

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Information Systems
  • Computer Vision and Pattern Recognition

Cite this