Improving NER in Arabic using a morphological tagger

Benjamin Farber, Dayne Freitag, Nizar Habash, Owen Rambow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We discuss a named entity recognition system for Arabic, and show how we incorporated the information provided by MADA, a full morphological tagger which uses a morphological analyzer. Surprisingly, the relevant features used are the capitalization of the English gloss chosen by the tagger, and the fact that an analysis is returned (that a word is not OOV to the morphological analyzer). The use of the tagger also improves over a third system which just uses a morphological analyzer, yielding a 14% reduction in error over the baseline. We conduct a thorough error analysis to identify sources of success and failure among the variations, and show that by combining the systems in simple ways we can significantly influence the precision-recall trade-off.

Original languageEnglish (US)
Title of host publicationProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008
PublisherEuropean Language Resources Association (ELRA)
Pages2509-2514
Number of pages6
ISBN (Electronic)2951740840, 9782951740846
StatePublished - 2008
Event6th International Conference on Language Resources and Evaluation, LREC 2008 - Marrakech, Morocco
Duration: May 28 2008May 30 2008

Publication series

NameProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008

Other

Other6th International Conference on Language Resources and Evaluation, LREC 2008
CountryMorocco
CityMarrakech
Period5/28/085/30/08

ASJC Scopus subject areas

  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics
  • Education

Fingerprint Dive into the research topics of 'Improving NER in Arabic using a morphological tagger'. Together they form a unique fingerprint.

Cite this