Identification of naturally occurring numerical expressions in Arabic

Nizar Habash, Ryan Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we define the task of Number Identification in natural context. We present and validate a language-independent semiautomatic approach to quickly building a gold standard for evaluating number identification systems by exploiting hand-aligned parallel data. We also present and extensively evaluate a robust rule-based system for number identification in natural context for Arabic for a variety of number formats and types. The system is shown to have strong performance, achieving, on a blind test, a 94.8% F-score for the task of correctly identifying number expression spans in natural text, and a 92.1% F-score for the task of correctly determining the core numerical value.

Original languageEnglish (US)
Title of host publicationProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008
PublisherEuropean Language Resources Association (ELRA)
Pages3330-3336
Number of pages7
ISBN (Electronic)2951740840, 9782951740846
StatePublished - 2008
Event6th International Conference on Language Resources and Evaluation, LREC 2008 - Marrakech, Morocco
Duration: May 28 2008May 30 2008

Publication series

NameProceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008

Other

Other6th International Conference on Language Resources and Evaluation, LREC 2008
Country/TerritoryMorocco
CityMarrakech
Period5/28/085/30/08

ASJC Scopus subject areas

  • Library and Information Sciences
  • Linguistics and Language
  • Language and Linguistics
  • Education

Fingerprint

Dive into the research topics of 'Identification of naturally occurring numerical expressions in Arabic'. Together they form a unique fingerprint.

Cite this