Improving the arabic pronunciation dictionary for phone and word recognition with linguistically-based pronunciation rules

Fadi Biadsy, Nizar Habash, Julia Hirschberg

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we show that linguistically motivated pronunciation rules can improve phone and word recognition results for Modern Standard Arabic (MSA). Using these rules and the MADA morphological analysis and disambiguation tool, multiple pronunciations per word are automatically generated to build two pronunciation dictionaries; one for training and another for decoding. We demonstrate that the use of these rules can significantly improve both MSA phone recognition and MSA word recognition accuracies over a baseline system using pronunciation rules typically employed in previous work on MSA Automatic Speech Recognition (ASR). We obtain a significant improvement in absolute accuracy in phone recognition of 3.77%-7.29% and a significant improvement of 4.1% in absolute accuracy in ASR.

Original languageEnglish (US)
Title of host publicationNAACL HLT 2009 - Human Language Technologies
Subtitle of host publicationThe 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages397-405
Number of pages9
ISBN (Print)9781932432411
DOIs
StatePublished - 2009
EventHuman Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2009 - Boulder, CO, United States
Duration: May 31 2009Jun 5 2009

Publication series

NameNAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference

Other

OtherHuman Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2009
Country/TerritoryUnited States
CityBoulder, CO
Period5/31/096/5/09

ASJC Scopus subject areas

  • Language and Linguistics
  • Social Sciences (miscellaneous)

Fingerprint

Dive into the research topics of 'Improving the arabic pronunciation dictionary for phone and word recognition with linguistically-based pronunciation rules'. Together they form a unique fingerprint.

Cite this