TY - CONF
T1 - Spoken arabic dialect identification using phonotactic modeling
AU - Biadsy, Fadi
AU - Hirschberg, Julia
AU - Habash, Nizar
N1 - Funding Information:
We thank Dan Ellis, Michael Mandel, and Andrew Rosenberg for useful discussions. This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR0011-06-C-0023 (approved for public release, distribution unlimited). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA.
Publisher Copyright:
© 2009 Association for Computational Linguistics.
PY - 2009
Y1 - 2009
N2 - The Arabic language is a collection of multiple variants, among which Modern Standard Arabic (MSA) has a special status as the formal written standard language of the media, culture and education across the Arab world. The other variants are informal spoken dialects that are the media of communication for daily life. Arabic dialects differ substantially from MSA and each other in terms of phonology, morphology, lexical choice and syntax. In this paper, we describe a system that automatically identifies the Arabic dialect (Gulf, Iraqi, Levantine, Egyptian and MSA) of a speaker given a sample of his/her speech. The phonotactic approach we use proves to be effective in identifying these dialects with considerable overall accuracy - 81.60% using 30s test utterances.
AB - The Arabic language is a collection of multiple variants, among which Modern Standard Arabic (MSA) has a special status as the formal written standard language of the media, culture and education across the Arab world. The other variants are informal spoken dialects that are the media of communication for daily life. Arabic dialects differ substantially from MSA and each other in terms of phonology, morphology, lexical choice and syntax. In this paper, we describe a system that automatically identifies the Arabic dialect (Gulf, Iraqi, Levantine, Egyptian and MSA) of a speaker given a sample of his/her speech. The phonotactic approach we use proves to be effective in identifying these dialects with considerable overall accuracy - 81.60% using 30s test utterances.
UR - http://www.scopus.com/inward/record.url?scp=85026860559&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85026860559&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85026860559
SP - 53
EP - 61
T2 - EACL 2009 Workshop on Computational Approaches to Semitic Languages, SEMITIC@EACL 2009
Y2 - 31 March 2009
ER -