TY - GEN
T1 - Morphologically annotated corpora for seven arabic dialects
T2 - 4th Arabic Natural Language Processing Workshop, WANLP 2019, held at ACL 2019
AU - Alshargi, Faisal
AU - Dibas, Shahd
AU - Alkhereyf, Sakhar
AU - Faraj, Reem
AU - Abdulkareem, Basmah
AU - Yagi, Sane
AU - Kacha, Ouafaa
AU - Habash, Nizar
AU - Rambow, Owen
N1 - Funding Information:
This work is supported by the Air Force Research Laboratory (AFRL) under a grant administered by Ball Aerospace. Alkhereyf is supported by the KACST Graduate Studies program. The views expressed here are those of the authors and do not reflect the official policy or position of the U.S. Department of Defense or the U.S. Government We also would like to thank all the anonymous reviewers for their insightful and valuable comments and suggestions.
Publisher Copyright:
© ACL 2019.All right reserved.
PY - 2019
Y1 - 2019
N2 - We present a collection of morphologically annotated corpora for seven Arabic dialects: Taizi Yemeni, Sanaani Yemeni, Najdi, Jordanian, Syrian, Iraqi and Moroccan Arabic. The corpora collectively cover over 200,000 words, and are all manually annotated in a common set of standards for orthography, diacritized lemmas, tokenization, morphological units and English glosses. These corpora will be publicly available to serve as benchmarks for training and evaluating systems for Arabic dialect morphological analysis and disambiguation.
AB - We present a collection of morphologically annotated corpora for seven Arabic dialects: Taizi Yemeni, Sanaani Yemeni, Najdi, Jordanian, Syrian, Iraqi and Moroccan Arabic. The corpora collectively cover over 200,000 words, and are all manually annotated in a common set of standards for orthography, diacritized lemmas, tokenization, morphological units and English glosses. These corpora will be publicly available to serve as benchmarks for training and evaluating systems for Arabic dialect morphological analysis and disambiguation.
UR - http://www.scopus.com/inward/record.url?scp=85096536274&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096536274&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85096536274
T3 - ACL 2019 - 4th Arabic Natural Language Processing Workshop, WANLP 2019 - Proceedings of the Workshop
SP - 137
EP - 147
BT - ACL 2019 - 4th Arabic Natural Language Processing Workshop, WANLP 2019 - Proceedings of the Workshop
PB - Association for Computational Linguistics (ACL)
Y2 - 1 August 2019
ER -