NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task

Muhammad Abdul-Mageed, Amr Keleg, Abdel Rahim Elmadany, Chiyu Zhang, Injy Hamed, Walid Magdy, Houda Bouamor, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe the findings of the fifth Nuanced Arabic Dialect Identification Shared Task (NADI 2024). NADI’s objective is to help advance SoTA Arabic NLP by providing guidance, datasets, modeling opportunities, and standardized evaluation conditions that allow researchers to collaboratively compete on prespecified tasks. NADI 2024 targeted both dialect identification cast as a multi-label task (Subtask 1), identification of the Arabic level of dialectness (Subtask 2), and dialect-to-MSA machine translation (Subtask 3). A total of 51 unique teams registered for the shared task, of whom 12 teams have participated (with 76 valid submissions during the test phase). Among these, three teams participated in Subtask 1, three in Subtask 2, and eight in Subtask 3. The winning teams achieved 50.57 F1 on Subtask 1, 0.1403 RMSE for Subtask 2, and 20.44 BLEU in Subtask 3, respectively. Results show that Arabic dialect processing tasks such as dialect identification and machine translation remain challenging. We describe the methods employed by the participating teams and briefly offer an outlook for NADI.

Original languageEnglish (US)
Title of host publicationArabicNLP 2024 - 2nd Arabic Natural Language Processing Conference, Proceedings of the Conference
EditorsNizar Habash, Houda Bouamor, Ramy Eskander, Nadi Tomeh, Ibrahim Abu Farha, Ahmed Abdelali, Samia Touileb, Injy Hamed, Yaser Onaizan, Bashar Alhafni, Wissam Antoun, Salam Khalifa, Hatem Haddad, Imed Zitouni, Badr AlKhamissi, Rawan Almatham, Khalil Mrini
PublisherAssociation for Computational Linguistics (ACL)
Pages709-728
Number of pages20
ISBN (Electronic)9798891761322
StatePublished - 2024
Event2nd Arabic Natural Language Processing Conference, ArabicNLP 2024 - Bangkok, Thailand
Duration: Aug 16 2024 → …

Publication series

NameArabicNLP 2024 - 2nd Arabic Natural Language Processing Conference, Proceedings of the Conference

Conference

Conference2nd Arabic Natural Language Processing Conference, ArabicNLP 2024
Country/TerritoryThailand
CityBangkok
Period8/16/24 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task'. Together they form a unique fingerprint.

Cite this