Abstract
We present the results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI). This Shared Task includes two subtasks: country-level dialect identification (Subtask 1) and province-level sub-dialect identification (Subtask 2). The data for the shared task covers a total of 100 provinces from 21 Arab countries and is collected from the Twitter domain. As such, NADI is the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level. A total of 61 teams from 25 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 47 submissions for Subtask 1 from 18 teams and 9 submissions for Subtask 2 from 9 teams.
Original language | Undefined |
---|---|
Title of host publication | Proceedings of the Fifth Arabic Natural Language Processing Workshop |
Place of Publication | Barcelona, Spain (Online) |
Publisher | Association for Computational Linguistics |
Pages | 97-110 |
Number of pages | 14 |
State | Published - Dec 1 2020 |