CamelParser2.0: A State-of-the-Art Dependency Parser for Arabic

Ahmed Elshabrawy, Muhammed AbuOdeh, Go Inoue, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present CamelParser2.0, an open-source Python-based Arabic dependency parser targeting two popular Arabic dependency formalisms, the Columbia Arabic Treebank (CATiB), and Universal Dependencies (UD). The CamelParser2.0 pipeline handles the processing of raw text and produces tokenization, part-of-speech and rich morphological features. As part of developing CamelParser2.0, we explore many system design hyper-parameters, such as parsing model architecture and pretrained language model selection, achieving new state-of-the-art performance across diverse Arabic genres under gold and predicted tokenization settings.

Original languageEnglish (US)
Title of host publicationArabicNLP 2023 - 1st Arabic Natural Language Processing Conference, Porceedings
EditorsHassan Sawaf, Samhaa El-Beltagy, Wajdi Zaghouani, Walid Magdy, Nadi Tomeh, Ibrahim Abu Farha, Nizar Habash, Salam Khalifa, Amr Keleg, Hatem Haddad, Imed Zitouni, Ahmed Abdelali, Khalil Mrini, Rawan Almatham
PublisherAssociation for Computational Linguistics (ACL)
Pages170-180
Number of pages11
ISBN (Electronic)9781959429272
DOIs
StatePublished - 2023
Event1st Arabic Natural Language Processing Conference, ArabicNLP 2023 - Hybrid, Singapore, Singapore
Duration: Dec 7 2023 → …

Publication series

NameArabicNLP 2023 - 1st Arabic Natural Language Processing Conference, Proceedings

Conference

Conference1st Arabic Natural Language Processing Conference, ArabicNLP 2023
Country/TerritorySingapore
CityHybrid, Singapore
Period12/7/23 → …

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'CamelParser2.0: A State-of-the-Art Dependency Parser for Arabic'. Together they form a unique fingerprint.

Cite this