Pos-tagging of tunisian dialect using standard arabic resources and tools

Ahmed Hamdi, Alexis Nasr, Nizar Habash, Núria Gala

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Developing natural language processing tools usually requires a large number of resources (lexica, annotated corpora, etc.), which often do not exist for less-resourced languages. One way to overcome the problem of lack of resources is to devote substantial efforts to build new ones from scratch. Another approach is to exploit existing resources of closely related languages. In this paper, we focus on developing a part-of-speech tagger for the Tunisian Arabic dialect (TUN), a lowresource language, by exploiting its closeness to Modern Standard Arabic (MSA), which has many state-of-the-art resources and tools. Our system achieved an accuracy of 89% (∼20% absolute improvement over an MSA tagger baseline).

Original languageEnglish (US)
Title of host publication2nd Workshop on Arabic Natural Language Processing, ANLP 2015 - held at 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015 - Proceedings
EditorsNizar Habash, Stephan Vogel, Kareem Darwish
PublisherAssociation for Computational Linguistics (ACL)
Pages59-68
Number of pages10
ISBN (Electronic)9781941643587
StatePublished - 2015
Event2nd Workshop on Arabic Natural Language Processing, ANLP 2015 - Beijing, China
Duration: Jul 30 2015 → …

Publication series

Name2nd Workshop on Arabic Natural Language Processing, ANLP 2015 - held at 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015 - Proceedings

Conference

Conference2nd Workshop on Arabic Natural Language Processing, ANLP 2015
Country/TerritoryChina
CityBeijing
Period7/30/15 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Computational Theory and Mathematics
  • Software

Fingerprint

Dive into the research topics of 'Pos-tagging of tunisian dialect using standard arabic resources and tools'. Together they form a unique fingerprint.

Cite this