CATiB: The Columbia Arabic Treebank

Nizar Habash, Ryan M. Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The Columbia Arabic Treebank (CATiB) is a database of syntactic analyses of Arabic sentences. CATiB contrasts with previous approaches to Arabic treebanking in its emphasis on speed with some constraints on linguistic richness. Two basic ideas inspire the CATiB approach: no annotation of redundant information and using representations and terminology inspired by traditional Arabic syntax. We describe CATiB's representation and annotation procedure, and report on inter-annotator agreement and speed.

Original languageEnglish (US)
Title of host publicationACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf.
Pages221-224
Number of pages4
StatePublished - 2009
EventJoint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 - Suntec, Singapore
Duration: Aug 2 2009Aug 7 2009

Publication series

NameACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf.

Other

OtherJoint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009
Country/TerritorySingapore
CitySuntec
Period8/2/098/7/09

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence
  • Software

Fingerprint

Dive into the research topics of 'CATiB: The Columbia Arabic Treebank'. Together they form a unique fingerprint.

Cite this