A large-scale leveled readability lexicon for standard Arabic

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a large-scale 26,000-lemma leveled readability lexicon for Modern Standard Arabic. The lexicon was manually annotated in triplicate by language professionals from three regions in the Arab world. The annotations show a high degree of agreement; and major differences were limited to regional variations. Comparing lemma readability levels with their frequencies provided good insights in the benefits and pitfalls of frequency-based readability approaches. The lexicon will be publicly available.
Original languageEnglish (US)
Title of host publicationLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
EditorsNicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
PublisherEuropean Language Resources Association (ELRA)
Pages3053-3062
Number of pages10
ISBN (Electronic)9791095546344
ISBN (Print)9791095546344
StatePublished - 2020

Publication series

NameLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

Keywords

  • Arabic
  • Lexicon
  • Readability

ASJC Scopus subject areas

  • Education
  • Library and Information Sciences
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'A large-scale leveled readability lexicon for standard Arabic'. Together they form a unique fingerprint.

Cite this