Text readability for Arabic as a foreign language

Hind Saddiki, Karim Bouzoubaa, Violetta Cavalli-Sforza

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this study, we evaluate the informativeness of lexical, morphological and semantic features in determining the readability of texts geared towards learners of Arabic as a foreign language. We have gathered low-complexity features with the purpose of establishing a baseline for future research in readability assessment, using freely available natural language processing (NLP) and machine learning (ML) tools on a publicly accessible corpus. We tested common classification algorithms, as well as random forests-an ensemble learning method-and report on their results using several evaluation measures for comparability with similar work. Our results suggest that a small set of easily computed features can be indicative of the reading level of a text. Moreover, our findings will serve as a common ground, for ourselves and others, to evaluate and compare the performance of more elaborate techniques and feature sets.

Original languageEnglish (US)
Title of host publication2015 IEEE/ACS 12th International Conference of Computer Systems and Applications, AICCSA 2015
PublisherIEEE Computer Society
ISBN (Electronic)9781509004782
DOIs
StatePublished - Jul 7 2016
Event12th IEEE/ACS International Conference of Computer Systems and Applications, AICCSA 2015 - Marrakech, Morocco
Duration: Nov 17 2015Nov 20 2015

Publication series

NameProceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA
Volume2016-July
ISSN (Print)2161-5322
ISSN (Electronic)2161-5330

Other

Other12th IEEE/ACS International Conference of Computer Systems and Applications, AICCSA 2015
Country/TerritoryMorocco
CityMarrakech
Period11/17/1511/20/15

Keywords

  • Arabic
  • foreign language learning
  • machine learning
  • natural language processing
  • text readability

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Signal Processing
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Text readability for Arabic as a foreign language'. Together they form a unique fingerprint.

Cite this