Machine translation evaluation for Arabic using morphologically-enriched embeddings

Francisco Guzmán, Houda Bouamor, Ramy Baly, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges. In this paper, we explore the use of embeddings obtained from different levels of lexical and morpho-syntactic linguistic analysis and show that they improve MT evaluation into an MRL. Specifically we report on Arabic, a language with complex and rich morphology. Our results show that using a neural-network model with different input representations produces results that clearly outperform the state-of-the-art for MT evaluation into Arabic, by almost over 75% increase in correlation with human judgments on pairwise MT evaluation quality task. More importantly, we demonstrate the usefulness of morpho-syntactic representations to model sentence similarity for MT evaluation and address complex linguistic phenomena of Arabic.

Original languageEnglish (US)
Title of host publicationCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016
Subtitle of host publicationTechnical Papers
PublisherAssociation for Computational Linguistics, ACL Anthology
Pages1398-1408
Number of pages11
ISBN (Print)9784879747020
StatePublished - 2016
Event26th International Conference on Computational Linguistics, COLING 2016 - Osaka, Japan
Duration: Dec 11 2016Dec 16 2016

Publication series

NameCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers

Other

Other26th International Conference on Computational Linguistics, COLING 2016
CountryJapan
CityOsaka
Period12/11/1612/16/16

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Machine translation evaluation for Arabic using morphologically-enriched embeddings'. Together they form a unique fingerprint.

Cite this