Orthographic and Morphological Processing for Persian-to-English Statistical Machine Translation

Mohammad Sadegh Rasooli, Ahmed El Kholy, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In statistical machine translation, data sparsity is a challenging problem especially for languages with rich morphology and inconsistent orthography, such as Persian. We show that orthographic preprocessing and morphological segmentation of Persian verbs in particular improves the translation quality of Persian-English by 1.9 BLEU points on a blind test set.

Original languageEnglish (US)
Title of host publication6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Proceedings of the Main Conference
EditorsRuslan Mitkov, Jong C. Park
PublisherAsian Federation of Natural Language Processing
Pages1047-1051
Number of pages5
ISBN (Electronic)9784990734800
StatePublished - 2013
Event6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Nagoya, Japan
Duration: Oct 14 2013 → …

Publication series

Name6th International Joint Conference on Natural Language Processing, IJCNLP 2013 - Proceedings of the Main Conference

Conference

Conference6th International Joint Conference on Natural Language Processing, IJCNLP 2013
Country/TerritoryJapan
CityNagoya
Period10/14/13 → …

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Fingerprint

Dive into the research topics of 'Orthographic and Morphological Processing for Persian-to-English Statistical Machine Translation'. Together they form a unique fingerprint.

Cite this