Hebrew morphological preprocessing for statistical machine translation

Nimesh Singh, Nizar Habash

Research output: Contribution to conferencePaper

Abstract

This paper presents a range of preprocessing solutions for Hebrew-English statistical machine translation. Our best system, using a morphological analyzer, increases 3.5 BLEU points over a no-tokenization baseline on a blind test set. The next best system uses Morfessor, an unsupervised morphological segmenter, and obtains almost 3.0 BLEU points over the baseline.

Original languageEnglish (US)
Pages43-50
Number of pages8
StatePublished - 2012
Event16th Annual Conference of the European Association for Machine Translation, EAMT 2012 - Trento, Italy
Duration: May 28 2012May 30 2012

Other

Other16th Annual Conference of the European Association for Machine Translation, EAMT 2012
CountryItaly
CityTrento
Period5/28/125/30/12

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Software

Fingerprint Dive into the research topics of 'Hebrew morphological preprocessing for statistical machine translation'. Together they form a unique fingerprint.

  • Cite this

    Singh, N., & Habash, N. (2012). Hebrew morphological preprocessing for statistical machine translation. 43-50. Paper presented at 16th Annual Conference of the European Association for Machine Translation, EAMT 2012, Trento, Italy.