Abstract
This paper presents a range of preprocessing solutions for Hebrew-English statistical machine translation. Our best system, using a morphological analyzer, increases 3.5 BLEU points over a no-tokenization baseline on a blind test set. The next best system uses Morfessor, an unsupervised morphological segmenter, and obtains almost 3.0 BLEU points over the baseline.
Original language | English (US) |
---|---|
Pages | 43-50 |
Number of pages | 8 |
State | Published - 2012 |
Event | 16th Annual Conference of the European Association for Machine Translation, EAMT 2012 - Trento, Italy Duration: May 28 2012 → May 30 2012 |
Other
Other | 16th Annual Conference of the European Association for Machine Translation, EAMT 2012 |
---|---|
Country/Territory | Italy |
City | Trento |
Period | 5/28/12 → 5/30/12 |
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Software