TY - GEN
T1 - Combination of arabic preprocessing schemes for statistical machine translation
AU - Sadat, Fatiha
AU - Habash, Nizar
PY - 2006
Y1 - 2006
N2 - Statistical machine translation is quite robust when it comes to the choice of input representation. It only requires consistency between training and testing. As a result, there is a wide range of possible preprocessing choices for data used in statistical machine translation. This is even more so for morphologically rich languages such as Arabic. In this paper, we study the effect of different word-level preprocessing schemes for Arabic on the quality of phrase-based statistical machine translation. We also present and evaluate different methods for combining preprocessing schemes resulting in improved translation quality.
AB - Statistical machine translation is quite robust when it comes to the choice of input representation. It only requires consistency between training and testing. As a result, there is a wide range of possible preprocessing choices for data used in statistical machine translation. This is even more so for morphologically rich languages such as Arabic. In this paper, we study the effect of different word-level preprocessing schemes for Arabic on the quality of phrase-based statistical machine translation. We also present and evaluate different methods for combining preprocessing schemes resulting in improved translation quality.
UR - http://www.scopus.com/inward/record.url?scp=84860521060&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84860521060&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84860521060
SN - 1932432655
SN - 9781932432657
T3 - COLING/ACL 2006 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
SP - 1
EP - 8
BT - COLING/ACL 2006 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
T2 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, COLING/ACL 2006
Y2 - 17 July 2006 through 21 July 2006
ER -