Multi-align: Combining linguistic and statistical techniques to improve alignments for adaptable MT

Necip Fazil Ayan, Bonnie J. Dorr, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

An adaptable statistical or hybrid MT system relies heavily on the quality of word-level alignments of real-world data. Statistical alignment approaches provide a reasonable initial estimate for word alignment. However, they cannot handle certain types of linguistic phenomena such as long-distance dependencies and structural differences between languages. We address this issue in Multi-Align, a new framework for incremental testing of different alignment algorithms and their combinations. Our design allows users to tune their systems to the properties of a particular genre/domain while still benefiting from general linguistic knowledge associated with a language pair. We demonstrate that a combination of statistical and linguistically-informed alignments can resolve translation divergences during the alignment process.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsRobert E. Frederking, Kathryn B. Taylor
PublisherSpringer Verlag
Pages17-26
Number of pages10
ISBN (Print)3540233008, 9783540233008
DOIs
StatePublished - 2004

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3265
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Multi-align: Combining linguistic and statistical techniques to improve alignments for adaptable MT'. Together they form a unique fingerprint.

Cite this