Amortized Noisy Channel Neural Machine Translation

Richard Yuanzhe Pang, He He, Kyunghyun Cho

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Noisy channel models have been especially effective in neural machine translation (NMT). However, recent approaches like “beam search and rerank” (BSR) incur significant computation overhead during inference, making real-world application infeasible. We aim to study if it is possible to build an amortized noisy channel NMT model such that when we do greedy decoding during inference, the translation accuracy matches that of BSR in terms of reward (based on the source-to-target log probability and the target-to-source log probability) and quality (based on BLEU and BLEURT). We attempt three approaches to train the new model: knowledge distillation, 1-step-deviation imitation learning, and Q learning. The first approach obtains the noisy channel signal from a pseudo-corpus, and the latter two approaches aim to optimize toward a noisy-channel MT reward directly. For all three approaches, the generated translations fail to achieve rewards comparable to BSR, but the translation quality approximated by BLEU and BLEURT is similar to the quality of BSR-produced translations. Additionally, all three approaches speed up inference by 1–2 orders of magnitude.

Original languageEnglish (US)
Title of host publication15th International Natural Language Generation Conference, INLG 2022
EditorsSamira Shaikh, Thiago Castro Ferreira, Amanda Stent
PublisherAssociation for Computational Linguistics (ACL)
Number of pages13
ISBN (Electronic)9781955917575
StatePublished - 2022
Event15th International Natural Language Generation Conference, INLG 2022 - Hybrid, Waterville, United States
Duration: Jul 18 2022Jul 22 2022

Publication series

Name15th International Natural Language Generation Conference, INLG 2022


Conference15th International Natural Language Generation Conference, INLG 2022
Country/TerritoryUnited States
CityHybrid, Waterville

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems
  • Software


Dive into the research topics of 'Amortized Noisy Channel Neural Machine Translation'. Together they form a unique fingerprint.

Cite this