The representational geometry of word meanings acquired by neural machine translation models

Felix Hill, Kyunghyun Cho, Sébastien Jean, Yoshua Bengio

Research output: Contribution to journalArticlepeer-review


This work is the first comprehensive analysis of the properties of word embeddings learned by neural machine translation (NMT) models trained on bilingual texts. We show the word representations of NMT models outperform those learned from monolingual text by established algorithms such as Skipgram and CBOW on tasks that require knowledge of semantic similarity and/or lexical–syntactic role. These effects hold when translating from English to French and English to German, and we argue that the desirable properties of NMT word embeddings should emerge largely independently of the source and target languages. Further, we apply a recently-proposed heuristic method for training NMT models with very large vocabularies, and show that this vocabulary expansion method results in minimal degradation of embedding quality. This allows us to make a large vocabulary of NMT embeddings available for future research and applications. Overall, our analyses indicate that NMT embeddings should be used in applications that require word concepts to be organised according to similarity and/or lexical function, while monolingual embeddings are better suited to modelling (nonspecific) inter-word relatedness.

Original languageEnglish (US)
Pages (from-to)3-18
Number of pages16
JournalMachine Translation
Issue number1-2
StatePublished - Jun 1 2017


  • Machine translation
  • Representation
  • Word embeddings

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Linguistics and Language
  • Artificial Intelligence


Dive into the research topics of 'The representational geometry of word meanings acquired by neural machine translation models'. Together they form a unique fingerprint.

Cite this