Probabilistic fasttext for multi-sense word embeddings

Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. In particular, we represent each word with a Gaussian mixture density, where the mean of a mixture component is given by the sum of n-grams. This representation allows the model to share statistical strength across sub-word structures (e.g. Latin roots), producing accurate representations of rare, misspelt, or even unseen words. Moreover, each component of the mixture can capture a different word sense. Probabilistic FastText outperforms both FASTTEXT, which has no probabilistic model, and dictionary-level probabilistic embeddings, which do not incorporate subword structures, on several word-similarity benchmarks, including English RareWord and foreign language datasets. We also achieve state-of-art performance on benchmarks that measure ability to discern different meanings. Thus, the proposed model is the first to achieve multi-sense representations while having enriched semantics on rare words.

Original languageEnglish (US)
Title of host publicationACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
PublisherAssociation for Computational Linguistics (ACL)
Pages1-11
Number of pages11
ISBN (Electronic)9781948087322
DOIs
StatePublished - 2018
Event56th Annual Meeting of the Association for Computational Linguistics, ACL 2018 - Melbourne, Australia
Duration: Jul 15 2018Jul 20 2018

Publication series

NameACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
Volume1

Conference

Conference56th Annual Meeting of the Association for Computational Linguistics, ACL 2018
CountryAustralia
CityMelbourne
Period7/15/187/20/18

ASJC Scopus subject areas

  • Software
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Probabilistic fasttext for multi-sense word embeddings'. Together they form a unique fingerprint.

  • Cite this

    Athiwaratkun, B., Wilson, A. G., & Anandkumar, A. (2018). Probabilistic fasttext for multi-sense word embeddings. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (pp. 1-11). (ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers); Vol. 1). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-1001