Contextual Embeddings Can Distinguish Homonymy from Polysemy in a Human-Like Way

Kyra Wilson, Alec Marantz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Lexical ambiguity is a pervasive feature of natural language, and a major difficulty in understanding language is selecting the intended meaning when more than one are possible. Despite this difficulty, many studies of single word recognition have found a processing advantage for ambiguous words compared to unambiguous ones. This effect is not homogeneous however–studies find consistent advantages for polysemes (words with multiple related meanings), and inconsistent results for homonyms (words with multiple unrelated meanings). Complicating this is the fact that most measures of ambiguity are derived from human- annotated or curated lexicographic resources, and their use is not consistent between studies. Our work investigates whether contextualized word embeddings are able to capture human-like distinctions between senses and meanings, and whether they can predict human behavior. We reanalyze data from previous experiments reporting ambiguity (dis)advantages using the lexical decision times reported in the English Lexicon Project. We find that our method does replicate the polyseme advantage and homonym disadvantage previously reported, and the predictors are superior to binary distinctions derived from lexicographic resources. Our findings point towards the benefits of using continuous-space representations of senses and meanings over more traditional measures. Additionally, we make our code publicly available for use in future research.

Original languageEnglish (US)
Title of host publicationICNLSP 2022 - Proceedings of the 5th International Conference on Natural Language and Speech Processing
EditorsMourad Abbas, Abed Alhakim Freihat
PublisherAssociation for Computational Linguistics (ACL)
Pages144-155
Number of pages12
ISBN (Electronic)9781959429364
StatePublished - 2022
Event5th International Conference on Natural Language and Speech Processing, ICNLSP 2022 - Virtual, Online
Duration: Dec 16 2022Dec 17 2022

Publication series

NameICNLSP 2022 - Proceedings of the 5th International Conference on Natural Language and Speech Processing

Conference

Conference5th International Conference on Natural Language and Speech Processing, ICNLSP 2022
CityVirtual, Online
Period12/16/2212/17/22

ASJC Scopus subject areas

  • Artificial Intelligence
  • Signal Processing
  • Linguistics and Language
  • Communication

Fingerprint

Dive into the research topics of 'Contextual Embeddings Can Distinguish Homonymy from Polysemy in a Human-Like Way'. Together they form a unique fingerprint.

Cite this