Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment

William Merrill, Zhaofeng Wu, Norihito Naka, Yoon Kim, Tal Linzen

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Do LMs infer the semantics of text from co-occurrence patterns in their training data? Merrill et al. (2022) argue that, in theory, sentence co-occurrence probabilities predicted by an optimal LM should reflect the entailment relationship of the constituent sentences, but it is unclear whether probabilities predicted by neural LMs encode entailment in this way because of strong assumptions made by Merrill et al. (namely, that humans always avoid redundancy). In this work, we investigate whether their theory can be used to decode entailment relations from neural LMs. We find that a test similar to theirs can decode entailment relations between natural sentences, well above random chance, though not perfectly, across many datasets and LMs. This suggests LMs implicitly model aspects of semantics to predict semantic effects on sentence co-occurrence patterns. However, we find the test that predicts entailment in practice works in the opposite direction to the theoretical test. We thus revisit the assumptions underlying the original test, finding its derivation did not adequately account for redundancy in human-written text. We argue that better accounting for redundancy related to explanations might derive the observed flipped test and, more generally, improve computational models of speakers in linguistics.

    Original languageEnglish (US)
    Title of host publicationThe 62nd Annual Meeting of the Association for Computational Linguistics
    Subtitle of host publicationFindings of the Association for Computational Linguistics, ACL 2024
    EditorsLun-Wei Ku, Andre Martins, Vivek Srikumar
    PublisherAssociation for Computational Linguistics (ACL)
    Pages2752-2773
    Number of pages22
    ISBN (Electronic)9798891760998
    DOIs
    StatePublished - 2024
    EventFindings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Hybrid, Bangkok, Thailand
    Duration: Aug 11 2024Aug 16 2024

    Publication series

    NameProceedings of the Annual Meeting of the Association for Computational Linguistics
    ISSN (Print)0736-587X

    Conference

    ConferenceFindings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
    Country/TerritoryThailand
    CityHybrid, Bangkok
    Period8/11/248/16/24

    ASJC Scopus subject areas

    • Computer Science Applications
    • Linguistics and Language
    • Language and Linguistics

    Fingerprint

    Dive into the research topics of 'Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment'. Together they form a unique fingerprint.

    Cite this