A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing

William Timkey, Tal Linzen

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Two of the central factors believed to underpin human sentence processing difficulty are expectations and retrieval from working memory. A recent attempt to create a unified cognitive model integrating these two factors relied on the parallels between the self-attention mechanism of transformer language models and cue-based retrieval theories of working memory in human sentence processing (Ryu and Lewis, 2021). While Ryu and Lewis show that attention patterns in specialized attention heads of GPT-2 are consistent with similarity-based interference, a key prediction of cue-based retrieval models, their method requires identifying syntactically specialized attention heads, and makes the cognitively implausible assumption that hundreds of memory retrieval operations take place in parallel. In the present work, we develop a recurrent neural language model with a single self-attention head, which more closely parallels the memory system assumed by cognitive theories. We show that our model's single attention head captures semantic and syntactic interference effects observed in human experiments.

    Original languageEnglish (US)
    Title of host publicationFindings of the Association for Computational Linguistics
    Subtitle of host publicationEMNLP 2023
    PublisherAssociation for Computational Linguistics (ACL)
    Pages8705-8720
    Number of pages16
    ISBN (Electronic)9798891760615
    StatePublished - 2023
    Event2023 Findings of the Association for Computational Linguistics: EMNLP 2023 - Singapore, Singapore
    Duration: Dec 6 2023Dec 10 2023

    Publication series

    NameFindings of the Association for Computational Linguistics: EMNLP 2023

    Conference

    Conference2023 Findings of the Association for Computational Linguistics: EMNLP 2023
    Country/TerritorySingapore
    CitySingapore
    Period12/6/2312/10/23

    ASJC Scopus subject areas

    • Computational Theory and Mathematics
    • Computer Science Applications
    • Information Systems
    • Language and Linguistics
    • Linguistics and Language

    Fingerprint

    Dive into the research topics of 'A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing'. Together they form a unique fingerprint.

    Cite this