When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it

Sebastian Schuster, Tal Linzen

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Understanding longer narratives or participating in conversations requires tracking of discourse entities that have been mentioned. Indefinite noun phrases (NPs), such as a dog, frequently introduce discourse entities but this behavior is modulated by sentential operators such as negation. For example, a dog in Arthur doesn't own a dog does not introduce a discourse entity due to the presence of negation. In this work, we adapt the psycholinguistic assessment of language models paradigm to higher-level linguistic phenomena and introduce an English evaluation suite that targets the knowledge of the interactions between sentential operators and indefinite NPs. We use this evaluation suite for a fine-grained investigation of the entity tracking abilities of the Transformer-based models GPT-2 and GPT-3. We find that while the models are to a certain extent sensitive to the interactions we investigate, they are all challenged by the presence of multiple NPs and their behavior is not systematic, which suggests that even models at the scale of GPT-3 do not fully acquire basic entity tracking abilities.

    Original languageEnglish (US)
    Title of host publicationNAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics
    Subtitle of host publicationHuman Language Technologies, Proceedings of the Conference
    PublisherAssociation for Computational Linguistics (ACL)
    Pages969-982
    Number of pages14
    ISBN (Electronic)9781955917711
    DOIs
    StatePublished - 2022
    Event2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022 - Seattle, United States
    Duration: Jul 10 2022Jul 15 2022

    Publication series

    NameNAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

    Conference

    Conference2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022
    Country/TerritoryUnited States
    CitySeattle
    Period7/10/227/15/22

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Hardware and Architecture
    • Information Systems
    • Software

    Fingerprint

    Dive into the research topics of 'When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it'. Together they form a unique fingerprint.

    Cite this