Causal analysis of syntactic agreement mechanisms in neural language models

Matthew Finlayson, Aaron Mueller, Sebastian Gehrmann, Stuart Shieber, Tal Linzen, Yonatan Belinkov

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural language models. We investigate the magnitude of models' preferences for grammatical inflections, as well as whether neurons process subject-verb agreement similarly across sentences with different syntactic structures. We uncover similarities and differences across architectures and model sizes-notably, that larger models do not necessarily learn stronger preferences. We also observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure of the input sentence. Finally, we find that language models rely on similar sets of neurons when given sentences with similar syntactic structure.

    Original languageEnglish (US)
    Title of host publicationACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
    PublisherAssociation for Computational Linguistics (ACL)
    Pages1828-1843
    Number of pages16
    ISBN (Electronic)9781954085527
    StatePublished - 2021
    EventJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online
    Duration: Aug 1 2021Aug 6 2021

    Publication series

    NameACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference

    Conference

    ConferenceJoint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
    CityVirtual, Online
    Period8/1/218/6/21

    ASJC Scopus subject areas

    • Software
    • Computational Theory and Mathematics
    • Linguistics and Language
    • Language and Linguistics

    Fingerprint

    Dive into the research topics of 'Causal analysis of syntactic agreement mechanisms in neural language models'. Together they form a unique fingerprint.

    Cite this