Targeted syntactic evaluation of language models

Rebecca Marvin, Tal Linzen

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    We present a dataset for evaluating the grammaticality of the predictions of a language model. We automatically construct a large number of minimally different pairs of English sentences, each consisting of a grammatical and an ungrammatical sentence. The sentence pairs represent different variations of structure-sensitive phenomena: subject-verb agreement, reflexive anaphora and negative polarity items. We expect a language model to assign a higher probability to the grammatical sentence than the ungrammatical one. In an experiment using this data set, an LSTM language model performed poorly on many of the constructions. Multi-task training with a syntactic objective (CCG supertagging) improved the LSTM's accuracy, but a large gap remained between its performance and the accuracy of human participants recruited online. This suggests that there is considerable room for improvement over LSTMs in capturing syntax in a language model.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
    EditorsEllen Riloff, David Chiang, Julia Hockenmaier, Jun'ichi Tsujii
    PublisherAssociation for Computational Linguistics
    Pages1192-1202
    Number of pages11
    ISBN (Electronic)9781948087841
    StatePublished - 2020
    Event2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 - Brussels, Belgium
    Duration: Oct 31 2018Nov 4 2018

    Publication series

    NameProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

    Conference

    Conference2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
    CountryBelgium
    CityBrussels
    Period10/31/1811/4/18

    ASJC Scopus subject areas

    • Computational Theory and Mathematics
    • Computer Science Applications
    • Information Systems

    Fingerprint Dive into the research topics of 'Targeted syntactic evaluation of language models'. Together they form a unique fingerprint.

    Cite this