Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference

R. Thomas McCoy, Ellie Pavlick, Tal Linzen

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issue within natural language inference (NLI), the task of determining whether one sentence entails another. We hypothesize that statistical NLI models may adopt three fallible syntactic heuristics: the lexical overlap heuristic, the subsequence heuristic, and the constituent heuristic. To determine whether models have adopted these heuristics, we introduce a controlled evaluation set called HANS (Heuristic Analysis for NLI Systems), which contains many examples where the heuristics fail. We find that models trained on MNLI, including BERT, a state-of-the-art model, perform very poorly on HANS, suggesting that they have indeed adopted these heuristics. We conclude that there is substantial room for improvement in NLI systems, and that the HANS dataset can motivate and measure progress in this area.

    Original languageEnglish (US)
    Title of host publicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
    PublisherAssociation for Computational Linguistics (ACL)
    Pages3428-3448
    Number of pages21
    ISBN (Electronic)9781950737482
    StatePublished - 2020
    Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
    Duration: Jul 28 2019Aug 2 2019

    Publication series

    NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

    Conference

    Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
    Country/TerritoryItaly
    CityFlorence
    Period7/28/198/2/19

    ASJC Scopus subject areas

    • Language and Linguistics
    • General Computer Science
    • Linguistics and Language

    Fingerprint

    Dive into the research topics of 'Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference'. Together they form a unique fingerprint.

    Cite this