SuperGLUE: A stickier benchmark for general-purpose language understanding systems

Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

    Research output: Contribution to journalConference articlepeer-review

    Abstract

    In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at super.gluebenchmark.com.

    Original languageEnglish (US)
    JournalAdvances in Neural Information Processing Systems
    Volume32
    StatePublished - 2019
    Event33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019 - Vancouver, Canada
    Duration: Dec 8 2019Dec 14 2019

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Information Systems
    • Signal Processing

    Fingerprint

    Dive into the research topics of 'SuperGLUE: A stickier benchmark for general-purpose language understanding systems'. Together they form a unique fingerprint.

    Cite this