TY - JOUR
T1 - SuperGLUE
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
AU - Wang, Alex
AU - Pruksachatkun, Yada
AU - Nangia, Nikita
AU - Singh, Amanpreet
AU - Michael, Julian
AU - Hill, Felix
AU - Levy, Omer
AU - Bowman, Samuel R.
N1 - Funding Information:
We thank the original authors of the included datasets in SuperGLUE for their cooperation in the creation of the benchmark, as well as those who proposed tasks and datasets that we ultimately could not include. This work was made possible in part by a donation to NYU from Eric and Wendy Schmidt made by recommendation of the Schmidt Futures program. We gratefully acknowledge the support of the NVIDIA Corporation with the donation of a Titan V GPU used at NYU for this research, and funding from DeepMind for the hosting of the benchmark platform. AW is supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE 1342536. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. SB is partly supported by Samsung Advanced Institute of Technology (Next Generation Deep Learning: from Pattern Recognition to AI) and Samsung Electronics (Improving Deep Learning using Latent Structure).
Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.
PY - 2019
Y1 - 2019
N2 - In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at super.gluebenchmark.com.
AB - In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at super.gluebenchmark.com.
UR - http://www.scopus.com/inward/record.url?scp=85086222979&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086222979&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85086222979
SN - 1049-5258
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
Y2 - 8 December 2019 through 14 December 2019
ER -