TY - GEN
T1 - CURI
T2 - 38th International Conference on Machine Learning, ICML 2021
AU - Vedantam, Ramakrishna
AU - Szlam, Arthur
AU - Nickel, Maximilian
AU - Morcos, Ari
AU - Lake, Brenden
N1 - Funding Information:
We would like to thank Laurens Van Der Maaten, Rob Fergus, Larry Zitnick, Edward Grefenstette, Devi Parikh and numerous other colleagues at Facebook AI Research for their feedback and discussions in helping shape this project. Specifically, we thank Larry Zitnick and Edward Grefenstette for comments on this draft. Finally, we would like to thank the developers of Hydra and PyTorch for providing amazing frameworks for running large scale deep learning experiments.
Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - Humans can learn and reason under substantial uncertainty in a space of infinitely many compositional, productive concepts. For example, if a scene with two blue spheres qualifies as “daxy, ” one can reason that the underlying concept may require scenes to have “only blue spheres” or “only spheres” or “only two objects.” In contrast, standard benchmarks for compositional reasoning do not explicitly capture a notion of reasoning under uncertainty or evaluate compositional concept acquisition. We introduce a new benchmark, Compositional Reasoning Under Uncertainty (CURI) that instantiates a series of few-shot, meta-learning tasks in a productive concept space to evaluate different aspects of systematic generalization under uncertainty, including splits that test abstract understandings of disentangling, productive generalization, learning boolean operations, variable binding, etc. Importantly, we also contribute a model-independent “compositionality gap” to evaluate the difficulty of generalizing out-of-distribution along each of these axes, allowing objective comparison of the difficulty of each compositional split. Evaluations across a range of modeling choices and splits reveal substantial room for improvement on the proposed benchmark.
AB - Humans can learn and reason under substantial uncertainty in a space of infinitely many compositional, productive concepts. For example, if a scene with two blue spheres qualifies as “daxy, ” one can reason that the underlying concept may require scenes to have “only blue spheres” or “only spheres” or “only two objects.” In contrast, standard benchmarks for compositional reasoning do not explicitly capture a notion of reasoning under uncertainty or evaluate compositional concept acquisition. We introduce a new benchmark, Compositional Reasoning Under Uncertainty (CURI) that instantiates a series of few-shot, meta-learning tasks in a productive concept space to evaluate different aspects of systematic generalization under uncertainty, including splits that test abstract understandings of disentangling, productive generalization, learning boolean operations, variable binding, etc. Importantly, we also contribute a model-independent “compositionality gap” to evaluate the difficulty of generalizing out-of-distribution along each of these axes, allowing objective comparison of the difficulty of each compositional split. Evaluations across a range of modeling choices and splits reveal substantial room for improvement on the proposed benchmark.
UR - http://www.scopus.com/inward/record.url?scp=85161295216&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161295216&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85161295216
T3 - Proceedings of Machine Learning Research
SP - 10519
EP - 10529
BT - Proceedings of the 38th International Conference on Machine Learning, ICML 2021
PB - ML Research Press
Y2 - 18 July 2021 through 24 July 2021
ER -