TY - GEN
T1 - Baby Intuitions Benchmark (BIB)
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
AU - Gandhi, Kanishk
AU - Stojnic, Gala
AU - Lake, Brenden M.
AU - Dillon, Moira R.
N1 - Funding Information:
This worked was supported by the DARPA Machine Common Sense program (HR001119S0005). We thank Victoria Romero, Koleen McKrink, David Moore, Lisa Oakes, and Clark Dorman for their generous feedback. We are also grateful to Thomas Schellenberg, Dean Wetherby, and Brian Pippin for their development effort in porting the benchmark to 3D. Finally, we thank Brian Reilly for coming up with the name of the benchmark and finding the perfect acronym for our work.
Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - To achieve human-like common sense about everyday life, machine learning systems must understand and reason about the goals, preferences, and actions of other agents in the environment. By the end of their first year of life, human infants intuitively achieve such common sense, and these cognitive achievements lay the foundation for humans’ rich and complex understanding of the mental states of others. Can machines achieve generalizable, commonsense reasoning about other agents like human infants? The Baby Intuitions Benchmark (BIB)1 challenges machines to predict the plausibility of an agent’s behavior based on the underlying causes of its actions. Because BIB’s content and paradigm are adopted from developmental cognitive science, BIB allows for direct comparison between human and machine performance. Nevertheless, recently proposed, deep-learning-based agency reasoning models fail to show infant-like reasoning, leaving BIB an open challenge.
AB - To achieve human-like common sense about everyday life, machine learning systems must understand and reason about the goals, preferences, and actions of other agents in the environment. By the end of their first year of life, human infants intuitively achieve such common sense, and these cognitive achievements lay the foundation for humans’ rich and complex understanding of the mental states of others. Can machines achieve generalizable, commonsense reasoning about other agents like human infants? The Baby Intuitions Benchmark (BIB)1 challenges machines to predict the plausibility of an agent’s behavior based on the underlying causes of its actions. Because BIB’s content and paradigm are adopted from developmental cognitive science, BIB allows for direct comparison between human and machine performance. Nevertheless, recently proposed, deep-learning-based agency reasoning models fail to show infant-like reasoning, leaving BIB an open challenge.
UR - http://www.scopus.com/inward/record.url?scp=85131801363&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131801363&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85131801363
T3 - Advances in Neural Information Processing Systems
SP - 9963
EP - 9976
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
Y2 - 6 December 2021 through 14 December 2021
ER -