TY - CONF
T1 - Can neural networks acquire a structural bias from raw linguistic data?
AU - Warstadt, Alex
AU - Bowman, Samuel R.
N1 - Funding Information:
We thank Chris Barker, Chris Collins, Stephanie Harves, Brenden Lake, Tal Linzen, Alec Marantz, Tom McCoy, and the audience at NYU's Syntax Brown Bag for helpful feedback. This material is based on work supported by the National Science Foundation under Grant No. 1850208. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This project has also benefited from support to SB by Eric and Wendy Schmidt (made by recommendation of the Schmidt Futures program), by Samsung Research (under the project Improving Deep Learning using Latent Structure), by Intuit, Inc., and by NVIDIA Corporation (with the donation of a Titan V GPU).
Funding Information:
We thank Chris Barker, Chris Collins, Stephanie Harves, Brenden Lake, Tal Linzen, Alec Marantz, Tom McCoy, and the audience at NYU’s Syntax Brown Bag for helpful feedback. This material is based on work supported by the National Science Foundation under Grant No. 1850208. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This project has also benefited from support to SB by Eric and Wendy Schmidt (made by recommendation of the Schmidt Futures program), by Samsung Research (under the project Improving Deep Learning using Latent Structure), by Intuit, Inc., and by NVIDIA Corporation (with the donation of a Titan V GPU).
Publisher Copyright:
© 2020 The Author(s)
PY - 2020
Y1 - 2020
N2 - We evaluate whether BERT, a widely used neural network for sentence processing, acquires an inductive bias towards forming structural generalizations through pretraining on raw data. We conduct four experiments testing its preference for structural vs. linear generalizations in different structure-dependent phenomena. We find that BERT makes a structural generalization in 3 out of 4 empirical domains-subject-auxiliary inversion, reflexive binding, and verb tense detection in embedded clauses-but makes a linear generalization when tested on NPI licensing. We argue that these results are the strongest evidence so far from artificial learners supporting the proposition that a structural bias can be acquired from raw data. If this conclusion is correct, it is tentative evidence that some linguistic universals can be acquired by learners without innate biases. However, the precise implications for human language acquisition are unclear, as humans learn language from significantly less data than BERT.
AB - We evaluate whether BERT, a widely used neural network for sentence processing, acquires an inductive bias towards forming structural generalizations through pretraining on raw data. We conduct four experiments testing its preference for structural vs. linear generalizations in different structure-dependent phenomena. We find that BERT makes a structural generalization in 3 out of 4 empirical domains-subject-auxiliary inversion, reflexive binding, and verb tense detection in embedded clauses-but makes a linear generalization when tested on NPI licensing. We argue that these results are the strongest evidence so far from artificial learners supporting the proposition that a structural bias can be acquired from raw data. If this conclusion is correct, it is tentative evidence that some linguistic universals can be acquired by learners without innate biases. However, the precise implications for human language acquisition are unclear, as humans learn language from significantly less data than BERT.
KW - BERT
KW - inductive bias
KW - learnability of grammar
KW - neural network
KW - poverty of the stimulus
KW - self-supervised learning
KW - structure dependence
UR - http://www.scopus.com/inward/record.url?scp=85139518962&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85139518962&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85139518962
SP - 1737
EP - 1743
T2 - 42nd Annual Meeting of the Cognitive Science Society: Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 2020
Y2 - 29 July 2020 through 1 August 2020
ER -