TY - GEN
T1 - Syntactic data augmentation increases robustness to inference heuristics
AU - Min, Junghyun
AU - Thomas McCoy, R.
AU - Das, Dipanjan
AU - Pitler, Emily
AU - Linzen, Tal
N1 - Funding Information:
This research was supported by a gift from Google, NSF Graduate Research Fellowship No. 1746891, and NSF Grant No. BCS-1920924. Our experiments were conducted using the Maryland Advanced Research Computing Center (MARCC).
Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - Pretrained neural models such as BERT, when fine-tuned to perform natural language inference (NLI), often show high accuracy on standard datasets, but display a surprising lack of sensitivity to word order on controlled challenge sets. We hypothesize that this issue is not primarily caused by the pretrained model's limitations, but rather by the paucity of crowd-sourced NLI examples that might convey the importance of syntactic structure at the fine-tuning stage. We explore several methods to augment standard training sets with syntactically informative examples, generated by applying syntactic transformations to sentences from the MNLI corpus. The best-performing augmentation method, subject/object inversion, improved BERT's accuracy on controlled examples that diagnose sensitivity to word order from 0.28 to 0.73, without affecting performance on the MNLI test set. This improvement generalized beyond the particular construction used for data augmentation, suggesting that augmentation causes BERT to recruit abstract syntactic representations.
AB - Pretrained neural models such as BERT, when fine-tuned to perform natural language inference (NLI), often show high accuracy on standard datasets, but display a surprising lack of sensitivity to word order on controlled challenge sets. We hypothesize that this issue is not primarily caused by the pretrained model's limitations, but rather by the paucity of crowd-sourced NLI examples that might convey the importance of syntactic structure at the fine-tuning stage. We explore several methods to augment standard training sets with syntactically informative examples, generated by applying syntactic transformations to sentences from the MNLI corpus. The best-performing augmentation method, subject/object inversion, improved BERT's accuracy on controlled examples that diagnose sensitivity to word order from 0.28 to 0.73, without affecting performance on the MNLI test set. This improvement generalized beyond the particular construction used for data augmentation, suggesting that augmentation causes BERT to recruit abstract syntactic representations.
UR - http://www.scopus.com/inward/record.url?scp=85107287716&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107287716&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85107287716
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 2339
EP - 2352
BT - ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
Y2 - 5 July 2020 through 10 July 2020
ER -