TY - GEN
T1 - Probing what different NLP tasks teach machines about function word comprehension
AU - Kim, Najoung
AU - Patel, Roma
AU - Poliak, Adam
AU - Wang, Alex
AU - Xia, Patrick
AU - McCoy, R. Thomas
AU - Tenney, Ian
AU - Ross, Alexis
AU - Linzen, Tal
AU - Van Durme, Benjamin
AU - Bowman, Samuel R.
AU - Pavlick, Ellie
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2019
Y1 - 2019
N2 - We introduce a set of nine challenge tasks that test for the understanding of function words. These tasks are created by structurally mutating sentences from existing datasets to target the comprehension of specific types of function words (e.g., prepositions, wh-words). Using these probing tasks, we explore the effects of various pretraining objectives for sentence encoders (e.g., language modeling, CCG supertagging and natural language inference (NLI)) on the learned representations. Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably. Overall, no pretraining objective dominates across the board, and our function word probing tasks highlight several intuitive differences between pretraining objectives, e.g., that NLI helps the comprehension of negation.
AB - We introduce a set of nine challenge tasks that test for the understanding of function words. These tasks are created by structurally mutating sentences from existing datasets to target the comprehension of specific types of function words (e.g., prepositions, wh-words). Using these probing tasks, we explore the effects of various pretraining objectives for sentence encoders (e.g., language modeling, CCG supertagging and natural language inference (NLI)) on the learned representations. Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably. Overall, no pretraining objective dominates across the board, and our function word probing tasks highlight several intuitive differences between pretraining objectives, e.g., that NLI helps the comprehension of negation.
UR - http://www.scopus.com/inward/record.url?scp=85086138447&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086138447&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85086138447
T3 - *SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics
SP - 235
EP - 249
BT - *SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics
PB - Association for Computational Linguistics (ACL)
T2 - 8th Joint Conference on Lexical and Computational Semantics, *SEM@NAACL-HLT 2019
Y2 - 6 June 2019 through 7 June 2019
ER -