TY - GEN
T1 - When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it
AU - Schuster, Sebastian
AU - Linzen, Tal
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Understanding longer narratives or participating in conversations requires tracking of discourse entities that have been mentioned. Indefinite noun phrases (NPs), such as a dog, frequently introduce discourse entities but this behavior is modulated by sentential operators such as negation. For example, a dog in Arthur doesn't own a dog does not introduce a discourse entity due to the presence of negation. In this work, we adapt the psycholinguistic assessment of language models paradigm to higher-level linguistic phenomena and introduce an English evaluation suite that targets the knowledge of the interactions between sentential operators and indefinite NPs. We use this evaluation suite for a fine-grained investigation of the entity tracking abilities of the Transformer-based models GPT-2 and GPT-3. We find that while the models are to a certain extent sensitive to the interactions we investigate, they are all challenged by the presence of multiple NPs and their behavior is not systematic, which suggests that even models at the scale of GPT-3 do not fully acquire basic entity tracking abilities.
AB - Understanding longer narratives or participating in conversations requires tracking of discourse entities that have been mentioned. Indefinite noun phrases (NPs), such as a dog, frequently introduce discourse entities but this behavior is modulated by sentential operators such as negation. For example, a dog in Arthur doesn't own a dog does not introduce a discourse entity due to the presence of negation. In this work, we adapt the psycholinguistic assessment of language models paradigm to higher-level linguistic phenomena and introduce an English evaluation suite that targets the knowledge of the interactions between sentential operators and indefinite NPs. We use this evaluation suite for a fine-grained investigation of the entity tracking abilities of the Transformer-based models GPT-2 and GPT-3. We find that while the models are to a certain extent sensitive to the interactions we investigate, they are all challenged by the presence of multiple NPs and their behavior is not systematic, which suggests that even models at the scale of GPT-3 do not fully acquire basic entity tracking abilities.
UR - http://www.scopus.com/inward/record.url?scp=85137212025&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137212025&partnerID=8YFLogxK
U2 - 10.18653/v1/2022.naacl-main.71
DO - 10.18653/v1/2022.naacl-main.71
M3 - Conference contribution
AN - SCOPUS:85137212025
T3 - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
SP - 969
EP - 982
BT - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022
Y2 - 10 July 2022 through 15 July 2022
ER -