TY - GEN
T1 - Are All Spurious Features in Natural Language Alike? An Analysis through a Causal Lens
AU - Joshi, Nitish
AU - Pan, Xiang
AU - He, He
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - The term 'spurious correlations' has been used in NLP to informally denote any undesirable feature-label correlations. However, a correlation can be undesirable because (i) the feature is irrelevant to the label (e.g. punctuation in a review), or (ii) the feature's effect on the label depends on the context (e.g. negation words in a review), which is ubiquitous in language tasks. In case (i), we want the model to be invariant to the feature, which is neither necessary nor sufficient for prediction. But in case (ii), even an ideal model (e.g. humans) must rely on the feature, since it is necessary (but not sufficient) for prediction. Therefore, a more fine-grained treatment of spurious features is needed to specify the desired model behavior. We formalize this distinction using a causal model and probabilities of necessity and sufficiency, which delineates the causal relations between a feature and a label. We then show that this distinction helps explain results of existing debiasing methods on different spurious features, and demystifies surprising results such as the encoding of spurious features in model representations after debiasing.
AB - The term 'spurious correlations' has been used in NLP to informally denote any undesirable feature-label correlations. However, a correlation can be undesirable because (i) the feature is irrelevant to the label (e.g. punctuation in a review), or (ii) the feature's effect on the label depends on the context (e.g. negation words in a review), which is ubiquitous in language tasks. In case (i), we want the model to be invariant to the feature, which is neither necessary nor sufficient for prediction. But in case (ii), even an ideal model (e.g. humans) must rely on the feature, since it is necessary (but not sufficient) for prediction. Therefore, a more fine-grained treatment of spurious features is needed to specify the desired model behavior. We formalize this distinction using a causal model and probabilities of necessity and sufficiency, which delineates the causal relations between a feature and a label. We then show that this distinction helps explain results of existing debiasing methods on different spurious features, and demystifies surprising results such as the encoding of spurious features in model representations after debiasing.
UR - http://www.scopus.com/inward/record.url?scp=85142485808&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142485808&partnerID=8YFLogxK
U2 - 10.18653/v1/2022.emnlp-main.666
DO - 10.18653/v1/2022.emnlp-main.666
M3 - Conference contribution
AN - SCOPUS:85142485808
T3 - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
SP - 9804
EP - 9817
BT - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
A2 - Goldberg, Yoav
A2 - Kozareva, Zornitsa
A2 - Zhang, Yue
PB - Association for Computational Linguistics (ACL)
T2 - 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Y2 - 7 December 2022 through 11 December 2022
ER -