TY - JOUR
T1 - Don't be fooled
T2 - 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023
AU - Jethani, Neil
AU - Saporta, Adriel
AU - Ranganath, Rajesh
N1 - Funding Information:
We thank the reviewers for their thoughtful comments. This work was supported by NIH T32GM007308, NIH T32GM136573, a DeepMind Scholarship, NIH/NHLBI Award R01HL148248, NSF Award 1922658 NRT-HDR: FUTURE Foundations, Translation, and Responsibility for Data Science, and NSF CAREER Award 2145542.
Publisher Copyright:
Copyright © 2023 by the author(s)
PY - 2023
Y1 - 2023
N2 - Feature attribution methods identify which features of an input most influence a model's output. Most widely-used feature attribution methods (such as SHAP, LIME, and Grad-CAM) are “class-dependent” methods in that they generate a feature attribution vector as a function of class. In this work, we demonstrate that class-dependent methods can “leak” information about the selected class, making that class appear more likely than it is. Thus, an end user runs the risk of drawing false conclusions when interpreting an explanation generated by a class-dependent method. In contrast, we introduce “distribution-aware” methods, which favor explanations that keep the label's distribution close to its distribution given all features of the input. We introduce SHAP-KL and FastSHAP-KL, two baseline distribution-aware methods that compute Shapley values. Finally, we perform a comprehensive evaluation of seven class-dependent and three distribution-aware methods on three clinical datasets of different high-dimensional data types: images, biosignals, and text.
AB - Feature attribution methods identify which features of an input most influence a model's output. Most widely-used feature attribution methods (such as SHAP, LIME, and Grad-CAM) are “class-dependent” methods in that they generate a feature attribution vector as a function of class. In this work, we demonstrate that class-dependent methods can “leak” information about the selected class, making that class appear more likely than it is. Thus, an end user runs the risk of drawing false conclusions when interpreting an explanation generated by a class-dependent method. In contrast, we introduce “distribution-aware” methods, which favor explanations that keep the label's distribution close to its distribution given all features of the input. We introduce SHAP-KL and FastSHAP-KL, two baseline distribution-aware methods that compute Shapley values. Finally, we perform a comprehensive evaluation of seven class-dependent and three distribution-aware methods on three clinical datasets of different high-dimensional data types: images, biosignals, and text.
UR - http://www.scopus.com/inward/record.url?scp=85165182308&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85165182308&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85165182308
SN - 2640-3498
VL - 206
SP - 8925
EP - 8953
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
Y2 - 25 April 2023 through 27 April 2023
ER -