TY - JOUR
T1 - Have We Learned to Explain?
T2 - 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021
AU - Jethani, Neil
AU - Sudarshan, Mukund
AU - Aphinyanaphongs, Yindalon
AU - Ranganath, Rajesh
N1 - Funding Information:
Neil Jethani was partially supported by NIH T32 GM136573. Mukund Sudarshan was partially supported by a PhRMA Foundation Predoctoral Fellowship. Yin Aphinyanaphongs was partially supported by NIH 3UL1TR001445-05 and National Science Foundation award #1928614. Mukund Sudarshan and Rajesh Ranganath were partly supported by NIH/NHLBI Award R01HL148248, and by NSF Award 1922658 NRT-HDR: FUTURE Foundations, Translation, and Responsibility for Data Science.
Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - While the need for interpretable machine learning has been established, many common approaches are slow, lack fidelity, or hard to evaluate. Amortized explanation methods reduce the cost of providing interpretations by learning a global selector model that returns feature importances for a single instance of data. The selector model is trained to optimize the fidelity of the interpretations, as evaluated by a predictor model for the target. Popular methods learn the selector and predictor model in concert, which we show allows predictions to be encoded within interpretations. We introduce EVAL-X as a method to quantitatively evaluate interpretations and REAL-X as an amortized explanation method, which learn a predictor model that approximates the true data generating distribution given any subset of the input. We show EVAL-X can detect when predictions are encoded in interpretations and show the advantages of REAL-X through quantitative and radiologist evaluation.
AB - While the need for interpretable machine learning has been established, many common approaches are slow, lack fidelity, or hard to evaluate. Amortized explanation methods reduce the cost of providing interpretations by learning a global selector model that returns feature importances for a single instance of data. The selector model is trained to optimize the fidelity of the interpretations, as evaluated by a predictor model for the target. Popular methods learn the selector and predictor model in concert, which we show allows predictions to be encoded within interpretations. We introduce EVAL-X as a method to quantitatively evaluate interpretations and REAL-X as an amortized explanation method, which learn a predictor model that approximates the true data generating distribution given any subset of the input. We show EVAL-X can detect when predictions are encoded in interpretations and show the advantages of REAL-X through quantitative and radiologist evaluation.
UR - http://www.scopus.com/inward/record.url?scp=85161809721&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161809721&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85161809721
SN - 2640-3498
VL - 130
SP - 1459
EP - 1467
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
Y2 - 13 April 2021 through 15 April 2021
ER -