TY - CONF
T1 - Subspace inference for Bayesian deep learning
AU - Izmailov, Pavel
AU - Maddox, Wesley J.
AU - Kirichenko, Polina
AU - Garipov, Timur
AU - Vetrov, Dmitry
AU - Wilson, Andrew Gordon
N1 - Funding Information:
WJM, PI, PK and AGW were supported by an Amazon Research Award, Facebook Research, and NSF IIS-1563887. WJM was additionally supported by an NSF Graduate Research Fellowship under Grant No. DGE-1650441.
Publisher Copyright:
© 2019 Association For Uncertainty in Artificial Intelligence (AUAI). All rights reserved.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Bayesian inference was once a gold standard for learning with neural networks, providing accurate full predictive distributions and well calibrated uncertainty. However, scaling Bayesian inference techniques to deep neural networks is challenging due to the high dimensionality of the parameter space. In this paper, we construct low-dimensional subspaces of parameter space, such as the first principal components of the stochastic gradient descent (SGD) trajectory, which contain diverse sets of high performing models. In these subspaces, we are able to apply elliptical slice sampling and variational inference, which struggle in the full parameter space. We show that Bayesian model averaging over the induced posterior in these subspaces produces accurate predictions and well-calibrated predictive uncertainty for both regression and image classification.
AB - Bayesian inference was once a gold standard for learning with neural networks, providing accurate full predictive distributions and well calibrated uncertainty. However, scaling Bayesian inference techniques to deep neural networks is challenging due to the high dimensionality of the parameter space. In this paper, we construct low-dimensional subspaces of parameter space, such as the first principal components of the stochastic gradient descent (SGD) trajectory, which contain diverse sets of high performing models. In these subspaces, we are able to apply elliptical slice sampling and variational inference, which struggle in the full parameter space. We show that Bayesian model averaging over the induced posterior in these subspaces produces accurate predictions and well-calibrated predictive uncertainty for both regression and image classification.
UR - http://www.scopus.com/inward/record.url?scp=85073216200&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073216200&partnerID=8YFLogxK
M3 - Paper
T2 - 35th Conference on Uncertainty in Artificial Intelligence, UAI 2019
Y2 - 22 July 2019 through 25 July 2019
ER -