TY - JOUR
T1 - Automatic variational inference in Stan
AU - Kucukelbir, Alp
AU - Ranganath, Rajesh
AU - Gelman, Andrew
AU - Blei, David M.
N1 - Funding Information:
We thank Dustin Tran, Bruno Jacobs, and the reviewers for their comments. This work is supported by NSF IIS-0745520, IIS-1247664, IIS-1009542, SES-1424962, ONR N00014-11-1-0651, DARPA FA8750-14-2-0009, N66001-15-C-4032, Sloan G-2015-13987, IES DE R305D140059, NDSEG, Facebook, Adobe, Amazon, and the Siebel Scholar and John Templeton Foundations.
PY - 2015
Y1 - 2015
N2 - Variational inference is a scalable technique for approximate Bayesian inference. Deriving variational inference algorithms requires tedious model-specific calculations; this makes it difficult for non-experts to use. We propose an automatic variational inference algorithm, automatic differentiation variational inference (ADVI); we implement it in Stan (code available), a probabilistic programming system. In ADVI the user provides a Bayesian model and a dataset, nothing else. We make no conjugacy assumptions and support a broad class of models. The algorithm automatically determines an appropriate variational family and optimizes the variational objective. We compare ADVI to MCMC sampling across hierarchical generalized linear models, nonconjugate matrix factorization, and a mixture model. We train the mixture model on a quarter million images. With ADVI we can use variational inference on any model we write in Stan.
AB - Variational inference is a scalable technique for approximate Bayesian inference. Deriving variational inference algorithms requires tedious model-specific calculations; this makes it difficult for non-experts to use. We propose an automatic variational inference algorithm, automatic differentiation variational inference (ADVI); we implement it in Stan (code available), a probabilistic programming system. In ADVI the user provides a Bayesian model and a dataset, nothing else. We make no conjugacy assumptions and support a broad class of models. The algorithm automatically determines an appropriate variational family and optimizes the variational objective. We compare ADVI to MCMC sampling across hierarchical generalized linear models, nonconjugate matrix factorization, and a mixture model. We train the mixture model on a quarter million images. With ADVI we can use variational inference on any model we write in Stan.
UR - http://www.scopus.com/inward/record.url?scp=84965158671&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84965158671&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84965158671
SN - 1049-5258
VL - 2015-January
SP - 568
EP - 576
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 29th Annual Conference on Neural Information Processing Systems, NIPS 2015
Y2 - 7 December 2015 through 12 December 2015
ER -