TY - JOUR
T1 - Operator variational inference
AU - Ranganath, Rajesh
AU - Altosaar, Jaan
AU - Tran, Dustin
AU - Blei, David M.
N1 - Funding Information:
This work is supported by NSF IIS-1247664, ONR N00014-11-1-0651, DARPA FA8750-14-2-0009, DARPA N66001-15-C-4032, Adobe, NSERC PGS-D, Porter Ogden Jacobus Fellowship, Seibel Foundation, and the Sloan Foundation. The authors would like to thank Dawen Liang, Ben Poole, Stephan Mandt, Kevin Murphy, Christian Naesseth, and the anonymous reviews for their helpful feedback and comments.
Publisher Copyright:
© 2016 NIPS Foundation - All Rights Reserved.
PY - 2016
Y1 - 2016
N2 - Variational inference is an umbrella term for algorithms which cast Bayesian inference as optimization. Classically, variational inference uses the Kullback-Leibler divergence to define the optimization. Though this divergence has been widely used, the resultant posterior approximation can suffer from undesirable statistical properties. To address this, we reexamine variational inference from its roots as an optimization problem. We use operators, or functions of functions, to design variational objectives. As one example, we design a variational objective with a Langevin-Stein operator. We develop a black box algorithm, operator variational inference (OPVI), for optimizing any operator objective. Importantly, operators enable us to make explicit the statistical and computational tradeoffs for variational inference. We can characterize different properties of variational objectives, such as objectives that admit data subsampling-allowing inference to scale to massive data-as well as objectives that admit variational programs-a rich class of posterior approximations that does not require a tractable density. We illustrate the benefits of OPVI on a mixture model and a generative model of images.
AB - Variational inference is an umbrella term for algorithms which cast Bayesian inference as optimization. Classically, variational inference uses the Kullback-Leibler divergence to define the optimization. Though this divergence has been widely used, the resultant posterior approximation can suffer from undesirable statistical properties. To address this, we reexamine variational inference from its roots as an optimization problem. We use operators, or functions of functions, to design variational objectives. As one example, we design a variational objective with a Langevin-Stein operator. We develop a black box algorithm, operator variational inference (OPVI), for optimizing any operator objective. Importantly, operators enable us to make explicit the statistical and computational tradeoffs for variational inference. We can characterize different properties of variational objectives, such as objectives that admit data subsampling-allowing inference to scale to massive data-as well as objectives that admit variational programs-a rich class of posterior approximations that does not require a tractable density. We illustrate the benefits of OPVI on a mixture model and a generative model of images.
UR - http://www.scopus.com/inward/record.url?scp=85019244640&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85019244640&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85019244640
SN - 1049-5258
SP - 496
EP - 504
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 30th Annual Conference on Neural Information Processing Systems, NIPS 2016
Y2 - 5 December 2016 through 10 December 2016
ER -