TY - JOUR
T1 - Energy-inspired models
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
AU - Lawson, Dieterich
AU - Tucker, George
AU - Dai, Bo
AU - Ranganath, Rajesh
N1 - Funding Information:
We thank Ben Poole, Abhishek Kumar, and Diederick Kingma for helpful comments. We thank Matthias Bauer for answering implementation questions about LARS.
PY - 2019
Y1 - 2019
N2 - Energy-based models (EBMs) are powerful probabilistic models [8, 44], but suffer from intractable sampling and density evaluation due to the partition function. As a result, inference in EBMs relies on approximate sampling algorithms, leading to a mismatch between the model and inference. Motivated by this, we consider the sampler-induced distribution as the model of interest and maximize the likelihood of this model. This yields a class of energy-inspired models (EIMs) that incorporate learned energy functions while still providing exact samples and tractable log-likelihood lower bounds. We describe and evaluate three instantiations of such models based on truncated rejection sampling, self-normalized importance sampling, and Hamiltonian importance sampling. These models outperform or perform comparably to the recently proposed Learned Accept/Reject Sampling algorithm [5] and provide new insights on ranking Noise Contrastive Estimation [34, 46] and Contrastive Predictive Coding [57]. Moreover, EIMs allow us to generalize a recent connection between multi-sample variational lower bounds [9] and auxiliary variable variational inference [1, 63, 59, 47]. We show how recent variational bounds [9, 49, 52, 42, 73, 51, 65] can be unified with EIMs as the variational family.
AB - Energy-based models (EBMs) are powerful probabilistic models [8, 44], but suffer from intractable sampling and density evaluation due to the partition function. As a result, inference in EBMs relies on approximate sampling algorithms, leading to a mismatch between the model and inference. Motivated by this, we consider the sampler-induced distribution as the model of interest and maximize the likelihood of this model. This yields a class of energy-inspired models (EIMs) that incorporate learned energy functions while still providing exact samples and tractable log-likelihood lower bounds. We describe and evaluate three instantiations of such models based on truncated rejection sampling, self-normalized importance sampling, and Hamiltonian importance sampling. These models outperform or perform comparably to the recently proposed Learned Accept/Reject Sampling algorithm [5] and provide new insights on ranking Noise Contrastive Estimation [34, 46] and Contrastive Predictive Coding [57]. Moreover, EIMs allow us to generalize a recent connection between multi-sample variational lower bounds [9] and auxiliary variable variational inference [1, 63, 59, 47]. We show how recent variational bounds [9, 49, 52, 42, 73, 51, 65] can be unified with EIMs as the variational family.
UR - http://www.scopus.com/inward/record.url?scp=85090170623&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090170623&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85090170623
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
SN - 1049-5258
Y2 - 8 December 2019 through 14 December 2019
ER -