TY - GEN
T1 - Generative Models as Out-of-Equilibrium Particle Systems
T2 - 2nd International Conference on Nonlinear Dynamics and Applications, ICNDA 2024
AU - Carbone, Davide
AU - Hua, Mengjian
AU - Coste, Simon
AU - Vanden-Eijnden, Eric
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - Energy-based models (EBMs) are generative models rooted in principles from statistical physics that find diverse applications in unsupervised learning. The evaluation of their performance often hinges on the cross-entropy (CE), which gauges the model distribution’s fidelity to the underlying data distribution. However, training EBMs using CE as the objective poses challenges due to the need to compute its gradient with respect to the model parameters, a task demanding sampling from the model distribution at each optimization step. By incorporating tools from sequential Monte-Carlo sampling, we achieved efficient computation of the gradient of CE, thereby circumventing the uncontrolled approximations present in standard contrastive divergence algorithms. Numerical experiments conducted on Gaussian mixture distributions, as well as the MNIST and CIFAR-10 datasets, provided empirical support for our theoretical findings. In this proceeding, we present and emphasize our recent results, drawing particular attention on the physical interpretation of the proposed methodology.
AB - Energy-based models (EBMs) are generative models rooted in principles from statistical physics that find diverse applications in unsupervised learning. The evaluation of their performance often hinges on the cross-entropy (CE), which gauges the model distribution’s fidelity to the underlying data distribution. However, training EBMs using CE as the objective poses challenges due to the need to compute its gradient with respect to the model parameters, a task demanding sampling from the model distribution at each optimization step. By incorporating tools from sequential Monte-Carlo sampling, we achieved efficient computation of the gradient of CE, thereby circumventing the uncontrolled approximations present in standard contrastive divergence algorithms. Numerical experiments conducted on Gaussian mixture distributions, as well as the MNIST and CIFAR-10 datasets, provided empirical support for our theoretical findings. In this proceeding, we present and emphasize our recent results, drawing particular attention on the physical interpretation of the proposed methodology.
KW - Generative models
KW - Jarzynski Identity
KW - Non-Equilibrium Thermodynamics
KW - Sequential Monte-Carlo
KW - Unsupervised machine learning
UR - http://www.scopus.com/inward/record.url?scp=85213384058&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85213384058&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-69146-1_23
DO - 10.1007/978-3-031-69146-1_23
M3 - Conference contribution
AN - SCOPUS:85213384058
SN - 9783031691454
T3 - Springer Proceedings in Physics
SP - 287
EP - 311
BT - Proceedings of the 2nd International Conference on Nonlinear Dynamics and Applications (ICNDA 2024) - Dynamical Models, Communications and Networks
A2 - Saha, Asit
A2 - Banerjee, Santo
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 21 February 2024 through 23 February 2024
ER -