A two-stage pretraining algorithm for deep boltzmann machines

Kyunghyun Cho, Tapani Raiko, Alexander Ilin, Juha Karhunen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum- likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.

Original languageEnglish (US)
Title of host publicationArtificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings
Pages106-113
Number of pages8
DOIs
StatePublished - 2013
Event23rd International Conference on Artificial Neural Networks, ICANN 2013 - Sofia, Bulgaria
Duration: Sep 10 2013Sep 13 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8131 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other23rd International Conference on Artificial Neural Networks, ICANN 2013
CountryBulgaria
CitySofia
Period9/10/139/13/13

Keywords

  • Deep Boltzmann Machine
  • Deep Learning
  • Pretraining

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'A two-stage pretraining algorithm for deep boltzmann machines'. Together they form a unique fingerprint.

  • Cite this

    Cho, K., Raiko, T., Ilin, A., & Karhunen, J. (2013). A two-stage pretraining algorithm for deep boltzmann machines. In Artificial Neural Networks and Machine Learning, ICANN 2013 - 23rd International Conference on Artificial Neural Networks, Proceedings (pp. 106-113). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8131 LNCS). https://doi.org/10.1007/978-3-642-40728-4_14