How to pretrain deep Boltzmann machines in two stages

Kyunghyun Cho, Tapani Raiko, Alexander Ilin, Juha Karhunen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum-likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.

Original languageEnglish (US)
Title of host publicationArtificial Neural Networks - Methods and Applications in Bio-/Neuroinformatics
PublisherSpringer Verlag
Pages201-219
Number of pages19
ISBN (Print)9783319099026
DOIs
StatePublished - 2015
Event23rd International Conference on Artificial Neural Networks, ICANN 2013 - Sofia, Bulgaria
Duration: Sep 10 2013Sep 13 2013

Publication series

NameArtificial Neural Networks - Methods and Applications in Bio-/Neuroinformatics

Other

Other23rd International Conference on Artificial Neural Networks, ICANN 2013
Country/TerritoryBulgaria
CitySofia
Period9/10/139/13/13

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems

Fingerprint

Dive into the research topics of 'How to pretrain deep Boltzmann machines in two stages'. Together they form a unique fingerprint.

Cite this