Enhanced gradient for training restricted Boltzmann machines

Kyung Hyun Cho, Tapani Raiko, Alexander Ilin

Research output: Contribution to journalLetterpeer-review

Abstract

Restricted Boltzmann machines (RBMs) are often used as building blocks in greedy learning of deep networks.However, training this simplemodel can be laborious. Traditional learning algorithms often converge only with the right choice of metaparameters that specify, for example, learning rate scheduling and the scale of the initial weights. They are also sensitive to specific data representation. An equivalentRBMcan be obtained by flipping some bits and changing the weights and biases accordingly, but traditional learning rules are not invariant to such transformations. Without careful tuning of these training settings, traditional algorithms can easily get stuck or even diverge. In this letter, we present an enhanced gradient that is derived to be invariant to bit-flipping transformations.We experimentally show that the enhanced gradient yields more stable training of RBMs both when used with a fixed learning rate and an adaptive one.

Original languageEnglish (US)
Pages (from-to)805-831
Number of pages27
JournalNeural computation
Volume25
Issue number3
DOIs
StatePublished - 2013

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Cognitive Neuroscience

Fingerprint

Dive into the research topics of 'Enhanced gradient for training restricted Boltzmann machines'. Together they form a unique fingerprint.

Cite this