Two-layer contractive encodings for learning stable nonlinear features

Hannes Schulz, Kyunghyun Cho, Tapani Raiko, Sven Behnke

Research output: Contribution to journalArticlepeer-review

Abstract

Unsupervised learning of feature hierarchies is often a good strategy to initialize deep architectures for supervised learning. Most existing deep learning methods build these feature hierarchies layer by layer in a greedy fashion using either auto-encoders or restricted Boltzmann machines. Both yield encoders which compute linear projections of input followed by a smooth thresholding function. In this work, we demonstrate that these encoders fail to find stable features when the required computation is in the exclusive-or class. To overcome this limitation, we propose a two-layer encoder which is less restricted in the type of features it can learn. The proposed encoder is regularized by an extension of previous work on contractive regularization. This proposed two-layer contractive encoder potentially poses a more difficult optimization problem, and we further propose to linearly transform hidden neurons of the encoder to make learning easier. We demonstrate the advantages of the two-layer encoders qualitatively on artificially constructed datasets as well as commonly used benchmark datasets. We also conduct experiments on a semi-supervised learning task and show the benefits of the proposed two-layer encoders trained with the linear transformation of perceptrons.

Original languageEnglish (US)
Pages (from-to)4-11
Number of pages8
JournalNeural Networks
Volume64
DOIs
StatePublished - Apr 1 2015

Keywords

  • Deep learning
  • Linear transformation
  • Multi-layer perceptron
  • Pretraining
  • Semi-supervised learning
  • Two-layer contractive encoding

ASJC Scopus subject areas

  • Cognitive Neuroscience
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Two-layer contractive encodings for learning stable nonlinear features'. Together they form a unique fingerprint.

Cite this