Two-layer contractive encodings with shortcuts for semi-supervised learning

Hannes Schulz, Kyunghyun Cho, Tapani Raiko, Sven Behnke

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Supervised training of multi-layer perceptrons (MLP) with only few labeled examples is prone to overfitting. Pretraining an MLP with unlabeled samples of the input distribution may achieve better generalization. Usually, pretraining is done in a layer-wise, greedy fashion which limits the complexity of the learnable features. To overcome this limitation, two-layer contractive encodings have been proposed recently - which pose a more difficult optimization problem, however. On the other hand, linear transformations of perceptrons have been proposed to make optimization of deep networks easier. In this paper, we propose to combine these two approaches. Experiments on handwritten digit recognition show the benefits of our combined approach to semi-supervised learning.

Original languageEnglish (US)
Title of host publicationNeural Information Processing - 20th International Conference, ICONIP 2013, Proceedings
Pages450-457
Number of pages8
EditionPART 1
DOIs
StatePublished - 2013
Event20th International Conference on Neural Information Processing, ICONIP 2013 - Daegu, Korea, Republic of
Duration: Nov 3 2013Nov 7 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume8226 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other20th International Conference on Neural Information Processing, ICONIP 2013
Country/TerritoryKorea, Republic of
CityDaegu
Period11/3/1311/7/13

Keywords

  • Linear transformation
  • Multi-layer perceptron
  • Semi-supervised learning
  • Two-layer contractive encoding

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Two-layer contractive encodings with shortcuts for semi-supervised learning'. Together they form a unique fingerprint.

Cite this