Sparse feature learning for deep belief networks

Marc'aurelio Ranzato, Y. Lan Boureau, Yann Le Cun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
StatePublished - 2009
Event21st Annual Conference on Neural Information Processing Systems, NIPS 2007 - Vancouver, BC, Canada
Duration: Dec 3 2007Dec 6 2007

Publication series

NameAdvances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

Other

Other21st Annual Conference on Neural Information Processing Systems, NIPS 2007
CountryCanada
CityVancouver, BC
Period12/3/0712/6/07

ASJC Scopus subject areas

  • Information Systems

Fingerprint Dive into the research topics of 'Sparse feature learning for deep belief networks'. Together they form a unique fingerprint.

Cite this