A sparse and locally shift invariant feature extractor applied to document images

Marc Aurelio Ranzato, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe an unsupervised learning algorithm for extracting sparse and locally shift-invariant features. We also devise a principled procedure for learning hierarchies of invariant features. Each feature detector is composed of a set of trainable convolutional filters followed by a max-pooling layer over non-overlapping windows, and a point-wise sigmoid non-linearity. A second stage of more invariant features is fed with patches provided by the first stage feature extractor, and is trained in the same way. The method is used to pre-train the first four layers of a deep convolutional network which achieves state-of-the-art performance on the MNIST dataset of handwritten digits. The final testing error rate is equal to 0.42%. Preliminary experiments on compression of bitonal document images show very promising results in terms of compression ratio and reconstruction error.

Original languageEnglish (US)
Title of host publicationProceedings - 9th International Conference on Document Analysis and Recognition, ICDAR 2007
Pages1213-1217
Number of pages5
DOIs
StatePublished - 2007
Event9th International Conference on Document Analysis and Recognition, ICDAR 2007 - Curitiba, Brazil
Duration: Sep 23 2007Sep 26 2007

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Volume2
ISSN (Print)1520-5363

Other

Other9th International Conference on Document Analysis and Recognition, ICDAR 2007
Country/TerritoryBrazil
CityCuritiba
Period9/23/079/26/07

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'A sparse and locally shift invariant feature extractor applied to document images'. Together they form a unique fingerprint.

Cite this