TY - GEN
T1 - Learning convolutional feature hierarchies for visual recognition
AU - Kavukcuoglu, Koray
AU - Sermanet, Pierre
AU - Boureau, Y. Lan
AU - Gregor, Karol
AU - Mathieu, Michaël
AU - LeCun, Yann
PY - 2010
Y1 - 2010
N2 - We propose an unsupervised method for learning multi-stage hierarchies of sparse convolutional features. While sparse coding has become an increasingly popular method for learning visual features, it is most often trained at the patch level. Applying the resulting filters convolutionally results in highly redundant codes because overlapping patches are encoded in isolation. By training convolutionally over large image windows, our method reduces the redudancy between feature vectors at neighboring locations and improves the efficiency of the overall representation. In addition to a linear decoder that reconstructs the image from sparse features, our method trains an efficient feed-forward encoder that predicts quasisparse features from the input. While patch-based training rarely produces anything but oriented edge detectors, we show that convolutional training produces highly diverse filters, including center-surround filters, corner detectors, cross detectors, and oriented grating detectors. We show that using these filters in multistage convolutional network architecture improves performance on a number of visual recognition and detection tasks.
AB - We propose an unsupervised method for learning multi-stage hierarchies of sparse convolutional features. While sparse coding has become an increasingly popular method for learning visual features, it is most often trained at the patch level. Applying the resulting filters convolutionally results in highly redundant codes because overlapping patches are encoded in isolation. By training convolutionally over large image windows, our method reduces the redudancy between feature vectors at neighboring locations and improves the efficiency of the overall representation. In addition to a linear decoder that reconstructs the image from sparse features, our method trains an efficient feed-forward encoder that predicts quasisparse features from the input. While patch-based training rarely produces anything but oriented edge detectors, we show that convolutional training produces highly diverse filters, including center-surround filters, corner detectors, cross detectors, and oriented grating detectors. We show that using these filters in multistage convolutional network architecture improves performance on a number of visual recognition and detection tasks.
UR - http://www.scopus.com/inward/record.url?scp=85162460675&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85162460675&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85162460675
SN - 9781617823800
T3 - Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010
BT - Advances in Neural Information Processing Systems 23
PB - Neural Information Processing Systems
T2 - 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010
Y2 - 6 December 2010 through 9 December 2010
ER -