TY - GEN
T1 - A theoretical analysis of feature pooling in visual recognition
AU - Boureau, Y. Lan
AU - Ponce, Jean
AU - Lecun, Yann
PY - 2010
Y1 - 2010
N2 - Many modem visual recognition algorithms incorporate a step of spatial 'pooling', where the outputs of several nearby feature detectors are combined into a local or global 'bag of features', in a way that preserves task-related information while removing irrelevant details. Pooling is used to achieve invariance to image transformations, more compact representations, and better robustness to noise and clutter. Several papers have shown that the details of the pooling operation can greatly influence the performance, but studies have so far been purely empirical. In this paper, we show that the reasons underlying the performance of various pooling methods are obscured by several confounding factors, such as the link between the sample cardinality in a spatial pool and the resolution at which low-level features have been extracted. We provide a detailed theoretical analysis of max pooling and average pooling, and give extensive empirical comparisons for object recognition tasks.
AB - Many modem visual recognition algorithms incorporate a step of spatial 'pooling', where the outputs of several nearby feature detectors are combined into a local or global 'bag of features', in a way that preserves task-related information while removing irrelevant details. Pooling is used to achieve invariance to image transformations, more compact representations, and better robustness to noise and clutter. Several papers have shown that the details of the pooling operation can greatly influence the performance, but studies have so far been purely empirical. In this paper, we show that the reasons underlying the performance of various pooling methods are obscured by several confounding factors, such as the link between the sample cardinality in a spatial pool and the resolution at which low-level features have been extracted. We provide a detailed theoretical analysis of max pooling and average pooling, and give extensive empirical comparisons for object recognition tasks.
UR - http://www.scopus.com/inward/record.url?scp=77956502203&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77956502203&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:77956502203
SN - 9781605589077
T3 - ICML 2010 - Proceedings, 27th International Conference on Machine Learning
SP - 111
EP - 118
BT - ICML 2010 - Proceedings, 27th International Conference on Machine Learning
T2 - 27th International Conference on Machine Learning, ICML 2010
Y2 - 21 June 2010 through 25 June 2010
ER -