TY - GEN
T1 - Ask the locals
T2 - 2011 IEEE International Conference on Computer Vision, ICCV 2011
AU - Boureau, Y. Lan
AU - Le Roux, Nicolas
AU - Bach, Francis
AU - Ponce, Jean
AU - Lecun, Yann
PY - 2011
Y1 - 2011
N2 - Invariant representations in object recognition systems are generally obtained by pooling feature vectors over spatially local neighborhoods. But pooling is not local in the feature vector space, so that widely dissimilar features may be pooled together if they are in nearby locations. Recent approaches rely on sophisticated encoding methods and more specialized codebooks (or dictionaries), e.g., learned on subsets of descriptors which are close in feature space, to circumvent this problem. In this work, we argue that a common trait found in much recent work in image recognition or retrieval is that it leverages locality in feature space on top of purely spatial locality. We propose to apply this idea in its simplest form to an object recognition system based on the spatial pyramid framework, to increase the performance of small dictionaries with very little added engineering. State-of-the-art results on several object recognition benchmarks show the promise of this approach.
AB - Invariant representations in object recognition systems are generally obtained by pooling feature vectors over spatially local neighborhoods. But pooling is not local in the feature vector space, so that widely dissimilar features may be pooled together if they are in nearby locations. Recent approaches rely on sophisticated encoding methods and more specialized codebooks (or dictionaries), e.g., learned on subsets of descriptors which are close in feature space, to circumvent this problem. In this work, we argue that a common trait found in much recent work in image recognition or retrieval is that it leverages locality in feature space on top of purely spatial locality. We propose to apply this idea in its simplest form to an object recognition system based on the spatial pyramid framework, to increase the performance of small dictionaries with very little added engineering. State-of-the-art results on several object recognition benchmarks show the promise of this approach.
UR - http://www.scopus.com/inward/record.url?scp=84856649187&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84856649187&partnerID=8YFLogxK
U2 - 10.1109/ICCV.2011.6126555
DO - 10.1109/ICCV.2011.6126555
M3 - Conference contribution
AN - SCOPUS:84856649187
SN - 9781457711015
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 2651
EP - 2658
BT - 2011 International Conference on Computer Vision, ICCV 2011
Y2 - 6 November 2011 through 13 November 2011
ER -