TY - GEN
T1 - Hyper-class augmented and regularized deep learning for fine-grained image classification
AU - Xie, Saining
AU - Yang, Tianbao
AU - Wang, Xiaoyu
AU - Lin, Yuanqing
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/10/14
Y1 - 2015/10/14
N2 - Deep convolutional neural networks (CNN) have seen tremendous success in large-scale generic object recognition. In comparison with generic object recognition, fine-grained image classification (FGIC) is much more challenging because (i) fine-grained labeled data is much more expensive to acquire (usually requiring domain expertise); (ii) there exists large intra-class and small inter-class variance. Most recent work exploiting deep CNN for image recognition with small training data adopts a simple strategy: pre-train a deep CNN on a large-scale external dataset (e.g., ImageNet) and fine-tune on the small-scale target data to fit the specific classification task. In this paper, beyond the fine-tuning strategy, we propose a systematic framework of learning a deep CNN that addresses the challenges from two new perspectives: (i) identifying easily annotated hyper-classes inherent in the fine-grained data and acquiring a large number of hyper-class-labeled images from readily available external sources (e.g., image search engines), and formulating the problem into multitask learning; (ii) a novel learning model by exploiting a regularization between the fine-grained recognition model and the hyper-class recognition model. We demonstrate the success of the proposed framework on two small-scale fine-grained datasets (Stanford Dogs and Stanford Cars) and on a large-scale car dataset that we collected.
AB - Deep convolutional neural networks (CNN) have seen tremendous success in large-scale generic object recognition. In comparison with generic object recognition, fine-grained image classification (FGIC) is much more challenging because (i) fine-grained labeled data is much more expensive to acquire (usually requiring domain expertise); (ii) there exists large intra-class and small inter-class variance. Most recent work exploiting deep CNN for image recognition with small training data adopts a simple strategy: pre-train a deep CNN on a large-scale external dataset (e.g., ImageNet) and fine-tune on the small-scale target data to fit the specific classification task. In this paper, beyond the fine-tuning strategy, we propose a systematic framework of learning a deep CNN that addresses the challenges from two new perspectives: (i) identifying easily annotated hyper-classes inherent in the fine-grained data and acquiring a large number of hyper-class-labeled images from readily available external sources (e.g., image search engines), and formulating the problem into multitask learning; (ii) a novel learning model by exploiting a regularization between the fine-grained recognition model and the hyper-class recognition model. We demonstrate the success of the proposed framework on two small-scale fine-grained datasets (Stanford Dogs and Stanford Cars) and on a large-scale car dataset that we collected.
UR - http://www.scopus.com/inward/record.url?scp=84959248782&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959248782&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2015.7298880
DO - 10.1109/CVPR.2015.7298880
M3 - Conference contribution
AN - SCOPUS:84959248782
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 2645
EP - 2654
BT - IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
PB - IEEE Computer Society
T2 - IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
Y2 - 7 June 2015 through 12 June 2015
ER -