TY - GEN
T1 - Learning pairwise neural network encoder for depth image-based 3d model retrieval
AU - Zhu, Jing
AU - Zhu, Fan
AU - Wong, Edward K.
AU - Fang, Yi
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/10/13
Y1 - 2015/10/13
N2 - With the emergence of RGB-D cameras (e.g., Kinect), the sensing capability of artificial intelligence systems has been dramatically increased, and as a consequence, a wide range of depth image-based human-machine interaction applica-tions are proposed. In design industry, a 3D model always contains abundant information, which are required for man-ufacture. Since depth images can be conveniently acquired, a retrieval system that can return 3D models based on depth image inputs can assist or improve the traditional product design process. In this work, we address the depth image-based 3D model retrieval problem. By extending the neural network to a neural network pair with identical output lay-ers for objects of the same category, unified domain-invariant representations can be learned based on the low-level mis-matched depth image features and 3D model features. A unique advantage of the framework is that the correspon-dence information between depth images and 3D models are not required, so that it can easily be generalized to large-scale databases. In order to evaluate the effiectiveness of our approach, depth images (with Kinect-Type noise) in the NYU Depth V2 dataset are used as queries to retrieve 3D models of the same categories in the SHREC 2014 dataset. Experimental results suggest that our approach can outper-form the state-of-The-Arts methods, and the paradigm that directly uses the original representations of depth images and 3D models for retrieval.
AB - With the emergence of RGB-D cameras (e.g., Kinect), the sensing capability of artificial intelligence systems has been dramatically increased, and as a consequence, a wide range of depth image-based human-machine interaction applica-tions are proposed. In design industry, a 3D model always contains abundant information, which are required for man-ufacture. Since depth images can be conveniently acquired, a retrieval system that can return 3D models based on depth image inputs can assist or improve the traditional product design process. In this work, we address the depth image-based 3D model retrieval problem. By extending the neural network to a neural network pair with identical output lay-ers for objects of the same category, unified domain-invariant representations can be learned based on the low-level mis-matched depth image features and 3D model features. A unique advantage of the framework is that the correspon-dence information between depth images and 3D models are not required, so that it can easily be generalized to large-scale databases. In order to evaluate the effiectiveness of our approach, depth images (with Kinect-Type noise) in the NYU Depth V2 dataset are used as queries to retrieve 3D models of the same categories in the SHREC 2014 dataset. Experimental results suggest that our approach can outper-form the state-of-The-Arts methods, and the paradigm that directly uses the original representations of depth images and 3D models for retrieval.
KW - Cross-Domain
KW - Depth Image
KW - Neural Network
KW - Retrieval
UR - http://www.scopus.com/inward/record.url?scp=84962903684&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962903684&partnerID=8YFLogxK
U2 - 10.1145/2733373.2806323
DO - 10.1145/2733373.2806323
M3 - Conference contribution
AN - SCOPUS:84962903684
T3 - MM 2015 - Proceedings of the 2015 ACM Multimedia Conference
SP - 1227
EP - 1230
BT - MM 2015 - Proceedings of the 2015 ACM Multimedia Conference
PB - Association for Computing Machinery, Inc
T2 - 23rd ACM International Conference on Multimedia, MM 2015
Y2 - 26 October 2015 through 30 October 2015
ER -