TY - GEN
T1 - Unsupervised joint 3D object model learning and 6d pose estimation for depth-based instance segmentation
AU - Wu, Yuanwei
AU - Marks, Tim
AU - Cherian, Anoop
AU - Chen, Siheng
AU - Feng, Chen
AU - Wang, Guanghui
AU - Sullivan, Alan
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - In this work, we propose a novel unsupervised approach to jointly learn the 3D object model and estimate the 6D poses of multiple instances of a same object, with applications to depth-based instance segmentation. The inputs are depth images, and the learned object model is represented by a 3D point cloud. Traditional 6D pose estimation approaches are not sufficient to address this problem, where neither a CAD model of the object nor the ground-truth 6D poses of its instances are available during training. To solve this problem, we propose to jointly optimize the model learning and pose estimation in an end-to-end deep learning framework. Specifically, our network produces a 3D object model and a list of rigid transformations on this model to generate instances, which when rendered must match the observed point cloud to minimizing the Chamfer distance. To render the set of instance point clouds with occlusions, the network automatically removes the occluded points in a given camera view. Extensive experiments evaluate our technique on several object models and varying number of instances in 3D point clouds. We demonstrate the application of our method to instance segmentation of depth images of small bins of industrial parts. Compared with popular baselines for instance segmentation, our model not only demonstrates competitive performance, but also learns a 3D object model that is represented as a 3D point cloud.
AB - In this work, we propose a novel unsupervised approach to jointly learn the 3D object model and estimate the 6D poses of multiple instances of a same object, with applications to depth-based instance segmentation. The inputs are depth images, and the learned object model is represented by a 3D point cloud. Traditional 6D pose estimation approaches are not sufficient to address this problem, where neither a CAD model of the object nor the ground-truth 6D poses of its instances are available during training. To solve this problem, we propose to jointly optimize the model learning and pose estimation in an end-to-end deep learning framework. Specifically, our network produces a 3D object model and a list of rigid transformations on this model to generate instances, which when rendered must match the observed point cloud to minimizing the Chamfer distance. To render the set of instance point clouds with occlusions, the network automatically removes the occluded points in a given camera view. Extensive experiments evaluate our technique on several object models and varying number of instances in 3D point clouds. We demonstrate the application of our method to instance segmentation of depth images of small bins of industrial parts. Compared with popular baselines for instance segmentation, our model not only demonstrates competitive performance, but also learns a 3D object model that is represented as a 3D point cloud.
KW - 3D object model learning
KW - 6D pose estimation
KW - Instance segmentation
KW - Unsupervised learnig
UR - http://www.scopus.com/inward/record.url?scp=85082453207&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082453207&partnerID=8YFLogxK
U2 - 10.1109/ICCVW.2019.00339
DO - 10.1109/ICCVW.2019.00339
M3 - Conference contribution
AN - SCOPUS:85082453207
T3 - Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019
SP - 2777
EP - 2786
BT - Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE/CVF International Conference on Computer Vision Workshop, ICCVW 2019
Y2 - 27 October 2019 through 28 October 2019
ER -