TY - GEN
T1 - PointContrast
T2 - 16th European Conference on Computer Vision, ECCV 2020
AU - Xie, Saining
AU - Gu, Jiatao
AU - Guo, Demi
AU - Qi, Charles R.
AU - Guibas, Leonidas
AU - Litany, Or
N1 - Funding Information:
Acknowledgements. O.L. and L.G. were supported in part by NSF grant IIS-1763268, a Vannevar Bush Faculty Fellowship, and a grant from the SAIL-Toyota Center for AI Research.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Arguably one of the top success stories of deep learning is transfer learning. The finding that pre-training a network on a rich source set (e.g., ImageNet) can help boost performance once fine-tuned on a usually much smaller target set, has been instrumental to many applications in language and vision. Yet, very little is known about its usefulness in 3D point cloud understanding. We see this as an opportunity considering the effort required for annotating data in 3D. In this work, we aim at facilitating research on 3D representation learning. Different from previous works, we focus on high-level scene understanding tasks. To this end, we select a suit of diverse datasets and tasks to measure the effect of unsupervised pre-training on a large source set of 3D scenes. Our findings are extremely encouraging: using a unified triplet of architecture, source dataset, and contrastive loss for pre-training, we achieve improvement over recent best results in segmentation and detection across 6 different benchmarks for indoor and outdoor, real and synthetic datasets – demonstrating that the learned representation can generalize across domains. Furthermore, the improvement was similar to supervised pre-training, suggesting that future efforts should favor scaling data collection over more detailed annotation. We hope these findings will encourage more research on unsupervised pretext task design for 3D deep learning.
AB - Arguably one of the top success stories of deep learning is transfer learning. The finding that pre-training a network on a rich source set (e.g., ImageNet) can help boost performance once fine-tuned on a usually much smaller target set, has been instrumental to many applications in language and vision. Yet, very little is known about its usefulness in 3D point cloud understanding. We see this as an opportunity considering the effort required for annotating data in 3D. In this work, we aim at facilitating research on 3D representation learning. Different from previous works, we focus on high-level scene understanding tasks. To this end, we select a suit of diverse datasets and tasks to measure the effect of unsupervised pre-training on a large source set of 3D scenes. Our findings are extremely encouraging: using a unified triplet of architecture, source dataset, and contrastive loss for pre-training, we achieve improvement over recent best results in segmentation and detection across 6 different benchmarks for indoor and outdoor, real and synthetic datasets – demonstrating that the learned representation can generalize across domains. Furthermore, the improvement was similar to supervised pre-training, suggesting that future efforts should favor scaling data collection over more detailed annotation. We hope these findings will encourage more research on unsupervised pretext task design for 3D deep learning.
KW - 3D scene understanding
KW - Point cloud recognition
KW - Representation learning
KW - Unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85097812733&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097812733&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-58580-8_34
DO - 10.1007/978-3-030-58580-8_34
M3 - Conference contribution
AN - SCOPUS:85097812733
SN - 9783030585792
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 574
EP - 591
BT - Computer Vision – ECCV 2020 - 16th European Conference 2020, Proceedings
A2 - Vedaldi, Andrea
A2 - Bischof, Horst
A2 - Brox, Thomas
A2 - Frahm, Jan-Michael
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 August 2020 through 28 August 2020
ER -