TY - GEN
T1 - 3D-OAE
T2 - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
AU - Zhou, Junsheng
AU - Wen, Xin
AU - Ma, Baorui
AU - Liu, Yu Shen
AU - Gao, Yue
AU - Fang, Yi
AU - Han, Zhizhong
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The manual annotation for large-scale point clouds is still tedious and unavailable for many harsh real-world tasks. Self-supervised learning, which is used on raw and unlabeled data to pre-train deep neural networks, is a promising approach to address this issue. Existing works usually take the common aid from auto-encoders to establish the self-supervision by the self-reconstruction schema. However, the previous auto-encoders merely focus on the global shapes and do not distinguish the local and global geometric features apart. To address this problem, we present a novel and efficient self-supervised point cloud representation learning framework, named 3D Occlusion Auto-Encoder (3D-OAE), to facilitate the detailed supervision inherited in local regions and global shapes. We propose to randomly occlude some local patches of point clouds and establish the supervision via inpainting the occluded patches using the remaining ones. Specifically, we design an asymmetrical encoder-decoder architecture based on standard Transformer, where the encoder operates only on the visible subset of patches to learn local patterns, and a lightweight decoder is designed to leverage these visible patterns to infer the missing geometries via self-attention. We find that occluding a very high proportion of the input point cloud (e.g. 75%) will still yield a nontrivial self-supervisory performance, which enables us to achieve 3-4 times faster during training but also improve accuracy. Experimental results show that our approach outperforms the state-of-the-art on a diverse range of down-stream discriminative and generative tasks. Code is available at https://github.com/junshengzhou/3D-OAE.
AB - The manual annotation for large-scale point clouds is still tedious and unavailable for many harsh real-world tasks. Self-supervised learning, which is used on raw and unlabeled data to pre-train deep neural networks, is a promising approach to address this issue. Existing works usually take the common aid from auto-encoders to establish the self-supervision by the self-reconstruction schema. However, the previous auto-encoders merely focus on the global shapes and do not distinguish the local and global geometric features apart. To address this problem, we present a novel and efficient self-supervised point cloud representation learning framework, named 3D Occlusion Auto-Encoder (3D-OAE), to facilitate the detailed supervision inherited in local regions and global shapes. We propose to randomly occlude some local patches of point clouds and establish the supervision via inpainting the occluded patches using the remaining ones. Specifically, we design an asymmetrical encoder-decoder architecture based on standard Transformer, where the encoder operates only on the visible subset of patches to learn local patterns, and a lightweight decoder is designed to leverage these visible patterns to infer the missing geometries via self-attention. We find that occluding a very high proportion of the input point cloud (e.g. 75%) will still yield a nontrivial self-supervisory performance, which enables us to achieve 3-4 times faster during training but also improve accuracy. Experimental results show that our approach outperforms the state-of-the-art on a diverse range of down-stream discriminative and generative tasks. Code is available at https://github.com/junshengzhou/3D-OAE.
UR - http://www.scopus.com/inward/record.url?scp=85191707715&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85191707715&partnerID=8YFLogxK
U2 - 10.1109/ICRA57147.2024.10610588
DO - 10.1109/ICRA57147.2024.10610588
M3 - Conference contribution
AN - SCOPUS:85191707715
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 15416
EP - 15423
BT - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 13 May 2024 through 17 May 2024
ER -