TY - GEN
T1 - Pyramid Learnable Tokens for 3D LiDAR Place Recognition
AU - Wen, Congcong
AU - Huang, Hao
AU - Liu, Yu Shen
AU - Fang, Yi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - 3D LiDAR place recognition plays a vital role in various robot applications' including robotic navigation, autonomous driving, and simultaneous localization and mapping. However, most previous studies evaluated their models on accumulated 2D scans instead of real-world 3D LiDAR scans with a larger number of points, which limits the application in real scenarios. To address this limitation, we propose a point transformer network with pyramid learnable tokens (PTNet-PLT) to learn global descriptors for an actual scanned 3D LiDAR place recognition. Specifically, we first present a novel shifted cube attention module that consists of a self-attention module for local feature extraction and a cross-attention module for regional feature aggregation. The self-attention module constrains attention computation on a locally partitioned cube and builds connections across cubes based on the shifted cube scheme. In addition, the cross-attention module introduces several learnable tokens to separately aggregate features of points with similar features but spatially distant into an arbitrarily shaped region, which enables the model to capture long-term dependencies of the points. Next, we build a pyramid architecture network to learn multi-scale features and involve a decreasing number of tokens at each layer to aggregate features over a larger region. Finally, we obtain the global descriptor by concatenating learned region tokens of all layers. Experiments on three datasets, including USyd Campus, Oxford Robot-Car, and KITTI, demonstrate the effectiveness and generalization of the proposed model for large-scale 3D LiDAR place recognition.
AB - 3D LiDAR place recognition plays a vital role in various robot applications' including robotic navigation, autonomous driving, and simultaneous localization and mapping. However, most previous studies evaluated their models on accumulated 2D scans instead of real-world 3D LiDAR scans with a larger number of points, which limits the application in real scenarios. To address this limitation, we propose a point transformer network with pyramid learnable tokens (PTNet-PLT) to learn global descriptors for an actual scanned 3D LiDAR place recognition. Specifically, we first present a novel shifted cube attention module that consists of a self-attention module for local feature extraction and a cross-attention module for regional feature aggregation. The self-attention module constrains attention computation on a locally partitioned cube and builds connections across cubes based on the shifted cube scheme. In addition, the cross-attention module introduces several learnable tokens to separately aggregate features of points with similar features but spatially distant into an arbitrarily shaped region, which enables the model to capture long-term dependencies of the points. Next, we build a pyramid architecture network to learn multi-scale features and involve a decreasing number of tokens at each layer to aggregate features over a larger region. Finally, we obtain the global descriptor by concatenating learned region tokens of all layers. Experiments on three datasets, including USyd Campus, Oxford Robot-Car, and KITTI, demonstrate the effectiveness and generalization of the proposed model for large-scale 3D LiDAR place recognition.
UR - http://www.scopus.com/inward/record.url?scp=85168710446&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85168710446&partnerID=8YFLogxK
U2 - 10.1109/ICRA48891.2023.10161523
DO - 10.1109/ICRA48891.2023.10161523
M3 - Conference contribution
AN - SCOPUS:85168710446
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 4143
EP - 4149
BT - Proceedings - ICRA 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE International Conference on Robotics and Automation, ICRA 2023
Y2 - 29 May 2023 through 2 June 2023
ER -