TY - GEN
T1 - Pairwise Attention Encoding for Point Cloud Feature Learning
AU - Shi, Yunxiao
AU - Fang, Haoyu
AU - Zhu, Jing
AU - Fang, Yi
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - Compared to hand-crafted ones, learning a 3D point signature has attracted increasing attention in the research community to better address challenging issues such as deformation and structural variation in 3D objects. PointNet is a pioneering work in introducing learning 3D point signature directly by consuming raw point cloud as input and applying convolution on each one of these points. Ground-breaking as it is, PointNet has limited capability in capturing local structure when learning visual features from each individual point. Recent variants of PointNet improved the quality of 3D point signature learning by taking neighbourhood information into account, but typically do so through hard-coded mechanisms (e.g. manually setting 'k' for k-Nearest Neighbour search, radius 'r' for Ball Query, etc). In this paper, we developed a novel point signature learning approach by considering pairwise interaction between every two individual points that moves beyond hard-coded neighbourhood exploitation, which further improves the quality of 3D point signature learning by encouraging the model to be aware of both neighbourhood information and global context. Specifically, we first introduce a novel pairwise reference tensor (PRT) in the original input point space to represent the influence of every two individual points that have on each other. Then, by passing the pairwise reference tensor through a multi-layer perceptron (MLP), we obtain a high-dimensional attention tensor that encodes pairwise relationships in high dimensional space that acts as an attention mechanism. Next we further fuse learned point features with the attention weights to obtain global visual features. Our proposed method has demonstrated superior performance on various 3D visual recognition tasks (e.g. object classification, part segmentation and scene semantic segmentation).
AB - Compared to hand-crafted ones, learning a 3D point signature has attracted increasing attention in the research community to better address challenging issues such as deformation and structural variation in 3D objects. PointNet is a pioneering work in introducing learning 3D point signature directly by consuming raw point cloud as input and applying convolution on each one of these points. Ground-breaking as it is, PointNet has limited capability in capturing local structure when learning visual features from each individual point. Recent variants of PointNet improved the quality of 3D point signature learning by taking neighbourhood information into account, but typically do so through hard-coded mechanisms (e.g. manually setting 'k' for k-Nearest Neighbour search, radius 'r' for Ball Query, etc). In this paper, we developed a novel point signature learning approach by considering pairwise interaction between every two individual points that moves beyond hard-coded neighbourhood exploitation, which further improves the quality of 3D point signature learning by encouraging the model to be aware of both neighbourhood information and global context. Specifically, we first introduce a novel pairwise reference tensor (PRT) in the original input point space to represent the influence of every two individual points that have on each other. Then, by passing the pairwise reference tensor through a multi-layer perceptron (MLP), we obtain a high-dimensional attention tensor that encodes pairwise relationships in high dimensional space that acts as an attention mechanism. Next we further fuse learned point features with the attention weights to obtain global visual features. Our proposed method has demonstrated superior performance on various 3D visual recognition tasks (e.g. object classification, part segmentation and scene semantic segmentation).
KW - 3D Vision
KW - Computer Vision
KW - Local Structure
KW - Shape Classification
KW - Shape Segmentation
UR - http://www.scopus.com/inward/record.url?scp=85075033034&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075033034&partnerID=8YFLogxK
U2 - 10.1109/3DV.2019.00024
DO - 10.1109/3DV.2019.00024
M3 - Conference contribution
T3 - Proceedings - 2019 International Conference on 3D Vision, 3DV 2019
SP - 135
EP - 144
BT - Proceedings - 2019 International Conference on 3D Vision, 3DV 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Conference on 3D Vision, 3DV 2019
Y2 - 15 September 2019 through 18 September 2019
ER -