Pairwise Attention Encoding for Point Cloud Feature Learning

Yunxiao Shi, Haoyu Fang, Jing Zhu, Yi Fang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Compared to hand-crafted ones, learning a 3D point signature has attracted increasing attention in the research community to better address challenging issues such as deformation and structural variation in 3D objects. PointNet is a pioneering work in introducing learning 3D point signature directly by consuming raw point cloud as input and applying convolution on each one of these points. Ground-breaking as it is, PointNet has limited capability in capturing local structure when learning visual features from each individual point. Recent variants of PointNet improved the quality of 3D point signature learning by taking neighbourhood information into account, but typically do so through hard-coded mechanisms (e.g. manually setting 'k' for k-Nearest Neighbour search, radius 'r' for Ball Query, etc). In this paper, we developed a novel point signature learning approach by considering pairwise interaction between every two individual points that moves beyond hard-coded neighbourhood exploitation, which further improves the quality of 3D point signature learning by encouraging the model to be aware of both neighbourhood information and global context. Specifically, we first introduce a novel pairwise reference tensor (PRT) in the original input point space to represent the influence of every two individual points that have on each other. Then, by passing the pairwise reference tensor through a multi-layer perceptron (MLP), we obtain a high-dimensional attention tensor that encodes pairwise relationships in high dimensional space that acts as an attention mechanism. Next we further fuse learned point features with the attention weights to obtain global visual features. Our proposed method has demonstrated superior performance on various 3D visual recognition tasks (e.g. object classification, part segmentation and scene semantic segmentation).

Original languageEnglish (US)
Title of host publicationProceedings - 2019 International Conference on 3D Vision, 3DV 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages135-144
Number of pages10
ISBN (Electronic)9781728131313
DOIs
StatePublished - Sep 2019
Event7th International Conference on 3D Vision, 3DV 2019 - Quebec, Canada
Duration: Sep 15 2019Sep 18 2019

Publication series

NameProceedings - 2019 International Conference on 3D Vision, 3DV 2019

Conference

Conference7th International Conference on 3D Vision, 3DV 2019
Country/TerritoryCanada
CityQuebec
Period9/15/199/18/19

Keywords

  • 3D Vision
  • Computer Vision
  • Local Structure
  • Shape Classification
  • Shape Segmentation

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Media Technology
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Pairwise Attention Encoding for Point Cloud Feature Learning'. Together they form a unique fingerprint.

Cite this