ActFormer: Scalable Collaborative Perception via Active Queries

Suozhi Huang, Juexiao Zhang, Yiming Li, Chen Feng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Collaborative perception leverages rich visual observations from multiple robots to extend a single robot's perception ability beyond its field of view. Many prior works receive messages broadcast from all collaborators, leading to a scalability challenge when dealing with a large number of robots and sensors. In this work, we aim to address scalable camera-based collaborative perception with a Transformer-based architecture. Our key idea is to enable a single robot to intelligently discern the relevance of the collaborators and their associated cameras according to a learned spatial prior. This proactive understanding of the visual features' relevance does not require the transmission of the features themselves, enhancing both communication and computation efficiency. Specifically, we present ActFormer, a Transformer that learns bird's eye view (BEV) representations by using predefined BEV queries to interact with multi-robot multi-camera inputs. Each BEV query can actively select relevant cameras for information aggregation based on pose information, instead of interacting with all cameras indiscriminately. Experiments on the V2X-Sim dataset demonstrate that ActFormer improves the detection performance from 29.89% to 45.15% in terms of [email protected] with about 50% fewer queries, showcasing the effectiveness of ActFormer in multi-agent collaborative 3D object detection.

Original languageEnglish (US)
Title of host publication2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages14716-14723
Number of pages8
ISBN (Electronic)9798350384574
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, Japan
Duration: May 13 2024May 17 2024

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2024 IEEE International Conference on Robotics and Automation, ICRA 2024
Country/TerritoryJapan
CityYokohama
Period5/13/245/17/24

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'ActFormer: Scalable Collaborative Perception via Active Queries'. Together they form a unique fingerprint.

Cite this