Unifying Foundation Models with Quadrotor Control for Visual Tracking beyond Object Categories

Alessandro Saviolo, Pratyaksh Rao, Vivek Radhakrishnan, Jiuhong Xiao, Giuseppe Loianno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Visual control enables quadrotors to adaptively navigate using real-time sensory data, bridging perception with action. Yet, challenges persist, including generalization across scenarios, maintaining reliability, and ensuring real-time responsiveness. This paper introduces a perception framework grounded in foundation models for universal object detection and tracking, moving beyond specific training categories. Integral to our approach is a multi-layered tracker integrated with the foundation detector, ensuring continuous target visibility, even when faced with motion blur, abrupt light shifts, and occlusions. Complementing this, we introduce a model-free controller tailored for resilient quadrotor visual tracking. Our system operates efficiently on limited hardware, relying solely on an onboard camera and an inertial measurement unit. Through extensive validation in diverse challenging indoor and outdoor environments, we demonstrate our system's effectiveness and adaptability. In conclusion, our research represents a step forward in quadrotor visual tracking, moving from task-specific methods to more versatile and adaptable operations.

Original languageEnglish (US)
Title of host publication2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7389-7396
Number of pages8
ISBN (Electronic)9798350384574
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, Japan
Duration: May 13 2024May 17 2024

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2024 IEEE International Conference on Robotics and Automation, ICRA 2024
Country/TerritoryJapan
CityYokohama
Period5/13/245/17/24

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Unifying Foundation Models with Quadrotor Control for Visual Tracking beyond Object Categories'. Together they form a unique fingerprint.

Cite this