TY - GEN
T1 - Unifying Foundation Models with Quadrotor Control for Visual Tracking beyond Object Categories
AU - Saviolo, Alessandro
AU - Rao, Pratyaksh
AU - Radhakrishnan, Vivek
AU - Xiao, Jiuhong
AU - Loianno, Giuseppe
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Visual control enables quadrotors to adaptively navigate using real-time sensory data, bridging perception with action. Yet, challenges persist, including generalization across scenarios, maintaining reliability, and ensuring real-time responsiveness. This paper introduces a perception framework grounded in foundation models for universal object detection and tracking, moving beyond specific training categories. Integral to our approach is a multi-layered tracker integrated with the foundation detector, ensuring continuous target visibility, even when faced with motion blur, abrupt light shifts, and occlusions. Complementing this, we introduce a model-free controller tailored for resilient quadrotor visual tracking. Our system operates efficiently on limited hardware, relying solely on an onboard camera and an inertial measurement unit. Through extensive validation in diverse challenging indoor and outdoor environments, we demonstrate our system's effectiveness and adaptability. In conclusion, our research represents a step forward in quadrotor visual tracking, moving from task-specific methods to more versatile and adaptable operations.
AB - Visual control enables quadrotors to adaptively navigate using real-time sensory data, bridging perception with action. Yet, challenges persist, including generalization across scenarios, maintaining reliability, and ensuring real-time responsiveness. This paper introduces a perception framework grounded in foundation models for universal object detection and tracking, moving beyond specific training categories. Integral to our approach is a multi-layered tracker integrated with the foundation detector, ensuring continuous target visibility, even when faced with motion blur, abrupt light shifts, and occlusions. Complementing this, we introduce a model-free controller tailored for resilient quadrotor visual tracking. Our system operates efficiently on limited hardware, relying solely on an onboard camera and an inertial measurement unit. Through extensive validation in diverse challenging indoor and outdoor environments, we demonstrate our system's effectiveness and adaptability. In conclusion, our research represents a step forward in quadrotor visual tracking, moving from task-specific methods to more versatile and adaptable operations.
UR - http://www.scopus.com/inward/record.url?scp=85197028472&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85197028472&partnerID=8YFLogxK
U2 - 10.1109/ICRA57147.2024.10610111
DO - 10.1109/ICRA57147.2024.10610111
M3 - Conference contribution
AN - SCOPUS:85197028472
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 7389
EP - 7396
BT - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Robotics and Automation, ICRA 2024
Y2 - 13 May 2024 through 17 May 2024
ER -