TY - GEN
T1 - VIPose
T2 - 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2021
AU - Ge, Rundong
AU - Loianno, Giuseppe
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Estimating the 6D pose of objects is beneficial for robotics tasks such as transportation, autonomous navigation, manipulation as well as in scenarios beyond robotics like virtual and augmented reality. With respect to single image pose estimation, pose tracking takes into account the temporal information across multiple frames to overcome possible detection inconsistencies and to improve the pose estimation efficiency. In this work, we introduce a novel Deep Neural Network (DNN) called VIPose, that combines inertial and camera data to address the object pose tracking problem in real-time. The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose between consecutive image frames. The overall 6D pose is then estimated by consecutively combining relative poses. Our approach shows remarkable pose estimation results for heavily occluded objects that are well known to be very challenging to handle by existing state-of-the-art solutions. The effectiveness of the proposed approach is validated on a new dataset called VIYCB with RGB image, IMU data, and accurate 6D pose annotations created by employing an automated labeling technique. The approach presents accuracy performances comparable to state-of-the-art techniques, but with the additional benefit of being real-time.
AB - Estimating the 6D pose of objects is beneficial for robotics tasks such as transportation, autonomous navigation, manipulation as well as in scenarios beyond robotics like virtual and augmented reality. With respect to single image pose estimation, pose tracking takes into account the temporal information across multiple frames to overcome possible detection inconsistencies and to improve the pose estimation efficiency. In this work, we introduce a novel Deep Neural Network (DNN) called VIPose, that combines inertial and camera data to address the object pose tracking problem in real-time. The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose between consecutive image frames. The overall 6D pose is then estimated by consecutively combining relative poses. Our approach shows remarkable pose estimation results for heavily occluded objects that are well known to be very challenging to handle by existing state-of-the-art solutions. The effectiveness of the proposed approach is validated on a new dataset called VIYCB with RGB image, IMU data, and accurate 6D pose annotations created by employing an automated labeling technique. The approach presents accuracy performances comparable to state-of-the-art techniques, but with the additional benefit of being real-time.
UR - http://www.scopus.com/inward/record.url?scp=85124352335&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124352335&partnerID=8YFLogxK
U2 - 10.1109/IROS51168.2021.9636283
DO - 10.1109/IROS51168.2021.9636283
M3 - Conference contribution
AN - SCOPUS:85124352335
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 4597
EP - 4603
BT - IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 September 2021 through 1 October 2021
ER -