TY - JOUR
T1 - Learning long-range vision for autonomous off-road driving
AU - Hadsell, Raia
AU - Sermanet, Pierre
AU - Ben, Jan
AU - Erkan, Ayse
AU - Scoffier, Marco
AU - Kavukcuoglu, Koray
AU - Muller, Urs
AU - LeCun, Yann
PY - 2009
Y1 - 2009
N2 - Most vision-based approaches to mobile robotics suffer from the limitations imposed by stereo obstacle detection, which is short range and prone to failure. We present a self-supervised learning process for long-range vision that is able to accurately classify complex terrain at distances up to the horizon, thus allowing superior strategic planning. The success of the learning process is due to the self-supervised training data that are generated on every frame: robust, visually consistent labels from a stereo module; normalized wide-context input windows; and a discriminative and concise feature representation. A deep hierarchical network is trained to extract informative and meaningful features from an input image, and the features are used to train a real-time classifier to predict traversability. The trained classifier sees obstacles and paths from 5 to more than 100 m, far beyond the maximum stereo range of 12 m, and adapts very quickly to new environments. The process was developed and tested on the LAGR (Learning Applied to Ground Robots) mobile robot. Results from a ground truth data set, as well as field test results, are given.
AB - Most vision-based approaches to mobile robotics suffer from the limitations imposed by stereo obstacle detection, which is short range and prone to failure. We present a self-supervised learning process for long-range vision that is able to accurately classify complex terrain at distances up to the horizon, thus allowing superior strategic planning. The success of the learning process is due to the self-supervised training data that are generated on every frame: robust, visually consistent labels from a stereo module; normalized wide-context input windows; and a discriminative and concise feature representation. A deep hierarchical network is trained to extract informative and meaningful features from an input image, and the features are used to train a real-time classifier to predict traversability. The trained classifier sees obstacles and paths from 5 to more than 100 m, far beyond the maximum stereo range of 12 m, and adapts very quickly to new environments. The process was developed and tested on the LAGR (Learning Applied to Ground Robots) mobile robot. Results from a ground truth data set, as well as field test results, are given.
UR - http://www.scopus.com/inward/record.url?scp=67649219352&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67649219352&partnerID=8YFLogxK
U2 - 10.1002/rob.20276
DO - 10.1002/rob.20276
M3 - Article
AN - SCOPUS:67649219352
SN - 1556-4959
VL - 26
SP - 120
EP - 144
JO - Journal of Field Robotics
JF - Journal of Field Robotics
IS - 2
ER -