TY - GEN
T1 - VisualBackProp
T2 - 2018 IEEE International Conference on Robotics and Automation, ICRA 2018
AU - Bojarski, Mariusz
AU - Choromanska, Anna
AU - Choromanski, Krzysztof
AU - Firner, Bernhard
AU - Ackel, Larry J.
AU - Muller, Urs
AU - Yeres, Phil
AU - Zieba, Karol
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/9/10
Y1 - 2018/9/10
N2 - This paper proposes a new method, that we call VisualBackProp, for visualizing which sets of pixels of the input image contribute most to the predictions made by the convolutional neural network (CNN). The method heavily hinges on exploring the intuition that the feature maps contain less and less irrelevant information to the prediction decision when moving deeper into the network. The technique we propose is dedicated for CNN-based systems for steering self-driving cars and is therefore required to run in real-time. This makes the proposed visualization method a valuable debugging tool which can be easily used during both training and inference. We justify our approach with theoretical arguments and confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction. We utilize the proposed visualization tool in the NVIDIA neural-network-based end-to-end learning system for autonomous driving, known as PilotNet. We demonstrate that VisualBackProp determines which elements in the road image most influence PilotNet's steering decision and indeed captures relevant objects on the road. The empirical evaluation furthermore shows the plausibility of the proposed approach on public road video data as well as in other applications and reveals that it compares favorably to the layer-wise relevance propagation approach, i.e. it obtains similar visualization results and achieves order of magnitude speed-ups.
AB - This paper proposes a new method, that we call VisualBackProp, for visualizing which sets of pixels of the input image contribute most to the predictions made by the convolutional neural network (CNN). The method heavily hinges on exploring the intuition that the feature maps contain less and less irrelevant information to the prediction decision when moving deeper into the network. The technique we propose is dedicated for CNN-based systems for steering self-driving cars and is therefore required to run in real-time. This makes the proposed visualization method a valuable debugging tool which can be easily used during both training and inference. We justify our approach with theoretical arguments and confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction. We utilize the proposed visualization tool in the NVIDIA neural-network-based end-to-end learning system for autonomous driving, known as PilotNet. We demonstrate that VisualBackProp determines which elements in the road image most influence PilotNet's steering decision and indeed captures relevant objects on the road. The empirical evaluation furthermore shows the plausibility of the proposed approach on public road video data as well as in other applications and reveals that it compares favorably to the layer-wise relevance propagation approach, i.e. it obtains similar visualization results and achieves order of magnitude speed-ups.
UR - http://www.scopus.com/inward/record.url?scp=85063157771&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063157771&partnerID=8YFLogxK
U2 - 10.1109/ICRA.2018.8461053
DO - 10.1109/ICRA.2018.8461053
M3 - Conference contribution
AN - SCOPUS:85063157771
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 4701
EP - 4708
BT - 2018 IEEE International Conference on Robotics and Automation, ICRA 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 21 May 2018 through 25 May 2018
ER -