TY - GEN
T1 - On the detection of adversarial attacks against deep neural networks
AU - Wang, Weiyu
AU - Zhu, Quanyan
N1 - Publisher Copyright:
© 2017 Copyright is held by the owner/author(s).
PY - 2017/11/3
Y1 - 2017/11/3
N2 - Deep learning model has been widely studied and proven to achieve high accuracy in various pattern recognition tasks, especially in image recognition. However, due to its non-linear architecture and high-dimensional inputs, its ill-posedness [1] towards adversarial perturbations - small deliberately crafted perturbations on the input will lead to completely different outputs, has also attracted researchers' attention. 1 This work takes the traffic sign recognition system on the selfdriving car as an example, and aims at designing an additional mechanism to improve the robustness of the recognition system. It uses a machine learning model which learns the results of the deep learning model's predictions, with human feedback as labels and provides the credibility of current prediction. The mechanism makes use of both the input image and the recognition result as the sample space, querying a human user the True/False of current classification result the least number of times, and completing the task of detecting adversarial attacks.
AB - Deep learning model has been widely studied and proven to achieve high accuracy in various pattern recognition tasks, especially in image recognition. However, due to its non-linear architecture and high-dimensional inputs, its ill-posedness [1] towards adversarial perturbations - small deliberately crafted perturbations on the input will lead to completely different outputs, has also attracted researchers' attention. 1 This work takes the traffic sign recognition system on the selfdriving car as an example, and aims at designing an additional mechanism to improve the robustness of the recognition system. It uses a machine learning model which learns the results of the deep learning model's predictions, with human feedback as labels and provides the credibility of current prediction. The mechanism makes use of both the input image and the recognition result as the sample space, querying a human user the True/False of current classification result the least number of times, and completing the task of detecting adversarial attacks.
KW - Active Learning
KW - Adversarial Machine Learning
KW - Deep Neural Network
KW - Machine Learning Security
KW - Support Vector Machine
UR - http://www.scopus.com/inward/record.url?scp=85037084091&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85037084091&partnerID=8YFLogxK
U2 - 10.1145/3140368.3140373
DO - 10.1145/3140368.3140373
M3 - Conference contribution
AN - SCOPUS:85037084091
T3 - SafeConfig 2017 - Proceedings of the 2017 Workshop on Automated Decision Making for Active Cyber Defense, co-located with CCS 2017
SP - 27
EP - 30
BT - SafeConfig 2017 - Proceedings of the 2017 Workshop on Automated Decision Making for Active Cyber Defense, co-located with CCS 2017
PB - Association for Computing Machinery, Inc
T2 - 10th Workshop on Automated Decision Making for Active Cyber Defense, SafeConfig 2017
Y2 - 3 November 2017
ER -