TY - JOUR
T1 - A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation
AU - Fu, Hao
AU - Veldanda, Akshaj Kumar
AU - Krishnamurthy, Prashanth
AU - Garg, Siddharth
AU - Khorrami, Farshad
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2022
Y1 - 2022
N2 - This paper proposes a novel feature-based on-line detection strategy, Removing Adversarial-Backdoors by Iterative Demarcation (RAID), for backdoor attacks. The proposed method is comprised of two parts: off-line training and on-line retraining. In the off-line training, a novelty detector and a shallow neural network are trained with clean validation data. During the on-line implementation, both models attempt to detect samples from the streaming data that differ from the validation data (i.e., flag likely-poisoned samples and possibly a few clean samples as false positives). An anomaly detector is used to purify the anomalous data by removing the clean samples. A binary support vector machine (SVM) is trained with the purified anomalous data and the clean validation data. RAID uses the SVM to detect poisoned inputs. To increase the accuracy as new anomalous data is being detected, the SVM is updated as well in real-time. It is shown that with updating, RAID can efficiently reduce the attack success rate while maintaining the classification accuracy against various types of backdoor attacks. The efficacy of RAID is compared against several state-of-the-art techniques. Additionally, it is shown that RAID only requires a small clean validation dataset to achieve such performance, and therefore provides a practical and efficient approach.
AB - This paper proposes a novel feature-based on-line detection strategy, Removing Adversarial-Backdoors by Iterative Demarcation (RAID), for backdoor attacks. The proposed method is comprised of two parts: off-line training and on-line retraining. In the off-line training, a novelty detector and a shallow neural network are trained with clean validation data. During the on-line implementation, both models attempt to detect samples from the streaming data that differ from the validation data (i.e., flag likely-poisoned samples and possibly a few clean samples as false positives). An anomaly detector is used to purify the anomalous data by removing the clean samples. A binary support vector machine (SVM) is trained with the purified anomalous data and the clean validation data. RAID uses the SVM to detect poisoned inputs. To increase the accuracy as new anomalous data is being detected, the SVM is updated as well in real-time. It is shown that with updating, RAID can efficiently reduce the attack success rate while maintaining the classification accuracy against various types of backdoor attacks. The efficacy of RAID is compared against several state-of-the-art techniques. Additionally, it is shown that RAID only requires a small clean validation dataset to achieve such performance, and therefore provides a practical and efficient approach.
KW - Machine learning
KW - Neural networks
KW - Pattern analysis
UR - http://www.scopus.com/inward/record.url?scp=85122890161&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85122890161&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2022.3141077
DO - 10.1109/ACCESS.2022.3141077
M3 - Article
AN - SCOPUS:85122890161
SN - 2169-3536
VL - 10
SP - 5545
EP - 5558
JO - IEEE Access
JF - IEEE Access
ER -