A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation

Hao Fu, Akshaj Kumar Veldanda, Prashanth Krishnamurthy, Siddharth Garg, Farshad Khorrami

Research output: Contribution to journalArticlepeer-review

Abstract

This paper proposes a novel feature-based on-line detection strategy, Removing Adversarial-Backdoors by Iterative Demarcation (RAID), for backdoor attacks. The proposed method is comprised of two parts: off-line training and on-line retraining. In the off-line training, a novelty detector and a shallow neural network are trained with clean validation data. During the on-line implementation, both models attempt to detect samples from the streaming data that differ from the validation data (i.e., flag likely-poisoned samples and possibly a few clean samples as false positives). An anomaly detector is used to purify the anomalous data by removing the clean samples. A binary support vector machine (SVM) is trained with the purified anomalous data and the clean validation data. RAID uses the SVM to detect poisoned inputs. To increase the accuracy as new anomalous data is being detected, the SVM is updated as well in real-time. It is shown that with updating, RAID can efficiently reduce the attack success rate while maintaining the classification accuracy against various types of backdoor attacks. The efficacy of RAID is compared against several state-of-the-art techniques. Additionally, it is shown that RAID only requires a small clean validation dataset to achieve such performance, and therefore provides a practical and efficient approach.

Original languageEnglish (US)
Pages (from-to)5545-5558
Number of pages14
JournalIEEE Access
Volume10
DOIs
StatePublished - 2022

Keywords

  • Machine learning
  • Neural networks
  • Pattern analysis

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Fingerprint

Dive into the research topics of 'A Feature-Based On-Line Detector to Remove Adversarial-Backdoors by Iterative Demarcation'. Together they form a unique fingerprint.

Cite this