Abstract
We propose a steganography based technique to generate adversarial perturbations to fool deep models on any image. The proposed perturbations are computed in a transform domain where a single secret image embedded in any target image makes any deep model misclassify the target image with high probability. The attack resulting from our perturbation is ideal for black-box setting, as it does not require any information about the target model. Moreover, being a non-iterative technique, our perturbation estimation remains computationally efficient. The computed perturbations are also imperceptible to humans while they achieve high fooling ratios for the models trained on large-scale ImageNet dataset. We demonstrate successful fooling of ResNet-50, VGG-16, Inception-V3 and MobileNet-V2, achieving up to 89% fooling of these popular classification models.
Original language | English (US) |
---|---|
Pages (from-to) | 146-152 |
Number of pages | 7 |
Journal | Pattern Recognition Letters |
Volume | 135 |
DOIs | |
State | Published - Jul 2020 |
Keywords
- Adversarial attack
- Deep neural networks
- Steganography
- Wavelet transform
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence