Detecting Backdoor Attacks in Black-Box Neural Networks through Hardware Performance Counters

Manaar Alam, Yue Wang, Michail Maniatakos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep Neural Networks (DNNs) have made significant strides, but their susceptibility to backdoor attacks still remains a concern. Most defenses typically assume access to white-box models or poisoned data, requirements that are often not feasible in practice, especially for proprietary DNNs. Existing defenses in a black-box setting usually rely on confidence scores of DNN's predictions. However, this exposes DNNs to the risk of model stealing attacks, a significant concern for proprietary DNNs. In this paper, we introduce a novel strategy for detecting back-doors, focusing on a more realistic black-box scenario where only hard-label (i.e., without any prediction confidence) query access is available. Our strategy utilizes data flow dynamics in a computational environment during DNN inference to identify potential backdoor inputs and is agnostic of trigger types or their locations in the input. We observe that a clean image and its corresponding backdoor counterpart with a trigger induce distinct patterns across various microarchitectural activities during the inference phase. We exploit these variations captured by Hardware Performance Counters (HPCs) and use principles of the Gaussian Mixture Model to detect backdoor inputs. To the best of our knowledge, this is the first work that utilizes HPCs for detecting backdoors in DNNs. Extensive evaluation considering a range of benchmark datasets, DNN architectures, and trigger patterns shows the efficacy of the proposed method in distinguishing between clean and backdoor inputs using HPCs.

Original languageEnglish (US)
Title of host publication2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350348590
StatePublished - 2024
Event2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024 - Valencia, Spain
Duration: Mar 25 2024Mar 27 2024

Publication series

NameProceedings -Design, Automation and Test in Europe, DATE
ISSN (Print)1530-1591

Conference

Conference2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024
Country/TerritorySpain
CityValencia
Period3/25/243/27/24

Keywords

  • Backdoor Attacks
  • Hardware Performance Counters
  • Neural Networks
  • Side-Channel Information

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Detecting Backdoor Attacks in Black-Box Neural Networks through Hardware Performance Counters'. Together they form a unique fingerprint.

Cite this