On Hardware Security Bug Code Fixes By Prompting Large Language Models

Baleegh Ahmad, Shailja Thakur, Benjamin Tan, Ramesh Karri, Hammond Pearce

Research output: Contribution to journalArticlepeer-review


Novel AI-based code-writing Large Language Models (LLMs) such as OpenAI’s Codex have demonstrated capabilities in many coding-adjacent domains. In this work, we consider how LLMs may be leveraged to automatically repair identified security-relevant bugs present in hardware designs by generating replacement code. We focus on bug repair in code written in Verilog. For this study, we curate a corpus of domain-representative hardware security bugs. We then design and implement a framework to quantitatively evaluate the performance of any LLM tasked with fixing the specified bugs. The framework supports design space exploration of prompts (i.e., prompt engineering) and identifying the best parameters for the LLM. We show that an ensemble of LLMs can repair all fifteen of our benchmarks. This ensemble outperforms a state-of-the-art automated hardware bug repair tool on its own suite of bugs. These results show that LLMs have the ability to repair hardware security bugs and the framework is an important step towards the ultimate goal of an automated end-to-end bug repair tool.

Original languageEnglish (US)
Pages (from-to)1
Number of pages1
JournalIEEE Transactions on Information Forensics and Security
StatePublished - 2024


  • Bug Repair
  • Codes
  • Computer bugs
  • Hardware
  • Hardware Security
  • Large Language Models
  • Maintenance engineering
  • Registers
  • Security
  • Software

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Computer Networks and Communications


Dive into the research topics of 'On Hardware Security Bug Code Fixes By Prompting Large Language Models'. Together they form a unique fingerprint.

Cite this