Examining Zero-Shot Vulnerability Repair with Large Language Models

Hammond Pearce, Benjamin Tan, Baleegh Ahmad, Ramesh Karri, Brendan Dolan-Gavitt

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Human developers can produce code with cybersecurity bugs. Can emerging 'smart' code completion tools help repair those bugs? In this work, we examine the use of large language models (LLMs) for code (such as OpenAI's Codex and AI21's Jurassic J-1) for zero-shot vulnerability repair. We investigate challenges in the design of prompts that coax LLMs into generating repaired versions of insecure code. This is difficult due to the numerous ways to phrase key information - both semantically and syntactically - with natural languages. We perform a large scale study of five commercially available, black-box, "off-the-shelf"LLMs, as well as an open-source model and our own locally-trained model, on a mix of synthetic, hand-crafted, and real-world security bug scenarios. Our experiments demonstrate that while the approach has promise (the LLMs could collectively repair 100% of our synthetically generated and hand-crafted scenarios), a qualitative evaluation of the model's performance over a corpus of historical real-world examples highlights challenges in generating functionally correct code.

Original languageEnglish (US)
Title of host publicationProceedings - 44th IEEE Symposium on Security and Privacy, SP 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2339-2356
Number of pages18
ISBN (Electronic)9781665493369
DOIs
StatePublished - 2023
Event44th IEEE Symposium on Security and Privacy, SP 2023 - Hybrid, San Francisco, United States
Duration: May 22 2023May 25 2023

Publication series

NameProceedings - IEEE Symposium on Security and Privacy
Volume2023-May
ISSN (Print)1081-6011

Conference

Conference44th IEEE Symposium on Security and Privacy, SP 2023
Country/TerritoryUnited States
CityHybrid, San Francisco
Period5/22/235/25/23

Keywords

  • AI
  • code generation
  • CWE
  • Cybersecurity

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Examining Zero-Shot Vulnerability Repair with Large Language Models'. Together they form a unique fingerprint.

Cite this