TY - GEN
T1 - CodexLeaks
T2 - 32nd USENIX Security Symposium, USENIX Security 2023
AU - Niu, Liang
AU - Mirza, Shujaat
AU - Maradni, Zayd
AU - Pöpper, Christina
N1 - Publisher Copyright:
© USENIX Security 2023. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Code generation language models are trained on billions of lines of source code to provide code generation and auto-completion features, like those offered by code assistant GitHub Copilot with more than a million users. These datasets may contain sensitive personal information—personally identifiable, private, or secret—that these models may regurgitate. This paper introduces and evaluates a semi-automated pipeline for extracting sensitive personal information from the Codex model used in GitHub Copilot. We employ carefully-designed templates to construct prompts that are more likely to result in privacy leaks. To overcome the non-public training data, we propose a semi-automated filtering method using a blind membership inference attack. We validate the effectiveness of our membership inference approach on different code generation models. We utilize hit rate through the GitHub Search API as a distinguishing heuristic followed by human-in-the-loop evaluation, uncovering that approximately 8% (43) of the prompts yield privacy leaks. Notably, we observe that the model tends to produce indirect leaks, compromising privacy as contextual integrity by generating information from individuals closely related to the queried subject in the training corpus.
AB - Code generation language models are trained on billions of lines of source code to provide code generation and auto-completion features, like those offered by code assistant GitHub Copilot with more than a million users. These datasets may contain sensitive personal information—personally identifiable, private, or secret—that these models may regurgitate. This paper introduces and evaluates a semi-automated pipeline for extracting sensitive personal information from the Codex model used in GitHub Copilot. We employ carefully-designed templates to construct prompts that are more likely to result in privacy leaks. To overcome the non-public training data, we propose a semi-automated filtering method using a blind membership inference attack. We validate the effectiveness of our membership inference approach on different code generation models. We utilize hit rate through the GitHub Search API as a distinguishing heuristic followed by human-in-the-loop evaluation, uncovering that approximately 8% (43) of the prompts yield privacy leaks. Notably, we observe that the model tends to produce indirect leaks, compromising privacy as contextual integrity by generating information from individuals closely related to the queried subject in the training corpus.
UR - http://www.scopus.com/inward/record.url?scp=85174560822&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174560822&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85174560822
T3 - 32nd USENIX Security Symposium, USENIX Security 2023
SP - 2133
EP - 2150
BT - 32nd USENIX Security Symposium, USENIX Security 2023
PB - USENIX Association
Y2 - 9 August 2023 through 11 August 2023
ER -