TY - GEN
T1 - A Stackelberg game perspective on the conflict between machine learning and data obfuscation
AU - Pawlick, Jeffrey
AU - Zhu, Quanyan
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/1/18
Y1 - 2017/1/18
N2 - Data is the new oil; this refrain is repeated extensively in the age of internet tracking, machine learning, and data analytics. As data collection becomes more personal and pervasive, however, public pressure is mounting for privacy protection. In this atmosphere, developers have created applications to add noise to user attributes visible to tracking algorithms. This creates a strategic interaction between trackers and users when incentives to maintain privacy and improve accuracy are misaligned. In this paper, we conceptualize this conflict through an N + 1-player, augmented Stackelberg game. First a machine learner declares a privacy protection level, and then users respond by choosing their own perturbation amounts. We use the general frameworks of differential privacy and empirical risk minimization to quantify the utility components due to privacy and accuracy, respectively. In equilibrium, each user perturbs her data independently, which leads to a high net loss in accuracy. To remedy this scenario, we show that the learner improves his utility by proactively perturbing the data himself. While other work in this area has studied privacy markets and mechanism design for truthful reporting of user information, we take a different viewpoint by considering both user and learnerperturbation.
AB - Data is the new oil; this refrain is repeated extensively in the age of internet tracking, machine learning, and data analytics. As data collection becomes more personal and pervasive, however, public pressure is mounting for privacy protection. In this atmosphere, developers have created applications to add noise to user attributes visible to tracking algorithms. This creates a strategic interaction between trackers and users when incentives to maintain privacy and improve accuracy are misaligned. In this paper, we conceptualize this conflict through an N + 1-player, augmented Stackelberg game. First a machine learner declares a privacy protection level, and then users respond by choosing their own perturbation amounts. We use the general frameworks of differential privacy and empirical risk minimization to quantify the utility components due to privacy and accuracy, respectively. In equilibrium, each user perturbs her data independently, which leads to a high net loss in accuracy. To remedy this scenario, we show that the learner improves his utility by proactively perturbing the data himself. While other work in this area has studied privacy markets and mechanism design for truthful reporting of user information, we take a different viewpoint by considering both user and learnerperturbation.
UR - http://www.scopus.com/inward/record.url?scp=85015029986&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015029986&partnerID=8YFLogxK
U2 - 10.1109/WIFS.2016.7823893
DO - 10.1109/WIFS.2016.7823893
M3 - Conference contribution
AN - SCOPUS:85015029986
T3 - 8th IEEE International Workshop on Information Forensics and Security, WIFS 2016
BT - 8th IEEE International Workshop on Information Forensics and Security, WIFS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th IEEE International Workshop on Information Forensics and Security, WIFS 2016
Y2 - 4 December 2016 through 7 December 2016
ER -