Abstract
Privacy-preserving distributed machine learning becomes increasingly important due to the recent rapid growth of data. This paper focuses on a class of regularized empirical risk minimization machine learning problems, and develops two methods to provide differential privacy to distributed learning algorithms over a network. We first decentralize the learning algorithm using the alternating direction method of multipliers, and propose the methods of dual variable perturbation and primal variable perturbation to provide dynamic differential privacy. The two mechanisms lead to algorithms that can provide privacy guarantees under mild conditions of the convexity and differentiability of the loss function and the regularizer. We study the performance of the algorithms, and show that the dual variable perturbation outperforms its primal counterpart. To design an optimal privacy mechanism, we analyze the fundamental tradeoff between privacy and accuracy, and provide guidelines to choose privacy parameters. Numerical experiments using customer information database are performed to corroborate the results on privacy and utility tradeoffs and design.
Original language | English (US) |
---|---|
Pages (from-to) | 172-187 |
Number of pages | 16 |
Journal | IEEE Transactions on Information Forensics and Security |
Volume | 12 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2017 |
Keywords
- ADMM
- Machine learning
- differential privacy
- distributed computing
- dynamic programming
- privacy
ASJC Scopus subject areas
- Safety, Risk, Reliability and Quality
- Computer Networks and Communications