TY - GEN
T1 - SCAFFOLD
T2 - 37th International Conference on Machine Learning, ICML 2020
AU - Karimireddy, Sai Praneeth
AU - Kale, Satyen
AU - Mohri, Mehryar
AU - Reddi, Sashank J.
AU - Stich, Sebastian U.
AU - Suresh, Ananda Theertha
N1 - Publisher Copyright:
© 2020 by the Authors.
PY - 2020
Y1 - 2020
N2 - Federated Averaging (FEDAVG) has emerged as the algorithm of choice for federated learning due to its simplicity and low communication cost. However, in spite of recent research efforts, its performance is not fully understood. We obtain tight convergence rates for FEDAVG and prove that it suffers from 'client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence. As a solution, we propose a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the 'client-drift' in its local updates. We prove that SCAFFOLD requires significantly fewer communication rounds and is not affected by data heterogeneity or client sampling. Further, we show that (for quadratics) SCAFFOLD can take advantage of similarity in the client's data yielding even faster convergence. The latter is the first result to quantify the usefulness of local-steps in distributed optimization.
AB - Federated Averaging (FEDAVG) has emerged as the algorithm of choice for federated learning due to its simplicity and low communication cost. However, in spite of recent research efforts, its performance is not fully understood. We obtain tight convergence rates for FEDAVG and prove that it suffers from 'client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence. As a solution, we propose a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the 'client-drift' in its local updates. We prove that SCAFFOLD requires significantly fewer communication rounds and is not affected by data heterogeneity or client sampling. Further, we show that (for quadratics) SCAFFOLD can take advantage of similarity in the client's data yielding even faster convergence. The latter is the first result to quantify the usefulness of local-steps in distributed optimization.
UR - http://www.scopus.com/inward/record.url?scp=85100526407&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100526407&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85100526407
T3 - 37th International Conference on Machine Learning, ICML 2020
SP - 5088
EP - 5099
BT - 37th International Conference on Machine Learning, ICML 2020
A2 - Daume, Hal
A2 - Singh, Aarti
PB - International Machine Learning Society (IMLS)
Y2 - 13 July 2020 through 18 July 2020
ER -