TY - GEN
T1 - Breaking the centralized barrier for cross-device federated learning
AU - Karimireddy, Sai Praneeth
AU - Jaggi, Martin
AU - Kale, Satyen
AU - Mohri, Mehryar
AU - Reddi, Sashank J.
AU - Stich, Sebastian U.
AU - Suresh, Ananda Theertha
N1 - Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which can cause a client drift phenomenon. In fact, designing an algorithm for FL that is uniformly better than simple centralized training has been a major open problem thus far. In this work, we propose a general algorithmic framework, MIME, which i) mitigates client drift and ii) adapts an arbitrary centralized optimization algorithm such as momentum and Adam to the cross-device federated learning setting. MIME uses a combination of control-variates and server-level optimizer state (e.g. momentum) at every client-update step to ensure that each local update mimics that of the centralized method run on i.i.d. data. We prove a reduction result showing that MIME can translate the convergence of a generic algorithm in the centralized setting into convergence in the federated setting. Moreover, we show that, when combined with momentum-based variance reduction, MIME is provably faster than any centralized method-the first such result. We also perform a thorough experimental exploration of MIME's performance on real world datasets (implemented here).
AB - Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which can cause a client drift phenomenon. In fact, designing an algorithm for FL that is uniformly better than simple centralized training has been a major open problem thus far. In this work, we propose a general algorithmic framework, MIME, which i) mitigates client drift and ii) adapts an arbitrary centralized optimization algorithm such as momentum and Adam to the cross-device federated learning setting. MIME uses a combination of control-variates and server-level optimizer state (e.g. momentum) at every client-update step to ensure that each local update mimics that of the centralized method run on i.i.d. data. We prove a reduction result showing that MIME can translate the convergence of a generic algorithm in the centralized setting into convergence in the federated setting. Moreover, we show that, when combined with momentum-based variance reduction, MIME is provably faster than any centralized method-the first such result. We also perform a thorough experimental exploration of MIME's performance on real world datasets (implemented here).
UR - http://www.scopus.com/inward/record.url?scp=85125689042&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125689042&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85125689042
T3 - Advances in Neural Information Processing Systems
SP - 28663
EP - 28676
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Y2 - 6 December 2021 through 14 December 2021
ER -