TY - GEN
T1 - Communication-efficient agnostic federated averaging
AU - Ro, Jae
AU - Chen, Mingqing
AU - Mathews, Rajiv
AU - Mohri, Mehryar
AU - Suresh, Ananda Theertha
N1 - Publisher Copyright:
Copyright © 2021 ISCA.
PY - 2021
Y1 - 2021
N2 - In distributed learning settings such as federated learning, the training algorithm can be potentially biased towards different clients. [1] proposed a domain-agnostic learning algorithm, where the model is optimized for any target distribution formed by a mixture of the client distributions in order to overcome this bias. They further proposed an algorithm for the cross-silo federated learning setting, where the number of clients is small. We consider this problem in the cross-device setting, where the number of clients is much larger. We propose a communication-efficient distributed algorithm called AGNOSTIC FEDERATED AVERAGING (or AGNOSTICFEDAVG) to minimize the domain-agnostic objective proposed in [1], which is amenable to other private mechanisms such as secure aggregation. We highlight two types of naturally occurring domains in federated learning and argue that AGNOSTICFEDAVG performs well on both. To demonstrate the practical effectiveness of AGNOSTICFEDAVG, we report positive results for large-scale language modeling tasks in both simulation and live experiments, where the latter involves training language models for Spanish virtual keyboard for millions of user devices.
AB - In distributed learning settings such as federated learning, the training algorithm can be potentially biased towards different clients. [1] proposed a domain-agnostic learning algorithm, where the model is optimized for any target distribution formed by a mixture of the client distributions in order to overcome this bias. They further proposed an algorithm for the cross-silo federated learning setting, where the number of clients is small. We consider this problem in the cross-device setting, where the number of clients is much larger. We propose a communication-efficient distributed algorithm called AGNOSTIC FEDERATED AVERAGING (or AGNOSTICFEDAVG) to minimize the domain-agnostic objective proposed in [1], which is amenable to other private mechanisms such as secure aggregation. We highlight two types of naturally occurring domains in federated learning and argue that AGNOSTICFEDAVG performs well on both. To demonstrate the practical effectiveness of AGNOSTICFEDAVG, we report positive results for large-scale language modeling tasks in both simulation and live experiments, where the latter involves training language models for Spanish virtual keyboard for millions of user devices.
UR - http://www.scopus.com/inward/record.url?scp=85119186924&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119186924&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2021-153
DO - 10.21437/Interspeech.2021-153
M3 - Conference contribution
AN - SCOPUS:85119186924
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 1753
EP - 1757
BT - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
PB - International Speech Communication Association
T2 - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Y2 - 30 August 2021 through 3 September 2021
ER -