TY - GEN
T1 - Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data
AU - Esfandiari, Yasaman
AU - Tan, Sin Yong
AU - Jiang, Zhanhong
AU - Balu, Aditya
AU - Herron, Ethan
AU - Hegde, Chinmay
AU - Sarkar, Soumik
N1 - Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benchmark data sets, comparable with centralized algorithms. However, the key assumption to achieve competitive performance is that the data is independently and identically distributed (IID) among the agents which, in real-life applications, is often not applicable. Inspired by ideas from continual learning, we propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm where (i) each agent aggregates cross-gradient information, i.e., derivatives of its model with respect to its neighbors' datasets, and (ii) updates its model using a projected gradient based on quadratic programming (QP). We theoretically analyze the convergence characteristics of CGA and demonstrate its efficiency on non-IID data distributions sampled from the MNIST and CIFAR-10 datasets. Our empirical comparisons show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms, as well as maintaining the improved performance under information compression to reduce peer-to-peer communication overhead. The code is available here on GitHub.
AB - Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benchmark data sets, comparable with centralized algorithms. However, the key assumption to achieve competitive performance is that the data is independently and identically distributed (IID) among the agents which, in real-life applications, is often not applicable. Inspired by ideas from continual learning, we propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm where (i) each agent aggregates cross-gradient information, i.e., derivatives of its model with respect to its neighbors' datasets, and (ii) updates its model using a projected gradient based on quadratic programming (QP). We theoretically analyze the convergence characteristics of CGA and demonstrate its efficiency on non-IID data distributions sampled from the MNIST and CIFAR-10 datasets. Our empirical comparisons show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms, as well as maintaining the improved performance under information compression to reduce peer-to-peer communication overhead. The code is available here on GitHub.
UR - http://www.scopus.com/inward/record.url?scp=85161297827&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161297827&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85161297827
T3 - Proceedings of Machine Learning Research
SP - 3036
EP - 3046
BT - Proceedings of the 38th International Conference on Machine Learning, ICML 2021
PB - ML Research Press
T2 - 38th International Conference on Machine Learning, ICML 2021
Y2 - 18 July 2021 through 24 July 2021
ER -