Gradient coding: Avoiding stragglers in distributed learning

Rashish Tandon, Qi Lei, Alexandras G. Dimakis, Nikos Karampatziakis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a novel coding theoretic framework for mitigating stragglers in distributed learning. We show how carefully replicating data blocks and coding across gradients can provide tolerance to failures and stragglers for synchronous Gradient Descent. We implement our schemes in python (using MPI) to run on Amazon EC2, and show how we compare against baseline approaches in running time and generalization error.

Original languageEnglish (US)
Title of host publication34th International Conference on Machine Learning, ICML 2017
PublisherInternational Machine Learning Society (IMLS)
Pages5166-5178
Number of pages13
ISBN (Electronic)9781510855144
StatePublished - 2017
Event34th International Conference on Machine Learning, ICML 2017 - Sydney, Australia
Duration: Aug 6 2017Aug 11 2017

Publication series

Name34th International Conference on Machine Learning, ICML 2017
Volume7

Other

Other34th International Conference on Machine Learning, ICML 2017
Country/TerritoryAustralia
CitySydney
Period8/6/178/11/17

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Fingerprint

Dive into the research topics of 'Gradient coding: Avoiding stragglers in distributed learning'. Together they form a unique fingerprint.

Cite this