TY - JOUR

T1 - First- and second-order diffusive methods for rapid, coarse, distributed load balancing

AU - Muthukrishnan, S.

AU - Ghosh, B.

AU - Schultz, M. H.

N1 - Funding Information:
⁄The first author was partly supported by ALCOM-IT and the work was partly done at DIMACS (Center for Discrete Mathematics and Theoretical Computer Science), a National Science Foundation Science and Technology Center under NSF Contract STC-8809648. The major part of this work was done when the second author was at the Department of Computer Science, Yale University with support from ONR under Grant Number 4-91-J-1576 and a Yale/IBM joint study.

PY - 1998

Y1 - 1998

N2 - We consider the following general problem modeling load balancing in a variety of distributed settings. Given an arbitrary undirected connected graph G = (V, E) and a weight distribution w0 on the nodes, determine a schedule to move weights across edges in each step so as to (approximately) balance the weights on the nodes. We focus on diffusive schedules for this problem. All previously studied diffusive schedules can be modeled as wt+1 = Mwt where wt is the weight distribution after t steps and M is a doubly stochastic matrix. We call these the first-order schedules. First-order schedules, although widely used in practice, are often slow. In this paper we introduce a new direction in diffusive schedules by considering schedules that are modeled as: w1 = Mw0; wt+1 = βMwt + (1 - β)wt-1 for some appropriate β; we call these the second-order schedules. In the idealized setting of weights being real numbers, we adopt known results to show that βcan be chosen so that the second-order schedule involves significantly fewer steps than the first-order method for approximate load balancing. In the realistic setting when the weights are positive integers, we simulate the idealized schedules by maintaining I Owe You units on the edges. Extensive experiments with simulated data and real-life data from JOSTLE, a mesh-partitioning software, show that the resultant realistic schedule is close to the idealized schedule, and it again involves fewer steps than the first-order schedules for approximate load balancing. Our main result is therefore a fast algorithm for coarse load balancing that can be used in a variety of applications.

AB - We consider the following general problem modeling load balancing in a variety of distributed settings. Given an arbitrary undirected connected graph G = (V, E) and a weight distribution w0 on the nodes, determine a schedule to move weights across edges in each step so as to (approximately) balance the weights on the nodes. We focus on diffusive schedules for this problem. All previously studied diffusive schedules can be modeled as wt+1 = Mwt where wt is the weight distribution after t steps and M is a doubly stochastic matrix. We call these the first-order schedules. First-order schedules, although widely used in practice, are often slow. In this paper we introduce a new direction in diffusive schedules by considering schedules that are modeled as: w1 = Mw0; wt+1 = βMwt + (1 - β)wt-1 for some appropriate β; we call these the second-order schedules. In the idealized setting of weights being real numbers, we adopt known results to show that βcan be chosen so that the second-order schedule involves significantly fewer steps than the first-order method for approximate load balancing. In the realistic setting when the weights are positive integers, we simulate the idealized schedules by maintaining I Owe You units on the edges. Extensive experiments with simulated data and real-life data from JOSTLE, a mesh-partitioning software, show that the resultant realistic schedule is close to the idealized schedule, and it again involves fewer steps than the first-order schedules for approximate load balancing. Our main result is therefore a fast algorithm for coarse load balancing that can be used in a variety of applications.

UR - http://www.scopus.com/inward/record.url?scp=0032378726&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032378726&partnerID=8YFLogxK

U2 - 10.1007/s002240000092

DO - 10.1007/s002240000092

M3 - Article

AN - SCOPUS:0032378726

VL - 31

SP - 331

EP - 354

JO - Theory of Computing Systems

JF - Theory of Computing Systems

SN - 1432-4350

IS - 4

ER -