TY - GEN
T1 - Dynamic load balancing in parallel and distributed networks by random matchings (extended abstract)
AU - Ghosh, Bhaskar
AU - Muthukrishnan, S.
N1 - Publisher Copyright:
© 1994 ACM.
PY - 1994/8/1
Y1 - 1994/8/1
N2 - The fundamental problems in dynamic load balancing and job scheduling in parallel and distributed computers involve moving load between processors. In this paper, we consider a new model for load movement in synchronous parallel and distributed machines. In each step of our model, each processor can transfer load to at most one neighbor; also, any amount of load can be moved along a communication link between two processors in one step. This is a reasonable model for load movement in significant classes of dynamic load balancing problems. We derive efficient algorithms for a number of task reallocation problems under our model of load movement. These include dynamic load balancing on processor networks, adaptive mesh re-partitioning such as those in finite element methods, and progressive job migration under dynamic generation and consumption of load. To obtain the above-mentioned results, we introduce and solve the abstract problem of Incremental Weight Migration (IWM) on arbitrary graphs. Our main result is a simple, randomized, algorithm for this problem which provably results in asymptotically optimal convergence towards the state where weights on the nodes of the graph are all equal. This algorithm utilizes an appropriate random set of edges forming a matching. Our algorithm for the IWM problem is used in deriving efficient algorithms for all the problems mentioned above. Our results are very general. The algorithms we derive are local, and hence, scalable. They work for arbitrary load distributions and for networks of arbitrary topology which can possibly undergo link failures. Of independent interest is our proof technique which we use to lower bound the convergence of our algorithms in terms of the eigenstructure of the underlying graph. Finally, we present preliminary experimental results analyzing issues in load balancing related to our algorithms.
AB - The fundamental problems in dynamic load balancing and job scheduling in parallel and distributed computers involve moving load between processors. In this paper, we consider a new model for load movement in synchronous parallel and distributed machines. In each step of our model, each processor can transfer load to at most one neighbor; also, any amount of load can be moved along a communication link between two processors in one step. This is a reasonable model for load movement in significant classes of dynamic load balancing problems. We derive efficient algorithms for a number of task reallocation problems under our model of load movement. These include dynamic load balancing on processor networks, adaptive mesh re-partitioning such as those in finite element methods, and progressive job migration under dynamic generation and consumption of load. To obtain the above-mentioned results, we introduce and solve the abstract problem of Incremental Weight Migration (IWM) on arbitrary graphs. Our main result is a simple, randomized, algorithm for this problem which provably results in asymptotically optimal convergence towards the state where weights on the nodes of the graph are all equal. This algorithm utilizes an appropriate random set of edges forming a matching. Our algorithm for the IWM problem is used in deriving efficient algorithms for all the problems mentioned above. Our results are very general. The algorithms we derive are local, and hence, scalable. They work for arbitrary load distributions and for networks of arbitrary topology which can possibly undergo link failures. Of independent interest is our proof technique which we use to lower bound the convergence of our algorithms in terms of the eigenstructure of the underlying graph. Finally, we present preliminary experimental results analyzing issues in load balancing related to our algorithms.
UR - http://www.scopus.com/inward/record.url?scp=45449103550&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=45449103550&partnerID=8YFLogxK
U2 - 10.1145/181014.181366
DO - 10.1145/181014.181366
M3 - Conference contribution
AN - SCOPUS:45449103550
T3 - Proceedings of the 6th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1994
SP - 226
EP - 235
BT - Proceedings of the 6th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1994
PB - Association for Computing Machinery, Inc
T2 - 6th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 1994
Y2 - 27 June 1994 through 29 June 1994
ER -