Efficient communication using total-exchange

Satish Rao, Torsten Suel, Thanasis Tsantilas, Mark Goudreau

    Research output: Chapter in Book/Report/Conference proceedingConference contribution


    A central question in parallel computing is to determine the extent to which one can write parallel programs using a high-level, general-purpose, and architecture-independent programming language and have them executed on a variety of parallel and distributed architectures without sacrificing efficiency. A large body of research suggests that, at least in theory, general-purpose parallel computing is indeed possible provided certain conditions are met: an excess of logical parallelism in the program, and the ability of the target architecture to efficiently realize balanced communication patterns. The canonical example of a balanced communication pattern is an h-relation, in which each processor is the origin and destination of at most h messages. A plethora of protocols has been designed for routing h-relations in a variety of networks. The goal has been to minimize the value of h while guaranteeing delivery of the messages within time a constant factor from optimal. In this paper we describe protocols that meet the most stringent efficiency requirement, namely delivery of messages within time that is a lower order additive term from the best achievable. Such protocols are called 1-optimal. While these protocols achieve 1-optimality only for heavily loaded networks, that is, for large values of h, they are remarkable for their simplicity in that they only use the total-exchange communication primitive. The total-exchange can be realized in many networks using very simple, contention-free, and extremely efficient schemes. The technical contribution of this paper is a protocol to route random h-relations in an N-processor network using h÷N(1 + 0(1)) + O(log log N) total-exchange rounds with high probability. Using message duplication, we can improve the bound to h÷N(1 + 0(1)) + O(log * N). This improves upon the h÷N(1 + 0(1)) + O(log N) bound of Gerbessiotis and Valiant. While our theoretical improvements are modest, our experimental results show an improvement over the protocol of Gerebessiotis and Valiant.

    Original languageEnglish (US)
    Title of host publicationIEEE Symposium on Parallel and Distributed Processing - Proceedings
    Editors Anon
    Number of pages7
    StatePublished - 1995
    EventProceedings of the IEEE 9th International Parallel Processing Symposium - Santa Barbara, CA, USA
    Duration: Apr 25 1995Apr 28 1995


    OtherProceedings of the IEEE 9th International Parallel Processing Symposium
    CitySanta Barbara, CA, USA

    ASJC Scopus subject areas

    • General Engineering


    Dive into the research topics of 'Efficient communication using total-exchange'. Together they form a unique fingerprint.

    Cite this