TY - GEN

T1 - Spartan

T2 - 2015 USENIX Annual Technical Conference, USENIX ATC 2015

AU - Huang, Chien Chin

AU - Chen, Qi

AU - Wang, Zhaoguo

AU - Power, Russell

AU - Ortiz, Jorge

AU - Li, Jinyang

AU - Xiao, Zhen

N1 - Funding Information:
We thank the anonymous reviewers and our shepherd David Shue. This work is supported in part by NSF (CSR-1065169) and a Google research award.
Funding Information:
Acknowledgments: We thank the anonymous reviewers and our shepherd David Shue. This work is supported in part by NSF (CSR-1065169) and a Google research award.
Publisher Copyright:
© 2015 USENIX Annual Technical Conference.

PY - 2015

Y1 - 2015

N2 - Application programmers in domains like machine learning, scientific computing, and computational biology are accustomed to using powerful, high productivity array languages such as MatLab, R and NumPy. Distributed array frameworks aim to scale array programs across machines. However, maximizing the locality of access to distributed arrays is an unsolved problem; such locality is critical for high performance. This paper presents Spartan, a distributed array framework that automatically determines how to best partition (aka "tile") ndimensional arrays and to co-locate data with computation to maximize locality. Spartan combines a lazy-evaluation based, optimizing frontend with a distributed tiled array backend. Central to Spartan's design is a small number of carefully chosen parallel high-level operators, which form the expression graph captured by Spartan's frontend during runtime. These operators simplify the programming of distributed applications. More importantly, their well-defined semantics allow Spartan's runtime to calculate the costs of different tiling strategies and pick the best one for evaluating the entire expression graph. Using Spartan, we have implemented 12 applications from a variety of domains including machine learning and scientific computing. Our evaluations show that Spartan's automatic tiling mechanism leads to good and scala.

AB - Application programmers in domains like machine learning, scientific computing, and computational biology are accustomed to using powerful, high productivity array languages such as MatLab, R and NumPy. Distributed array frameworks aim to scale array programs across machines. However, maximizing the locality of access to distributed arrays is an unsolved problem; such locality is critical for high performance. This paper presents Spartan, a distributed array framework that automatically determines how to best partition (aka "tile") ndimensional arrays and to co-locate data with computation to maximize locality. Spartan combines a lazy-evaluation based, optimizing frontend with a distributed tiled array backend. Central to Spartan's design is a small number of carefully chosen parallel high-level operators, which form the expression graph captured by Spartan's frontend during runtime. These operators simplify the programming of distributed applications. More importantly, their well-defined semantics allow Spartan's runtime to calculate the costs of different tiling strategies and pick the best one for evaluating the entire expression graph. Using Spartan, we have implemented 12 applications from a variety of domains including machine learning and scientific computing. Our evaluations show that Spartan's automatic tiling mechanism leads to good and scala.

UR - http://www.scopus.com/inward/record.url?scp=85013648065&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013648065&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85013648065

T3 - Proceedings of the 2015 USENIX Annual Technical Conference, USENIX ATC 2015

SP - 1

EP - 15

BT - Proceedings of the 2015 USENIX Annual Technical Conference, USENIX ATC 2015

PB - USENIX Association

Y2 - 8 July 2015 through 10 July 2015

ER -