TY - GEN
T1 - The power of choice in data-aware cluster scheduling
AU - Venkataraman, Shivaram
AU - Panda, Aurojit
AU - Ananthanarayanan, Ganesh
AU - Franklin, Michael J.
AU - Stoica, Ion
PY - 2014/1/1
Y1 - 2014/1/1
N2 - Providing timely results in the face of rapid growth in data volumes has become important for analytical frameworks. For this reason, frameworks increasingly operate on only a subset of the input data. A key property of such sampling is that combinatorially many subsets of the input are present. We present KMN, a system that leverages these choices to perform data-aware scheduling, i.e., minimize time taken by tasks to read their inputs, for a DAG of tasks. KMN not only uses choices to co-locate tasks with their data but also percolates such combinatorial choices to downstream tasks in the DAG by launching a few additional tasks at every upstream stage. Evaluations using workloads from Facebook and Conviva on a 100-machine EC2 cluster show that KMN reduces average job duration by 81% using just 5% additional resources.
AB - Providing timely results in the face of rapid growth in data volumes has become important for analytical frameworks. For this reason, frameworks increasingly operate on only a subset of the input data. A key property of such sampling is that combinatorially many subsets of the input are present. We present KMN, a system that leverages these choices to perform data-aware scheduling, i.e., minimize time taken by tasks to read their inputs, for a DAG of tasks. KMN not only uses choices to co-locate tasks with their data but also percolates such combinatorial choices to downstream tasks in the DAG by launching a few additional tasks at every upstream stage. Evaluations using workloads from Facebook and Conviva on a 100-machine EC2 cluster show that KMN reduces average job duration by 81% using just 5% additional resources.
UR - http://www.scopus.com/inward/record.url?scp=84971514453&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84971514453&partnerID=8YFLogxK
M3 - Conference contribution
T3 - Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2014
SP - 301
EP - 316
BT - Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2014
PB - USENIX Association
T2 - 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2014
Y2 - 6 October 2014 through 8 October 2014
ER -