TY - GEN
T1 - Reservation-based scheduling
T2 - 5th ACM Symposium on Cloud Computing, SOCC 2014
AU - Curino, Carlo
AU - Difallah, Djellel E.
AU - Douglas, Chris
AU - Krishnan, Subru
AU - Ramakrishnan, Raghu
AU - Rao, Sriram
N1 - Publisher Copyright:
Copyright © 2014 by the Association for Computing Machinery, Inc. (ACM).
PY - 2014/11/3
Y1 - 2014/11/3
N2 - The continuous shift towards data-driven approaches to business, and a growing attention to improving return on investments (ROI) for cluster infrastructures is generating new challenges for big-data frameworks. Systems originally designed for big batch jobs now handle an increasingly complex mix of computations. Moreover, they are expected to guarantee stringent SLAs for production jobs and minimize latency for best-effort jobs. In this paper, we introduce reservation-based scheduling, a new approach to this problem. We develop our solution around four key contributions: 1) we propose a reservation definition language (RDL) that allows users to declaratively reserve access to cluster resources, 2) we formalize planning of current and future cluster resources as a Mixed-Integer Linear Programming (MILP) problem, and propose scalable heuristics, 3) we adaptively distribute resources between production jobs and best-effort jobs, and 4) we integrate all of this in a scalable system named Rayon, that builds upon Hadoop / YARN. We evaluate Rayon on a 256-node cluster against workloads derived from Microsoft, Yahoo!, Facebook, and Cloud-era's clusters. To enable practical use of Rayon, we open-sourced our implementation as part of Apache Hadoop 2.6.
AB - The continuous shift towards data-driven approaches to business, and a growing attention to improving return on investments (ROI) for cluster infrastructures is generating new challenges for big-data frameworks. Systems originally designed for big batch jobs now handle an increasingly complex mix of computations. Moreover, they are expected to guarantee stringent SLAs for production jobs and minimize latency for best-effort jobs. In this paper, we introduce reservation-based scheduling, a new approach to this problem. We develop our solution around four key contributions: 1) we propose a reservation definition language (RDL) that allows users to declaratively reserve access to cluster resources, 2) we formalize planning of current and future cluster resources as a Mixed-Integer Linear Programming (MILP) problem, and propose scalable heuristics, 3) we adaptively distribute resources between production jobs and best-effort jobs, and 4) we integrate all of this in a scalable system named Rayon, that builds upon Hadoop / YARN. We evaluate Rayon on a 256-node cluster against workloads derived from Microsoft, Yahoo!, Facebook, and Cloud-era's clusters. To enable practical use of Rayon, we open-sourced our implementation as part of Apache Hadoop 2.6.
UR - http://www.scopus.com/inward/record.url?scp=85118316836&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118316836&partnerID=8YFLogxK
U2 - 10.1145/2670979.2670981
DO - 10.1145/2670979.2670981
M3 - Conference contribution
AN - SCOPUS:85118316836
T3 - Proceedings of the 5th ACM Symposium on Cloud Computing, SOCC 2014
BT - Proceedings of the 5th ACM Symposium on Cloud Computing, SOCC 2014
PB - Association for Computing Machinery
Y2 - 3 November 2014 through 5 November 2014
ER -