Reservation-based scheduling: If you're late don't blame us!

Carlo Curino, Djellel E. Difallah, Chris Douglas, Subru Krishnan, Raghu Ramakrishnan, Sriram Rao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The continuous shift towards data-driven approaches to business, and a growing attention to improving return on investments (ROI) for cluster infrastructures is generating new challenges for big-data frameworks. Systems originally designed for big batch jobs now handle an increasingly complex mix of computations. Moreover, they are expected to guarantee stringent SLAs for production jobs and minimize latency for best-effort jobs. In this paper, we introduce reservation-based scheduling, a new approach to this problem. We develop our solution around four key contributions: 1) we propose a reservation definition language (RDL) that allows users to declaratively reserve access to cluster resources, 2) we formalize planning of current and future cluster resources as a Mixed-Integer Linear Programming (MILP) problem, and propose scalable heuristics, 3) we adaptively distribute resources between production jobs and best-effort jobs, and 4) we integrate all of this in a scalable system named Rayon, that builds upon Hadoop / YARN. We evaluate Rayon on a 256-node cluster against workloads derived from Microsoft, Yahoo!, Facebook, and Cloud-era's clusters. To enable practical use of Rayon, we open-sourced our implementation as part of Apache Hadoop 2.6.

Original languageEnglish (US)
Title of host publicationProceedings of the 5th ACM Symposium on Cloud Computing, SOCC 2014
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)1595930361, 9781450332521
DOIs
StatePublished - Nov 3 2014
Event5th ACM Symposium on Cloud Computing, SOCC 2014 - Seattle, United States
Duration: Nov 3 2014Nov 5 2014

Publication series

NameProceedings of the 5th ACM Symposium on Cloud Computing, SOCC 2014

Conference

Conference5th ACM Symposium on Cloud Computing, SOCC 2014
CountryUnited States
CitySeattle
Period11/3/1411/5/14

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Reservation-based scheduling: If you're late don't blame us!'. Together they form a unique fingerprint.

  • Cite this

    Curino, C., Difallah, D. E., Douglas, C., Krishnan, S., Ramakrishnan, R., & Rao, S. (2014). Reservation-based scheduling: If you're late don't blame us! In Proceedings of the 5th ACM Symposium on Cloud Computing, SOCC 2014 (Proceedings of the 5th ACM Symposium on Cloud Computing, SOCC 2014). Association for Computing Machinery, Inc. https://doi.org/10.1145/2670979.2670981