isard-sosp09 - Quincy Fair Scheduling for Distributed...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Quincy: Fair Scheduling for Distributed Computing Clusters Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar and Andrew Goldberg Microsoft Research, Silicon Valley — Mountain View, CA, USA {misard, vijayanp, jcurrey, uwieder, kunal, [email protected] ABSTRACT This paper addresses the problem of scheduling concur- rent jobs on clusters where application data is stored on the computing nodes. This setting, in which schedul- ing computations close to their data is crucial for per- formance, is increasingly common and arises in systems such as MapReduce, Hadoop, and Dryad as well as many grid-computing environments. We argue that data in- tensive computation benefits from a fine-grain resource sharing model that differs from the coarser semi-static resource allocations implemented by most existing clus- ter computing architectures. The problem of scheduling with locality and fairness constraints has not previously been extensively studied under this model of resource- sharing. We introduce a powerful and flexible new framework for scheduling concurrent distributed jobs with fine-grain resource sharing. The scheduling problem is mapped to a graph datastructure, where edge weights and capacities encode the competing demands of data locality, fairness, and starvation-freedom, and a standard solver computes the optimal online schedule according to a global cost model. We evaluate our implementation of this frame- work, which we call Quincy, on a cluster of a few hun- dred computers using a varied workload of data- and CPU-intensive jobs. We evaluate Quincy against an ex- isting queue-based algorithm and implement several poli- cies for each scheduler, with and without fairness con- straints. Quincy gets better fairness when fairness is re- quested, while substantially improving data locality. The volume of data transferred across the cluster is reduced by up to a factor of 3.9 in our experiments, leading to a throughput increase of up to 40%. Categories and Subject Descriptors D.4.1 [ Operating Systems ]: Process Management— Scheduling General Terms Algorithms, Design, Performance 100 200 300 400 500 600 10 100 1000 Number of jobs Running time in minutes Figure 1 : Distribution of job running times from a production cluster used inside Microsoft’s search division. The horizontal axis shows the running time in minutes on a log scale, and the vertical axis shows the number of jobs with the corresponding running time. Run time(m) 5 10 15 30 60 300 % Jobs 18.9 28.0 34.7 51.31 72.0 95.7 Table 1 : Job running time. The table shows the same data as Figure 1 but here presented as the percentage of jobs under a particular running time in minutes....
View Full Document

This note was uploaded on 11/12/2011 for the course CE 726 taught by Professor Staf during the Spring '11 term at SUNY Buffalo.

Page1 / 20

isard-sosp09 - Quincy Fair Scheduling for Distributed...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online