Ali Ghodsi Matei Zaharia Benjamin Hindman Andy Konwinski Scott Shenker Ion Stoica University of California Berkeley Resource Sharing Multiple users share the resource from a system ID: 932097
Download Presentation The PPT/PDF document "Dominant Resource Fairness: Fair Allocat..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Dominant Resource Fairness: Fair Allocation of Multiple Resource Types
Ali
Ghodsi
,
Matei
Zaharia
, Benjamin Hindman, Andy
Konwinski
, Scott
Shenker
, Ion
Stoica
University of California, Berkeley
Slide2Resource Sharing
Multiple users share the resource from a system
Resource: CPU, memory, storage, etc.
A user may need multiple kinds of resources.
We need to fairly allocate the system resource to the users
Slide3What is fair sharing?
N users want to share the system’s CPU
Solution:
Allocate each 1/n of the shared resource
- three users want to user CPU
Generalized by max-min fairnessHandles if a user wants less than its fair share-e.g., user 1 wants no more than 20% CPU
Slide4Properties of max-min fairness
Share guarantee
Each user will get at least 1/n of the resource
But will get less if she/he demand is less
Strategy-proof
Users are not better off by asking for more than they needNo lie
Slide5Why max-min fairness is not enough
Job scheduling in datacenters is not only about CPUs
Jobs consume CPU, memory, disk, etc.
Challenge: heterogeneity in resource demands
Slide6Problem
How to fairly share multiple resources when users have heterogenous demands on them?
Example
2 resources: CPUs & mem
User 1 wants <1 CPU, 4 GB> per task
User 2 wants <3 CPU, 1 GB> per task
Slide7Model
Users have tasks according to a demand vector
e.g. <2, 3, 1> user’s tasks need 2
, 3
, 1
Assume divisible resources
A Natural Policy
Asset Fairness
Equalize each user’s sum of resource shares
A cluster with 70 CPUs, 70 GB RAM
User 1 needs <2 CPU, 2 GB RAM> per task
User 2 needs <1 CPU, 2 GB RAM> per taskAsset Fairness yieldsUser 1: 15 tasks – 30 CPUs, 30 GB (
=60)
User 2: 20 tasks – 20 CPUs, 40 GB (
=60)
User 1 has < 50% of both CPUs and RAM
Slide9Share Guarantee
Every user should get 1/n of at least one resource
Intuition:
You shouldn’t be worse off than if you ran your own cluster with 1/n of the resources
Slide10Strategy-proof
A user should not be able to increase her allocation by lying about her demand vector
Intuition
Users are incentivized to provide truthful resource requirements
Slide11Things need to do
Finding a fair sharing policy that provides
Share guarantee
Strategy-proof
Max-min fairness for a single resource has these properties
Generalize it to multiple resource?
Slide12Dominant Resource Fairness
A user’s
dominant resource
is the resource she has the biggest share
Example:
Total resources: <10 CPU, 4 GB>User 1’s allocation: <2 CPU, 1 GB>Dominant resource is memory as 1/4 (25%) > 2/10 (20%)
A user’s
dominant share
is the fraction of the dominant resource she is allocated
User 1’s dominant share is 25% (1/4)
Slide13Dominant Resource Fairness
Apply max‐min fairness to dominant shares
Equalize the dominant share of the users
Example:
Total resources: <9 CPU, 18 GB>
User 1 demand: <1 CPU, 4 GB> dom res: memUser 2 demand: <3 CPU, 1 GB> dom res: CPU
Slide14Online Dominant Resource Scheduler
Whenever there are available resources and tasks to run
Schedule a task to the user with smallest dominant share
Slide15An Approach from Economy Community
Let the market determine the prices
Competitive Equilibrium from Equal Incomes (CEEI)
Give each user 1/n of every resource
Let users trade in a perfectly competitive market
Nash bargaining solutionMaximize
-
is the utility that user
i
gets from its allocation
Suppose the utility is the number of tasks
Example
Total resources: <9 CPU, 18 GB>
User 1 demand: <1 CPU, 4 GB>
dom
res: mem
User 2 demand: <3 CPU, 1 GB>
dom
res: CPU
Comparison in a toy example
Example
Total resources: <9 CPU, 18 GB>
User 1 demand: <1 CPU, 4 GB>
dom
res: memUser 2 demand: <3 CPU, 1 GB> dom res: CPU
Slide17Evaluation
Micro‐experiments on EC2
Evaluate DRF’s dynamic behavior when demands change
Compare DRF with current Hadoop scheduler
Macro‐benchmark through simulations
Simulate Facebook trace with DRF and current Hadoop scheduler
Slide18DRF inside Mesos on EC2
In the first 2 minutes, job 1 uses <1 CPU, 10 GB RAM>
per task and job 2 uses
<
1 CPU, 1 GB RAM
>
per task.
After 2 minutes, the task sizes of both jobs change to <2 CPUs, 4 GB> for job 1 and <1 CPU, 3 GB> for job 2.
Slide19DRF vs Hadoop Scheduler
Hadoop Fair Scheduler/capacity/Quincy
Each machine consists of k
slots
( e.g. k=14)
Run at most one task per slot
Give jobs ”equal” number of slots
i.e., apply max‐min fairness to slot‐count
Experiment: DRF vs Slots
80 jobs for each task
In 10 mins
Slide21Experiment: DRF vs Slots
80 jobs for each task
In 10 mins
Slide22Simulation: DRF vs Slots on facebook Traces
Slide23Selected
Questions
Why is the sharing-incentive property important? If a user doesn’t know it’s obtaining less than what it could get from sharing the resources evenly, does this matter?
How does DRF deals with unutilized resource if the over allocating it?