Cluster Scheduler Reference: Mesos: A Platform for
Author : sherrill-nordquist | Published Date : 2025-05-28
Description: Cluster Scheduler Reference Mesos A Platform for FineGrained Resource Sharing in the Data Center NSDI2011 Multiagent Cluster Scheduling for Scalability and Flexibility Berkerly techdoc EECS2012273 doctoral dissertation Omega
Presentation Embed Code
Download Presentation
Download
Presentation The PPT/PDF document
"Cluster Scheduler Reference: Mesos: A Platform for" is the property of its rightful owner.
Permission is granted to download and print the materials on this website for personal, non-commercial use only,
and to display it on your personal computer provided you do not modify the materials and that you retain all
copyright notices contained in the materials. By downloading content from our website, you accept the terms of
this agreement.
Transcript:Cluster Scheduler Reference: Mesos: A Platform for:
Cluster Scheduler Reference: Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center NSDI’2011 Multi-agent Cluster Scheduling for Scalability and Flexibility. Berkerly techdoc EECS-2012-273. (doctoral dissertation) Omega: flexible scalable schedulers for large compute clusters EuroSys’2013 Ytao 2013.5.15 Cluster Scheduler(Intro) Cloud computing framework varies. New frameworks will likely continue to merge, and no single framework will be optimal for all application. Cluster with multiple frameworks improves ultilization and data sharing. Problem Statement: In the face of increasing demand for cluster resources by diverse cluster computing applications and the growing number of machines in typical clusters, it is a challenge to design cluster schedulers that provide flexible, scalable, and effcient resource allocations Cluster Scheduler Monolithic State Scheduling(MSS) Traditional & popular ones: (LSF , condor Hadoop) Concept: a single scheduling agent process that makes all scheduling decisions sequentially Usage: The agent takes input about framework requirements, resource availability, and organizational policies, and computes a global schedule for all tasks Cluster Scheduler Monolithic State Scheduling(MSS 2) Advantage: optimal scheduling. Global! Challenge: Complexity: capture all framework requirements Scalebility : New frameworks emerge Lose framework’s own scheduling optimization Cluster Scheduler (update) Scalability. (response time, number of machines) Flexibility (heterogeneous mix of job) Usability and Maintainability(easily adapt new types of jobs, frameworks) Fault isolation(Minimize dependencies between unrelated jobs) Utilization(Achieve high cluster resource utilization. e.g., cpu utilization, memory utilization) Partitioned State Scheduling(PSS) PSS: in PSS, cluster state is divided between multiple scheduling agents as non-overlapping scheduling domains Statically Partitioned State Scheduling (SPS): statically set cluster resources for particular frameworks. Dynamically Partitioned State Scheduling(DPS) : Mesos NSDI 2011 Replicated State Scheduling(RSS) in RSS scheduling domains may overlap and optimistic consistency control is used to resolve conflicting transactions Omega EuroSys 2013 Cluster Environment Use of commodity servers Tens to hundreds of thousands of servers Heterogeneous resources Use of commodity networks Mix workloads Service Jobs vs. Terminating Jobs Service Jobs consist of a set of service tasks that conceptually are intended to run forever, and these tasks are interacted with by means of request-response interfaces. , e.g., a set of web servers or relational database servers. Terminating Jobs, on the other hand, are given a set of inputs, perform some work as a function of those inputs, and are intended terminate eventually (traditional HPC cluster management only considers this) Mesos Goal: Support and demonstrate multi-agent scheduling Support fair-sharing meta-scheduling policy Increase overall cluster utilization Scale to tens of thousands of