Senior Design Project: Parallel Senior Design Project: Parallel

Senior Design Project: Parallel - PowerPoint Presentation

luanne-stotts . @luanne-stotts
Uploaded On 2017-06-10

Senior Design Project: Parallel - PPT Presentation

Task Scheduling in Heterogeneous Computing Environments Senior Design Students Christopher Blandin and Dylan Machovec Postdoctoral Scholar Bhavesh Khemka Faculty Advisor H J Siegel ID: 558093

tasks task system parallel task tasks parallel system scheduling model max time nodes multiple heuristics performance execution workload assignment




Download Presentation from below link

Download Presentation The PPT/PDF document "Senior Design Project: Parallel" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript


Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments

Senior Design Students: Christopher Blandin and Dylan MachovecPost-doctoral Scholar: Bhavesh KhemkaFaculty Advisor: H. J. Siegel

Senior Design PresentationSlide2

Outlinemotivationour system modelproblem statementexisting worksimulation detailsfuture work2Slide3

MotivationHigh Performance Computing (HPC) used by wide variety of fields to solve challenging problemsphysics simulations, oil and gas industry, climate modeling, computational biology, computational chemistry, and many moreimproving performance increases productivity in these fieldswe plan on improving performance of system by designing novel scheduling techniquesscheduling refers to the assignment and ordering of tasks to machines for execution3Slide4

System Model – Definitionsheterogeneitydiffering execution characteristicshomogeneityhave the same execution characteristicsoversubscribedmore tasks arriving than the system can execute immediately4Slide5

System Model – Cluster Modelclusters have multiple homogeneous nodesclusters are heterogeneous from each othernodes may have multiple multicore processors each node may only have one task running at a given timeavoids interference between taskstask assignments are done at node-levela task cannot be spread across two clusters5Slide6

System Model – Workload Characteristicsdynamically arriving taskswhen a task arrives, scheduler obtains the following information:arrival timeexecution timedifferent times on different clusters (because of heterogeneity)number of processing cores requiredvalue functiontasks are heterogeneousno pre-emption6Slide7

System Model – Value Function7each task has a value functionrepresents value of the task when it completesvalue function may be different for each taskmonotonically decreasing functionsvalue functions can be fully described with four parametersa constant starting valueafter soft deadline value decays linearly to a final valueafter hard deadline value drops to zeroSlide8

Problem Statementwe measure the performance of a scheduler in our environment as the sum of the value earned by completing tasks over a given amount of timegoal of heuristics: maximize total sum of value earned over a given amount of timeimprove performance of HPC systemsmain contributiondesign, simulation, and analysis of resource allocation heuristics for task schedulingheterogeneous HPC system with multiple clusterstasks with associated value functions with soft and hard deadlineseach task executes in parallel over multiple cores8Slide9

t4t2Mapping Event9mapping event: when task assignment decision(s) are madetrigger mapping event whenever:a node becomes available, ora task arrivesduring mapping event, all tasks that have not been reserved or have not started execution are considered mappable only makes task assignments that can start nowheuristic may or may not make reservations




unmapped tasks set

nodes of

cluster 1










nodes of

cluster 2




current time





Planned Heuristicsfour planned heuristicsEASY BackfillingFCFS with Multiple QueuesMax-Max ValueMax-Max Value-Per-Resourcesubmit to Metaheuristics International Conference (MIC 2015)submission deadline: 2/6/1510Slide11

Existing Work – Dr. Siegel’s Groupfocuses on utility of tasksB. Khemka, R. Friese, L. D. Briceño, H. J. Siegel, A. A. Maciejewski, G. A. Koenig, C. Groer, G. Okonski, M. M. Hilton, R. Rambharos and S. Poole, “Utility Functions and Resource Management in an Oversubscribed Heterogeneous Computing Environment,” IEEE Transactions on Parallel and Distributed Systems, accepted 2014, to appear.another work that models stepped value functionsJ-K Kim, S. Shivle, H. J. Siegel, A. A. Maciejewski, T. D. Braun, et al. “Dynamically Mapping Tasks with Priorities and Multiple Deadlines in a Heterogeneous Environment,” Journal of Parallel and Distributed Computing, vol. 67, no. 2, pp. 154-169, Feb. 200711Slide12

Existing Workother parallel task scheduling techniquesEASY BackfillingD. A. Lifka, “The ANL/IBM SP Scheduling System,” Proc. First Workshop Job Scheduling Strategies for Parallel Processing, pp. 295-303, 1995.S. Gerald, R. Kettimuthu, A. Rajan and P. Sadayappan, “Scheduling of Parallel Jobs in a Heterogeneous Multi-Site Environment,” Job Scheduling Strategies for Parallel Processing, pp. 87-104, 2003.12Slide13

Design of Parallel Simulator for Experimentsextends existing serial simulator from Dr. Siegel’s groupmodified to handle scheduling of parallel taskscreated new modulescluster classhas nodes within itmethods for obtaining parallel task information from workload tracecreated a sleep task object to model idle time within each machinedeveloped an algorithm to locate slots for parallel tasks within the area occupied by sleep tasksdeveloped a method that picks the nodes that create the best packing (i.e., create the least future restrictions)13Slide14

Workloads for Simulationswill use Dr. Dror Feitelson’s Parallel Workload Trace to model the workload arrivalworkload log from Curie Supercomputer in France (has 93,312 cores)using last 10 months of datamay use Downey’s model for execution time scaling14Slide15

Future WorkUse simulator to implement and compare the planned heuristicsrunning a post-mortem analysisuse a genetic algorithm to find a loose upper bound solution when we know in advance the arrival time and characteristics of all taskssince scheduling is NP-hard it is hard to quantify the performance of heuristicsthis analysis will give us a better metric to compare our results with15Slide16

Thank YouQuestions?Feedback?16Slide17

Back-up Slides17Slide18

Packing Nodes Efficientlywhenever an assignment is to be made, all heuristics pick the nodes that create the least amount of restrictions for future assignmentse.g., if task t8 needs 3 nodes, it will be assigned: n1, n2, n518



current time





Heuristics – Overview19EASY Backfillingconsiders tasks in a first come first serve (FCFS) ordermakes only one reservation for the first task that cannot fit on idle machinesbackfills other tasks so that they do no delay the reservationFCFS with Multiple Queuesputs the tasks in three queuestakes 1, 4, and 8 tasks from the large, medium, and small queues respectivelyassigns tasks if possible, and otherwise makes the earliest reservation for themrepeats until the queues are emptySlide20

Heuristics – Overview20Max-Max ValueFirst phase: Considering all tasksDetermine the allocation choice that will earn it the highest value without delaying any place holder taskIf there are ties, pick the choice with the earlier completion timeSecond phase: Consider tasks from first phaseMake assignment or a place-holder for the choice that earns the highest valueThis assignment should not start execution after the start of the earliest place holder taskRepeat the two phases until no more tasks can be mappedMax-Max Value-Per-ResourceSimilar to Max-Max ValueSlide21

Simulation Studyto model real-world system environmentexperiments run on ISTeC Cray HPC Systemuses real workload traces as inputs21