A Selftuning System for Big Data Analytics Presented by Carl Erhard amp Zahid Mian Authors Herodotos Herodotou Harold Lim Fei Dong Shivnath Babu Duke University Analysis in the Big Data Era ID: 410916
Download Presentation The PPT/PDF document "Starfish:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Starfish: A Self-tuning System for Big Data Analytics
Presented by Carl Erhard & Zahid MianAuthors: Herodotos Herodotou,Harold Lim, Fei Dong, Shivnath Babu
Duke UniversitySlide2
Analysis in the Big Data Era9/26/20112
Massive Data
Data
Analysis
Insight
Key to
Success
=
Timely
and
Cost-Effective
Analysis
StarfishSlide3
We want a MAD System9/26/2011Starfish
3Magntetism “Attracts” or welcomes all sources of data, regardless of structure, values, etc.Agility Adaptive, remains in sync with rapid data evolution and modification
Depth More than just your typical analytics, we need to support complex operations like statistical analysis and machine learningSlide4
No wait…I mean MADDER9/26/2011Starfish
4Data-lifecycle Do more than just queries, Awareness optimize the movement, storage, and processing of big dataElasticity
Dynamically adjust resource usage and operational costs based on workload and user requirementsRobustness Provide storage and querying services even in the event of some failuresSlide5
Practitioners of Big Data AnalyticsWho are the users?Data analysts, statisticians, computational scientists…Researchers, developers, testers…Business Analysts…You!
Who performs setup and tuning?The users!Usually lack expertise to tune the system9/26/20115
StarfishSlide6
Motivation9/26/2011Starfish
6Slide7
Tuning ChallengesHeavy use of programming languages for MapReduce programs (e.g., Java/python)Data loaded/accessed as opaque files
Large space of tuning choices (over 190 parameters!)Elasticity is wonderful, but hard to achieve (Hadoop has many useful mechanisms, but policies are lacking)Terabyte-scale data cycles
9/26/20117StarfishSlide8
Our goal: Provide good performance automaticallyStarfish: Self-tuning System
9/26/20118
MapReduce
Execution Engine
Distributed File System
Hadoop
Java / C++ /
R
/ Python
Oozie
Hive
Pig
Elastic
MapReduce
Jaql
HBase
Starfish
Analytics System
StarfishSlide9
What are the Tuning Problems?9/26/2011
9
Job-level MapReduce configurationWorkload management
Data
layout
tuning
Cluster sizing
Workflow optimization
J
1
J
2
J
3
J
4
StarfishSlide10
Starfish’s Core Approach to Tuning9/26/201110
if Δ(conf. parameters) then what …?
if Δ(data properties) then what …? if Δ(cluster properties) then what …?
Profiler
Collects concise
s
ummaries of
execution
What-if Engine
Estimates impact of hypothetical changes on execution
Optimizers
Search through space of tuning choices
Job
Workflow
Workload
Data layout
Cluster
StarfishSlide11
Starfish Architecture9/26/201111
StarfishSlide12
Job Level TuningJust-in-Time Optimizer: Automatically selects efficient execution techniques for MapReduce jobs.Profiler: A Starfish component which is able to collect detailed summaries of jobs on a task-by-task basis.Sampler: Collects statistics about input, intermediate, and output data of a MapReduce job.
9/26/2011Starfish12Slide13
MapReduce Job Execution9/26/201113
split 0
map
out 0
reduce
split 2
map
split 1
map
split 3
map
Out 1
reduce
job
j
=
<
program
p
, data
d
, resources
r
, configuration
c
>
StarfishSlide14
What Controls MR Job Execution?Space of configuration choices:Number of map tasksNumber of reduce tasksPartitioning of map outputs to reduce tasksMemory allocation to task-level buffersWhether
output data from tasks should be compressedWhether combine function should be used9/26/201114
job j = < program
p
, data
d
, resources
r
, configuration
c
>
StarfishSlide15
Effect of Configuration SettingsUse defaults or set manually (rules-of-thumb)Rules-of-thumb
may not suffice9/26/201115
Two-dimensional projection of a multi-dimensional surface(Word Co-occurrence MapReduce Program)
Rules-of-thumb settings
StarfishSlide16
MapReduce Job Tuning in a NutshellGoal:Challenges: p is an arbitrary MapReduce program;
c is high-dimensional; …9/26/201116
Profiler
What-if Engine
Optimizer
Runs
p
to collect a
job profile
(concise execution summary) of <
p
,
d
1
,
r
1
,
c
1
>
Given profile of <
p
,
d
1
,
r
1
,
c
1
>, estimates
virtual profile
for <
p
,
d
2
,
r
2
,
c
2
>
Enumerates and searches through the
optimization space S
efficiently
StarfishSlide17
Job ProfileConcise representation of program execution as a jobRecords information at the level of “task phases”Generated by Profiler
through measurement or by the What-if Engine through estimation9/26/2011
17
Memory Buffer
Merge
Sort,
[Combine],
[Compress]
Serialize,
Partition
map
Merge
split
DFS
Spill
Collect
Map
Read
StarfishSlide18
Job Profile FieldsDataflow: amount of data flowing
through task phasesMap output bytesNumber of spills
Number of records in buffer per spill9/26/2011
18
Costs:
execution times at
the level of task phases
Read phase time in the map task
Map phase time in the map task
Spill phase time in the map task
Dataflow Statistics:
statistical information about dataflow
Width of input key-value pairs
Map selectivity in terms of records
Map output compression ratio
Cost Statistics:
statistical information about resource
costs
I/O cost for reading from local disk per byte
CPU cost for executing the Mapper per record
CPU cost for uncompressing the input per byte
StarfishSlide19
Generating Profiles by MeasurementGoalsHave zero overhead when profiling is turned offRequire no modifications to HadoopSupport unmodified MapReduce programs written in Java or Hadoop Streaming/Pipes (Python/Ruby/C++)Approach: Dynamic (on-demand) instrumentation
Event-condition-action rules are specified (in Java)Leads to run-time instrumentation of Hadoop internalsMonitors task phases of MapReduce job executionWe currently use Btrace (Hadoop internals are in Java)9/26/2011
19StarfishSlide20
Generating Profiles by Measurement9/26/201120
split 0
map
out 0
reduce
split 1
map
raw data
raw data
raw data
map profile
reduce profile
job profile
Use of Sampling
Profile fewer tasks
Execute fewer tasks
JVM = Java Virtual Machine, ECA = Event-Condition-Action
JVM
JVM
JVM
Enable Profiling
ECA rules
StarfishSlide21
Results of Job Profiling9/26/2011Starfish
21Slide22
Results using Job Profiling9/26/2011Starfish
22Slide23
Workflow-Aware SchedulingUnbalanced Data LayoutSkewed DataData Layout Not Considered when SchedulingTasksAddition/Dropping Partitions—No RebalanceCan Lead to Failures Due to Space IssuesLocality-Aware Schedulers Can Make Problem Worse
Possible Solutions:Change # of ReplicasCollocating Data (Block Placement Policy)9/26/2011Starfish
23Slide24
Impact of Unbalanced Data Layout9/26/2011Starfish
24Slide25
Impact of Unbalanced Data Layout9/26/2011Starfish
25Slide26
Impact of Unbalanced Data Layout9/26/2011Starfish
26Slide27
Workflow-Aware SchedulingMakes Decisions by Considering Producer-Consumer Relationships9/26/2011Starfish
27
NodesSlide28
Starfish’s Workflow-Aware SchedulerSpace of Choices:Block Placement Policy: Round Robin (Local Write is default)Replication FactorSize of blocks: general large for large filesCompression: Impacts I/O; not always beneficial
9/26/2011Starfish28Slide29
Starfish’s Workflow-Aware SchedulerWhat-If QuestionsA) Expected runtime of Job P if the RR block placement policy is used for P’s output files?B) New Data layout in the cluster if the RR block placement policy is used for P’s output files?C) Expected runtime of Job C1 (C2) if its input data layout is the one in the answer of Question (above)?
D) Expected runtimes of Jobs C1 and C2 if scheduled concurrently when Job P completes?E) Given Local Write block policy and RF = 1 for Job P’s output, what is the expected increase in the runtime of Job C1 if one node in the cluster fails during C1’s execution?9/26/2011Starfish
29Slide30
Estimates from the What-if Engine9/26/201130
Hadoop
cluster: 16 nodes, c1.mediumMapReduce Program: Word Co-occurrenceData set: 10 GB WikipediaTrue surface
Estimated surface
StarfishSlide31
Workflow Scheduler Picks Layout9/26/2011Starfish
31Slide32
Optimizations-Workload Optimizer9/26/2011Starfish
32Slide33
Provisioning--ElastisizerMotivation: Amazon Elastic MapReduce Data in S3, processed in-cluster, stored to S3User Pays for Resources UsedElastisizer Determines …Best cluster
Hadoop configurations … Based on user-specified goals (execution time and costs)9/26/2011Starfish
33Slide34
Elastisizer Configuration Evaluation9/26/2011Starfish
34Slide35
Elastisizer Configuration Evaluation9/26/2011Starfish
35Slide36
Elastisizer- Cluster Configurations9/26/2011Starfish
36Slide37
Multi-objective Cluster Provisioning9/26/201137
Instance Type for Source Cluster:
m1.largeStarfishSlide38
Critique of PaperGoodSource Available for ImplementationAble to See the impact of various settingsGood Visualization ToolsTutorials/Source available at duke.edu/starfishBadThe paper (and subsequent materials) talk a lot about
what Starfish does, but not necessarily how it does itThere is no documentation on LastWord, and this seems importantOnly works after a the job/workflow has been executed at least once9/26/2011
Starfish38Slide39
Starfish’s VisualizerTimeline ViewsShows progress of a job execution at the task levelSee execution of same job with different settingsData-flow ViewsView of flow of data among nodes, along with MR jobs“Video Mode” allows playback execution from pastProfile Views
Timings, data-flow, resource-level 9/26/2011Starfish
39Slide40
Timeline Views9/26/2011Starfish40Slide41
Timeline View9/26/2011Starfish41Slide42
Data Skew View9/26/2011Starfish42Slide43
Data Skew View9/26/2011Starfish43Slide44
Data Skew View9/26/2011Starfish44Slide45
Data-flow Views9/26/2011Starfish45Slide46
ReferencesHerodotou, Herodotos, et al. "Starfish: A self-tuning system for big data analytics." Proc. of the Fifth CIDR Conf. 2011.Dong, Fei. Extending Starfish to Support the Growing
Hadoop Ecosystem. Diss. Duke University, 2012.Herodotou, Herodotos, Fei Dong, and Shivnath Babu. "MapReduce programming and cost-based optimization? Crossing this chasm with Starfish." Proceedings of the VLDB Endowment 4.12 (2011).
http://www.cs.duke.edu/starfish/http://www.youtube.com/watch?v=Upxe2dzE1uk9/26/2011Starfish46Slide47
Backup9/26/2011Starfish
47Slide48
Hadoop MapReduce EcosystemPopular solution to Big Data Analytics9/26/2011
48
MapReduce
Execution Engine
Distributed File System
Hadoop
Java / C++ /
R
/ Python
Oozie
Hive
Pig
Elastic
MapReduce
Jaql
HBase
StarfishSlide49
Workflow-level TuningStarfish has a Workflow-aware Scheduler which addresses several concerns:How do we equally distribute computation across nodes?How do we adapt to imbalance of a load or energy cost?The Workflow-aware Scheduler
works with the What-if Engine and the Data Manager to answer these questions9/26/2011Starfish
49Slide50
Workload-level TuningStarfish’s Workload Optimizer is aware of each workflow that will be executed. It reorders the workflows in order to make them more efficient.This includes reusing data for different workflows that use the same MapReduce jobs.
9/26/2011Starfish50Slide51
What-if Engine
Job Oracle
Virtual Job Profile for <p, d2
, r
2
, c
2
>
What-if Engine
9/26/2011
51
Task Scheduler Simulator
Job
Profile
<p, d
1
, r
1
, c
1
>
Properties of Hypothetical job
Input Data
Properties
<d
2
>
Cluster
Resources
<r
2
>
Configuration
Settings
<c
2
>
Possibly Hypothetical
StarfishSlide52
Virtual Profile Estimation9/26/201152
Given profile for job j = <p, d1, r
1, c1> estimate profile for job j' = <p, d
2
, r
2
, c
2
>
(Virtual) Profile for
j'
Dataflow
Statistics
Dataflow
Cost
Statistics
Costs
Profile for
j
Input
Data
d
2
Confi-guration
c
2
Resources
r
2
Costs
White-box Models
Cost
Statistics
Relative
Black-box
Models
Dataflow
White-box Models
Dataflow
Statistics
Cardinality
Models
StarfishSlide53
Job Optimizer9/26/201153
Best Configuration
Settings
<
c
opt
>
for
<p, d
2
, r
2
>
Subspace Enumeration
Recursive Random Search
Just-in-Time Optimizer
Job
Profile
<p, d
1
, r
1
, c1>
Input Data
Properties
<d
2
>
Cluster
Resources
<r
2
>
What-if
calls
Starfish