/
Map Reduce Basics  Chapter 2 Map Reduce Basics  Chapter 2

Map Reduce Basics Chapter 2 - PowerPoint Presentation

pasty-toler
pasty-toler . @pasty-toler
Follow
393 views
Uploaded On 2018-03-13

Map Reduce Basics Chapter 2 - PPT Presentation

Basics Divide and conquer Partition large problem into smaller subproblems Worker work on subproblems in parallel Threads in a core cores in multicore processor multiple processor in a machine machines in a cluster ID: 649260

intermediate map reduce data map intermediate data reduce hadoop number key fold functional input execution file results programming list keys hdfs reducer

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Map Reduce Basics Chapter 2" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Map Reduce Basics Chapter 2Slide2

Basics

Divide and conquer

Partition large problem into smaller

subproblems

Worker work on

subproblems

in parallel

Threads in a core, cores in multi-core processor, multiple processor in a machine, machines in a cluster

Combine intermediate results from worker to final result

Issues

How break up into smaller tasks

Assign tasks to workers

Workers get data needed

Coordinate synchronization among workers

Share partial results

Do all if SE errors and HW faults?Slide3

BasicsMR – abstraction that hides system-level details from programmerMove code to data

Spread data across disks

DFS manages storageSlide4

TopicsFunctional programmingMapReduce

Distributed file systemSlide5

Functional Programming RootsMapReduce

= functional programming

plus distributed

processing on steroids

Not a new idea… dates back to the 50’s (or even 30’s)

What is functional programming?

Computation as application of functions

Computation is evaluation of mathematical functions

Avoids state and mutable data

Emphasizes application of functions instead of changes in stateSlide6

Functional Programming RootsHow is it different?

Traditional notions of “data” and “instructions” are not applicable

Data flows are implicit in program

Different orders of execution are possible

Theoretical foundation provided by lambda calculus

a formal system for function definition

Exemplified

by

LISP, SchemeSlide7

Overview of LispFunctions written in prefix notation

(

+

1 2)

3

(*

3 4)

12

(

sqrt

( + (*

3 3)

(* 4

4)))

5

(define x 3)

x

(* x 5)

15Slide8

FunctionsFunctions = lambda expressions bound to variables

Example expressed with lambda:(+ 1 2)

3

λ

x

λ

y.x+y

Above expression is equivalent to:

Once defined, function can be applied:

(define (

foo

x y)

(

sqrt

(+ (* x x) (* y y))))

(define

foo

(lambda (x y) (sqrt (+ (* x x) (* y y)))))

(foo 3 4)

5Slide9

Functional Programming RootsTwo important concepts in functional programming

Map: do something to everything in a list

Fold: combine results of a list in some waySlide10

Functional Programming MapHigher order functions – accept other functions as arguments

Map

Takes a function f and its argument, which is a list

applies to all elements in list

Returns a list as result

Lists are primitive data types

[

1 2 3 4 5]

[[a 1] [b 2] [c 3]]Slide11

Map/Fold in ActionSimple map example:

(map (lambda (x) (* x x

))

[1 2 3 4 5])

[1 4 9 16 25

]Slide12

Functional Programming ReduceFold

Takes

function g, which has 2 arguments: an

initial

value and a list.

The g applied to initial value and 1

st

item in list

Result stored in intermediate variable

Intermediate variable and next item in list 2

nd

application of g, etc.

Fold returns final value of intermediate variableSlide13

Map/Fold in ActionSimple map example:

Fold examples:

Sum of squares:

(map (lambda (x) (* x x

))

[1 2 3 4 5])

[1 4 9 16 25

]

(fold + 0

[1

2 3 4

5])

 15

(fold * 1

[1

2 3 4

5])  120

(define (sum-of-squares v

) // where v is a list

(fold + 0 (map (lambda (x) (* x x)) v)))

(sum-of-squares

[1

2 3 4

5])

 55Slide14
Slide15

Functional Programming RootsUse map/fold in combinationMap – transformation of dataset

Fold- aggregation operation

Can apply map in parallel

Fold – more restrictions, elements must be brought together

Many applications do not require g be applied to all elements of list, fold aggregations in parallelSlide16

Functional Programming RootsMap in MapReduce

is same as in functional programming

Reduce corresponds to fold

2 stages:

User specified computation applied over all input, can occur in parallel, return intermediate output

Output aggregated by another user-specified computationSlide17

Mappers/ReducersKey-value pair (

k,v

) – basic data structure in MR

Keys, values –

int

, strings, etc., user defined

e.g. keys – URLs, values – HTML content

e.g. keys – node ids, values – adjacency lists of nodes

Map: (k1, v1) -> [(k2, v2)]

Reduce: (k2, [v2]) -> [(k3, v2)]

Where […] denotes a listSlide18

General Flow

Apply

mapper

to every input key-value pair stored in DFS

Generate arbitrary number of intermediate (

k,v

)

Distributed group by operation (

shuffle

) on intermediate keys

Sort

intermediate results by key (not across reducers)

Aggregate intermediate results

Generate final output to DFS – one file per reducer

Map

ReduceSlide19

What function is implemented?Slide20
Slide21

Example: unigram (word count)(docid

, doc) on DFS, doc is text

Mapper

tokenizes (

docid

, doc), emits (

k,v

) for every word – (word, 1)

Execution framework all same keys brought together in reducer

Reducer – sums all counts (of 1) for word

Each reduce writes to one file

Words within file sorted, file same # words

Can use output as input to another MRSlide22

Combine - Bandwidth Optimization

Issue:

large number of key-value

pairs

Example – word count (word, 1)

If

copy across network intermediate data >

input

Solution:

use

Combiner

functions

allow local aggregation (after mapper) before shuffle

sort

Word Count - Aggregate

(count each word

locally)intermediate = # unique wordsExecuted on same machine as mapper – no output from other mappers

Results in a “mini-reduce” right after the map phase(k,v) of same type as input/outputIf operation associative and commutative, reduce can be combinerReduces key-value pairs to save bandwidthSlide23

Partitioners – Load Balance

Issue:

Intermediate results all on one reducer

Solution:

use

Partitioner

functions

divide up intermediate key space and assign (

k,v

) to reducers

Specifies task to which copy (

k,v

)

Reducer processes keys in sorted order

Partitioner

computes hash value of

key, takes mod of value with # reducersHopefully same number of each to each reducerBut may be- ZipfianSlide24

MapReduce

Programmers specify two functions:

map

(k, v)

→ <k’, v’>*

reduce

(k’, v’) → <k’, v’>*

All v’ with the same k’ are reduced together

Usually, programmers also specify:

partition

(k’, number of partitions ) → partition for k’

Often a simple hash of the key, e.g. hash(k’) mod n

Allows reduce operations for different keys in parallel

Implementations:

Google has a proprietary implementation in C++

Hadoop

is an open source implementation in Java (lead by Yahoo)Slide25

Its not just Map and Reduce

Apply mapper to every input key-value pair stored in DFS

Generate arbitrary number of intermediate (

k,v

)

Aggregate locally

Assign to reducers

Distributed group by operation (

shuffle

) on intermediate keys

Sort

intermediate results by key (not across reducers)

Aggregate intermediate results

Generate final output to DFS – one file per reducer

Map

Reduce

Combine

PartitionSlide26

Execution FrameworkMapReduce

program (job) contains

Code for

mappers

Combiners

Partitioners

Code for reducers

Configuration parameters (where is input, store output)

Execution

framework takes care of

everything else

Developer submits job to submission node of cluster (

jobtracker

)Slide27

Recall these problems?How do we assign work units to workers?

What if we have more work units than workers?

What if workers need to share partial results?

How do we aggregate partial results?

How do we know all the workers have finished?

What if workers die?Slide28

Execution FrameworkSchedulingJob divided into tasks (certain block of (

k,v

) pairs)

Can have 1000s jobs need to be assigned

May exceed number that can run concurrently

Task queue

Coordination among tasks from different jobsSlide29

Execution FrameworkSpeculative

execution

Map

phase only as fast as?

slowest map task

Problem: Stragglers

, flaky hardware

Solution: Use

speculative execution

:

Exact copy of same task

on different machine

Uses result of fastest task in attempt to finish

Better for map or reduce?

Can

improve running time by 44% (Google)

Doesn’t help if skewed in distributed of valuesSlide30

Execution FrameworkData/code co-locationExecute near dataIt not

possible must stream data

Try to keep within same rackSlide31

Execution Framework

Synchronization

Concurrently running processes join up

Intermediate (

k,v

) grouped by key,

copy intermediate data over network, shuffle/sort

Number of copy operations? Worst case:

M

X R copy operations

Each

mapper

may send intermediate results to every reducer

Reduce computation cannot start until all

mappers

finished, (

k,v) shuffled/sortedDiffers from functional programmingCan copy intermediate (k,v) over network to reducer when mapper finishesSlide32

Execution FrameworkError/fault handlingThe norm

Disk failures, RAM errors, datacenter outages

Software errors

Corrupted dataSlide33
Slide34

Differences in MapReduce Implementations

Hadoop

(Apache)

vs. Google

Hadoop

- Values arbitrarily ordered, can change key in reducer

Google – program can specify 2ndary sort, can’t change key in reducer

Hadoop

Programmer can specify number of map tasks, but framework makes final decision

In reduce, programmer

specified number of tasks is usedSlide35

HadoopCareful using external resources (e.g. bottleneck querying SQL DB

)

Mappers can emit arbitrary number of intermediate (

k,v

),

can be of different type

Reduce can emit

artibtraty

number of final (

k,v

) and can be of different type than intermediate (

k,v

)

Different from functional programming, can have side effects (state change internal – may cause problems, external may write to files)

MapReduce

can have no reduce, but must have mapperCan just pass identity function to reducerMay

not have any input (compute pi)Slide36

Other SourcesOther source can serve as source/destination for data from MapReduce

Google –

BigTable

Hbase

BigTable

clone

Hadoop

– integrated RDB with parallel processing, can write to DB tablesSlide37

Distributed File System (DFS)In HPC, storage distinct from computation

NAS (network attached storage)

and SAN

are common

Separate, dedicated nodes for storage

Fetch, load, process, write

Bottleneck

Higher performance networks $$ (10G Ethernet), special purpose interconnects $$$ (

InfiniBand

)

$$ increases non-linearly

In GFS Computation

and storage not distinct

componentsSlide38

Hadoop Distributed File System - HDFS

GFS supports proprietary

MapReduce

HDFS – supports

Hadoop

Don’t have to run GFS on DFS, but misses advantages

Difference in

GFS and HDFS vs. DFS:

Adapted to large data processing

divide user data into

chunks/blocks -

LARGE

Replicate

these across

the local disk nodes in cluster

Master-slave architectureSlide39

HDFS vs GFS (Google File System)

Difference in HDFS:

Master-slave architecture

GFS: Master (master), slave (

chunkserver

)

HDFS: master (

namenode

), slave (

datanode

)

Master – namespace (metadata, directory structure, file to

block

mapping, location of

blocks

, access permission)

Slaves – manage actual data blocksClient contacts namespace, gets data from slaves, 3 copies of each block, etc.Block is 64 MBInitially Files were immutable

– once closed cannot be modifiedSlide40
Slide41
Slide42

HDFS Namenode

Namespace management

Coordinate file operations

Lazy garbage collection

Maintain file system health

Heartbeats, under-replication, balancing

Supports subset of POSIX API, pushed to application

No SecuritySlide43

Hadoop Cluster ArchitectureHDFS

namenode

runs daemon

Job submission node runs

jobtracker

point of contact run

MapReduce

Monitors progress of

MapReduce

jobs, coordinates

Mappers

and reducers

Slaves run

tasktracker

Runs users code,

datanode daemon, serve HDFS dataSend heartbeat messages to jobtrackerSlide44

Hadoop Cluster ArchitectureNumber of reduce tasks depends on reducers specified by programmer

Number of map tasks depends on

Hint from programmer

Number of input files

Number of HDFS data blocks of filesSlide45
Slide46

Hadoop Cluster ArchitectureMap tasks assigned

(

k,v

) called input split

Input splits computed automatically

Aligned on HDFS boundaries so associated with single block, simplifies scheduling

Data locality, if not stream across network (same rack if possible)Slide47

How can we use MapReduce to solve problems?Slide48
Slide49

Hadoop Cluster Architecture

Mappers

in

Hadoop

Javaobjects

with a MAP method

Mapper

object instantiated for every map task by

tasktracker

Life cycle – instantiation, hook in API for program specified code

Mappers

can load state, static data sources, dictionaries, etc.

After initialization: MAP method called by framework on all (

k,v

) in input split

Method calls within same Java object, can preserve state across multiple (

k,v) in same taskCan run programmer specified termination codeSlide50

Hadoop Cluster ArchitectureReducers in

Hadoop

Execution similar to that of

mappers

Instantiation, initialization, framework calls REDUCE method with intermediate key and

iterator

over all key values

Intermediate keys in sorted order

Can preserve state across multiple intermediate keysSlide51

CAP TheoremConsistency, availability, partition tolerance

Cannot satisfy all 3

Partitioning unavoidable in large data systems, must trade off availability and consistency

If master fails, system is unavailable so consistent!

If multiple masters, more available, but inconsistent

Workaround to single

namenode

Warm standby

namenode

Hadoop

community working on itSlide52