/
MapReduce : Simplified Data Processing on Large Clusters MapReduce : Simplified Data Processing on Large Clusters

MapReduce : Simplified Data Processing on Large Clusters - PowerPoint Presentation

joyce
joyce . @joyce
Follow
80 views
Uploaded On 2023-06-23

MapReduce : Simplified Data Processing on Large Clusters - PPT Presentation

Jeffrey Dean amp Sanjay Ghemawat Appeared in OSDI 04 Sixth Symposium on Operating System Design and Implementation San Francisco CA December 2004 Presented by Hemanth Makkapati ID: 1002242

data reduce intermediate tasks reduce data tasks intermediate key amp map google input output task mapreduce file large programming

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "MapReduce : Simplified Data Processing o..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. MapReduce: Simplified Data Processing on Large ClustersJeffrey Dean & Sanjay GhemawatAppeared in:OSDI '04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, 2004. Presented by: Hemanth Makkapati (makka@vt.edu)

2. Mom told SamAn apple a day keeps a doctor away!

3. One daySam thought of “drinking” the appleSo, he used a to cut the and a to make juice.

4. Next DaySam applied his invention to all the fruits he could find in the fruit basketSimple!!!

5. 18 Years LaterSam got his first job with juice making giants, Joogle, for his talent in making juiceNow, it’s not just one basket but a whole container of fruitsAlso, he has to make juice of different fruits separatelyFruitsAnd, Sam has just ONE and ONE But!Simple???Is this representative of computations that process large-scale data?

6. Why?The operations themselves are conceptually simpleMaking juiceIndexingRecommendations etcBut, the data to process is HUGE!!!Google processes over 20 PB of data every daySequential execution just won’t scale up

7. Why?Parallel execution achieves greater efficiencyBut, parallel programming is hardParallelizationRace ConditionsDebuggingFault ToleranceData DistributionLoad Balancing

8. MapReduce“MapReduce is a programming model and an associated implementation for processing and generating large data sets”Programming model Abstractions to express simple computationsLibrary Takes care of the gory stuff: Parallelization, Fault Tolerance, Data Distribution and Load Balancing

9. Programming ModelTo generate a set of output key-value pairs from a set of input key-value pairs{ < ki, vi >}  { < ko, vo >}Expressed using two abstractions:Map task<ki, vi>  { < kint, vint > }Reduce task< kint, {vint} >  < ko, vo >Libraryaggregates all the all intermediate values associated with the same intermediate keypasses the intermediate key-value pairs to reduce function

10. Inspiration‘map’ & ‘reduce/fold’ of functional programming languages(map f list [list2 list3 …])(map square (1 2 3 4))  (1 4 9 16)(reduce f list […])(reduce + (1 4 9 16))  30(reduce + (map square (map – l1 l2))))

11. ExampleMapReduce

12. Example: Word Countmap(String input_key, String input_value): // input_key: document name // input_value: document contents for each word w in input_value: EmitIntermediate(w, "1");   reduce(String output_key, Iterator intermediate_values): // output_key: a word // output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit(AsString(result));<“Sam”, “1”>, <“Apple”, “1”>, <“Sam”, “1”>, <“Mom”, “1”>, <“Sam”, “1”>, <“Mom”, “1”>, <“Sam” , [“1”,”1”,”1”]>, <“Apple” , [“1”]>, <“Mom” , [“1”, “1”]>“3”“1”“2”

13. ImplementationLarge cluster of commodity PCsConnected together with switched EthernetX86 dual-processor, 2-4 GB memory eachLinux OSStorage by GFS on IDE disksScheduling systemUsers submit jobsTasks are scheduled to available machines on cluster

14. Google File System (GFS)File is divided into several chunks of predetermined sizeTypically, 16-64 MBReplicates each chunk by a predetermined factorUsually, three replicasReplication to achieve fault-toleranceAvailabilityReliability

15. GFS Architecture

16. ExecutionUser specifiesM: no. of map tasksR: no. of reduce tasksMap Phaseinput is partitioned into M splitsmap tasks are distributed across multiple machinesReduce Phasereduce tasks are distributed across multiple machinesintermediate keys are partitioned (using partitioning function) to be processed by desired reduce task

17. Execution Flow

18. Sam & MapReduceSam implemented a parallel version of his innovation Fruits

19. Master Data StructuresFor each taskState { idle, in-progress, completed }Identity of the worker machineFor each completed map taskSize and location of intermediate data

20. Fault ToleranceWorker failure – handled via re-executionIdentified by no response to heartbeat messagesIn-progress and Completed map tasks are re-scheduledWorkers executing reduce tasks are notified of re-schedulingCompleted reduce tasks are not re-scheduledMaster failureRareCan be recovered from checkpointsAll tasks abort

21. Disk LocalityLeveraging GFSMap tasks are scheduled close to dataon nodes that have input dataif not, on nodes that are nearer to input dataEx. Same network switchConserves network bandwidth

22. Task GranularityNo. of map tasks > no. of worker nodesBetter load balancingBetter recoveryBut, increases load on MasterMore scheduling decisionsMore states to be savedM could be chosen w.r.t to block size of the file systemto effectively leverage localityR is usually specified by usersEach reduce task produces one output file

23. StragglersSlow workers delay completion timeBad disks with soft errorsOther tasks eating up resourcesStrange reasons like processor cache being disabledStart back-up tasks as the job nears completionFirst task to complete is considered

24. Refinement: Partitioning FunctionIdentifies the desired reduce taskGiven the intermediate key and RDefault partitioning functionhash(key) mod RImportant to choose well-balanced partitioning functionsIf not, reduce tasks may delay completion time

25. Refinement: Combiner FunctionMini-reduce phase before the intermediate data is sent to reduceSignificant repetition of intermediate keys possibleMerge values of intermediate keys before sending to reduce tasksSimilar to reduce functionSaves network bandwidth

26. Refinement: Skipping Bad RecordsMap/Reduce tasks may fail on certain records due to bugsIdeally, debug and fixNot possible if third-party code is buggyWhen worker dies, Master is notified of the recordIf more than one worker dies on the same recordMaster re-schedules the task and asks to skip the record

27. Refinements: othersOrdering guaranteessorted output fileTemporary filesLocal sequential executionto debug and testStatus Informationinput, intermediate & output bytes processed so farerror & failure reportsCountersto keep track of specific events

28. PerformanceEvaluated on two programs running on a large cluster & processing 1 TB datagrep & sortCluster Configuration1800 machines2 GHz Intel Xeon processors4 GB memorytwo 160 GB IDE disksgigabit Ethernet linkHosted in the same facility

29. GrepScans for a three character patternM = 15000 @ 64 MB per splitR = 1Entire computation takes 150sStartup overhead apprx. 60sPropagation of programs to worker machinesGFS operationsInformation of locality optimizations

30. SortModels TeraSort benchmarkM = 15000 @ 64 MB per splitR = 4000Workers = 1700 Evaluated on three executionswith backup taskswithout backup taskswith machine failures

31. SortNormal executionWithout backup tasksWith machine failures

32. Experience: Google Indexing SystemComplete re-write using MapReduceModeled as a sequence of multiple MapReduce operationsMuch simpler codefewer LoCshorter code changeseasier to improve performance

33. ConclusionEasy to use scalable programming model for large-scale data processing on clustersAllows users to focus on computationsHides issues ofparallelization, fault tolerance, data partitioning & load balancingAchieves efficiency through disk-localityAchieves fault-tolerance through replication

34. New Trend: Disk-locality IrrelevantAssumes disk bandwidth exceeds network bandwidthNetwork speeds fast improvingDisk speeds have stagnatedNext step: attain memory-locality

35. HadoopAn open-source framework for data-intensive distributed applicationsInspired from Google’s MapReduce & GFSImplemented in JavaName after the toy elephant of creator’s son

36. Hadoop HistoryNutch, an effort to develop open-souce search engineSoon, encountered scalability issues to store and process large data setsGoogle published MapReduce and GFSAttempted re-creation for NutchYahoo! got interested and contributed

37. Hadoop EcosystemHadoop MapReduce inspired by Google’s MapReduceHadoop Distributed File System(HDFS)inspired by Google File SystemHBase – Hadoop Databaseinspired by Google BigTableCassandra – Distributed Key-Value StoreInspired by Amazon Dynamo & Google BigTable

38. ReferencesGoogle File Systemhttp://labs.google.com/papers/gfs.htmlGoogle BigTablehttp://labs.google.com/papers/bigtable.htmlApache Hadoophttp://hadoop.apache.org/Juice Examplehttp://esaliya.blogspot.com/2010/03/mapreduce-explained-simply-as-story-of.html

39. Thank you