/
Introduction of   Apache Introduction of   Apache

Introduction of Apache - PowerPoint Presentation

conchita-marotz
conchita-marotz . @conchita-marotz
Follow
365 views
Uploaded On 2018-02-27

Introduction of Apache - PPT Presentation

Hama Edward J Yoon October 11 2011 ltedwardyoonapacheorggt About Me Founder of Apache Hama Committer of Apache Bigtop Employee for KT httptwittercom eddieyoon What Is Hama ID: 638571

bsp apache traffic hama apache bsp hama traffic tasks org parallel programming communication network supersteps synchronous interface supports random job model axis

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Introduction of Apache" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Introduction of Apache Hama

Edward J. Yoon, October 11, 2011

<edwardyoon@apache.org>Slide2

About MeFounder of Apache Hama.

Committer of Apache

Bigtop

.

Employee for KT.

http://twitter.com/

eddieyoonSlide3

What Is Hama?Apache Incubator Project.

BSP

(Bulk Synchronous Parallel

) for massive scientific computations.

Written In Java.

Currently 2 releases, 3 main committers.Slide4

Hama CharacteristicsProvides a Pure BSP model .

Job submission and management interface.

Multiple tasks per node.

Checkpoint recovery.

Supports to run in the Clouds using Apache Whirr.

Supports to run with Hadoop

nextGen

.Slide5

Bulk Synchronous Parallel?

Parallel programming model introduced by Valiant.

Consist of a sequence of supersteps.

Conceptually simple and intuitive from a programming standpoint.

Used for a variety of applications e.g., scientific computing, genetic programming, …Slide6

Schematic diagram of a superstep

Local Computation

Idle

Idle

Communication

……….

……….

Barrier

SynchronizationSlide7

InternalsHadoop RPC is used for BSP

tasks to

communicate each other.

Collection and bundling

of messages as a technique to

reduce

network overheads and contentions.

Zookeeper is used for Barrier Synchronization.Slide8

Pi Calculation

Each task executes locally its portion of the loop a number of times.

One task acts as master and collects the results through the BSP communication interface.Slide9

Structural Analysis of Network Traffic Flows

Traffic

flows in KT clouds.

traffic engineering, anomaly detection, traffic forecasting and capacity planning

Currently

BSP

jobs are

experimentally running

on 512 multi-cores machines.Slide10

Random Communication BenchmarksBenchmarked

on 16

1U servers using

10 tasks per

server.

X

axis is the

time (sec.)of BSP job execution (32 supersteps).

Y

axis is the number of

messages to be sent to random BSP tasks in each superstep.Slide11

What’s Next?Support

Input/Output

Formatter

like

MapReduce.

Message Compression for High

Performance.

Add some frameworks on top

of Hama.Slide12

More Informationhttp://incubator.apache.org/hama

http://wiki.apache.org/hama