Fault Tolerance in MPI PowerPoint Presentation

Fault Tolerance in MPI PowerPoint Presentation

2016-07-29 74K 74 0 0

Description

Minaashi Kalyanaraman . Pragya . Upreti. CSS 534 Parallel Programming. OVERVIEW. Fault Tolerance in MPI. Levels of survival in MPI. Approaches to . f. ault tolerance in MPI. Advantages & disadvantages of implementing fault tolerance in MPI. ID: 424067

Embed code:

Download this presentation



DownloadNote - The PPT/PDF document "Fault Tolerance in MPI" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentations text content in Fault Tolerance in MPI

Slide1

Fault Tolerance in MPI

Minaashi Kalyanaraman

Pragya

Upreti

CSS 534 Parallel Programming

Slide2

OVERVIEW

Fault Tolerance in MPI

Levels of survival in MPI

Approaches to

f

ault tolerance in MPI

Advantages & disadvantages of implementing fault tolerance in MPI

Extending MPI to HARNESS

Why FT-MPI

Implementation

Comparison MPI and FT-MPI

Performance consideration

Conclusion

Future scope

Slide3

MPI is not fault tolerant! -Is that true?

It is a common misconception about MPI.MPI provides considerable flexibility in the handling of errors. FAULT TOLERANCE IS THE PROPERTY OF AN MPI PROGRAM!

Job1

Processes in Job1

MPI_COMM_WORLD

P1

P2

P3

P4

Sends MPI_SUCCESS

Job1

Processes in Job1

Sends

MPI_ERRORS_ARE_FATAL

MPI_COMM_WORLD

P1

P2

P3

P4

By default other processes detect error

and

abort

Process P2 dies

Slide4

Levels of Survival of an MPI Implementation

Approaches to achieve fault tolerance in MPI

Level

1 – MPI implementation automatically recovers from failure and continues without significant change to its behavior. Highest Level of Survival and difficult to implement.

Level 2 – The MPI implementation is notified of the problem and is prepared to take corrective action.Example: Using Intercommunicators

Level 3 – In case of failure, certain MPI operations, although not all become invalid.Example: Modifying MPI Semantics, Extending MPI

Level 4 – In case of failure, the MPI program can abort and be restarted from a checkpoint.Example: Checkpointing

Program state of the failed process is retained for the overall computation to proceed

Slide5

The MPI Standard and Fault Tolerance

Reliable Communication:The MPI implementation is responsible for detecting and handling network faults.The MPI implementation can retransmit the message or inform the application that an error has occurred, allowing the application to take its own corrective action.Error Handlers:Error handlers are set on communicators with MPI_Comm_set_errhandler.The default is MPI_ERRORS_ARE_FATAL and it can be changed to MPI_ERRORS_RETURN.Users can define their own error handlers and attach them to communicators.

Slide6

ERROR HANDLING- CONTINUED

In c++ , MPI :: ERRORS_THROW_EXCEPTIONS is defined to handle the errorsIf an error is returned, the standard does not require that subsequent operations succeed or that they fail.Thus the standard allows implementations to take various approaches to the fault tolerance issue

Slide7

Approach to Fault Tolerance in MPI programs

1.Checkpointing:

This is a common technique that periodically saves the state of a computation, allowing the computation to be restarted from that point in the event of a

failure.

The

cost of

checkpointing

is determined by,

Cost

to create and write

checkpoint.

Cost

to read and restore

checkpoint.

Probability

of

failure.

Time

between

checkpoints.

Total

time to run without

checkpoints.

Types of

checkpointing

:

User-Directed

checkpointing

.

System-Directed

checkpointing

.

Advantage & disadvantage:

It is easy to implement.

Cost

of saving and restoring checkpoints must be relatively small.

Slide8

2.Using Intercommunicators:It contains two groups of processes.All communications occurs between processes in one group and processes in the other group.Example:Manager-Worker Manager process keeps track of a pool of tasks and dispatches them to working processes for completion.Workers return results to the manager, simultaneously requesting a new task.Advantages & Disadvantage:The manager can easily recognize that a particular worker has failed and communicate to other processes.Each group can keep tack of the state held by the other group.Difficult to implement in complex systems.

Approach

to Fault Tolerance in MPI programs

Slide9

3.Modifying MPI Semantics: Takes advantage of the existing MPI objects that contain more state and MPI functions defined in the standard.Example:MPI objects guarantees that the number of processes and its rank in a communicator is constant.This property can be used by the program,To decompose data according to a communicator’s size.Calculate the data assigned to a process using its rank.Advantage & Disadvantage:Fault tolerant programs can be written for a wider set of algorithms.This approach uses the already existing semantics and therefore provides lesser fault tolerant features compared to other approaches.

Approach

to Fault Tolerance in MPI programs

Slide10

4.Extending MPI:This approach is developed to address the difficulty of using MPI communicators when processes may fail.It is difficult to construct the communicator consisting of the two individual processes. If the Manager group has failed, then it is even more difficult because of collective semantics of communicator construction in MPI.

Approach to Fault Tolerance in MPI programs

Slide11

Advantages of using MPI fault tolerance features

It is simple and easy to use the existing error handling features in

MPI.

Users can extend the “MPI_ERRORS_RETURN” to define errors specific to their needs.

Error handling is purely local. Every process can have a different handler.

The ability to attach error handlers on a communicator increases the modularity of

MPI.

MPI provides the ability to define one’s own application-specific error handler which is an important approach to fault

tolerance.

Slide12

Limitations of Fault Tolerance in MPI

The specification makes no demands on MPI to survive

failures.

The

defined MPI error classes are used only to clarify to the user

about the

source of the

error.

It

is difficult for MPI to notify users of the failure of a given function that happen after the function has already returned

.

There

is no description of when error notification will happen relative to the occurrence of the error

.

I

t

is not possible for one application process to ask to be informed of errors on other processes or for the application to be informed of specific classes of errors.

Slide13

Harness/ Fault Tolerant MPI: an Extension to MPI

HARNESS

(Heterogeneous Adaptive Reconfigurable Networked

SyStem

)

Experimental

System which provides highly dynamic, fault-tolerant computing environment for high performance computing applications

HARNESS is a joint DOE funded project involving Oak Ridge National Laboratory (ORNL), University of Tennessee at Knoxville

(UTK/ICL

) and Emory University in Atlanta, GA

.

Slide14

Harness : an extension to MPI

Current MPI implementations either abort or use check-pointing

Communications only via communicator

MPI communicator based on static model

Slide15

Implementation

FT

MPI (HARNESS) extends MPI

Allows applications to decide when errors occurs

Restart failed node

Continue with less number of nodes

When member communicator fails:

Communicator state changes to indicate problem

Message transfer continues if safe or be stopped or ignored

User application can fix or abort communicator to continue

Slide16

Comparison FT-MPI and MPI: Communicator And Process States

FT-MPI

MPI

FT_OK

FT_DETECTED

VALID

FT_RECOVER

INVALID

FT_RECOVERED

FT_FAILED

PROCESS STATES

OK

OK

UNAVAILABLE

FAILED

JOINING

FAILED

Slide17

Implementation: Extending MPI

When running an FT-MPI application, there are two parameters used to specify modes in which application is running.The first parameter, the ’communicator mode’, indicates what is the status of an MPI object after recovery. Which can be specified when starting the application:

ABORT

BLANK

REBUILD

SHRINK

Like MPI FTMPI can abort error

Failed process are not replace

Failed process respawned surviving process has same rank. Default mode

Failed Process not replaced.

No gaps in lists of processors

Slide18

FT/MPI :

Second parameter communication mode:Two types of communications

Cont/ CONTINUE

NOOP /RESET

All operations which returned MPI_SUCCESS code will finish properly.

All ongoing messeages dropped.

Error on application sents it to last consistent state.

Slide19

FT/MPI : Communicator (COMM.) Failure Handling

COMM. Invalidated if failure detected

Underlying system sends a state update to all processes for that COMM.

System behavior depends on COMM. mode chosen

All COMM. are not updated for

communication

errors

Process

exit

Slide20

FT/MPI Usage

In form of error checkSome corrective action like communicator rebuild For example*: (Simple FT-MPI send usage) rc = MPI_Send(-------, com); if (rc == MPI_ERR_OTHER) MPI_Comm_dup (com, newcom); com = newcom; SPMD master-worker node only need master code to check for errors if user only takes master code as the point of failure

Slide21

Example : MPI Error handling

Slide22

Example of Error Handling Using FT-MPI

Slide23

Performance Consideration

Fault

free overhead of P2P communication in MPI/FT is negligible in long running applications.

Check-pointing

increases communication overhead considerably therefore user must determine less frequency of checkpoints.

Slide24

Conclusions

FT-MPI is tool to provide with methods of dealing with failures within MPI applications

FT-MPI is useful for experimenting with

S

elf

tuning collective communications

Distributed control algorithms

Dynamics libraries download

methods

Slide25

Future Scope

Developing further implementations that support more restrictive environments (

ie

. embedded clusters)

Creation of number of drop-in library templates to simplify the construction of fault tolerant applications

High performance and survivability

Slide26

References

Fault

Tolerance in MPI Programs:

http://www.mcs.anl.gov/~lusk/papers/fault-tolerance.pdf

LEGION

:

http://legion.virginia.edu/documentation/FAQ_mpi_run.html

HARNESS

:

http://icl.cs.utk.edu/ftmpi/index.html

MPI

3.0 Fault Tolerance Working Group:

http://

meetings.mpi-forum.org/mpi3.0_ft.php

Graham

E.

Fagg

, George

Bosilca

,

Thara

Angskun

,

Zhizhong

Chen,

Jelena

Pjesivac-Grbovic

,

Kevin

London and Jack J.

Dongarra

"Extending the MPI Specification for Process Fault Tolerance on High Performance Computing Systems" manual

HARNESS

Graham

E.

Fagg

, Antonin

Bukovsky,Jack

J.

Dongarra

"HARNESS and fault tolerant MPI" Parallel Computing 27, 2001 1479-1495

Graham

E.

Fagg

, Jack J.

Dongarra

"BUILDING AND USING A FAULT–TOLERANT MPI IMPLEMENTATION" The International Journal of High Performance Computing Applications, Volume 18, No. 3, Fall 2004, pp. 353–361

Conference

proceedings FT-MPI Presentation Graham E.

Fagg

, Jack J.

Dongarra

Slide27

Q & A ?

Slide28

FAQs

1.MPI vs TCP socket:

Arguably

, one of the biggest weaknesses of MPI is its lack of resilience — most (if not all) MPI implementations will kill an entire MPI job if any individual process dies.  This is in contrast to the reliability of TCP sockets, for example: if a process on one side of a socket suddenly goes away, the peer just gets a stale socket

.

2. Does MPI guarantee that user-defines handler be used as MPI_ERRORS_RETURN

The

specification does not state whether an error that would cause MPI functions to return an error code under the MPI_ERRORS_RETURN error handler would cause a user-defined error handler to be called during the same MPI function or at some earlier or later point in time.

3. Relation between

checkpointing

and I/O

The

practicality of

checkpointing

is related to performance of parallel I/O as checkpoint data is saved to a parallel file system.

Slide29

FAQs

4

.

Usability of HARNESS FT-MPI

The fault tolerance feature provides by HARNESS depends on its implementation. The HARNESS team actually works on the reported bugs and releases new versions.

5

.

Data Recovery in MPI

The MPI standard does not provide a way to recover data. It depends on the implementation of the MPI program.

6

.

Is fault tolerance in MPI can be made transparent?

It is very difficult to make the fault tolerance in MPI transparent. This is because of the complexity involved in communication between processes.

Slide30

Reference Slides

Slide31

Referrence

:

Structure

of FT-MPI

Slide32

Slide33

Derived datatype handling

Reduces memory copies while allowing overlapping 3 stages of data handling

Gather/Scatter

Encoding/Decoding

Send/Receive Package

Slide34

Handling of compacted Datatype:

only MPI_Snd and receive wer used

Slide35

Performance Consideration

Tests show compacted data handling gives 10% to 19% imrovement.Benefit of buffer reuse and reordering of data elements leads to considerable improvements on heterogeneous networks.

Slide36

Slide37

Slide38

Slide39

Slide40

Slide41


About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.