/
Uncoordinated Checkpointing Uncoordinated Checkpointing

Uncoordinated Checkpointing - PowerPoint Presentation

olivia-moreira
olivia-moreira . @olivia-moreira
Follow
402 views
Uploaded On 2016-08-03

Uncoordinated Checkpointing - PPT Presentation

The Global State Recording Algorithm CS 5204 Operating Systems 2 The Model Node properties No shared memory No global clock node channel Channel properties FIFO loss free nonduplicating ID: 430311

mij state 5204 empty state mij empty 5204 operating systems recording lsi marker records lsj 500 channel snapshot send

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Uncoordinated Checkpointing" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Uncoordinated Checkpointing

The Global State Recording AlgorithmSlide2

CS 5204 – Operating Systems

2

The Model

Node properties

No shared memory

No global clock

node

channel

Channel properties:

FIFO

loss free

non­duplicatingSlide3

CS 5204 – Operating Systems

3

The Problem

$500

$200

C1:empty

C2:empty

$450

$200

C1:transfer $50

C2:empty

$450

$250

C1:empty

C2:emptySlide4

CS 5204 – Operating Systems

4

Motivation for recording a “consistent” state of the global computation:

checkpointing for fault tolerance (rollback, recovery)

testing and debugging

monitoring and auditing

Method: detecting stable properties in a distributed system via snapshots. A property is “stable” if, once it holds in a state, it holds in all subsequent states.

termination

deadlock garbage collection

Distributed Snapshot (Global State Recording) Slide5

CS 5204 – Operating Systems

5

Local State and Actions:

local state: LS

i message send: send(mij ) message receive: rec(m

ij )

time: time(x) send(mij ) 

LSi iff time(send(mij )) < time(LSi )

rec(mij )  LSj iff time(rec(m

ij )) < time(LSj ) Predicates:

transit(LSi , LSj ) =

{mij | send(mij )

 LSi  !( rec(mij )

 LSj ) ) }inconsistent(LSi , LS

j ) = {mij | !(send(mij )  LSi )

 rec(mij )  LSj ) }

Consistent Global State:  i,  j : 1 <= i, j <= n :: inconsistent( LSi , LS

j

) =

DefinitionsSlide6

CS 5204 – Operating Systems

6

Marker­Sending Rule for a Process p:

for (each channel c, incident on, and directed away from p)

{ p sends one marker along c after p records its state and before p sends further messages along c; }

Marker­Receiving Rule for a Process q:

if (q has not recorded its state) then { q records its state;

q records the state of c as the empty sequence; }

else { q records the state of c as the sequence of message received along c after q's state was recorded and before

q received the marker along c.

}

Global­State­Recording Algorithm Slide7

CS 5204 – Operating Systems

7

p

empty

empty

q

state A

state C

S

0

p records its state (A) and sends

marker M on channel

p

M

empty

q

state A

state C

S

1

before receiving the marker, q

changes its state and sends

message D.

p

M

D

q

state A

state D

S

2

q receives the marker and records its state (D) and the incoming channel as empty; q send marker M' on its outgoing channel.

p

empty

M’

q

state B

state D

S

3

on receiving the marker, p records the channel as having message D

p

empty

D

q

state A

state D

recorded

stateSlide8

CS 5204 – Operating Systems

8

Snapshot/State Recording Example

M

= Marker

p

500

q

r

500

500

c3

c4

c2

c1Slide9

CS 5204 – Operating Systems

9

Snapshot/State Recording Example (Step 1)

p

490

q

r

470

500

c3

c4

c2

c1

M

10

20

10Slide10

CS 5204 – Operating Systems

10

Snapshot/State Recording Example (Step 2)

p

490

q

r

480

475

c3

c4

c2

c1

20

10

M

M

25Slide11

CS 5204 – Operating Systems

11

Snapshot/State Recording Example (Step 3)

p

470

q

r

480

485

c3

c4

c2

c1

20

20

M

M

25Slide12

CS 5204 – Operating Systems

12

Snapshot/State Recording Example (Step 4)

p

490

q

r

500

485

c3

c4

c2

c1

M

25Slide13

CS 5204 – Operating Systems

13

Snapshot/State Recording Example (Step 5)

485

p

515

q

r

500

c3

c4

c2

c1