/
Networks and Networks and

Networks and - PowerPoint Presentation

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
379 views
Uploaded On 2017-05-13

Networks and - PPT Presentation

Distributed Snapshot Sukumar Ghosh Department of Computer Science University of Iowa Contents Part 1 The evolution of network topologies Part 2 Distributed snapshot Part 3 Tolerating failures ID: 547835

graphs snapshot state distributed snapshot graphs distributed state generals process commander nodes loyal small world node values message algorithm red traitors model

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Networks and" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Networks and Distributed Snapshot

Sukumar Ghosh

Department of Computer Science

University of IowaSlide2

Contents

Part 1. The evolution of network topologies

Part 2. Distributed snapshot

Part 3. Tolerating failuresSlide3

Random Graphs

How a connected topology evolves in the real world

Erdös-Rényi

graphs (ER graphs)

Power-law graphs

Small-world graphsSlide4

Random graphs: Erdös-Rényi

model

ER model

is one of several models of random graphs

Presents a theory of how social webs are formed.

Start with a set of isolated nodes

Connect each pair of nodes with a probability

The resulting graph is known asSlide5

Erdös-Rényi model

ER model is different from the model

The model randomly selects one from the entire

family of graphs with nodes and edges

.

Slide6

Properties of ER graphs

Property 1

. The expected

number of edges

is

Property 2

. The expected

degree per node is

Property 3

.

The expected

diameter of is

[

deg = expected degree of a node

]Slide7

Diameter of a network

Let denote the distance of the

shortest path

between a pair of nodes and . For all such pairs of nodes, the

largest value

of is known as the

diameter

of the network. Slide8

Degree distribution in random graphs

Probability that a node connects with a given set of nodes (and

not

to the remaining remaining nodes) is

One can choose out of the remaining

nodes in ways.

So the probability distribution is

(

binomial distribution

) Slide9

Degree distribution in random graphs

N(k) = Number of nodes with degree kSlide10

Properties of ER graphs

--

When ,

an ER graph is a

collection of disjoint trees.

-- When suddenly

one giant (connected) component

emerges. Other components have a much smaller size [

Phase change]Slide11

Properties of ER graphs

When the graph is

almost always connected

These give “ideas” about how a network can evolve.

But not all random topologies are ER graphs!

For example, social networks are often “clustered”, but ER

graphs have poor (i.e. very low)

clustering coefficient

(

what is clustering coefficient

?)Slide12

Clustering coefficient

For a given node,

its

local clustering

coefficient

(CC)

measures what fraction of its various pairs of neighbors are neighbors of each other.

CC(B) = 3/6 = ½ CC(D) = 2/3 = CC(E)

B’s neighbors are{A,C,D,E}. Only (A,D), (D,E), (E,C) are connected

CC of a graph is themean of the CC of its various nodesSlide13

The connectors

Malcom

Gladwell

, a staff writer at the

New Yorker

magazine

describes in his book The Tipping Point, a simple

experiment to measure how social a person is.

He started with a list of 248 last names A person scores a point if he or she knows someone with a last name from this list. If he/she knows

three persons with the same last name, then he/she scores 3 pointsSlide14

The connectors

(Outcome of the

Tipping Point

experiment) Altogether

400 persons from different groups were tested. It was found that

(min) 9, (max) 118 {from a random sample}

(min) 16, (max) 108 {from a highly homogeneous group}

(min) 2, (max) 95 {from a college class}

[Conclusion: Some people are very social, even in small or homogeneous samples. They are connectors

]Slide15

Connectors

Barabási

observed that connectors are not unique to human society only, but

true for many complex networks

ranging from

biology to

computer science

, where there are some nodes with an anomalously large number of links. This was not quite expected

in ER graphs.The world wide web,

the ultimate forum of democracy, is not a

random network, as Barabási’s web-mapping project revealed.

.Slide16

Anatomy of the World Wide Web

Barabási

experimented with the Univ. of Notre Dame’s web.

325,000

pages

270,000

pages

(i.e. 82%) had three or fewer links

42 had 1000+ incoming links each.

The entire WWW exhibited even more disparity. 90% had ≤ 10 links,

whereas a few (4-5) like Yahoo were referenced by close to a million pages!

These are the hubs of the web. They help create short paths between nodes (

mean distance = 19 for WWW obtained via extrapolation). (Some dispute this figure now)Slide17

Power law graph

The degree distribution in of the web pages in the World Wide Web follow a

power-law

. In a

power-law graph

, the number of nodes with degree satisfies the condition .

Also

known as scale-free graph. Other examples are

--

Income and number of people with that income

-- Magnitude and number of earthquakes of that magnitude

-- Population and number of cities with that populationSlide18

Random vs. Power-law Graphs

The degree distribution in of the web pages in the

World Wide Web follows a power-law Slide19

Random vs. Power-Law networksSlide20

Example: Airline Routes

Think of how new routes are added to an existing networkSlide21

Preferential attachment

New node

Existing

network

A new node connects with an

existing node with a

probability

proportional to

its degree. Thesum of the node degrees = 8

Also known as “Rich gets richer” policySlide22

Preferential attachmentcontinued

Barabási

and Albert showed that when large networks

are formed via

preferential attachment

, the resulting graph

exhibits a

power-law distribution of the node degrees. Slide23

Other properties of power law graphs

Graphs following a power-law distribution

have a small diameter

(

n

= number of nodes). The clustering coefficient decreases as the node degree increases (power law)

Graphs following a power-law distribution tend to be highly resilient to random edge removal, but

quite vulnerable to targeted attacks on the hubs.Slide24

The small-world model

Due to

Watts and

Strogatz

(1998)

They followed up on

Milgram’s work (on six degrees of separation) and reason about

why there is a small degree of separation between individuals in a social network.

Research originally inspired by Watt’s efforts to understand the synchronization

of cricket chirps, which show a high degree of coordination over long ranges, as though the insects are being guided by an invisible conductor

. Disease spreads faster

over a small-world network

.Slide25

Questions not answered by Milgram

Milgram’s

experiment

tried to validate the theory of

six degrees of separation

between any two individuals on the planet.

Why

six

degrees of separation? Any scientific reason? What properties do these social graphs have? Are there other situations in which this model is applicable?

Time to reverse engineer this. Slide26

What are small-world graphs

Completely regular

Small-world graphs

(n

>> k

>

ln (n) >

1)

Completely random

n

= number of nodes,

k

= number of neighbors of each nodeSlide27

Completely regular

A ring lattice

If then

Diameter is too large!Slide28

Completely random

Diameter is small, but the

Clustering coefficient is small too!Slide29

Small-world graphs

Start with the regular graph, and with probability

p

rewire

each link

to a randomly selected

node. It results in

a graph that has high clustering coefficient but low diameter …Slide30

Small-world graphs

Small-world

properties holdSlide31

Limitation of Watts-Strogatz

model

Jon Kleinberg argued

Watts-

Strogatz

small-world model illustrates

the existence

of short paths between pairs of nodes. But it does not give any clue about how those short paths will be discovered

. A greedy search for the destination

will not lead to the discovery of these short paths.Slide32

Kleinberg’s Small-World Model

Consider

an

grid

. Each node has

a link to every node at lattice distance

(short range neighbors) &

long range links. Choose long-range links at

lattice distance with a probability proportional to

r = 2

p

= 1,

q

= 2

n

nSlide33

Results

Theorem

1.

There is a

constant

(

depending on and

but independent of

), such that

when , the expected delivery

time of any decentralized algorithm is at least

Slide34

More results

Theorem 2

. There is a decentralized algorithm A and a

constant

dependent

on and but independent of

so that when and , the expected

delivery time of A is at

most Slide35

Variation of search time with r

E

x

ponent

r

Log TSlide36

Distributed SnapshotSlide37

Think about these

How

many messages are in transit

on

the internet?

What

is the

total cash reserve in the Bank of America?

How many cars are on the streets of Kolkata now? How much pollutants are there in the air (or water now)?

What are most people in the US thinking about the election?

How do we compute these?Slide38

UAV surveillance of trafficSlide39

Importance of snapshots

Major uses in

-

data collection

- surveillance

- deadlock detection - termination detection

- rollback recovery - global predicate computationSlide40

Importance of snapshots

A

snapshot may consist of the

internal states

of the

recording processes

, or it may consist of the

state of external shared objects updated by an updater process.Slide41

Distributed Snapshot:

First Case

Assume that the snapshot consists of the

internal states

of the

recording processes.

The main issue is

synchronization. An ad hoc combination of the local snapshots will not lead to a meaningful distributed snapshot.Slide42

One-dollar

bank

Let a

$1 coin

circulate in a network of a million banks. How can someone count the total $ in circulation? If not counted “properly,” then one may think the total $ in circulation to be one million.Slide43

Review Causal Ordering

Causality

helps identify

sequential

and

concurrent

events

in distributed systems, since clocks are notalways reliable.

Local ordering: a  b

 c (based on the local clock)

2. Message sent  message received [Thus joke

 Re: joke]

3. If

a

b and b

c

then

a

c

(

implies

causally ordered before

or

happened

before relation

)Slide44

Consistent cut

If this is not true, then the cut C is inconsistent

A

cut

is a

set of events

. If a cut C is consistent

then

timeSlide45

Consistent snapshot

The set of states immediately following the events (actions) in a

consistent cut

forms a

consistent snapshot

of a distributed system.

A snapshot that is of practical interest is the

most recent one. Let C1 and C2 be two consistent cuts and . Then C2 is more recent than C1.

Analyze why certain cuts in the one-dollar bank are inconsistent.Slide46

Consistent snapshot

How to record a

consistent snapshot

? Note that

1. The recording must be

non-invasive.

2. Recording must be done on-the-fly.

You cannot stop the system.

Slide47

Chandy-Lamport Algorithm

Works on a

(1) strongly connected graph

(2) each channel is FIFO.

An

initiator

initiates the algorithm by sending out a

marker

( )Slide48

White and red processes

Initially every process is

white

.

When

a

process

receives a marker,

it turns red and remain red

Every action by a process, and every

message sent by a process gets the color of that process.

So, white action = action by a white process

red action = action by a red process white message

= message sent by a white process

red message

= message sent by a red processSlide49

Two steps

Step 1.

In one atomic action, the initiator (a)

Turns red

(b)

Records its own state

(c)

sends a marker along all outgoing channels

Step 2. Every other process, upon receiving a marker for the first time (and before doing anything else) (a) Turns red (b) Records its own state (c) sends markers along all outgoing channels

The algorithm terminates when (1) every process

turns red, and (2) Every process has received a marker through each incoming channel.Slide50

Why does it work?

Lemma 1

.

No red message is received in a white action

.Slide51

Why does it work?

Theorem

.

The global state recorded by

Chandy-

Lamport

algorithm is

equivalent to

the ideal snapshotstate SSS.

Hint.

A pair of actions (a, b) can be scheduled in

anyorder, if there is no causal order between them, so

(a; b) is equivalent to (b; a)

SSS

Easy conceptualization of the snapshot state

All white

All redSlide52

Why does it work?

Let an observer observe the following actions

:

w[

i

] w[k] r[k] w[j] r[

i

] w[l] r[j] r[l] … ≡ w[i] w[k] w[j] r[k] r[i

] w[l] r[j] r[l] …[Lemma 1]≡ w[i] w[k] w[j] r[k] w[l] r[i

] r[j] r[l] …[Lemma 1]≡ w[i] w[k] w[j] w[l] r[k] r[i] r[j] r[l] …[done!]

Recorded stateSlide53

Example 1: Count the tokens

Let us verify that

Chandy-Lamport

snapshot algorithm correctly counts

the tokens circulating in the system

A

B

C

How to account for the channel states?

Compute this using the

sent and received variables for each process

.

token

no token

token

token

no token

no token

A

B

C

no token

no token

token

Are these consistent cuts?

1

2Slide54

Example 2: Communicating State MachinesSlide55

Something unusual

Let

machine

i

start

Chandy-Lamport

snapshot before it has sent M

along ch1. Also, let machine j

receive the marker after it sends out M’ along ch2

. Observe that the snapshot state is SSS =

down ∅ up

M’

Doesn’t this appear strange? This state was never reached during the computation!Slide56

Understanding snapshotSlide57

Understanding snapshot

The

observed state

is a

feasible state

that is reachable

from the

initial configuration

. It may not actually be visitedduring a specific execution.

The final state of the original computation is always reachable

from the observed state.Slide58

Discussions

What good is a snapshot if that state has never been visited by the system?

- It is relevant for the detection of

stable predicates

.

- Useful for

checkpointing

.Slide59

Discussions

What if the channels are not FIFO?

Study how

Lai-Yang algorithm

works. It does not use any marker

LY1

. The initiator records its own state. When it needs to send a message

m

to another process, it sends a message (m,

red).LY2.

When a process receives a message (m, red), it records its state if it has not already done so, and then accepts the message

m.Question 1. Why will it work?

Question 1 Are there any limitations of this approach?Slide60

Food for thought

Distributed snapshot = distributed read.

Distributed reset = distributed write

How difficult is distributed reset?Slide61

Distributed debugging

(

Marzullo

and

Neiger

, 1991)

observer

Distributed system

e, VC(e)Slide62

Distributed debugging

Uses

vector clocks

.

S

ij

is a global state after the ith

action by process 0 and the jth action by process 1Slide63

Distributed debugging

Possibly

ϕ

:

At least one consistent global state S is reachable

from the initial global state, such that

φ

(S) = true.

Definitely

ϕ

: All computations pass through some consistent global state S such that

φ(S) = true.

Never ϕ: No computation passes through some consistent global state S such that

φ

(S) = true

.

Definitely

ϕ

⇒ Possibly

ϕ

Slide64

Examples

ϕ

=

x+y

=12 (true at S

21

)

Possibly ϕϕ = x+y

> 15 (true at S31) Definitely ϕ

ϕ = x=y=5 (true at S40 and S22)

Never ϕ *Neither S40

nor S22 are consistent states*Slide65

Distributed Snapshot:

Second case

The

snapshot consists of the

external observations

of the

recording

processes --

distributed snapshots of shared external objects.

How

many cars are on the streets now?How many trees have been downed by the storm?Slide66

Distributed snapshot of shared objectsSlide67

The first algorithm

0

1

2

iSlide68

Algorithm double collect

function

read

while

true

X[0..n-1] := collect; Y[

0..n-1] := collect; if ∀i∈{0,..,

n-1} location i was not changed between two collects then return Y; end

function update (i,v) M[i] := v; end Slide69

Limitations of double collect

Read may never terminate! Why?

We need a better algorithm that guarantees termination.Slide70

Coordinated snapshot

Engage multiple readers and ask them to record snapshots at the

same time

. It will work if the writer is

sluggish

and the clocks are

accurately synchronized

.Slide71

Faulty recorder

Assume that there are

n

recorders. Each

records a snapshot

and

shares with the others

, so that each can form a complete snapshot.

Easy when all recorders record correctly and transmit the information reliably.

But what if one or more recorders are faulty

or the communication is error prone?Slide72

Distributed Consensus

Consensus is very important to take coordinated action.

How can the recorders reach consensus in presence of communication failure?

It reduces to the classic

Byzantine Generals ProblemSlide73

Byzantine Generals Problem

Describes

and solves the consensus problem on the

synchronous model of communication

.

The network topology is a

completely

connected graph.

Processes undergo byzantine failures, the worst possible kind of

failure. Shows the power of the adversary. Slide74

Byzantine Generals Problem

n

generals

{0, 1, 2, ..., n-1}

decide about whether to

"attack"

or to

"retreat" during a particular phase of a war. The goal is to

agree upon the same plan of action.

Some generals may be "traitors" and therefore send either no input, or send conflicting inputs to prevent the

"loyal" generals from reaching an agreement.

Devise a strategy, by which every loyal general eventually agrees upon the same plan, regardless of the action of the traitors. Slide75

Byzantine Generals

0

3

2

1

Attack = 1

Attack=1

Retreat = 0

Retreat = 0

{1, 1, 0,

0

}

{1, 1, 0,

0

}

Every general will broadcast

his/her

judgment to everyone else.

These are inputs to the consensus protocol.

{1, 1, 0,

1

}

{1, 1, 0,

0

}

traitor

The

traitor may

send

conflicting input valuesSlide76

Byzantine Generals

We need to devise a protocol so that

all peers (

call it a

lieutenant

)

receives the same value from

any given general (call it a

commander). Clearly, the

lieutenants will have to use secondary information.

Note that the roles of the commander and the

lieutenants will rotate among the generals.Slide77

Interactive consistency specifications

IC1.

Every

loyal lieutenant

receives

the

same order

from the commander.

IC2.

If the commander is loyal, then every loyal lieutenant receives the order that the commander

sends.

commander

lieutenantsSlide78

The Communication

Model

Oral Messages

1. Messages

are not corrupted in transit

. (

why? if the message gets altered then blame the sender

)

2. Messages can be lost, but the absence of message can be detected

.

3. When a message is received (or its absence is detected), the receiver knows the identity of the sender (or the defaulter).

OM(m) represents an interactive consistency protocol

in presence of at most m traitors.Slide79

An Impossibility Result

Using

oral messages

,

no solution

to the

Byzantine Generals

problem exists with

three or fewer

generals and one traitor. Consider the two cases:

In (a), to satisfy IC2, lieutenant 1 must trust the commander, but in IC2, the same idea leads to the violation of IC1.Slide80

Impossibility result

(continued

)

Using

oral messages

,

no solution to the Byzantine Generals problem exists

with 3m or fewer

generals and m

traitors (m > 0).

The proof is by contradiction

. Assume that such a solution exists. Now, divide the 3m generals into three groups of m

generals

each, such that all the traitors belong to one group. Let one general simulate each of these three groups. This scenario is equivalent to the case of

three generals and one traitor

. We already know that such a solution does not exist.Slide81

The OM(m

) algorithm

Recursive algorithm

OM(m

)

OM(m-1)

OM(m-2)

OM(0)

OM(0)

OM(m

) = Consensus Algorithm with oral messages in presence of up to

m

traitors

OM(0) = Direct broadcastSlide82

The OM(m

) algorithm

1

.

Commander

i

sends out a value

v (0 or 1)

2. If m

> 0, then every lieutenant j

≠ i, afterreceiving

v, acts as a commander and initiates OM(m-1)

with everyone except i .

3. Every lieutenant, collects (

n-1)

values:

(n-2) values

received from the lieutenants

using

OM

(m-1)

, and one direct value

from

the commander

. Then he picks

the majority

of

these

values as the

order

from

i

Slide83

Example of OM(1)Slide84

Example of OM(2)

OM(2)

OM(1)

OM(0)Slide85

Proof of OM(m

)

Lemma

.

Let the

commander be

loyal

, and

n > 2m + k,

where m = maximum

number of traitors. Then

OM(k) satisfies IC2Slide86

Proof of OM(m

)

Proof

If

k

=0, then the result trivially holds.

Let it hold for

k

= r (r

> 0) i.e. OM(r)

satisfies IC2. We have to show thatit holds for

k = r + 1 too.

By definition n > 2m+r

+1, so

n-1 > 2m

+r

So

OM(r

) holds

for the lieutenants in

the bottom row.

Each loyal lieutenant

collects

n-m-1

identical good values and

m

bad values. So bad values are voted

out (

n-m-1 >

m+r

implies

n-m-1 >

m

)

“O

M(r) holds

” means each loyal

lieutenant receives identical values

from every loyal commanderSlide87

The final theorem

Theorem

. If

n

> 3m

where

m

is the maximum number of

traitors, then OM (m

) satisfies both

IC1 and IC2.

Proof. Consider two cases:Case 1

. Commander is loyal. The theorem follows from the previous lemma (substitute k

=

m

).

Case 2

. Commander is a traitor. We prove it by induction

.

Base case

.

m

=0 trivial.

(

Induction hypothesis

) Let the theorem hold for

m

=

r

.

(

Inductive step

) We

have to show that it holds for

m

=

r+1 too.Slide88

Proof (continued)

There are

n

> 3(r + 1)

generals

and

r + 1 traitors. Excluding the commander, there are

> 3r+2 generals of which there are r traitors

. So > 2r+2 lieutenants are loyal. Since 3r+ 2 > 3.r, OM(r) satisfies IC1 and IC2

> 2r+2

r traitorsSlide89

Proof (continued)

In OM(r+1), a loyal lieutenant chooses the

majority from

(1)

> 2r+1 values

obtained from the loyal lieutenants via

OM(r

),

(2) the r values from the traitors, and

(3) the value directly from the commander.

> 2r+2

r traitors

The set of values collected in part (1) & (3) are the same for all loyal lieutenants –

it is the same set of values that these lieutenants received from the commander.

Also,

by the induction hypothesis

, in part (2) each loyal lieutenant receives

identical values from each traitor.

So every loyal lieutenant eventually

collects the same set of values.Slide90

Conclusion

Distributed snapshot of shared objects can be tricky when the writer does not cooperate

Approximate snapshots

is useful for a rough view.

Failures add new twist

to the recording of snapshots.

Much work remains to be done for the upper layers of

snapshot integration (

What can you make out from a trail of Twitter data with not much correlation?)