/
Bob-the-Builder vs. Fix-it-Felix Bob-the-Builder vs. Fix-it-Felix

Bob-the-Builder vs. Fix-it-Felix - PowerPoint Presentation

pasty-toler
pasty-toler . @pasty-toler
Follow
461 views
Uploaded On 2016-05-06

Bob-the-Builder vs. Fix-it-Felix - PPT Presentation

Maintaining Overlays in Dynamic Graphs Seth Gilbert What is an overlay Overlay Networks Given a collection of changing servers Choose a good subset of the edges What is an overlay Overlay Networks ID: 307410

nodes overlay good churn overlay nodes churn good virtual node overlays topology log graph fix construction random networks chord skip real failures

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Bob-the-Builder vs. Fix-it-Felix" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Bob-the-Builder vs. Fix-it-Felix

Maintaining Overlays in Dynamic Graphs

Seth GilbertSlide2

What is an overlay?

Overlay Networks

Given a collection of (changing) servers:

Choose a good subset of the edges.Slide3

What is an overlay?

Overlay Networks

Given a collection of (changing) servers:

Choose a good subset of the edges.Subgraph

has low degree.

Subgraph has low diameter.Slide4

What is an overlay?

Overlay Networks

Given a collection of (changing) servers:

Choose a good subset of the edges.Subgraph

has low degree.

Subgraph has low diameter.

Maintain edges as servers come and go.

Can we build it?

Yes we can!

We can fix it!Slide5

Bitcoin is broken

(So why are you wasting your time here when you could be busy hacking

bitcoins

?)Slide6

What goes wrong?

Bitcoin

is a peer-to-peer overlay network

Overlay is used for all communication.

Overlay is assumed to be reliable.Slide7

What goes wrong?

Bitcoin

is a peer-to-peer overlay network

Overlay is used for all communication.

Overlay is assumed to be reliable.

Nodes arrive and leave all the time.

Malicious/greedy users can partition the network.Slide8

What goes wrong?

Basic overlay maintenance idea:

Connect to a few arbitrary neighbors.

Maintain a fixed number of neighbors.

Accept incoming connection requests.

Details:

Eclipse Attacks on

Bitcoin’s

Peer-to-Peer Network

by Heilman, Kendler

, Zohar, Goldberg

Bitcoin

is a peer-to-peer overlay network

Overlay is used for all communication.

Overlay is assumed to be reliable.

Nodes arrive and leave all the time.

Malicious/greedy users can partition the network.Slide9

Rapid and significant churnUsers constantly arriving and departing.Network changes continuously.No stable state.Malicious / greedy usersSome users may not follow the protocol.

Or all users may be greedy!Long-lived system

Bitcoin

Why is this a hard problem?Slide10

I am not going to solve the Bitcoin problem today.Slide11

A few existing overlay networks…ChordKademliaPastry Tapestry

Skip+HSkip+

Patricia Tries

DexXHealForgiving Tree

PGrid

Skipnet

RN Protocol

dHamiltonianCyclesAvatarCa-Re-Chord

HyperCubes

HyperRingChameleonRe-ChordTiaraCoronaSlide12

What do we know? Discussion of existing approaches. Is the problem solved? Are the existing solutions practical?Ongoing work Some ideas that we are thinking about.Challenge for us to work on

To work on after the workshop is over.

Today’s Goals

A Workshop Talk

Collaborators:

Gopal Pandurangan, Peter Robinson, and Amitabh TrahanSlide13

Overlay NetworksSlide14

Overlay Networks

Ground Rules

Underlying network:

Collection of nodes.

Nodes arrive (join).

 Joining node is connected to someone

.

Nodes leave (fail).Precise model TBA.Slide15

Overlay Networks

Ground Rules

Communication:

Every node has an address.

Every node has an address book.

A node can send a message to every node in the address book.

A node can send an address to another node.

Joe

Sue

JoeSue< Sue >Slide16

Overlay Networks

Goals

Overlay network:

Low degree  constant or logarithmic.

Low diameter

 logarithmic or

polylogarithmic

Note: every existing solution guarantees these properties.Slide17

Overlay Networks

Goals

Routable:

There exist short paths…

… and we can find short paths.

Note: random graphs may not be good!Slide18

Overlay Networks

Goals

Other properties:

Good expansion

Good conductance

Random walks converge quickly

Diameter is smallSlide19

Overlay Networks

Issues

Churn:

Nodes arrive.

Nodes leave.

Note: Not considering Byzantine/malicious failures today!Slide20

Overlay Networks

Issues

Synchrony vs. Asynchrony:

Synchronous: computation proceeds in rounds.

Asynchronous: arbitrary message/computation delays

Today: assume computation proceeds in rounds.Slide21

Overlay Networks

Issues

Oblivious vs. Adaptive scheduler/adversary:

Oblivious: schedule/arrivals/failures fixed in advance.

Adaptive: schedule/arrivals/failures depends on execution.

Questions:

Do actions of the algorithm correlate with crashes?

Is there an attacker using knowledge of the system?

 Attacker can crash critical nodes? Sending many messages can overload a link?Slide22

Overlay Networks

Metrics

Rate of churn:

How fast can nodes join and leave?What is the maximum rate that an algorithm can tolerate?Slide23

Overlay Networks

Metrics

Rate of recovery:

What happens when something goes wrong?How fast does the algorithm reconstruct a good overlay?Slide24

Overlay Networks

Metrics

Costs:

Message complexity  how many messages per round

?

Communication complexity  how many bits of communication

?

Quiescent complexity  what happens if changes stop?

Adaptive complexity  how do costs relate to the changes?Slide25

Overlay Networks

Overlay Networks

Given a collection of (changing) servers:

Choose a good subset of the edges.Subgraph

has low degree.

Subgraph has low diameter.

Maintain edges as servers come and go.Slide26

OverlaysA Play in Three ActsAct I : Fix-it-Felix and

the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being

destoryed

.

Act II : Bob-the-Builder and the Destabilizing-Demon

Wherein we meet our hero

Bob, as he

counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.Act III : Here Be Dragons

Wherein we

explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide27

Basic StatisticsHome: NicelandRole: Good GuyAntagonist: Ralph

Tool: magic hammerPowers: fixing anythingIntroductions

Fix-it-Felix

I can fix it!Slide28

Fix it:On insert: fix it!On departure: fix it!On any change: fix it!

Basic philosophy:Fix changes as soon as they occur.Fix it right then and there.Fixes are local and immediate.

Fix-it-Felix Approach

How to cope with changes?

I can fix it!Slide29

Approach:Start with good overlay.Bound rate of churn.Fix the overlay

faster than the churn destroys it. Maintain good properties throughout.Fix-it-Felix ApproachHow to cope with changes?

I can fix it!Slide30

Measuring Churn

Metric: per-period-churn

In every round:

At most k

nodes join every r

rounds.

At most

k nodes leave every r rounds.

Ex: k=1, r=log n

 At most 1 node joins / leaves every log(n) rounds.

55

50

25

20

15

10

5

0

35

30

45

40

arrive

arrive

arrive

depart

depart

depart

depart

departSlide31

Measuring Churn

Metric: per-round-churn

In every round:

At most k

nodes join.

At most k

nodes leave.

Ex: k=log n

 At most log n nodes joins / leaves per round.

Ex: k=1  At most 1 node joins / leaves per round.

55

50

25

20

15

10

5

0

35

30

45

40

arrive

depart

arrive

arrive

depart

depart

arrive

depart

arrive

depart

depart

departSlide32

Measuring Churn

Half-life of a System

Minimum time in which either:

Number of nodes doubles.

Number of nodes halves.

If half life is H and there are n nodes at time t, then between [t,

t+H

], there are at least (n/2) and at most 2n nodes in the system.

55

50

25

20

15

10

5

0

35

30

45

40

arrive

arrive

arrive

arrive

depart

arrive

depart

depart

arrive

H = 5

n = 6Slide33

Measuring Churn

Half-life of a System

Ideal Theorem:

If the half-life of the system is at least θ(log n), then the overlay is always

good

.

If half life is H and there are n nodes at time t, then between [t,

t+H

], there are at least (n/2) and at most 2n nodes in the system.

Minimum time in which either:Number of nodes doubles.Number of nodes halves.Slide34

Tolerating Churn

Chord

Advantages:

Tolerates high rate of churn.Overlay has many good properties.

Simple to implement.

Disadvantages:

Oblivious adversary

Fragile

Limitations on where nodes can join?Only supports one topology. Once the ring is disrupted, all is lost! Easy to attack, at risk of correlated failures.

[Liben-Nowell, Balakrishnan, Karger, PODC‘02]Slide35

Basic overlay:Ring topologyNodes distributed randomly on ring.Edges:

successorpredecssor“fingers”Approximates a hypercube.Tolerating Churn

Chord

[Liben-Nowell

,

Balakrishnan, Karger

, PODC

‘02]Slide36

Maintenance:Ignore “fingers.”Only matters for routing/lookup.

Easy to rebuild.Maintain successor/predecessor.Ensures that ring remains a ring. Tolerating Churn

Chord

[Liben-Nowell

, Balakrishnan

, Karger

, PODC

‘02]Slide37

Assume adversary is oblivious:Adversary does not know where nodes live on the ring.

New nodes are inserted “randomly.”Nodes are deleted “randomly.”Tolerating ChurnChord

[

Liben-Nowell,

Balakrishnan,

Karger, PODC

02]Slide38

Think about crashes:Assume at most

n/2 nodes crash.Ring is disconnected?Tolerating ChurnChord

[

Liben-Nowell,

Balakrishnan,

Karger, PODC

02]Slide39

Think about

crashes:

Assume at most

n/2 nodes crash.Maintain connection to log(n) successors.With high probability, at least 1 successor survives.Tolerating Churn

Chord

[

Liben-Nowell

, Balakrishnan, Karger

, PODC

‘02]Slide40

Think about

joining:

Assume at most

n/2 nodes join.Joining node is attached “randomly.”Creates appendages hanging off ring.At most O(log n) joining nodes in one place.

Tolerating ChurnChord

[

Liben-Nowell

, Balakrishnan,

Karger

, PODC‘02]Slide41

Repair mechanism:

Query successors for

their

successors.Query successors for their predecessor.Update local state.Notify successors of changes.

Note: I’m being a bit imprecise.

Tolerating ChurnChord

[

Liben-Nowell

, Balakrishnan

, Karger, PODC‘02]Slide42

Repair mechanism:

Query successors for

their

successors.Query successors for their predecessor.Update local state.Notify successors of changes.

Note: I’m being a bit imprecise.

Tolerating Churn

Chord

[Liben-Nowell

, Balakrishnan

, Karger, PODC‘02]

Beware: what if this node is deleted?Slide43

Time analysis:

Every

O(1) steps (in expectation), one insert/delete is resolved.

Within O(log n) steps, with high probability, all inserts/deletes from the last half-life are resolved.Tolerating ChurnChord

[

Liben-Nowell

,

Balakrishnan, Karger, PODC

‘02]Slide44

Analysis ideas:

Define an

almost good state.

Show that in the almost good state, the overlay has good properties.Show that if the half-life length Ω(log n), then the overlay is always almost good.Tolerating Churn

Chord

[

Liben-Nowell

, Balakrishnan,

Karger, PODC

‘02]Slide45

Tolerating Churn

Chord Summary

Advantages:

Tolerates high rate of churn.Overlay has many good properties.

Simple to implement.

Disadvantages:

Oblivious adversary

Fragile

Limitations on where nodes can join.Only supports one topology. Once the ring is disrupted, all is lost! Easy to attack, at risk of correlated failures.

*Re-Chord (KKS11) yields O(n log n) stabilization.Slide46

Think about crashes:Within

O(nc) steps, log(n) consecutive nodes crash.If you wait long enough, bad events happen.Then the ring is disconnected.

It will take a long time to fix!

Tolerating ChurnChord is fragile?Slide47

Recent work by Pamela Zave has found bugs/imprecision in original Chord descriptions:Using lightweight modeling to understand Chord

Why the Chord Ring-Maintenance Protocol is not Correct How to make Chord correctConclusion: Chord works, but only if you really get it right.

Tolerating Churn

Chord details are trickySlide48

Approach:Start with good overlay.Bound rate of churn.Fix the overlay

faster than the churn destroys it. Maintain good properties throughout.Fix-it-Felix ApproachChord tolerates churn.

I can fix it!Slide49

Tolerating Churn

Alternate Fix-it-Felix approach

Load-balanced hypercube:

At most O(log n)

arrivals/departures per round.

Basic idea: map

O(log n)

nodes to each vertex of a hypercube.Load-balance among hypercube vertices to cope with churn.Grow/shrink hypercube as needed.

[Kuhn,

Schmid

,

Wattenhofer

, IPTPS’05]Slide50

Tolerating Churn

Alternate Fix-it-Felix approach

Load-balanced hypercube:

At most O(log n)

arrivals/departures per round.

Basic idea: map

O(log n)

nodes to each vertex of a hypercube.Load-balance among hypercube vertices to cope with churn.Grow/shrink hypercube as needed.

[Kuhn,

Schmid, Wattenhofer, IPTPS’05]Interesting ideas:

Map real nodes to a virtual topology.Use load balancing to keep real nodes well distributed.Tolerates adaptive (and omniscient) adversary.Note: (unavoidable) weaker per-round adversarial limit.Slide51

Tolerating Churn

Alternate Fix-it-Felix approach

Random (dynamic) graph:

Maintain a random constant-degree graph.

Up to θ

(n/log n)

churn per round.

Based on random walks.

[Augustine,

Pandurangan, Robinson, Roche, Upfal, FOCS’15]Disadvantages: Not routable.

Oblivious adversaryNetwork size remains fixed.Limitation on where new nodes can be attached.Slide52

Tolerating Churn

Alternate Fix-it-Felix approach

XHeal

:Fixes any graph that has deletions (and insertions).

Carefully replaces missing nodes with small expanders.

Preserves good properties of original graph:

Bounded stretch:

log(n)Bounded degree-growth

[

Pandurangan, Trehan, PODC’11]Slide53

Tolerating Churn

Alternate Fix-it-Felix approach

XHeal

:Fixes any graph that has deletions (and insertions).

Carefully replaces missing nodes with small expanders.

Preserves good properties of original graph:

Bounded stretch:

log(n)Bounded degree-growth

[

Pandurangan, Trehan, PODC’11]Interesting aspects: One change at a time.

Local repair.Works for any topology/graph.Slide54

Fix-it-Felix Approach

Moral of the story:

General

ideas:Local repair.

Simple updates.

Maintain approximate structure.

Several existing solutions:

Oblivious adversary.Varying rates of

churn.

Subtle dependence on the joining model.Challenges:Adaptive adversary Tricky analysis, since failures are ongoing.Hard to keep costs proportional to changes.

I can fix it!Slide55

OverlaysA Play in Three ActsAct I : Fix-it-Felix and the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being

destoryed.Act II : Bob-the-Builder and the Destabilizing-DemonWherein we meet our hero Bob, as he

counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.

Act III : Here Be DragonsWherein we explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide56

Basic StatisticsHome: Sunflower ValleyRole: Building ContractorAntagonist: None.

Tool: Distributed team of building machinery.Powers: Teamwork.Introductions

Bob-the-Builder

Can we build it? Yes we can!Slide57

Rebuild good overlay:On insert: rebuild!On departure: rebuild!On any change: rebuild!

Basic philosophy:Assume network is in some arbitrary topology.Build a good overlay.Build it.. fast!

Bob–the-Builder Approach

How to cope with changes?

Can we build it? Yes we can!Slide58

From any initial state:Build a good overlay.Fast construction.No ongoing churn.

Self-stabilization:Arbitrary initial state corruption.Converges to a good state.Bob–the-Builder Approach

How to build an overlay?

Can we build it? Yes we can!Slide59

Dynamic Graph Model

Assume an arbitrary initial state:

Initially:

Graph is given in an arbitrary connected topology.

State of the nodes may be corrupted.

Slide60

Dynamic Graph Model

Construct a good overlay:

In every round:

Exchange messages with neighbors.Adjust edges.

Improve the overlay.

No further joins/leaves allowed (until overlay is constructed).Slide61

Dynamic Graph Model

Stabilization:

Eventually:

Overlay is constructed.Good properties are guaranteed.

In good state, joins and leaves may be supported.Slide62

From any initial state:Build a good overlay.Fast construction.No ongoing churn.

Self-stabilization:Arbitrary initial state corruption.Converges to a good state.Bob–the-Builder Approach

How to build an overlay?

Can we build it? Yes we can!Slide63

Self-Stabilizing Overlays

Skip+ Overlay Graphs:

Advantages:

Stabilizes quickly.Supports efficient joins and leaves.

Simple rules.

Disadvantages:

Large message / communication complexity (in the worst case

).

Large degree during construction (in the worst case).Oblivious adversary (for departures).Churn??Only supports one topology.[Jacob,

Richa, Scheideler, Schmid, Taubig, PODC’09]Slide64

Skip+ Overlays

Classical Skip List

011

010

001

000

010

000

010

001

001

1

11

011Slide65

Skip+ Overlays

Classical Skip List

011

010

001

000

010

000

010

001

001

1

11

011

Advantages

:

Fast search / insert in log(n) time.

Fault-tolerantSlide66

Skip+ Overlays

Classical Skip List

011

010

001

000

010

000

010

001

001

1

11

011

Disadvantage

:

congestion

Only one root.

Load is not balanced over nodes.Slide67

Skip+ Overlays

Skip Graph

011

010

001

000

010

000

010

001

001

1

11

011Slide68

Skip+ Overlays

Skip+ Graph

011

010

001

000

010

000

010

001

001

1

11

011Slide69

Skip+ Overlays

How to build it?

Overlay guarantees:

Fast searches / routing.Fast joins / leaves.

Small diameter.

Small degree.

Overlay construction issues:

Initially, diameter of the graph may be large.

How do nodes find their neighbors efficiently?How do nodes sort themselves properly into linked lists?Leverage parallelism? Not one insertion at a time!Slide70

Skip+ Overlays

Simple trick: pointer doubling

In every round:

Send your entire

address book to all your neighbors.

In every round, your “knowledge diameter” doubles.

Note: each rounds squares the adjacency matrix

.Within log(n)

rounds, graph is a clique.

4

2

8

12Slide71

Skip+ Overlays

Simple trick: pointer doubling

Once you have a clique:

Delete edges not in the overlay.

Final graph matches good topology.

Within

log(n)+1

rounds, graph is Skip+.

[Example: Skip List]Slide72

Skip+ Overlays

Simple trick: pointer doubling

Bad news:

Degree grows very large: n-1

.

Messages grow very large:

θ

(n).Number of messages per round is large: θ

(

n2).Conclusion: fast, but inefficient, overlay construction.

4

2

8

12Slide73

Skip+ Overlay

More efficient construction:

Basic idea:

Minimize expensive doubling steps.

Route nodes directly using existing overlay, if possible.

Local view:

Each node uses its local view to determine who its neighbors should be in the final overlay.

Stable edges:

An edge

(u,v) is stable if it appears to be in the final Skip+ overlay according to both the views of u and v.Otherwise, an edge is temporary.Slide74

Skip+ Overlays

Simple rules:

Rule 1

: Introduce Friends Notify stable neighbors of neighbors they should know about (including yourself).

Rule 2: Forward Temporary Edges (

Routing)

Forward a temporary neighbor to a stable neighbor with the largest shared prefix

.Rule

3a

: Introduce All (Pointer Doubling)If your set of stable neighbors changes (e.g., due to a change in views), then introduce all your neighbors to each other.Rule 3b: LinearizeHelp organize your neighbors at the same “level” into a linked list.dSlide75

Skip+ Overlay

How does it work?

In general:

Guarantees fast O(log n) overlay construction.

Best effort to restrain message size / degree during construction.

Locally checkable:

Nodes can locally determine whether the overlay is correct.

If every node

thinks

it is correct, then it is!Bad news:Still can lead to large degree, large messages, and many messages per round, in the worst case.Slide76

Self-Stabilizing Overlays

Skip+ Overlay Graphs:

Advantages:

Stabilizes quickly.Supports efficient joins and leaves.

Simple rules.

Disadvantages:

Large message / communication complexity (in the worst case

).

Large degree during construction (in the worst case).Oblivious adversary (for departures).Churn??Only supports one topology.Slide77

Self-Stabilizing Overlays

Skip+ Overlay Graphs:

Many improvements, similar ideas:

HSkip+ : Heterogeneous bandwidth

Corona : Deterministic

TCF-Skip+: Generalization, local detection analysis, etc.

011

010

001

000

010

000

010

001

001

1

11

011Slide78

Self-Stabilizing Overlays

Alternate Bob-the-Builder approach

Patricia-Tree-like Overlay:

Forms a tree from a weakly connected graph.

Asynchronous.

Communication-efficient:

Small messages, O(1) per round per nod.

Low contention

[

Angluin, Aspnes, Chen, Wu, Yin, SPAA’05]Disadvantages:

Initially: every node has low degree.Builds a tree.Later version (SSS’07) builds better overlay.Slide79

Bob-the-Builder Approach

Moral of the story:

General

ideas:

Deterministic, exact final structure.

Converge to final structure.

Graph square or component merging to converge fast.

Several existing solutions:

Oblivious adversary.

No joins/leaves during stabilization.Challenges:How to keep messages small?How to avoid graph squaring?How to tolerate ongoing changes?Can we build it? Yes we can!Slide80

OverlaysA Play in Three ActsAct I : Fix-it-Felix and the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being

destoryed.Act II : Bob-the-Builder and the Destabilizing-DemonWherein we meet our hero Bob, as he

counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.

Act III : Here Be DragonsWherein we explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide81

Good overlay:Small diameter / small degree:

O(log n).Routable / good expansion.

Fast and robust construction:

Rapidly formation from any initial state.

Self-stabilizing.

Churn tolerant:

Tolerates large fractions of nodes joining and leaving every round.

Oblivious adversary (?).

Efficient construction/maintenance:

Small messages (e.g., size O(log n)).Message-efficient (e.g., 1 message per node per round).Big Picture

Goals:Slide82

Collaborators: Gopal Pandurangan, Peter Robinson, and Amitabh TrahanSlide83

Overview

Plan:

Virtual overlays:

Define good topologies.

Map virtual nodes to real servers.

Merging two good overlays:

Start with two good overlays, connected by one link.

Build new good overlay.

Construction algorithm:

Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide84

Virtual Overlays

What is a good topology?

Sequence of graphs

G1, G2, …,

G

n

Graph

Gj has [j, 2j] nodes.

Graph

Gj has O(log j) degree and diameter.Expandable:There exists a mapping from each node in Gj to 1 or 2 nodes in G2j.Random sampling:

Supports a mechanism for randomly sampling nodes.E.g., fast random walks.Permutation routing:Supports efficiently routing permutations.Note: many (deterministic) expanders satisfy these requirements.Slide85

Virtual Overlays

Map virtual topology to real nodes

For each real node:

Assign O(1) virtual nodes to it.

Connect two real nodes if any of their virtual nodes are connected.

Inspiration: The Forgiving Tree [HST’12],

DeX

[PRT’14] and [KSW’05]

Virtual

RealSlide86

Overview

Plan:

Virtual overlays:

Define good topologies.

Map virtual nodes to real servers.

Merging two good overlays:

Start with two good overlays, connected by one link.

Build new good overlay.

Construction algorithm:

Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide87

Merging

Combining two overlay networks

Assume two overlay networks:

Each overlay is a properly mapped virtual topology.

The two overlays are connected by one edge.

Key challenge: merge the two overlay networks.Slide88

Merging

Combining two overlay networks

Step 1: Grow the virtual topology

Double one overlay (using topology mapping

G

n

 G

2n

)

.

Creates excess virtual nodes.Slide89

Merging

Combining two overlay networks

Step 1: Grow the virtual topology

Double one overlay (using topology mapping

G

n

 G

2n

)

.

Creates excess virtual nodes.

Step

2: Create new edges

Use permutation routing.Slide90

Merging

Combining two overlay networks

Step 3: Distribute excess virtual nodes

Send new virtual nodes on a random walk of old topology.

(Use random sampling of target topology.)Slide91

Merging

Combining two overlay networks

Step 3: Distribute excess virtual nodes

Send new virtual nodes on a random walk of old topology.

(Use random sampling of target topology.)

Problem: only one bridge.Slide92

Merging

Combining two overlay networks

Step 3: Distribute excess virtual nodes

Send new virtual nodes on a random walk of old topology.

(Use random sampling of target topology.)

Every success creates a new bridge.

Number of bridges doubles (i.e., exponential growth).Slide93

Merging

Combining two overlay networks

Step 3: Distribute excess virtual nodes

Send new virtual nodes on a random walk of old topology.

(

Use random sampling of target topology.)

Problem: eventually, hard to find an empty node.Slide94

Merging

Combining two overlay networks

Step 3: Distribute excess virtual nodes

Full nodes

send new

virtual nodes

on random walk of old topology.

 If many nodes are empty, they are easy to find.

Empty nodes

send

requests

on random walk of the old topology.

 If

most nodes are full, then there are lots of bridges.Slide95

Step 4: Rebalancing and clean up

Balance/reduce the virtual nodes (maybe).

Drop the old topology. (Or keep it as a backup.)

End result: a new instantiation of the virtual topology.

Merging

Combining two overlay networksSlide96

Overview

Plan:

Virtual overlays:

Define good topologies.

Map virtual nodes to real servers.

Merging two good overlays:

Start with two good overlays, connected by one link.

Build new good overlay.

Construction algorithm:

Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide97

Overlay Construction

Divide-and-Conquer Algorithm

1.

Collections: maintain a collection of overlays.Initially, each node is its own collection.

Collections are connected by edges between (real) nodes.

2.

Matching

: find a matching in the graph of collections.Pair up collections.

3.

Merge: combine pairs of collections.Merge components.Combine two components into a single new collection.4. Repeat: while there is > 1 collection, go to Step 2.Inspired by [Angluin, Aspnes

, Chen, Wu, Yin, SPAA’05]Slide98

Overlay Construction

Divide-and-Conquer Algorithm

1.

Collections: maintain a collection of overlays.Initially, each node is its own collection.

Collections are connected by edges between (real) nodes.

2.

Matching

: find a matching in the graph of collections.Pair up collections.

3.

Merge: combine pairs of collections.Merge components.Combine two components into a single new collection.4. Repeat: while there is > 1 collection, go to Step 2.Problem: matching may be small!Slide99

Overlay Construction

Divide-and-Conquer Algorithm

1.

Collections: maintain a collection of overlays.Initially, each node is its own collection.

Collections are connected by edges between (real) nodes.

2.

Sparsify

: reduce degree of graph of collections.Create constant degree graph.

Each node organizes its “children” into a line.

3. Matching: find a matching in the graph of collections.Pair up collections.4. Merge: combine pairs of collections.Merge components.Combine two components into a single new collection.5.

Repeat: while there is > 1 collection, go to Step 2.Slide100

Overlay Construction

Divide-and-Conquer Algorithm

Initially: connected graphSlide101

Overlay Construction

Divide-and-Conquer Algorithm

Sparsify

: reduce the degreeSlide102

Overlay Construction

Divide-and-Conquer Algorithm

Repeat: match and mergeSlide103

Overlay Construction

Divide-and-Conquer Algorithm

Repeat: match and mergeSlide104

Overlay Construction

Divide-and-Conquer Algorithm

Repeat: match and mergeSlide105

Overlay Construction

Divide-and-Conquer Algorithm

Repeat: match and mergeSlide106

Overlay Construction

Divide-and-Conquer Algorithm

Repeat: match and mergeSlide107

Divide-and-Conquer:Collection has small diameter

 O(log n) cost to coordinate.

Collection merging

: O(polylog n) cost per merge step.

Number of iterations:

O(log n)

matchings

Overall: O(

polylog

n) time to form overlay.Overlay Construction

Efficiency AnalysisMerging:Random walks: O(log n) costBridge doubling: O(log n) iterationsOverall: O(polylog n)

time to merge topologies.Slide108

Overview

Plan:

Virtual overlays:

Define good topologies.

Map virtual nodes to real servers.

Merging two good overlays:

Start with two good overlays, connected by one link.

Build new good overlay.

Construction algorithm:

Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide109

Self-Stabilization

Detecting TroubleSlide110

Self-Stabilization

Detecting Trouble

Local detection:

If the topology is bad, some node can detect it.

Virtual topology is deterministic.

Can verify assignment of virtual nodes to real nodes, since each real node is assigned 1 or 2 virtual nodes.Slide111

Detection diameter:Assume there is a topology problem

.Detection diameter = max distance of any node from a node that can detect the problem.

In any expander, detection diameter is

O(log n), even if real graph diameter is much larger!

Self-Stabilization

Detecting Trouble

Local detection:

If the topology is bad, some node can detect it.

Virtual topology is deterministic.

Can verify assignment of virtual nodes to real nodes, since each real node is assigned 1 or 2 virtual nodes.

[Defined by [Berns, Ghosh, Pemmaraju: SSS’11]Slide112

Detection diameter:Assume there is a topology problem

.Detection diameter = max distance of any node from a node that can detect the problem.

In any expander, detection diameter is

O(log n), even if real graph diameter is much larger!

Self-Stabilization

Detecting Trouble

Local detection:

If the topology is bad, some node can detect it.

Virtual topology is deterministic.

Can verify assignment of virtual nodes to real nodes, since each real node is assigned 1 or 2 virtual nodes.

Conclusion: Within O(log n) time, every node learns of a topology error.Slide113

Self-stabilizatino

Basic idea:

If a node detects an error:

Flood notification to everyone.

Time:

O(log n)

.

On notification of error:

Restart overlay construction from scratch.

Within O(polylog n) time, new overlay.Slide114

Self-stabilizatino

Basic idea:

If a node detects an error:

Flood notification to everyone.

Time:

O(log n)

.

On notification of error:

Restart overlay construction from scratch.

Within O(polylog n) time, new overlay.Remaining issue: synchronizationDifferent nodes begin rebuilds at different times.Components synchronize as they merge.Slide115

Overview

Plan:

Virtual overlays:

Define good topologies.

Map virtual nodes to real servers.

Merging two good overlays:

Start with two good overlays, connected by one link.

Build new good overlay.

Construction algorithm:

Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide116

Churn

Departures and FailuresSlide117

Churn

Departures and Failures

What types of failures?

Graceful: execute an exit protocol.

Oblivious

: adversary decides failures in advance.

Random

: adversary fails nodes at random.Adaptive

: adversary chooses failure on-line.

Connected: adversary never disconnects graph.Slide118

Overlay tolerates failuresExpanders are highly fault-tolerant.

For “good” target topologies, overlay maintains good properties, even if nodes fail. As long as not too many nodes fail, all is good.

Churn

Departures and Failures

What types of failures?

Graceful

: execute an exit protocol.

Oblivious

: adversary decides failures in advance.

Random: adversary fails nodes at random.Adaptive: adversary chooses failure on-line.

Connected: adversary never disconnects graph.Slide119

Churn

Repairing failures

Option 1.

Local repair If a node leaves gracefully, hand-off virtual nodes.

If a node crashes, neighbors regenerate virtual nodes.

Rebalance (via random walks).

Conjecture: tolerates small changes.

[as in Kuhn,

Schmid

, Wattenhofer, IPTPS’05]Slide120

Option 2. Periodic Rebuild

Every so often, initiate a complete rebuild.Assume failures oblivious or random (in some sense).

As long as:

“half-life” > “rebuild time” then everything continues to work.

Churn

Repairing failures

Option 1.

Local repair

If a node leaves gracefully, hand-off virtual nodes.

If a node crashes, neighbors regenerate virtual nodes.

Rebalance (via random walks).Conjecture: tolerates small changes.[as in Kuhn, Schmid

,

Wattenhofer

, IPTPS’05]Slide121

Churn

Key Question:

Does the construction procedure work when there are ongoing failures?

1. Merging fails if links connecting collections fail:

Random/oblivious failures

 not too many links fail / iteration

.

2. Collection coordination is more difficult

Can no longer rely on a leader or an aggregation tree in the collection.

Matching is still doable.Sparsification is harder.Slide122

Joining is easyFind an extra virtual node.

Route virtual node to new entrant.Or wait until the next rebuild.

Churn

Joining nodes

How do nodes join?

Random

: introduced to a random node.

Oblivious

: introduced to a node chosen by the adversary in advance.

Limited adversarial: introduced to a node, but adversary cannot introduce more than one at each location.Slide123

Overview

Plan:

Virtual overlays:

Define good topologies.

Map virtual nodes to real servers.

Merging two good overlays:

Start with two good overlays, connected by one link.

Build new good overlay.

Construction algorithm:

Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide124

OverlaysA Play in Three ActsAct I : Fix-it-Felix and the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being

destoryed.Act II : Bob-the-Builder and the Destabilizing-DemonWherein we meet our hero Bob, as he

counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.

Act III : Here Be DragonsWherein we explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide125

Wrap-up

Overlay Networks

Many approaches:

One zillion different base topologies.Many different repair techniques.

Real-world implementations.

Algorithms and theory.Slide126

A few existing overlay networks…ChordKademliaPastry Tapestry

Skip+HSkip+

Patricia Tries

DexXHealForgiving Tree

PGrid

Skipnet

RN Protocol

dHamiltonianCyclesAvatarCa-Re-Chord

HyperCubes

HyperRingChameleonRe-ChordTiaraCoronaSlide127

Wrap-up

A few things I have learned

Churn analysis is hard:

Relatively few papers really allow failures to happen at any time.

Analysis is

handwavy, and sometimes wrong (see, e.g., Chord).

Can we do a better job of understanding / proving results regarding churn?

We can fix it!Slide128

Wrap-up

A few things I have learned

Self-stabilization is also hard:

System can start in any state! Easy to ignore corner cases.Still, easier because there are no ongoing failures.

Tends to be expensive, due to lots of local checking and rebuilding.

Can we build it?

Yes we can!Slide129

Wrap-up

A few things I have learned

Integrating both churn and self-stabilization:

Hard to stabilize if failures are ongoing!Competing demands:

Self-stabilization treats every failures as a disaster!

Churn-tolerance tries to ignore/repair small failures.Slide130

Wrap-up

A few things I have learned

Communication costs:

Many existing solutions rely on “graph squaring” to handle worst-case.

Churn-tolerance is relatively cheap (when it works).

Self-stabilization tends to be much more (communication) expensive.Slide131

Wrap-up

A few things I have learned

No one has

any idea how to deal with Byzantine/malicious or greedy participants in a dynamic overlay network!

Exception

: see “Brahms: Byzantine resilient random membership sampling”

by

Bortnikov

,

Gurevitch, Keidar, Kliot, ShraerException: see “Self-stabilizing and Byzantine-Tolerant Overlay Network” by Dolev, Hoch, and van RenesseSlide132

Wrap-up

A few things I have learned

Many interesting ideas.

Many interesting techniques.

No one “right” solution yet.Slide133

Can we build it?

Yes we can!

We can fix it!