Maintaining Overlays in Dynamic Graphs Seth Gilbert What is an overlay Overlay Networks Given a collection of changing servers Choose a good subset of the edges What is an overlay Overlay Networks ID: 307410
Download Presentation The PPT/PDF document "Bob-the-Builder vs. Fix-it-Felix" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Bob-the-Builder vs. Fix-it-Felix
Maintaining Overlays in Dynamic Graphs
Seth GilbertSlide2
What is an overlay?
Overlay Networks
Given a collection of (changing) servers:
Choose a good subset of the edges.Slide3
What is an overlay?
Overlay Networks
Given a collection of (changing) servers:
Choose a good subset of the edges.Subgraph
has low degree.
Subgraph has low diameter.Slide4
What is an overlay?
Overlay Networks
Given a collection of (changing) servers:
Choose a good subset of the edges.Subgraph
has low degree.
Subgraph has low diameter.
Maintain edges as servers come and go.
Can we build it?
Yes we can!
We can fix it!Slide5
Bitcoin is broken
(So why are you wasting your time here when you could be busy hacking
bitcoins
?)Slide6
What goes wrong?
Bitcoin
is a peer-to-peer overlay network
Overlay is used for all communication.
Overlay is assumed to be reliable.Slide7
What goes wrong?
Bitcoin
is a peer-to-peer overlay network
Overlay is used for all communication.
Overlay is assumed to be reliable.
Nodes arrive and leave all the time.
Malicious/greedy users can partition the network.Slide8
What goes wrong?
Basic overlay maintenance idea:
Connect to a few arbitrary neighbors.
Maintain a fixed number of neighbors.
Accept incoming connection requests.
Details:
Eclipse Attacks on
Bitcoin’s
Peer-to-Peer Network
by Heilman, Kendler
, Zohar, Goldberg
Bitcoin
is a peer-to-peer overlay network
Overlay is used for all communication.
Overlay is assumed to be reliable.
Nodes arrive and leave all the time.
Malicious/greedy users can partition the network.Slide9
Rapid and significant churnUsers constantly arriving and departing.Network changes continuously.No stable state.Malicious / greedy usersSome users may not follow the protocol.
Or all users may be greedy!Long-lived system
Bitcoin
Why is this a hard problem?Slide10
I am not going to solve the Bitcoin problem today.Slide11
A few existing overlay networks…ChordKademliaPastry Tapestry
Skip+HSkip+
Patricia Tries
DexXHealForgiving Tree
PGrid
Skipnet
RN Protocol
dHamiltonianCyclesAvatarCa-Re-Chord
HyperCubes
HyperRingChameleonRe-ChordTiaraCoronaSlide12
What do we know? Discussion of existing approaches. Is the problem solved? Are the existing solutions practical?Ongoing work Some ideas that we are thinking about.Challenge for us to work on
To work on after the workshop is over.
Today’s Goals
A Workshop Talk
Collaborators:
Gopal Pandurangan, Peter Robinson, and Amitabh TrahanSlide13
Overlay NetworksSlide14
Overlay Networks
Ground Rules
Underlying network:
Collection of nodes.
Nodes arrive (join).
Joining node is connected to someone
.
Nodes leave (fail).Precise model TBA.Slide15
Overlay Networks
Ground Rules
Communication:
Every node has an address.
Every node has an address book.
A node can send a message to every node in the address book.
A node can send an address to another node.
Joe
Sue
JoeSue< Sue >Slide16
Overlay Networks
Goals
Overlay network:
Low degree constant or logarithmic.
Low diameter
logarithmic or
polylogarithmic
Note: every existing solution guarantees these properties.Slide17
Overlay Networks
Goals
Routable:
There exist short paths…
… and we can find short paths.
Note: random graphs may not be good!Slide18
Overlay Networks
Goals
Other properties:
Good expansion
Good conductance
Random walks converge quickly
Diameter is smallSlide19
Overlay Networks
Issues
Churn:
Nodes arrive.
Nodes leave.
Note: Not considering Byzantine/malicious failures today!Slide20
Overlay Networks
Issues
Synchrony vs. Asynchrony:
Synchronous: computation proceeds in rounds.
Asynchronous: arbitrary message/computation delays
Today: assume computation proceeds in rounds.Slide21
Overlay Networks
Issues
Oblivious vs. Adaptive scheduler/adversary:
Oblivious: schedule/arrivals/failures fixed in advance.
Adaptive: schedule/arrivals/failures depends on execution.
Questions:
Do actions of the algorithm correlate with crashes?
Is there an attacker using knowledge of the system?
Attacker can crash critical nodes? Sending many messages can overload a link?Slide22
Overlay Networks
Metrics
Rate of churn:
How fast can nodes join and leave?What is the maximum rate that an algorithm can tolerate?Slide23
Overlay Networks
Metrics
Rate of recovery:
What happens when something goes wrong?How fast does the algorithm reconstruct a good overlay?Slide24
Overlay Networks
Metrics
Costs:
Message complexity how many messages per round
?
Communication complexity how many bits of communication
?
Quiescent complexity what happens if changes stop?
Adaptive complexity how do costs relate to the changes?Slide25
Overlay Networks
Overlay Networks
Given a collection of (changing) servers:
Choose a good subset of the edges.Subgraph
has low degree.
Subgraph has low diameter.
Maintain edges as servers come and go.Slide26
OverlaysA Play in Three ActsAct I : Fix-it-Felix and
the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being
destoryed
.
Act II : Bob-the-Builder and the Destabilizing-Demon
Wherein we meet our hero
Bob, as he
counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.Act III : Here Be Dragons
Wherein we
explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide27
Basic StatisticsHome: NicelandRole: Good GuyAntagonist: Ralph
Tool: magic hammerPowers: fixing anythingIntroductions
Fix-it-Felix
I can fix it!Slide28
Fix it:On insert: fix it!On departure: fix it!On any change: fix it!
Basic philosophy:Fix changes as soon as they occur.Fix it right then and there.Fixes are local and immediate.
Fix-it-Felix Approach
How to cope with changes?
I can fix it!Slide29
Approach:Start with good overlay.Bound rate of churn.Fix the overlay
faster than the churn destroys it. Maintain good properties throughout.Fix-it-Felix ApproachHow to cope with changes?
I can fix it!Slide30
Measuring Churn
Metric: per-period-churn
In every round:
At most k
nodes join every r
rounds.
At most
k nodes leave every r rounds.
Ex: k=1, r=log n
At most 1 node joins / leaves every log(n) rounds.
55
50
25
20
15
10
5
0
35
30
45
40
arrive
arrive
arrive
depart
depart
depart
depart
departSlide31
Measuring Churn
Metric: per-round-churn
In every round:
At most k
nodes join.
At most k
nodes leave.
Ex: k=log n
At most log n nodes joins / leaves per round.
Ex: k=1 At most 1 node joins / leaves per round.
55
50
25
20
15
10
5
0
35
30
45
40
arrive
depart
arrive
arrive
depart
depart
arrive
depart
arrive
depart
depart
departSlide32
Measuring Churn
Half-life of a System
Minimum time in which either:
Number of nodes doubles.
Number of nodes halves.
If half life is H and there are n nodes at time t, then between [t,
t+H
], there are at least (n/2) and at most 2n nodes in the system.
55
50
25
20
15
10
5
0
35
30
45
40
arrive
arrive
arrive
arrive
depart
arrive
depart
depart
arrive
H = 5
n = 6Slide33
Measuring Churn
Half-life of a System
Ideal Theorem:
If the half-life of the system is at least θ(log n), then the overlay is always
good
.
If half life is H and there are n nodes at time t, then between [t,
t+H
], there are at least (n/2) and at most 2n nodes in the system.
Minimum time in which either:Number of nodes doubles.Number of nodes halves.Slide34
Tolerating Churn
Chord
Advantages:
Tolerates high rate of churn.Overlay has many good properties.
Simple to implement.
Disadvantages:
Oblivious adversary
Fragile
Limitations on where nodes can join?Only supports one topology. Once the ring is disrupted, all is lost! Easy to attack, at risk of correlated failures.
[Liben-Nowell, Balakrishnan, Karger, PODC‘02]Slide35
Basic overlay:Ring topologyNodes distributed randomly on ring.Edges:
successorpredecssor“fingers”Approximates a hypercube.Tolerating Churn
Chord
[Liben-Nowell
,
Balakrishnan, Karger
, PODC
‘02]Slide36
Maintenance:Ignore “fingers.”Only matters for routing/lookup.
Easy to rebuild.Maintain successor/predecessor.Ensures that ring remains a ring. Tolerating Churn
Chord
[Liben-Nowell
, Balakrishnan
, Karger
, PODC
‘02]Slide37
Assume adversary is oblivious:Adversary does not know where nodes live on the ring.
New nodes are inserted “randomly.”Nodes are deleted “randomly.”Tolerating ChurnChord
[
Liben-Nowell,
Balakrishnan,
Karger, PODC
‘
02]Slide38
Think about crashes:Assume at most
n/2 nodes crash.Ring is disconnected?Tolerating ChurnChord
[
Liben-Nowell,
Balakrishnan,
Karger, PODC
‘
02]Slide39
Think about
crashes:
Assume at most
n/2 nodes crash.Maintain connection to log(n) successors.With high probability, at least 1 successor survives.Tolerating Churn
Chord
[
Liben-Nowell
, Balakrishnan, Karger
, PODC
‘02]Slide40
Think about
joining:
Assume at most
n/2 nodes join.Joining node is attached “randomly.”Creates appendages hanging off ring.At most O(log n) joining nodes in one place.
Tolerating ChurnChord
[
Liben-Nowell
, Balakrishnan,
Karger
, PODC‘02]Slide41
Repair mechanism:
Query successors for
their
successors.Query successors for their predecessor.Update local state.Notify successors of changes.
Note: I’m being a bit imprecise.
Tolerating ChurnChord
[
Liben-Nowell
, Balakrishnan
, Karger, PODC‘02]Slide42
Repair mechanism:
Query successors for
their
successors.Query successors for their predecessor.Update local state.Notify successors of changes.
Note: I’m being a bit imprecise.
Tolerating Churn
Chord
[Liben-Nowell
, Balakrishnan
, Karger, PODC‘02]
Beware: what if this node is deleted?Slide43
Time analysis:
Every
O(1) steps (in expectation), one insert/delete is resolved.
Within O(log n) steps, with high probability, all inserts/deletes from the last half-life are resolved.Tolerating ChurnChord
[
Liben-Nowell
,
Balakrishnan, Karger, PODC
‘02]Slide44
Analysis ideas:
Define an
almost good state.
Show that in the almost good state, the overlay has good properties.Show that if the half-life length Ω(log n), then the overlay is always almost good.Tolerating Churn
Chord
[
Liben-Nowell
, Balakrishnan,
Karger, PODC
‘02]Slide45
Tolerating Churn
Chord Summary
Advantages:
Tolerates high rate of churn.Overlay has many good properties.
Simple to implement.
Disadvantages:
Oblivious adversary
Fragile
Limitations on where nodes can join.Only supports one topology. Once the ring is disrupted, all is lost! Easy to attack, at risk of correlated failures.
*Re-Chord (KKS11) yields O(n log n) stabilization.Slide46
Think about crashes:Within
O(nc) steps, log(n) consecutive nodes crash.If you wait long enough, bad events happen.Then the ring is disconnected.
It will take a long time to fix!
Tolerating ChurnChord is fragile?Slide47
Recent work by Pamela Zave has found bugs/imprecision in original Chord descriptions:Using lightweight modeling to understand Chord
Why the Chord Ring-Maintenance Protocol is not Correct How to make Chord correctConclusion: Chord works, but only if you really get it right.
Tolerating Churn
Chord details are trickySlide48
Approach:Start with good overlay.Bound rate of churn.Fix the overlay
faster than the churn destroys it. Maintain good properties throughout.Fix-it-Felix ApproachChord tolerates churn.
I can fix it!Slide49
Tolerating Churn
Alternate Fix-it-Felix approach
Load-balanced hypercube:
At most O(log n)
arrivals/departures per round.
Basic idea: map
O(log n)
nodes to each vertex of a hypercube.Load-balance among hypercube vertices to cope with churn.Grow/shrink hypercube as needed.
[Kuhn,
Schmid
,
Wattenhofer
, IPTPS’05]Slide50
Tolerating Churn
Alternate Fix-it-Felix approach
Load-balanced hypercube:
At most O(log n)
arrivals/departures per round.
Basic idea: map
O(log n)
nodes to each vertex of a hypercube.Load-balance among hypercube vertices to cope with churn.Grow/shrink hypercube as needed.
[Kuhn,
Schmid, Wattenhofer, IPTPS’05]Interesting ideas:
Map real nodes to a virtual topology.Use load balancing to keep real nodes well distributed.Tolerates adaptive (and omniscient) adversary.Note: (unavoidable) weaker per-round adversarial limit.Slide51
Tolerating Churn
Alternate Fix-it-Felix approach
Random (dynamic) graph:
Maintain a random constant-degree graph.
Up to θ
(n/log n)
churn per round.
Based on random walks.
[Augustine,
Pandurangan, Robinson, Roche, Upfal, FOCS’15]Disadvantages: Not routable.
Oblivious adversaryNetwork size remains fixed.Limitation on where new nodes can be attached.Slide52
Tolerating Churn
Alternate Fix-it-Felix approach
XHeal
:Fixes any graph that has deletions (and insertions).
Carefully replaces missing nodes with small expanders.
Preserves good properties of original graph:
Bounded stretch:
log(n)Bounded degree-growth
[
Pandurangan, Trehan, PODC’11]Slide53
Tolerating Churn
Alternate Fix-it-Felix approach
XHeal
:Fixes any graph that has deletions (and insertions).
Carefully replaces missing nodes with small expanders.
Preserves good properties of original graph:
Bounded stretch:
log(n)Bounded degree-growth
[
Pandurangan, Trehan, PODC’11]Interesting aspects: One change at a time.
Local repair.Works for any topology/graph.Slide54
Fix-it-Felix Approach
Moral of the story:
General
ideas:Local repair.
Simple updates.
Maintain approximate structure.
Several existing solutions:
Oblivious adversary.Varying rates of
churn.
Subtle dependence on the joining model.Challenges:Adaptive adversary Tricky analysis, since failures are ongoing.Hard to keep costs proportional to changes.
I can fix it!Slide55
OverlaysA Play in Three ActsAct I : Fix-it-Felix and the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being
destoryed.Act II : Bob-the-Builder and the Destabilizing-DemonWherein we meet our hero Bob, as he
counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.
Act III : Here Be DragonsWherein we explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide56
Basic StatisticsHome: Sunflower ValleyRole: Building ContractorAntagonist: None.
Tool: Distributed team of building machinery.Powers: Teamwork.Introductions
Bob-the-Builder
Can we build it? Yes we can!Slide57
Rebuild good overlay:On insert: rebuild!On departure: rebuild!On any change: rebuild!
Basic philosophy:Assume network is in some arbitrary topology.Build a good overlay.Build it.. fast!
Bob–the-Builder Approach
How to cope with changes?
Can we build it? Yes we can!Slide58
From any initial state:Build a good overlay.Fast construction.No ongoing churn.
Self-stabilization:Arbitrary initial state corruption.Converges to a good state.Bob–the-Builder Approach
How to build an overlay?
Can we build it? Yes we can!Slide59
Dynamic Graph Model
Assume an arbitrary initial state:
Initially:
Graph is given in an arbitrary connected topology.
State of the nodes may be corrupted.
Slide60
Dynamic Graph Model
Construct a good overlay:
In every round:
Exchange messages with neighbors.Adjust edges.
Improve the overlay.
No further joins/leaves allowed (until overlay is constructed).Slide61
Dynamic Graph Model
Stabilization:
Eventually:
Overlay is constructed.Good properties are guaranteed.
In good state, joins and leaves may be supported.Slide62
From any initial state:Build a good overlay.Fast construction.No ongoing churn.
Self-stabilization:Arbitrary initial state corruption.Converges to a good state.Bob–the-Builder Approach
How to build an overlay?
Can we build it? Yes we can!Slide63
Self-Stabilizing Overlays
Skip+ Overlay Graphs:
Advantages:
Stabilizes quickly.Supports efficient joins and leaves.
Simple rules.
Disadvantages:
Large message / communication complexity (in the worst case
).
Large degree during construction (in the worst case).Oblivious adversary (for departures).Churn??Only supports one topology.[Jacob,
Richa, Scheideler, Schmid, Taubig, PODC’09]Slide64
Skip+ Overlays
Classical Skip List
011
010
001
000
010
000
010
001
001
1
11
011Slide65
Skip+ Overlays
Classical Skip List
011
010
001
000
010
000
010
001
001
1
11
011
Advantages
:
Fast search / insert in log(n) time.
Fault-tolerantSlide66
Skip+ Overlays
Classical Skip List
011
010
001
000
010
000
010
001
001
1
11
011
Disadvantage
:
congestion
Only one root.
Load is not balanced over nodes.Slide67
Skip+ Overlays
Skip Graph
011
010
001
000
010
000
010
001
001
1
11
011Slide68
Skip+ Overlays
Skip+ Graph
011
010
001
000
010
000
010
001
001
1
11
011Slide69
Skip+ Overlays
How to build it?
Overlay guarantees:
Fast searches / routing.Fast joins / leaves.
Small diameter.
Small degree.
Overlay construction issues:
Initially, diameter of the graph may be large.
How do nodes find their neighbors efficiently?How do nodes sort themselves properly into linked lists?Leverage parallelism? Not one insertion at a time!Slide70
Skip+ Overlays
Simple trick: pointer doubling
In every round:
Send your entire
address book to all your neighbors.
In every round, your “knowledge diameter” doubles.
Note: each rounds squares the adjacency matrix
.Within log(n)
rounds, graph is a clique.
4
2
8
12Slide71
Skip+ Overlays
Simple trick: pointer doubling
Once you have a clique:
Delete edges not in the overlay.
Final graph matches good topology.
Within
log(n)+1
rounds, graph is Skip+.
[Example: Skip List]Slide72
Skip+ Overlays
Simple trick: pointer doubling
Bad news:
Degree grows very large: n-1
.
Messages grow very large:
θ
(n).Number of messages per round is large: θ
(
n2).Conclusion: fast, but inefficient, overlay construction.
4
2
8
12Slide73
Skip+ Overlay
More efficient construction:
Basic idea:
Minimize expensive doubling steps.
Route nodes directly using existing overlay, if possible.
Local view:
Each node uses its local view to determine who its neighbors should be in the final overlay.
Stable edges:
An edge
(u,v) is stable if it appears to be in the final Skip+ overlay according to both the views of u and v.Otherwise, an edge is temporary.Slide74
Skip+ Overlays
Simple rules:
Rule 1
: Introduce Friends Notify stable neighbors of neighbors they should know about (including yourself).
Rule 2: Forward Temporary Edges (
Routing)
Forward a temporary neighbor to a stable neighbor with the largest shared prefix
.Rule
3a
: Introduce All (Pointer Doubling)If your set of stable neighbors changes (e.g., due to a change in views), then introduce all your neighbors to each other.Rule 3b: LinearizeHelp organize your neighbors at the same “level” into a linked list.dSlide75
Skip+ Overlay
How does it work?
In general:
Guarantees fast O(log n) overlay construction.
Best effort to restrain message size / degree during construction.
Locally checkable:
Nodes can locally determine whether the overlay is correct.
If every node
thinks
it is correct, then it is!Bad news:Still can lead to large degree, large messages, and many messages per round, in the worst case.Slide76
Self-Stabilizing Overlays
Skip+ Overlay Graphs:
Advantages:
Stabilizes quickly.Supports efficient joins and leaves.
Simple rules.
Disadvantages:
Large message / communication complexity (in the worst case
).
Large degree during construction (in the worst case).Oblivious adversary (for departures).Churn??Only supports one topology.Slide77
Self-Stabilizing Overlays
Skip+ Overlay Graphs:
Many improvements, similar ideas:
HSkip+ : Heterogeneous bandwidth
Corona : Deterministic
TCF-Skip+: Generalization, local detection analysis, etc.
011
010
001
000
010
000
010
001
001
1
11
011Slide78
Self-Stabilizing Overlays
Alternate Bob-the-Builder approach
Patricia-Tree-like Overlay:
Forms a tree from a weakly connected graph.
Asynchronous.
Communication-efficient:
Small messages, O(1) per round per nod.
Low contention
[
Angluin, Aspnes, Chen, Wu, Yin, SPAA’05]Disadvantages:
Initially: every node has low degree.Builds a tree.Later version (SSS’07) builds better overlay.Slide79
Bob-the-Builder Approach
Moral of the story:
General
ideas:
Deterministic, exact final structure.
Converge to final structure.
Graph square or component merging to converge fast.
Several existing solutions:
Oblivious adversary.
No joins/leaves during stabilization.Challenges:How to keep messages small?How to avoid graph squaring?How to tolerate ongoing changes?Can we build it? Yes we can!Slide80
OverlaysA Play in Three ActsAct I : Fix-it-Felix and the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being
destoryed.Act II : Bob-the-Builder and the Destabilizing-DemonWherein we meet our hero Bob, as he
counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.
Act III : Here Be DragonsWherein we explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide81
Good overlay:Small diameter / small degree:
O(log n).Routable / good expansion.
Fast and robust construction:
Rapidly formation from any initial state.
Self-stabilizing.
Churn tolerant:
Tolerates large fractions of nodes joining and leaving every round.
Oblivious adversary (?).
Efficient construction/maintenance:
Small messages (e.g., size O(log n)).Message-efficient (e.g., 1 message per node per round).Big Picture
Goals:Slide82
Collaborators: Gopal Pandurangan, Peter Robinson, and Amitabh TrahanSlide83
Overview
Plan:
Virtual overlays:
Define good topologies.
Map virtual nodes to real servers.
Merging two good overlays:
Start with two good overlays, connected by one link.
Build new good overlay.
Construction algorithm:
Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide84
Virtual Overlays
What is a good topology?
Sequence of graphs
G1, G2, …,
G
n
Graph
Gj has [j, 2j] nodes.
Graph
Gj has O(log j) degree and diameter.Expandable:There exists a mapping from each node in Gj to 1 or 2 nodes in G2j.Random sampling:
Supports a mechanism for randomly sampling nodes.E.g., fast random walks.Permutation routing:Supports efficiently routing permutations.Note: many (deterministic) expanders satisfy these requirements.Slide85
Virtual Overlays
Map virtual topology to real nodes
For each real node:
Assign O(1) virtual nodes to it.
Connect two real nodes if any of their virtual nodes are connected.
Inspiration: The Forgiving Tree [HST’12],
DeX
[PRT’14] and [KSW’05]
Virtual
RealSlide86
Overview
Plan:
Virtual overlays:
Define good topologies.
Map virtual nodes to real servers.
Merging two good overlays:
Start with two good overlays, connected by one link.
Build new good overlay.
Construction algorithm:
Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide87
Merging
Combining two overlay networks
Assume two overlay networks:
Each overlay is a properly mapped virtual topology.
The two overlays are connected by one edge.
Key challenge: merge the two overlay networks.Slide88
Merging
Combining two overlay networks
Step 1: Grow the virtual topology
Double one overlay (using topology mapping
G
n
G
2n
)
.
Creates excess virtual nodes.Slide89
Merging
Combining two overlay networks
Step 1: Grow the virtual topology
Double one overlay (using topology mapping
G
n
G
2n
)
.
Creates excess virtual nodes.
Step
2: Create new edges
Use permutation routing.Slide90
Merging
Combining two overlay networks
Step 3: Distribute excess virtual nodes
Send new virtual nodes on a random walk of old topology.
(Use random sampling of target topology.)Slide91
Merging
Combining two overlay networks
Step 3: Distribute excess virtual nodes
Send new virtual nodes on a random walk of old topology.
(Use random sampling of target topology.)
Problem: only one bridge.Slide92
Merging
Combining two overlay networks
Step 3: Distribute excess virtual nodes
Send new virtual nodes on a random walk of old topology.
(Use random sampling of target topology.)
Every success creates a new bridge.
Number of bridges doubles (i.e., exponential growth).Slide93
Merging
Combining two overlay networks
Step 3: Distribute excess virtual nodes
Send new virtual nodes on a random walk of old topology.
(
Use random sampling of target topology.)
Problem: eventually, hard to find an empty node.Slide94
Merging
Combining two overlay networks
Step 3: Distribute excess virtual nodes
Full nodes
send new
virtual nodes
on random walk of old topology.
If many nodes are empty, they are easy to find.
Empty nodes
send
requests
on random walk of the old topology.
If
most nodes are full, then there are lots of bridges.Slide95
Step 4: Rebalancing and clean up
Balance/reduce the virtual nodes (maybe).
Drop the old topology. (Or keep it as a backup.)
End result: a new instantiation of the virtual topology.
Merging
Combining two overlay networksSlide96
Overview
Plan:
Virtual overlays:
Define good topologies.
Map virtual nodes to real servers.
Merging two good overlays:
Start with two good overlays, connected by one link.
Build new good overlay.
Construction algorithm:
Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide97
Overlay Construction
Divide-and-Conquer Algorithm
1.
Collections: maintain a collection of overlays.Initially, each node is its own collection.
Collections are connected by edges between (real) nodes.
2.
Matching
: find a matching in the graph of collections.Pair up collections.
3.
Merge: combine pairs of collections.Merge components.Combine two components into a single new collection.4. Repeat: while there is > 1 collection, go to Step 2.Inspired by [Angluin, Aspnes
, Chen, Wu, Yin, SPAA’05]Slide98
Overlay Construction
Divide-and-Conquer Algorithm
1.
Collections: maintain a collection of overlays.Initially, each node is its own collection.
Collections are connected by edges between (real) nodes.
2.
Matching
: find a matching in the graph of collections.Pair up collections.
3.
Merge: combine pairs of collections.Merge components.Combine two components into a single new collection.4. Repeat: while there is > 1 collection, go to Step 2.Problem: matching may be small!Slide99
Overlay Construction
Divide-and-Conquer Algorithm
1.
Collections: maintain a collection of overlays.Initially, each node is its own collection.
Collections are connected by edges between (real) nodes.
2.
Sparsify
: reduce degree of graph of collections.Create constant degree graph.
Each node organizes its “children” into a line.
3. Matching: find a matching in the graph of collections.Pair up collections.4. Merge: combine pairs of collections.Merge components.Combine two components into a single new collection.5.
Repeat: while there is > 1 collection, go to Step 2.Slide100
Overlay Construction
Divide-and-Conquer Algorithm
Initially: connected graphSlide101
Overlay Construction
Divide-and-Conquer Algorithm
Sparsify
: reduce the degreeSlide102
Overlay Construction
Divide-and-Conquer Algorithm
Repeat: match and mergeSlide103
Overlay Construction
Divide-and-Conquer Algorithm
Repeat: match and mergeSlide104
Overlay Construction
Divide-and-Conquer Algorithm
Repeat: match and mergeSlide105
Overlay Construction
Divide-and-Conquer Algorithm
Repeat: match and mergeSlide106
Overlay Construction
Divide-and-Conquer Algorithm
Repeat: match and mergeSlide107
Divide-and-Conquer:Collection has small diameter
O(log n) cost to coordinate.
Collection merging
: O(polylog n) cost per merge step.
Number of iterations:
O(log n)
matchings
Overall: O(
polylog
n) time to form overlay.Overlay Construction
Efficiency AnalysisMerging:Random walks: O(log n) costBridge doubling: O(log n) iterationsOverall: O(polylog n)
time to merge topologies.Slide108
Overview
Plan:
Virtual overlays:
Define good topologies.
Map virtual nodes to real servers.
Merging two good overlays:
Start with two good overlays, connected by one link.
Build new good overlay.
Construction algorithm:
Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide109
Self-Stabilization
Detecting TroubleSlide110
Self-Stabilization
Detecting Trouble
Local detection:
If the topology is bad, some node can detect it.
Virtual topology is deterministic.
Can verify assignment of virtual nodes to real nodes, since each real node is assigned 1 or 2 virtual nodes.Slide111
Detection diameter:Assume there is a topology problem
.Detection diameter = max distance of any node from a node that can detect the problem.
In any expander, detection diameter is
O(log n), even if real graph diameter is much larger!
Self-Stabilization
Detecting Trouble
Local detection:
If the topology is bad, some node can detect it.
Virtual topology is deterministic.
Can verify assignment of virtual nodes to real nodes, since each real node is assigned 1 or 2 virtual nodes.
[Defined by [Berns, Ghosh, Pemmaraju: SSS’11]Slide112
Detection diameter:Assume there is a topology problem
.Detection diameter = max distance of any node from a node that can detect the problem.
In any expander, detection diameter is
O(log n), even if real graph diameter is much larger!
Self-Stabilization
Detecting Trouble
Local detection:
If the topology is bad, some node can detect it.
Virtual topology is deterministic.
Can verify assignment of virtual nodes to real nodes, since each real node is assigned 1 or 2 virtual nodes.
Conclusion: Within O(log n) time, every node learns of a topology error.Slide113
Self-stabilizatino
Basic idea:
If a node detects an error:
Flood notification to everyone.
Time:
O(log n)
.
On notification of error:
Restart overlay construction from scratch.
Within O(polylog n) time, new overlay.Slide114
Self-stabilizatino
Basic idea:
If a node detects an error:
Flood notification to everyone.
Time:
O(log n)
.
On notification of error:
Restart overlay construction from scratch.
Within O(polylog n) time, new overlay.Remaining issue: synchronizationDifferent nodes begin rebuilds at different times.Components synchronize as they merge.Slide115
Overview
Plan:
Virtual overlays:
Define good topologies.
Map virtual nodes to real servers.
Merging two good overlays:
Start with two good overlays, connected by one link.
Build new good overlay.
Construction algorithm:
Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide116
Churn
Departures and FailuresSlide117
Churn
Departures and Failures
What types of failures?
Graceful: execute an exit protocol.
Oblivious
: adversary decides failures in advance.
Random
: adversary fails nodes at random.Adaptive
: adversary chooses failure on-line.
Connected: adversary never disconnects graph.Slide118
Overlay tolerates failuresExpanders are highly fault-tolerant.
For “good” target topologies, overlay maintains good properties, even if nodes fail. As long as not too many nodes fail, all is good.
Churn
Departures and Failures
What types of failures?
Graceful
: execute an exit protocol.
Oblivious
: adversary decides failures in advance.
Random: adversary fails nodes at random.Adaptive: adversary chooses failure on-line.
Connected: adversary never disconnects graph.Slide119
Churn
Repairing failures
Option 1.
Local repair If a node leaves gracefully, hand-off virtual nodes.
If a node crashes, neighbors regenerate virtual nodes.
Rebalance (via random walks).
Conjecture: tolerates small changes.
[as in Kuhn,
Schmid
, Wattenhofer, IPTPS’05]Slide120
Option 2. Periodic Rebuild
Every so often, initiate a complete rebuild.Assume failures oblivious or random (in some sense).
As long as:
“half-life” > “rebuild time” then everything continues to work.
Churn
Repairing failures
Option 1.
Local repair
If a node leaves gracefully, hand-off virtual nodes.
If a node crashes, neighbors regenerate virtual nodes.
Rebalance (via random walks).Conjecture: tolerates small changes.[as in Kuhn, Schmid
,
Wattenhofer
, IPTPS’05]Slide121
Churn
Key Question:
Does the construction procedure work when there are ongoing failures?
1. Merging fails if links connecting collections fail:
Random/oblivious failures
not too many links fail / iteration
.
2. Collection coordination is more difficult
Can no longer rely on a leader or an aggregation tree in the collection.
Matching is still doable.Sparsification is harder.Slide122
Joining is easyFind an extra virtual node.
Route virtual node to new entrant.Or wait until the next rebuild.
Churn
Joining nodes
How do nodes join?
Random
: introduced to a random node.
Oblivious
: introduced to a node chosen by the adversary in advance.
Limited adversarial: introduced to a node, but adversary cannot introduce more than one at each location.Slide123
Overview
Plan:
Virtual overlays:
Define good topologies.
Map virtual nodes to real servers.
Merging two good overlays:
Start with two good overlays, connected by one link.
Build new good overlay.
Construction algorithm:
Divide-and-conquer.Merge and collapse components.Robustness:Self-stabilization.Churn.Slide124
OverlaysA Play in Three ActsAct I : Fix-it-Felix and the Half-Life-HobgoblinWherein we meet our hero Felix, as he races to keep up with the Half-Life-Hobgoblin, fixing the overlay as fast as it is being
destoryed.Act II : Bob-the-Builder and the Destabilizing-DemonWherein we meet our hero Bob, as he
counters the Destabilizing-Demon, rebuilding the overlay no matter how badly damaged.
Act III : Here Be DragonsWherein we explore the dangerous and unknown path forward and away from the hobgoblins and demons of our despair.Slide125
Wrap-up
Overlay Networks
Many approaches:
One zillion different base topologies.Many different repair techniques.
Real-world implementations.
Algorithms and theory.Slide126
A few existing overlay networks…ChordKademliaPastry Tapestry
Skip+HSkip+
Patricia Tries
DexXHealForgiving Tree
PGrid
Skipnet
RN Protocol
dHamiltonianCyclesAvatarCa-Re-Chord
HyperCubes
HyperRingChameleonRe-ChordTiaraCoronaSlide127
Wrap-up
A few things I have learned
Churn analysis is hard:
Relatively few papers really allow failures to happen at any time.
Analysis is
handwavy, and sometimes wrong (see, e.g., Chord).
Can we do a better job of understanding / proving results regarding churn?
We can fix it!Slide128
Wrap-up
A few things I have learned
Self-stabilization is also hard:
System can start in any state! Easy to ignore corner cases.Still, easier because there are no ongoing failures.
Tends to be expensive, due to lots of local checking and rebuilding.
Can we build it?
Yes we can!Slide129
Wrap-up
A few things I have learned
Integrating both churn and self-stabilization:
Hard to stabilize if failures are ongoing!Competing demands:
Self-stabilization treats every failures as a disaster!
Churn-tolerance tries to ignore/repair small failures.Slide130
Wrap-up
A few things I have learned
Communication costs:
Many existing solutions rely on “graph squaring” to handle worst-case.
Churn-tolerance is relatively cheap (when it works).
Self-stabilization tends to be much more (communication) expensive.Slide131
Wrap-up
A few things I have learned
No one has
any idea how to deal with Byzantine/malicious or greedy participants in a dynamic overlay network!
Exception
: see “Brahms: Byzantine resilient random membership sampling”
by
Bortnikov
,
Gurevitch, Keidar, Kliot, ShraerException: see “Self-stabilizing and Byzantine-Tolerant Overlay Network” by Dolev, Hoch, and van RenesseSlide132
Wrap-up
A few things I have learned
Many interesting ideas.
Many interesting techniques.
No one “right” solution yet.Slide133
Can we build it?
Yes we can!
We can fix it!