A largescale and decentralized applicationlevel multicast infrastructure Overview Pastry PAST distributed file system layered on top of Pastry SCRIBE decentralized publishsubscribe system Pastry Quick Review ID: 499719
Download Presentation The PPT/PDF document "SCRIBE" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
SCRIBE
A large-scale and decentralized application-level multicast infrastructureSlide2
Overview
Pastry
PAST
distributed file system layered on top of Pastry
SCRIBE
decentralized publish/subscribe systemSlide3
Pastry – Quick Review
Chord like routing
Consistent hashing
Prefix routing
Leaf setSlide4
Pastry – locality properties
Short routes
Total distance traveled
Average dist 1.59 to 2.2 times actual dist
Route convergence
Dist Traveled by 2 messages sent to same key
Equal to dist between to nodes before routes convergeSlide5
Pastry API
nodeID
=
pastryInit
(Credentials)
Causes node to join pastry network
route(
msg,key
)send(msg,IP-addr
)
Applications must export:
deliver(
msg,key
)
forward(
msg,key,nextID
)
newLeafs
(
leafSet
)Slide6
SCRIBE
Built on top of Pastry
Support large number of groups
Handle a high rate of membership turnover
SCRIBE nodes can:
Create groups
Join groups
Multicast messages to groupsSlide7
SCRIBE API
create(credentials,
groupID
)
join(credentials,
groupID)
leave(credentials,
groupID
)multicast(credentials,
groupID
, message)Slide8
SCRIBE – Creating a Group
Pastry route(
msg
, key)
SCRIBE route(CREATE,
groupID
)
groupID
=> hash textual name cat creator nameMessage delivered to closest key which become
rendez-vous
point for the group (root of multicast tree for group)
Adds to local list of groups
Stores credentials
Alternative use itself as root
good choice if creator sends to group oftenSlide9
SCRIBE – Joining a Group
Pastry route(
msg
, key)
SCRIBE route(JOIN,
groupID)
routed to
rendez-vous
pointalong the way multicast tree formedSlide10
SCRIBE – Leaving a Group
Remove from local group children list
If list becomes empty forward to parent
Part of the multicast tree may be removedSlide11
SCRIBE – Sending a multicast message
route(MULTICAST,
groupID
)
ask for
rendez-vous
IP address
If
rendez-vous fails re-request
rendez-vous
point
Pastry handles node duplication
All messages are sent through the
rendez-vous
pointSlide12
SCRIBE – Repairing the Multicast Tree
Messages are delivered only in best-effort
may be out of order delivery
Periodic heartbeat message sent to all children
Child rejoins the tree through sending a new JOIN message if suspects parent has failed
Can repair
rendez-vous
point
Pastry handles node duplication in leaf nodes
Children nodes JOIN new root when missing heartbeat is detectedSlide13
SCRIBE – Forming a Multicast Tree
Rendez-vous
point (root)
Forwarders
may or may not be members of the group
maintain a children table (IP and
nodeID
) for groupSlide14
SCRIBE - Strengths
Pastry handles root duplication
Rendez-vous
point does not handle all join requests
Locality properties of Pastry
short routes
delay from
rendez-vous
point to member is short
route convergence
load imposed on physical network is smallSlide15
SCRIBE – Experimental
Evalutation
Simulation experimental results
Focus
on three metrics:
delay to deliver events to group members
stress on each node
stress on each physical network linkSlide16
SCRIBE – Simulator Evaluation
5050 routers and 100,000 end nodes
1,500 groups of different sizes
10 different runs using same parameters but different random seeds
Averaged all results
Compared results with IP multicastSlide17
SCRIBE – Delay Penalty
RMD – ratio between max delay using SCRIBE & max delay using IP multicast
RAD – ratio between average delay using SCRIBE
& average delay using IP multicastSlide18
SCRIBE – Node Stress
Average node is responsible for forwarding a small number of multicast messagesSlide19
SCRIBE – Link Stress
Total num links = 1,035,295
SCRIBE = 2,489,824 messages (mean 2.4)
IP multicast = 758,853 messages (mean 0.7)Slide20
SCRIBE – Bottleneck remover
Bottlenecks
Low capacity nodes
High capacity nodes with extremely high children entries
Drop children if over capacity
Select child to drop and send message with children table
Child chooses new parent node and sends JOIN message
Result
Removes long tail in node stress graph
Increases average link stressSlide21
SCRIBE – Scalability Small Groups
50,000 nodes
30,000 groups 11 members each group
SCRIBE performs poorly for large number small groups
SCRIBE collapse
Removes long paths
removing nodes that are not members of a group & have only one entry in their children tableSlide22
SCRIBE – Scalability Small Groups
Average link stress 6.1 to 3.3
Average number of children 21.2 to 8.5Slide23
SCRIBE - Conclusion
Fully decentralized
Support large number of groups
Support large group size
Multiple multicast sources per
group
QUESTIONS