CNT 55175564 Dr Sumi Helal amp Dr Choonhwa Lee Computer amp Information Science amp Engineering Department University of Florida Gainesville FL 32611 helal chl ID: 362597
Download Presentation The PPT/PDF document "Peer-to-Peer Systems" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Peer-to-Peer Systems CNT 5517-5564
Dr. Sumi Helal & Dr. Choonhwa LeeComputer & Information Science & Engineering DepartmentUniversity of Florida, Gainesville, FL 32611{helal, chl}@cise.ufl.edu Slide2
ScheduleIntroduction to peer-to-peer networking protocols (Nov. 9)BitTorrent
protocol (Nov. 9)Peer-to-peer streaming protocols (Nov. 18)Slide3
Reading MaterialsI. Stoica
, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan, “Chord: A Peer-to-Peer Lookup Service for Internet Applications,” In Proc. of the ACM SIGCOMM Conference, September 2001.B. Cohen, “Incentives Build Robustness in Bit Torrent,” In Proceedings of Workshop on Economics of Peer-to-Peer Systems, 2003.M. Piatek, T. Isdal, T. Anderson, A. Krishnamurthy, and A. Venkataramani, "Do Incentives Build Robustness in BitTorrent?” In Proc. of the 4th USENIX Symposium on Networked Systems Design and Implementation (NSDI), April 2007.J. Liu, S. G. Rao, B. Li, and H. Zhang, “Opportunities and Challenges of Peer-to-Peer Internet Video Broadcast,” Proc. of the IEEE, vol.96, no.1, pp.11-24
, January 2008
.
M. Zhang, Q. Zhang, L. Sun, and S. Yang, “Understanding the Power of Pull-Based Streaming Protocol: Can We Do Better?” IEEE Journal on Selected Areas in Communications, vol.25, no.9, pp.1678-1694, December 2007.Slide4
Slide courtesy: Prof. Jehn-Ruey Jiang, National Central University, Taiwan
Dah Ming Chiu, Chinese University of Hong Kong, China Chun-Hsin, National University of Kaohsiung, Taiwan Prof Shiao-Li Tsao, National Chiao Tung University, Taiwan Prof. Shuigeng Zhou, Fudan University, China
Introduction to P2P Networking ProtocolsSlide5
P2P Protocols and ApplicationsP2P file sharingNapster, FreeNet
, Gnutella, KaZaA, eDonkey/eMule, ezPeer, Kuro, BTP2P communicationNetNews (NNTP), Instant Messaging (IM), Skype (VoIP)P2P lookup services and applications (DHTs and global repositories)IRIS, Chord/CFS, Tapestry/OceanStore, Pastry/PAST, CANP2P multimedia streamingCoopNet, Zigzag, Narada, P2Cast, Joost, PPStreamProxies and Content Distribution NetworksSquid, Akamai,
LimeLight
Overlay
testbed
PlanetLab
,
NetBed
/
EmuLab
Other areas
P2P Gaming, Grid ComputingSlide6
P2P Internet TrafficSlide7
Some StatisticsMore than 200 million users registered with Skype and around 10 million online users (2007)Around 4.7M hosts participate in
SETI@home (2006)BitTorrent accounts for 1/3 of Internet traffic (2007)More than 200,000 simultaneous online users on PPLive (2007)More than 3,000,000 users downloaded PPStream (2008)Slide8
Paradigm Shift of Computing ModelsSlide9
Client/Server ArchitectureWell-known, powerful, reliable server is data sourceClients request data from server
Very successful modelWWW (HTTP), FTP, Web Services, etc Slide10
Client/Server LimitationsScalabilityA single point of failure
System administrationUnused resources at the network edgeSlide11
Peer-to-Peer Model“Peer-to-Peer (P2P) is a way of structuring distributed applications such that individual nodes have symmetric roles. Rather than being divided into clients and servers each with quite distinct roles, in P2P applications, a node may act as both a client and a server
.” Excerpt from the Charter of Peer-to-Peer Research Group, IETF/IRTF, June 24, 2003 Peers play similar roles No distinction of responsibilitiesSlide12
Peer-to-Peer SystemsIn a P2P network, every node is both a client and a serverProvide and consume data
Any node can initiate a connectionNo centralized data sourceThe ultimate form of democracy on the Internet As no. of clients increases, no. of servers also increasesPerfectly scalableDistributed costsIncreased privacySlide13
P2P Network BenefitsEfficient use of resources
B/W, storage, and processing power at the edge of the networkScalabilityConsumers of resources also donate resourcesAggregate resources grow naturally, as more peers joinReliabilityReplicasGeographic distributionNo single point of failureEase of administrationSelf-organizationNo need for server deployment and provisioningBuilt-in fault tolerance, replication, and load balancingSlide14
1999: Napster2000: Gnutella, eDonkey
2001: Kazaa 2002: eMule, BitTorrent2003: Skype2004: Coolstreaming, GridMedia, PPLive2004~: TVKoo,
TVAnts
,
PPStream
,
SopCast
, …
14
Short History
of P2P Networking ProtocolsSlide15
Whether or not the protocols rely on central indexing servers to facilitate the interactions between peersDecentralizedHybridCentralizedWhether the overlay networks contain some structure or are created in an ad-hoc fashion
UnstructuredStructured (i.e., precise control over network topology or data placement)P2P Protocol ClassificationSlide16
P2P Protocol Classification
Unstructured NetworksStructured NetworksCentralizedNapster
Decentralized
Gnutella
Chord, Pastry,
CAN
Hybrid
KaZaA
, GnutellaSlide17
First P2P file sharing applicationCentralized directory to help find contentHistoryIn 1999, S. Fanning launches Napster
Peaked at 1.5 million simultaneous usersJuly 2001, Napster shuts downNapsterSlide18
Napster: Publish
I have X, Y, and Z!
Publish
insert(X,
123.2.21.23)
...
123.2.21.23Slide19
Napster: Search
Where is file A?
Query
Reply
search(A)
-->
123.2.0.18
Fetch
123.2.0.18Slide20
Pros:SimpleSearch cost is O(1)Controllable (pro or con?)
Cons:Server maintains O(N) stateServer does all processingA single point of failureNapster: DiscussionSlide21
GnutellaCompletely distributed P2P file sharingEach peer floods its request to all other peers - prohibitive overheadsHistoryIn 2000, J. Frankel and T. Pepper from
Nullsoft released GnutellaSoon many other clients: Bearshare, Morpheus, LimeWire, etc.In 2001, many protocol enhancements including “UltraPeersSlide22
The ‘Animal’ GNU
Gnutella =GNU:
Recursive Acronym
GNU’s Not Unix ….
+
Nutella
:
a hazelnut chocolate spread produced by the Italian confectioner
Ferrero
….
GNU
Nutella
GNU
+
NutellaSlide23
Gnutella: Search
I have file A.
I have file A.
Where is file A?
Query
ReplySlide24
Pros:Fully de-centralizedSearch cost distributedCons:Search cost is O(N)
Search time is O(???)Nodes leave often, network unstableGnutella: DiscussionSlide25
Hybrid P2P Systems: FastTrack/KaZaAHierarchical
supernodes, i.e., ultra-peersAssigned the task of servicing a small sub-part of the networkIndexing and caching of files in the assigned partSufficient bandwidth and processing powerKazaa & Morpheus are proprietary systemsHybrid protocolMore efficient than old GnutellaMore robust than NapsterSlide26
KaZaA: Network Design
“Super Nodes”Slide27
KaZaA: File Insert
I have X!
Publish
insert(X,
123.2.21.23)
...
123.2.21.23Slide28
KaZaA: File Search
Where is file A?
Query
search(A)
-->
123.2.0.18
search(A)
-->
123.2.22.50
Replies
123.2.0.18
123.2.22.50Slide29
DHT: Distributed Hash TableSo farCentralized :
- Directory size – O(n) - Number of hops – O(1)Flooded queries: - Directory size – O(1) - Number of hops – O(n)We wantEfficiency : O(log(n)) messages per lookupScalability : O(log(n)) state per nodeRobustness : surviving massive failures n: number of participating nodesSlide30
DHT: Basic Idea
Hash keyObject “y”
Objects have hash keys
Peer “x”
Peer nodes also have hash keys
in the same hash space
P2P Network
y
x
H(y)
H(x)
Join (H(x))
Publish (H(y))
Place object to the peer with closest hash keysSlide31
Mapping an Object to the Closest Node with a Larger Key
0M
data
object
nodeSlide32
Viewed as a Distributed Hash Table
Hashtable02128-1
Peer
node
InternetSlide33
How to Find an Object?Hash
table02128-1Peer
nodeSlide34
Better IdeaTrack peers which allow us to move quickly across the hash spaceA peer p
tracks those peers responsible for hash keys (p+2i-1), i=1,..,mHashtable
0
2
128
-1
Peer
node
p
+2
2
p
+2
4
p
+2
8
pSlide35
Frans Kaashoek et al., MIT, 2001Identifiers
m bit identifier space for both keys and nodesKey identifier = SHA-1(key) Node identifier = SHA-1(IP address)Both are uniformly distributedHow to map key IDs to node IDs?Chord Protocol
Key=“
LetItBe
”
ID=5
SHA-1
IP=“198.10.10.1”
ID=105
SHA-1Slide36
Chord: Basic Routing
N32N90N105K80
K20
K5
Circular 7-bit
ID space
A key is stored at its
successor
: node with next higher ID
IP=“198.10.10.1”
As nodes enter the network, they are assigned unique IDs by hashing their IP addressSlide37
Chord: Basic Routing
N32N90N105N60
N10
N120
K80
“Where is key 80?”
“N90 has K80”
Every node knows its successor in the ringSlide38
Finger table (FT): With m additional entryThe
i-th entry points to the successor of node n+ 2i-1 To look up key k at node nIn FT, identify the highest node n' whose id is between n and k. If such a node exists, the lookup is repeated starting from n' Otherwise, the successor of
n
is returned
Chord: Finger-Table
RoutingSlide39
Chord: Finger-Table Routing
N42 is the first node that succeeds (8+26-1) mod 26=40N14 is the first node that succeeds (8+21-1) mod 26=9finger[k]: The first node on circle that succeeds (n+2k-1) mod 2m, 1≤
k
≤
mSlide40
Chord: Finger-Table Routing
Lookup(my-id, key-id) look in local finger table for highest node n s.t. my-id < n <
key-id
if
n
exists
call Lookup(key-id) on node
n
// next hop
else
return my successor
// done
Slide41
Chord: Finger-Table Routing
N32N10N5N20
N110
N99
N80
N60
Lookup(K19)
K19
N32: N60, N80,
N99
N99: N110,
N5
, N60
N5 :
N10
, N20, N32,
N60, N80
N10:
N20
, N32, N60
N80
N20: N32, N60, N99 Slide42
Centralized/distributed/hybridNapster, Gnutella, KaZaAUnstructured/structured
Unstructured P2P – no control over topology and file placementGnutella, Morpheus, Kazaa, etcStructured P2P – topology is tightly controlled and placement of files are not random Chord, CAN, Pastry, Tapestry, etcP2P Protocol SummarySlide43
Research IssuesP2P overlay topologySearch – full index, partial index, semantic search
Free riding – incentive mechanismsTopological awarenessISP-friendlyNAT traversalFault resilienceP2P traffic monitoring and detectionSecuritySpurious content, anonymity, trust & reputation managementNon-technical issuesCopyright infringement, intellectual privacySlide44
Questions?Slide45
BitTorrent
Slide courtesy: Prof. Dah Ming Chiu, Chinese University of Hong Kong, Hong Kong Dr. Iqbal Mohomed, University of Toronto, CanadaSlide46
Content Distribution
IP multicastCDN (Content Distribution Network)Application layer multicastOverlay structuresTree-based (push)Data-driven (pull)P2P swarmingBitTorrent, CoolStreamingSlide47
BitTorrent
Released in the summer of 2001Basic ideas from game theory to largely eliminate the free-rider problemAll precedent systems could not deal with this problem wellNo strong guarantees unlike DHTsWorking extremely well in practice unlike DHTs Slide48
Basic Idea – Swarming ProtocolA file is chopped into small pieces, called chunks
Pieces are disseminated over the networkAs soon as a peer acquire a piece, it can trade it for missing pieces with other peersA peer hopes to be able to assemble the entire file at the endSlide49
Basic ComponentsWeb server
The .torrent fileTrackerPeersSlide50
Web ServerContent discovery (i.e., file search) is handled outside of
BitTorrent, using a Web serverTo provide the “meta-info” file by HTTPFor example, http://bt.btchina.netThe information about each movie or content is stored in a metafile such as “supergirl.torrent”Slide51
The .torrent FileStatic file storing necessary meta information
NameSizeChecksumThe content is divided into many “chunks” (e.g., 1/4 megabyte each)Each chunk is hashed to a checksum valueWhen a peer later gets the chunks (from other peers), it can check the authenticity by comparing the checksumIP address and port of the TrackerSlide52
TrackerKeeping track of peers
To allow peers to find one anotherTo return a random list of active peersSlide53
PeersTwo types of peers:
Downloader (leecher) : A peer who has only a part (or none) of the file.Seeder: A peer who has the complete file, and chooses to stay in the system to allow other peers to downloadSlide54
BitTorrent in Action
Web ServerBobTracker
Matrix.torrent
Downloader:
David
Seeder:
Chris
Downloader:
AliceSlide55
Overview – System Components
Web page with link to .torrentABC
Peer
[Leech
]
Peer
[Seed]
Peer
[Leech]
Tracker
Web Server
.torrentSlide56
Overview – System Components
Web page with link to .torrentABC
Peer
[Leech
]
Peer
[Seed]
Peer
[Leech]
Tracker
Get-announce
Web ServerSlide57
Overview – System Components
Web page with link to .torrentABC
Peer
[Leech
]
Peer
[Seed]
Peer
[Leech]
Tracker
Response-peer list
Web ServerSlide58
Overview – System Components
Web page with link to .torrentABC
Peer
[Leech
]
Peer
[Seed]
Peer
[Leech]
Tracker
Shake-hand
Web Server
Shake-handSlide59
Overview – System Components
Web page with link to .torrentABC
Peer
[Leech
]
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
pieces
Web ServerSlide60
Overview – System Components
Web page with link to .torrentABC
Peer
[Leech
]
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
pieces
pieces
Web ServerSlide61
Overview – System Components
Web page with link to .torrentABC
Peer
[Leech
]
Peer
[Seed]
Peer
[Leech]
Tracker
Get-announce
Response-peer list
pieces
pieces
pieces
Web ServerSlide62
ChunksA file is split into chunks of fixed size, typically 256Kb
Each peer maintains a bit map that indicates which chunks it hasEach peer reports to all of its neighboring peers (obtained from tracker) what chunks it hasThis is the information used to build the implicit delivery treesSlide63
Swarming Example
Seeder:AliceDownloaderBob{1,2,3,4,5,6,7,8,9,10}{}
{1,2,3}
Downloader
Joe
{}
{1,2,3}
{1,2,3,4}
{1,2,3,5}
{1,2,3,4,5}Slide64
Rarest First
Rarer pieces are given priority in downloading with the rarest being the first candidateThe most common pieces are postponed towards the endThis policy ensures that a variety of pieces are downloaded from the seeder, resulting in quicker chunk propagationSlide65
Peer SelectionBasic idea of
tit-for-tat strategy in BitTorrent:Maintain 4-5 “friends” with which to exchange chunksIf a friend is not exchanging enough chunks, get rid of him/herKnown as “choking” in BTPeriodically, randomly select a new friendKnown as “optimistic unchoking” in BTIf you have no friends, randomly select several new friendsKnown as “anti-snubbing” in BTSlide66
Example of Optimistic Unchoking
AliceDownloader:Chris
Downloader:
Bob
Downloader:
Ed
Downloader:
David
40kb/s
30kb/s
10kb/s
100kb/s
20kb/s
70kb/s
15kb/s
10kb/s
70kb/s
110kb/s
70kb/s
5kb/s
Downloader
JoeSlide67
Questions?