/
Peer-to-Peer  Systems and Distributed Hash Tables Peer-to-Peer  Systems and Distributed Hash Tables

Peer-to-Peer Systems and Distributed Hash Tables - PowerPoint Presentation

luanne-stotts
luanne-stotts . @luanne-stotts
Follow
343 views
Uploaded On 2019-06-23

Peer-to-Peer Systems and Distributed Hash Tables - PPT Presentation

COS 518 Advanced Computer Systems Lecture 16 Michael Freedman Credit Slides Adapted from Kyle Jamieson and Daniel Suo PeertoPeer Systems Napster Gnutella BitTorrent challenges Distributed Hash Tables ID: 760167

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Peer-to-Peer Systems and Distributed Ha..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Peer-to-Peer Systems and Distributed Hash Tables

COS 518: Advanced Computer SystemsLecture 16Michael Freedman

[Credit: Slides Adapted from Kyle Jamieson and Daniel

Suo

]

Slide2

Peer-to-Peer SystemsNapster, Gnutella, BitTorrent, challengesDistributed Hash TablesThe Chord Lookup ServiceConcluding thoughts on DHTs, P2P

2

Today

Slide3

A

distributed system architecture:No centralized controlNodes are roughly symmetric in functionLarge number of unreliable nodes

3

What is a Peer-to-Peer (P2P) system?

Node

Node

Node

Node

Node

Internet

Slide4

High capacity for services through parallelism:Many disksMany network connectionsMany CPUsAbsence of a centralized server may mean:Less chance of service overload as load increasesEasier deploymentA single failure won’t wreck the whole systemSystem as a whole is harder to attack

4

Why might P2P be a win?

Slide5

Successful adoption in some niche areasClient-to-client (legal, illegal) file sharingDigital currency: no natural single owner (Bitcoin)Voice/video telephony: user to user anywayIssues: Privacy and control

5

P2P adoption

Slide6

User clicks on download linkGets torrent file with content hash, IP addr of trackerUser’s BitTorrent (BT) client talks to trackerTracker tells it list of peers who have fileUser’s BT client downloads file from peersUser’s BT client tells tracker it has a copy now, tooUser’s BT client serves the file to others for a while

6

Example: Classic BitTorrent

Provides huge download bandwidth,

without

expensive server or network links

Slide7

7

The lookup problem

N1

N2

N3

N6

N5

Publisher (N4)

Client

?

Internet

put(“Pacific Rim.mp4”, [content])

get(

Pacific Rim.mp4

)

Slide8

8

Centralized lookup (Napster)

N1

N2

N3

N6

N5

Publisher (N4)

Client

SetLoc

(“Pacific Rim.mp4”, IP address of N

4

)

Lookup(

Pacific Rim.mp4

)

DB

key=“Pacific Rim.mp4”, value=[content]

Simple,

but O(

N

) state and a

single point of failure

Slide9

9

Flooded queries (original Gnutella)

N1

N2

N3

N6

N5

Publisher (N4)

Client

Lookup(

Pacific Rim.mp4

)

key=“Star Wars.mov”, value=[content]

Robust,

but

O(

N = number of peers

)

messages per lookup

Slide10

10

Routed DHT queries (Chord)

N1

N2

N3

N6

N5

Publisher (N4)

Client

Lookup(H(data

)

)

key=“H(audio data)”, value=[content]

Can we make it

robust

,

reasonable state

, reasonable number of

hops

?

Slide11

Peer-to-Peer SystemsDistributed Hash TablesThe Chord Lookup ServiceConcluding thoughts on DHTs, P2P

11

Today

Slide12

12

What is a DHT (and why)?

How can I do (roughly) this across millions of hosts on the Internet?Distributed Hash Table (DHT)

Local

hash table:

key = Hash(name)

put(key, value)

get(key)

value

Service:

Constant-time insertion and lookup

Slide13

13

What is a DHT (and why)?

Distributed Hash Table:

key = hash(data)

lookup(key)

IP

addr

(Chord lookup service)

send-RPC(IP address,

put

, key, data)

send-RPC(IP address,

get

, key)

data

Partitioning data

in

large-scale distributed systems

Tuples in a global database engine

Data blocks in a global file system

Files in a P2P file-sharing system

Slide14

App may be distributed over many nodesDHT distributes data storage over many nodes

14

Cooperative storage with a DHT

Distributed hash table

Distributed application

get (key)

data

node

node

node

….

put(key, data)

Lookup service

lookup(key)

node IP address

(DHash)

(Chord)

Slide15

BitTorrent can use DHT instead of (or with) a trackerBT clients use DHT:Key = file content hash (“infohash”)Value = IP address of peer willing to serve fileCan store multiple values (i.e. IP addresses) for a keyClient does:get(infohash) to find other clients willing to serveput(infohash, my-ipaddr) to identify itself as willing

15

BitTorrent over DHT

Slide16

The DHT comprises a single giant tracker, less fragmented than many trackersSo peers more likely to find each otherClassic tracker too exposed to legal & © attacks

16

Why might DHT be a win for BitTorrent?

Slide17

API supports a wide range of applicationsDHT imposes no structure/meaning on keysKey/value pairs are persistent and globalCan store keys in other DHT valuesAnd thus build complex data structures

17

Why the put/get DHT interface?

Slide18

Decentralized: no central authorityScalable: low network traffic overhead Efficient: find items quickly (latency)Dynamic: nodes fail, new nodes join

18

Why might DHT design be hard?

Slide19

Peer-to-Peer SystemsDistributed Hash TablesThe Chord Lookup ServiceBasic designIntegration with DHash DHT, performance

19

Today

Slide20

Interface: lookup(key)  IP addressEfficient: O(log N) messages per lookupN is the total number of serversScalable: O(log N) state per nodeRobust: survives massive failuresSimple to analyze

20

Chord lookup algorithm properties

Slide21

Key identifier = SHA-1(key)Node identifier = SHA-1(IP address)SHA-1 distributes both uniformlyHow does Chord partition data?i.e., map key IDs to node IDs

21

Chord identifiers

Slide22

22

Consistent hashing [Karger ‘97]

Key is stored at its successor: node with next-higher ID

K80

N32

N90

N105

K20

K5

Circular 7-bit

ID space

Key 5

Node 105

Slide23

23

Chord: Successor pointers

K80

N32

N90

N105

N10

N60

N120

Slide24

24

Basic lookup

K80

N32

N90

N105

N10

N60

N120

“N90 has K80”

“Where is K80?”

Slide25

25

Simple lookup algorithm

Lookup

(key-id)

succ

my successor

if

my-id

<

succ

<

key-id

// next hop

call Lookup(key-id) on

succ

else

// done

return

succ

Correctness

depends only on

successors

Slide26

Problem: Forwarding through successor is slowData structure is a linked list: O(n)Idea: Can we make it more like a binary search? Need to be able to halve distance at each step

26

Improving performance

Slide27

27

“Finger table” allows log N-time lookups

N80

½

¼

1/8

1/16

1/32

1/64

Slide28

28

Finger i Points to Successor of n+2i

N80

½

¼

1/8

1/16

1/32

1/64

K112

N120

Slide29

A binary lookup tree rooted at every node Threaded through other nodes' finger tablesBetter than arranging nodes in a single treeEvery node acts as a rootSo there's no root hotspotNo single point of failureBut a lot more state in total

29

Implication of finger tables

Slide30

30

Lookup with finger table

Lookup

(key-id)

look in local finger table for

highest n: my-id

<

n

<

key-id

if

n exists

call Lookup(key-id) on node n

// next hop

else

return

my successor

// done

Slide31

31

Lookups Take O(log N) Hops

N32

N10

N5

N20

N110

N99

N80

N60

Lookup(K19

)

K19

Slide32

For a million nodes, it’s 20 hopsIf each hop takes 50ms, lookups take a secondIf each hop has 10% chance of failure, it’s a couple of timeoutsSo in practice log(n) is better than O(n) but not great

32

An aside: Is log(n) fast or slow?

Slide33

33

Joining: Linked list insert

N36

N40

N25

1. Lookup(36)

K30

K38

Slide34

34

Join (2)

N36

N40

N25

2. N36 sets its own

successor pointer

K30

K38

Slide35

35

Join (3)

N36

N40

N25

3. Copy keys 26..36

from N40 to N36

K30

K38

K30

Slide36

36

Notify maintains predecessors

N36

N40

N25

notify

N36

notify

N25

Slide37

37

Stabilize message fixes successor

N36

N40

N25

stabilize

My predecessor

is N36.”

Slide38

Predecessor pointer allows link to new nodeUpdate finger pointers in the backgroundCorrect successors produce correct lookups

38

Joining: Summary

N36

N40

N25

K30

K38

K30

Slide39

39

Failures may cause incorrect lookup

N120

N113

N102

N80

N85

N80

does not know correct successor, so

incorrect lookup

N10

Lookup(K90)

Slide40

40

Successor lists

Each node stores a

list

of its

r

immediate successors

After failure, will know first live successor

Correct successors

guarantee

correct lookups

Guarantee is with some probability

Slide41

41

Choosing successor list length

Assume

one half

of the nodes

fail

P(successor list all dead) =

(½)

r

i.e.

, P(this node breaks the Chord ring)

Depends on independent failure

Successor list of

size

r

= O(log

N

)

makes this probability 1/

N

: low for large

N

Slide42

42

Lookup with fault tolerance

Lookup

(key-id)

look in local finger table

and successor-list

for highest n: my-id

<

n

<

key-id

if

n exists

call Lookup(key-id) on node n

// next hop

if call failed,

remove n from finger table and/or

successor list

return Lookup(key-id)

else

return

my successor

// done

Slide43

Peer-to-Peer SystemsDistributed Hash TablesThe Chord Lookup ServiceBasic designIntegration with DHash DHT, performance

43

Today

Slide44

44

The DHash DHT

Builds key/value storage on Chord

Replicates

blocks for availability

Stores

k

replicas

at the

k

successors

after the block on the Chord ring

Caches

blocks for load balancing

Client

sends

copy of block

to each of the servers it contacted along the

lookup path

Authenticates

block contents

Slide45

45

DHash data authentication

Two types of DHash blocks:Content-hash: key = SHA-1(data)Public-key: Data signed by corresponding private keyChord File System example:

Slide46

Replicas are easy to find if successor failsHashed node IDs ensure independent failure

46

DHash replicates blocks at r successors

N40

N10

N5

N20

N110

N99

N80

N60

N50

Block 17

N68

Slide47

47

Experimental overview

Quick lookup in large systemsLow variation in lookup costsRobust despite massive failure

Goal: Experimentally confirm theoretical results

Slide48

48

Chord lookup cost is O(log N)

Number of Nodes

Average Messages per Lookup

Constant is 1/2

Slide49

49

Failure experiment setup

Start

1,000 Chord servers

Each server’s

successor list

has 20 entries

Wait until they

stabilize

Insert 1,000 key/value pairs

Five

replicas

of each

Stop X%

of the servers, immediately make 1,000 lookups

Slide50

50

Massive failures have little impact

Failed Lookups (Percent)

Failed Nodes (Percent)

(1/2)

6 is 1.6%

Slide51

Peer-to-Peer SystemsDistributed Hash TablesThe Chord Lookup ServiceBasic designIntegration with DHash DHT, performanceConcluding thoughts on DHT, P2P

51

Today

Slide52

Original DHTs (CAN, Chord, Kademlia, Pastry, Tapestry) proposed in 2001-02Next 5-6 years saw proliferation of DHT-based apps:Filesystems (e.g., CFS, Ivy, OceanStore, Pond, PAST)Naming systems (e.g., SFR, Beehive)DB query processing [PIER, Wisc]Content distribution systems (e.g., CoralCDN)distributed databases (e.g., PIER)

52

DHTs: Impact

Slide53

Why don’t all services use P2P?

High latency and limited bandwidth between peers (vs. intra/inter-datacenter)User computers are less reliable than managed serversLack of trust in peers’ correct behaviorSecuring DHT routing hard, unsolved in practice

53

Slide54

Seem promising for finding data in large P2P systemsDecentralization seems good for load, fault tolerance But: the security problems are difficultBut: churn is a problem, particularly if log(n) is bigDHTs have not had the hoped-for impact

54

DHTs in retrospective

Slide55

Consistent hashingElegant way to divide a workload across machinesVery useful in clusters: actively used today in Amazon Dynamo and other systemsReplication for high availability, efficient recoveryIncremental scalabilitySelf-management: minimal configurationUnique trait: no single server to shut down/monitor

55

What DHTs got right