Distributed Systems Lecture 14 Michael Freedman 2 Linearizability Eventual Consistency models Sequential Causal Lamport clocks Ca lt Cz Conclusion None Vector clocks Va lt Vz Conclusion ID: 680644
Download Presentation The PPT/PDF document "Causal Consistency COS 418:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Causal Consistency
COS 418:
Distributed Systems
Lecture 14
Michael FreedmanSlide2
2
Linearizability
Eventual
Consistency models
Sequential
CausalSlide3
Lamport clocks: C(a) < C(z) Conclusion:
None
Vector clocks: V(a) < V(z) Conclusion:
a → … → z
Distributed bulletin board applicationEach post gets sent to all other usersConsistency goal: No user to see reply before the corresponding original message postConclusion: Deliver message only
after all messages that causally precede it have been delivered3Recall use of logical clocks (
lec 4)Slide4
Causal Consistency
Writes
that are
potentially
causally related must be seen by all machines in same order.
Concurrent writes may be seen in a different order on different machines.Concurrent: Ops not causally relatedSlide5
Causal Consistency
P1
a
b
d
P2
P3
Physical time ↓
e
f
g
c
Writes that are
potentially
causally related must be seen by all machines in same order.
Concurrent writes may be seen in a different order on different machines.
Concurrent: Ops not causally relatedSlide6
Causal Consistency
P1
a
b
d
P2
P3
e
f
g
c
Operations
a,
b
b, f
c, f
e, f
e, g
a, c
a, e
Concurrent?
N
Y
Y
Y
N
Y
N
Physical time ↓Slide7
Causal Consistency
P1
a
b
d
P2
P3
e
f
g
c
Operations
a,
b
b, f
c, f
e, f
e, g
a, c
a, e
Concurrent?
N
Y
Y
Y
N
Y
N
Physical time ↓Slide8
Causal Consistency: Quiz
Valid under causal consistency
Why?
W(x)b
and W(x)c are concurrent
So all processes don’t (need to) see them in same orderP3 and P4 read the values ‘a’ and ‘b’ in order as potentially causally related. No ‘causality’ for ‘
c’.Slide9
Sequential Consistency: Quiz
Invalid under sequential
consistency
Why?
P3 and P4 see b and c in different orderBut fine for causal consistencyB and C are not causually
dependentWrite after write has no dep’s, write after read doesSlide10
Causal Consistency
x
A:
Violation
:
W(x)b
is potentially
dep
on
W(x)a
B:
Correct. P2 doesn’t read value of a before WSlide11
Causal consistency within replication systems
11Slide12
Linearizability
/ sequential: Eager replication
Trades off low-latency for consistency
12
Implications of laziness on consistency
add
jmp
mov
shl
Log
Consensus
Module
State
Machine
add
jmp
mov
shl
Log
Consensus
Module
State
Machine
add
jmp
mov
shl
Log
Consensus
Module
State
Machine
shlSlide13
Causal consistency: Lazy replication
Trades off
consistency for low-latency
Maintain local ordering when replicatingOperations may be lost if failure before replication
13
Implications of laziness on consistency
add
jmp
mov
shl
Log
State
Machine
add
jmp
mov
shl
Log
State
Machine
add
jmp
mov
shl
Log
State
Machine
shlSlide14
Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with
COPS
W. Lloyd
, M. Freedman
, M. Kaminsky, D. AndersenSOSP 2011
14Slide15
Wide-Area Storage: Serve
reqs
quicklySlide16
Inside the Datacenter
Web Tier
Storage Tier
A-F
G-L
M-R
S-Z
Web Tier
Storage Tier
A-F
G-L
M-R
S-Z
Remote DC
ReplicationSlide17
A
vailability
L
ow LatencyPartition Tolerance
ScalabilityTrade-offs
C
onsistency (Stronger)
P
artition Tolerance
vs.Slide18
A-Z
A-Z
A-L
M-Z
A-L
M-Z
A-F
G-L
M-R
S-Z
A-F
G-L
M-R
S-Z
A-C
D
-
F
G
-
J
K
-
L
M-
O
P
-
S
T
-
V
W
-Z
A-C
D
-
F
G
-
J
K
-
L
M-
O
P
-
S
T
-
V
W
-Z
Scalability through partitioningSlide19
Remove boss from
friends group
Post to friends: “Time for a new job!”
Friend reads post
Causality ( )
Thread-of-Execution
G
ets-From
Transitivity
New Job!
Friends
Boss
Causality By Example Slide20
Bayou ‘94, TACT ‘00, PRACTI
‘
06
Log-exchange based
Log is single
serialization pointImplicitly
captures and enforces causal orderLimits scalability OR no cross-server causality
Previous Causal SystemsSlide21
Scalability Key Idea
Dependency metadata explicitly captures causality
Distributed
verifications replace
single serialization
Delay exposing replicated puts until all dependencies are satisfied in the datacenterSlide22
Local Datacenter
All
Data
All
Data
All
Data
Causal
Replication
COPS architecture
Client LibrarySlide23
get
get
Client Library
Local Datacenter
ReadsSlide24
Client Library
put
put_after
?
?
Replication Q
put
after
K:V
put
+
ordering
metadata
put
after
=
Local Datacenter
WritesSlide25
Dependencies are explicit metadata on values
Library tracks and attaches them to
put_afters
DependenciesSlide26
put(key,
val
)
put_after
(
key,val,
deps
)
version
deps
. . .
K
version
(Thread-Of-Execution Rule)
Client 1
Dependencies
Dependencies are explicit metadata on values
Library tracks and attaches them to
put_aftersSlide27
deps
. . .
K
version
L
337
M
195
(Gets-From Rule)
get(K)
get(K)
value, version,
deps
'
value
(Transitivity Rule)
deps
'
L
337
M
195
Client 2
Dependencies
Dependencies are explicit metadata on values
Library tracks and attaches them to
put_aftersSlide28
Replication Q
put
after
put_after(
K,V,
deps
)
K:V,
deps
Causal ReplicationSlide29
put_after(
K,V,
deps
)
dep_check(L
337
)
K:V,
deps
deps
L
337
M
195
dep_check(M
195
)
Causal Replication
d
ep_check
blocks until satisfied
Once all checks return, all dependencies visible locally
Thus, causal consistency satisfiedSlide30
ALPS + Causal
Serve
operations locally, replicate in background
Partition keyspace onto many nodesControl replication with dependenciesProliferation of dependencies reduces efficiency
Results in lots of metadataRequires lots of verificationWe need to reduce metadata and dep_checksNearest dependencies
Dependency garbage collectionSystem So FarSlide31
Put
Put
Put
Put
Get
Get
Many Dependencies
Dependencies grow with client lifetimesSlide32
Transitively capture all ordering constraints
Nearest DependenciesSlide33
Transitively capture all ordering constraints
The Nearest Are FewSlide34
Only check nearest when replicating
COPS only tracks nearest
COPS-GT tracks non-nearest for read transactions
Dependency garbage collection tames metadata in COPS-GT
The Nearest Are FewSlide35
Experimental Setup
COPS
Remote DC
COPS Servers
Clients
Local Datacenter
N
N
N
ReplicationSlide36
Performance
High per-client write rates result in 1000s of dependencies
Low per-client
write rates expected
People tweeting 1000 times/sec
People tweeting 1 time/sec
All Put Workload – 4 Servers / DatacenterSlide37
COPS ScalingSlide38
COPS summary
ALPS: Handle all reads/writes locally
Causality
Explicit dependency tracking and verification with decentralized replication
Optimizations to reduce metadata and checksWhat about fault-tolerance?
Each partition uses linearizable replication within DCSlide39
Monday lecture
Concurrency Control
39