Distributed Systems Lecture 16 Michael Freedman 2 Linearizability Eventual Consistency models Sequential Causal Lamport clocks Ca lt Cz Conclusion None Vector clocks Va lt Vz Conclusion ID: 783832
Download The PPT/PDF document "Causal Consistency COS 418:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Causal Consistency
COS 418:
Distributed Systems
Lecture
16
Michael Freedman
Slide22
Linearizability
Eventual
Consistency models
Sequential
Causal
Slide3Lamport clocks: C(a) < C(z) Conclusion:
None
Vector clocks: V(a) < V(z) Conclusion:
a → … → z
Distributed bulletin board applicationEach post gets sent to all other usersConsistency goal: No user to see reply before the corresponding original message postConclusion: Deliver message only
after all messages that causally precede it have been delivered3Recall use of logical
clocks
Slide4Causal Consistency
Writes
that are
potentially
causally related must be seen by all machines in same order.
Concurrent writes may be seen in a different order on different machines.Concurrent: Ops not causally related
Slide5Causal Consistency
P1
a
b
d
P2
P3
Physical time ↓
e
f
g
c
Writes that are
potentially
causally related must be seen by all machines in same order.
Concurrent writes may be seen in a different order on different machines.
Concurrent: Ops not causally related
Slide6Causal Consistency
P1
a
b
d
P2
P3
e
f
g
c
Operations
a,
b
b, f
c, f
e, f
e, g
a, c
a, e
Concurrent?
N
Y
Y
Y
N
Y
N
Physical time ↓
Slide7Causal Consistency
P1
a
b
d
P2
P3
e
f
g
c
Operations
a,
b
b, f
c, f
e, f
e, g
a, c
a, e
Concurrent?
N
Y
Y
Y
N
Y
N
Physical time ↓
Slide8Causal Consistency: Quiz
Valid under causal consistency
Why?
W(x)b
and W(x)c are concurrent
So all processes don’t (need to) see them in same orderP3 and P4 read the values ‘a’ and ‘b’ in order as potentially causally related. No ‘causality’ for ‘c
’.
Slide9Sequential Consistency: Quiz
Invalid under sequential consistency
Why?
P3 and P4 see b and c in different order
But fine for causal consistencyB and C are not causually dependent
Write after write has no dep’s, write after read does
Slide10Causal Consistency
x
A:
Violation
:
W(x)b
is potentially
dep
on
W(x)a
B:
Correct. P2 doesn’t read value of a before W
Slide11Causal consistency within replication systems
11
Slide12Linearizability
/ sequential: Eager replication
Trades off low-latency for consistency
12
Implications of laziness on consistency
add
jmp
mov
shl
Log
Consensus
Module
State
Machine
add
jmp
mov
shl
Log
Consensus
Module
State
Machine
add
jmp
mov
shl
Log
Consensus
Module
State
Machine
shl
Slide13Causal consistency: Lazy replication
Trades off
consistency for low-latency
Maintain local ordering when replicatingOperations may be lost if failure before replication
13
Implications of laziness on consistency
add
jmp
mov
shl
Log
State
Machine
add
jmp
mov
shl
Log
State
Machine
add
jmp
mov
shl
Log
State
Machine
shl
Slide14Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with
COPS
W. Lloyd
, M. Freedman
, M. Kaminsky, D. AndersenSOSP 2011
14
Slide15Wide-Area Storage: Serve
reqs
quickly
Slide16Inside the Datacenter
Web Tier
Storage Tier
A-F
G-L
M-R
S-Z
Web Tier
Storage Tier
A-F
G-L
M-R
S-Z
Remote DC
Replication
Slide17A
vailability
L
ow LatencyPartition Tolerance
ScalabilityTrade-offs
C
onsistency (Stronger)
P
artition Tolerance
vs.
Slide18A-Z
A-Z
A-L
M-Z
A-L
M-Z
A-F
G-L
M-R
S-Z
A-F
G-L
M-R
S-Z
A-C
D
-
F
G
-
J
K
-
L
M-
O
P
-
S
T
-
V
W
-Z
A-C
D
-
F
G
-
J
K
-
L
M-
O
P
-
S
T
-
V
W
-Z
Scalability through partitioning
Slide19Remove boss from
friends group
Post to friends: “Time for a new job!”
Friend reads post
Causality ( )
Thread-of-Execution
G
ets-From
Transitivity
New Job!
Friends
Boss
Causality By Example
Slide20Bayou ‘94, TACT ‘00, PRACTI
‘
06
Log-exchange based
Log is single
serialization pointImplicitly
captures and enforces causal orderLimits scalability OR no cross-server causality
Previous Causal Systems
Slide21Scalability Key Idea
Dependency metadata explicitly captures causality
Distributed
verifications replace
single serialization
Delay exposing replicated puts until all dependencies are satisfied in the datacenter
Slide22Local Datacenter
All
Data
All
Data
All
Data
Causal
Replication
COPS architecture
Client Library
Slide23get
get
Client Library
Local Datacenter
Reads
Slide24Client Library
put
put_after
?
?
Replication Q
put
after
K:V
put
+
ordering
metadata
put
after
=
Local Datacenter
Writes
Slide25Dependencies are explicit metadata on values
Library tracks and attaches them to
put_afters
Dependencies
Slide26put(key,
val
)
put_after
(
key,val,
deps
)
version
deps
. . .
K
version
(Thread-Of-Execution Rule)
Client 1
Dependencies
Dependencies are explicit metadata on values
Library tracks and attaches them to
put_afters
Slide27deps
. . .
K
version
L
337
M
195
(Gets-From Rule)
get(K)
get(K)
value, version,
deps
'
value
(Transitivity Rule)
deps
'
L
337
M
195
Client 2
Dependencies
Dependencies are explicit metadata on values
Library tracks and attaches them to
put_afters
Slide28Replication Q
put
after
put_after(
K,V,
deps
)
K:V,
deps
Causal Replication
Slide29put_after(
K,V,
deps
)
dep_check(L
337
)
K:V,
deps
deps
L
337
M
195
dep_check(M
195
)
Causal Replication
d
ep_check
blocks until satisfied
Once all checks return, all dependencies visible locally
Thus, causal consistency satisfied
Slide30ALPS + Causal
Serve
operations locally, replicate in background
Partition keyspace onto many nodesControl replication with dependenciesProliferation of dependencies reduces efficiency
Results in lots of metadataRequires lots of verificationWe need to reduce metadata and dep_checksNearest dependencies
Dependency garbage collectionSystem So Far
Slide31Put
Put
Put
Put
Get
Get
Many Dependencies
Dependencies grow with client lifetimes
Slide32Transitively capture all ordering constraints
Nearest Dependencies
Slide33Transitively capture all ordering constraints
The Nearest Are Few
Slide34Only check nearest when replicating
COPS only tracks nearest
COPS-GT (”with get transactions”) tracks
non-nearest for read transactionsDependency garbage collection tames metadata in COPS-GT
The Nearest Are Few
Slide35Experimental Setup
COPS
Remote DC
COPS Servers
Clients
Local Datacenter
N
N
N
Replication
Slide36Performance
High per-client write rates result in 1000s of dependencies
Low per-client
write rates expected
People tweeting 1000 times/sec
People tweeting 1 time/sec
All Put Workload – 4 Servers / Datacenter
Slide37COPS Scaling
Slide38COPS summary
ALPS: Handle all reads/writes locally
Causality
Explicit dependency tracking and verification with decentralized replication
Optimizations to reduce metadata and checksWhat about fault-tolerance?
Each partition uses linearizable replication within DC
Slide39Wednesday lecture
Concurrency Control:
Locking and Recovery
39