Caching with Online Erasure Coding K V Rashmi Mosharaf Chowdhury Jack Kosaian Ion Stoica Kannan Ramchandran Presented by Haoran Wang EECS 582 F16 1 Background ID: 723823
Download Presentation The PPT/PDF document "EC-Cache: Load-balanced, Low-latency Clu..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
EC-Cache: Load-balanced, Low-latency Cluster Caching with Online Erasure Coding
K. V. Rashmi, Mosharaf Chowdhury, Jack Kosaian, Ion Stoica, Kannan RamchandranPresented by Haoran Wang
EECS 582 – F16
1Slide2
Background
EECS 582 – F162The trend of in-memory caching for object store Reduced a lot of disk I/O e.g. Alluxio(Tachyon)Load imbalanceSkew of popularityBackground network traffic congestionSlide3
Erasure coding in a nutshell
Fault-tolerance in a space-efficient fashionTrade data availability for space efficiencyEECS 582 – F163Slide4
Replication
EECS 582 – F164Y
X
ZSlide5
Replication
EECS 582 – F165Y
X
Z
Y
X
Z
Replica factor = 2Slide6
Erasure Coding
EECS 582 – F166Y
X
ZSlide7
Erasure Coding
EECS 582 – F167Y
X
Z
W
Y
X
Z
T
a
11
a
12
a
13
a
21
a
22
a
23
+
+
+
+
=
=
Linear paritySlide8
Erasure Coding
EECS 582 – F168Y
X
Z
W
Y
X
Z
T
a
11
a
12
a
13
a
21
a
22
a
23
+
+
+
+
=
=
Linear parity
k
= 3, r = 2Slide9
Erasure Coding
EECS 582 – F169Y
X
Z
W
Y
X
Z
T
a
11
a
12
a
13
a
21
a
22
a
23
+
+
+
+
=
=
Linear parity
k
= 3, r = 2
A well-known co-efficient matrix:Slide10
Fault tolerance - Replication
EECS 582 – F1610Y
X
Z
Y
X
Z
Survived 1 failureSlide11
Fault tolerance - Replication
EECS 582 – F1611Y
X
Z
Y
X
Z
Survived 2 failuresSlide12
Fault tolerance - Replication
EECS 582 – F1612Y
X
Z
Y
X
Z
Cannot surviveSlide13
Fault tolerance - Replication
EECS 582 – F1613Total storage is 6 unitsOnly guaranteed to survive one failureSlide14
Fault tolerance - EC
EECS 582 – F1614Y
X
Z
W
Y
X
Z
T
a
11
a
12
a
13
a
21
a
22
a
23
+
+
+
+
=
=
Linear parity
k
= 3, r = 2
A well-known co-efficient matrix:Slide15
Fault tolerance - EC
EECS 582 – F1615
: the
symbol
that represents data
: the actual
value
of
Translated into linear equations:Slide16
Fault tolerance - EC
EECS 582 – F1616
Translated into linear equations:
Still solvable, with unique solution (set)Slide17
Fault tolerance - EC
EECS 582 – F1617
Translated into linear equations:
5 equations and 3 variables
Can survive
ANY
2 failures
Because N variables can be solved from N (linearly independent) equationsSlide18
Fault tolerance – EC
EECS 582 – F1618Need only 5 storage units to survive 2 failures Compared to replication: (6,1)Slide19
What’s the tradeoff?
EECS 582 – F1619Saving space V.S. Time needed to reconstruct (solving linear equations)Slide20
Switch of context
EECS 582 – F1620EC for fault-tolerance EC for load balancing and low latencyGranularity: single data object splits of objectHow about the tradeoff?Deal with primarily in-memory cache, so reconstruction time won’t be a problemSlide21
Load balancing
EECS 582 – F16214 pieces of data: Hot Hot Cold ColdSlide22
Load balancing
EECS 582 – F1622Seemingly balanced
HotCold
Hot
ColdSlide23
Load balancing
EECS 582 – F1623What if some data is super hot ? Super hot >> hot + cold + cold
Super
Hot
Hot
Cold
ColdSlide24
Load balancing
EECS 582 – F1624Introducing splits“Distribute” the hotness of single object
Also good for read latency – read splits in parallelSlide25
But what’s wrong with split-level replication?
EECS 582 – F1625…
X
…
…
Y
…
…
Z
…
Server 1
Server 2
Server 3
…
X
…
…
Y
…
…
Z
…
Server 4
Server 5
Server 6Slide26
But what’s wrong with split-level replication?
EECS 582 – F1626…
X
…
…
Y
…
Need at least one copy of (X,Y,Z) to retrieve the whole object
…
Z
…
Server 1
Server 2
Server 3
…
X
…
…
Y
…
…
Z
…
Server 4
Server 5
Server 6Slide27
The flexibility of EC
EECS 582 – F1627…
X
…
…
Y
…
Any
k
(= 3)
of k + r
(=5)
units would suffice to reconstruct the original object
…
Z
…
Server 1
Server 2
Server 3
…
W
…
…
T
…
Server 4
Server 5Slide28
EC-Cache: Dealing with straggler/failures
EECS 582 – F1628…
X
…
…
Y
…
Read
< r) out of k + r units, finish as long as first k reads are complete
…
Z
…
Server 1
Server 2
Server 3
…
W
…
…
T
…
Server 4
Server 5
Failed/Slow, but it’s OKSlide29
Writes
EECS 582 – F1629Create splits Encode to parity unitsUniformly randomly distribute to distinct serversSlide30
Implementation
EECS 582 – F1630Built on top of Alluxio (Tachyon)Transparent to backend storage/caching serverReed-Solomon codes instead of linear paritiesIntel ISA-L for faster encoding/decodingZookeeper to manage metadataLocation of splits
Mapping of object to splitsWhich splits to write Backend storage server takes care of “ultimate” fault-tolerance
EC-Cache deals with only in-memory cachesSlide31
Evaluation
EECS 582 – F1631Metrics Load imbalance:
Latency improvement:
Setting
Selective Replication V.S. EC
Zipf’s
distribution for object popularity
Highly skewed (p=0.9)
Allowed 15% overhead
Slide32
Evaluation – Read latency
EECS 582 – F1632Slide33
Evaluation – Load Balancing
EECS 582 – F1633Slide34
Evaluation – Varying object sizes
EECS 582 – F1634
Median latency
Tail latencySlide35
Evaluation – Tail latency improvement from additional reads
EECS 582 – F1635Slide36
Summary
EECS 582 – F1636EC used for in-memory object cachingEffective for load balancing and reducing read latencySuitable workloadsImmutable objectsBecause updates to any data units would require re-encoding a new set of paritiesCut-off for small objectsOverhead cannot be amortized – still use SR for small objectsSlide37
Discussions
EECS 582 – F1637In EC-cache, there are extra overheads for encoding/decoding/metadata/networking, in what contexts can they be tolerated/amortized?What design would be good for frequent in-place updates? Any existing solutions?What policy could minimize the eviction of splits needed for current read requests?