EC-Cache: Load-balanced, Low-latency Cluster - PowerPoint Presentation

373 views
Uploaded On 2018-11-09

EC-Cache: Load-balanced, Low-latency Cluster - PPT Presentation

Caching with Online Erasure Coding K V Rashmi Mosharaf Chowdhury Jack Kosaian Ion Stoica Kannan Ramchandran Presented by Haoran Wang EECS 582 F16 1 Background ID: 723823

eecs 582 fault server 582 eecs server fault tolerance load replication object linear hot cold balancing latency splits coding

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/723823" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Presentation The PPT/PDF document "EC-Cache: Load-balanced, Low-latency Clu..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

EC-Cache: Load-balanced, Low-latency Cluster Caching with Online Erasure Coding

K. V. Rashmi, Mosharaf Chowdhury, Jack Kosaian, Ion Stoica, Kannan RamchandranPresented by Haoran Wang

EECS 582 – F16

1Slide2

Background

EECS 582 – F162The trend of in-memory caching for object store Reduced a lot of disk I/O e.g. Alluxio(Tachyon)Load imbalanceSkew of popularityBackground network traffic congestionSlide3

Erasure coding in a nutshell

Fault-tolerance in a space-efficient fashionTrade data availability for space efficiencyEECS 582 – F163Slide4

Replication

EECS 582 – F164Y

ZSlide5

Replication

EECS 582 – F165Y

Replica factor = 2Slide6

Erasure Coding

EECS 582 – F166Y

ZSlide7

Erasure Coding

EECS 582 – F167Y

Linear paritySlide8

Erasure Coding

EECS 582 – F168Y

Linear parity

= 3, r = 2Slide9

Erasure Coding

EECS 582 – F169Y

Linear parity

= 3, r = 2

A well-known co-efficient matrix:Slide10

Fault tolerance - Replication

EECS 582 – F1610Y

Survived 1 failureSlide11

Fault tolerance - Replication

EECS 582 – F1611Y

Survived 2 failuresSlide12

Fault tolerance - Replication

EECS 582 – F1612Y

Cannot surviveSlide13

Fault tolerance - Replication

EECS 582 – F1613Total storage is 6 unitsOnly guaranteed to survive one failureSlide14

Fault tolerance - EC

EECS 582 – F1614Y

Linear parity

= 3, r = 2

A well-known co-efficient matrix:Slide15

Fault tolerance - EC

EECS 582 – F1615

: the

symbol

that represents data

: the actual

value

Translated into linear equations:Slide16

Fault tolerance - EC

EECS 582 – F1616

Translated into linear equations:

Still solvable, with unique solution (set)Slide17

Fault tolerance - EC

EECS 582 – F1617

Translated into linear equations:

5 equations and 3 variables

Can survive

ANY

2 failures

Because N variables can be solved from N (linearly independent) equationsSlide18

Fault tolerance – EC

EECS 582 – F1618Need only 5 storage units to survive 2 failures Compared to replication: (6,1)Slide19

What’s the tradeoff?

EECS 582 – F1619Saving space V.S. Time needed to reconstruct (solving linear equations)Slide20

Switch of context

EECS 582 – F1620EC for fault-tolerance  EC for load balancing and low latencyGranularity: single data object  splits of objectHow about the tradeoff?Deal with primarily in-memory cache, so reconstruction time won’t be a problemSlide21

Load balancing

EECS 582 – F16214 pieces of data: Hot Hot Cold ColdSlide22

Load balancing

EECS 582 – F1622Seemingly balanced

HotCold

Hot

ColdSlide23

Load balancing

EECS 582 – F1623What if some data is super hot ? Super hot >> hot + cold + cold

Super

Hot

Cold

ColdSlide24

Load balancing

EECS 582 – F1624Introducing splits“Distribute” the hotness of single object

Also good for read latency – read splits in parallelSlide25

But what’s wrong with split-level replication?

EECS 582 – F1625…

…

Server 1

Server 2

Server 3

…

Server 4

Server 5

Server 6Slide26

But what’s wrong with split-level replication?

EECS 582 – F1626…

…

Need at least one copy of (X,Y,Z) to retrieve the whole object

…

Server 1

Server 2

Server 3

…

Server 4

Server 5

Server 6Slide27

The flexibility of EC

EECS 582 – F1627…

…

Any

(= 3)

of k + r

(=5)

units would suffice to reconstruct the original object

…

Server 1

Server 2

Server 3

…

Server 4

Server 5Slide28

EC-Cache: Dealing with straggler/failures

EECS 582 – F1628…

…

Read

< r) out of k + r units, finish as long as first k reads are complete

…

Server 1

Server 2

Server 3

…

Server 4

Server 5

Failed/Slow, but it’s OKSlide29

Writes

EECS 582 – F1629Create splits Encode to parity unitsUniformly randomly distribute to distinct serversSlide30

Implementation

EECS 582 – F1630Built on top of Alluxio (Tachyon)Transparent to backend storage/caching serverReed-Solomon codes instead of linear paritiesIntel ISA-L for faster encoding/decodingZookeeper to manage metadataLocation of splits

Mapping of object to splitsWhich splits to write Backend storage server takes care of “ultimate” fault-tolerance

EC-Cache deals with only in-memory cachesSlide31

Evaluation

EECS 582 – F1631Metrics Load imbalance:

Latency improvement:

Setting

Selective Replication V.S. EC

Zipf’s

distribution for object popularity

Highly skewed (p=0.9)

Allowed 15% overhead

Slide32

Evaluation – Read latency

EECS 582 – F1632Slide33

Evaluation – Load Balancing

EECS 582 – F1633Slide34

Evaluation – Varying object sizes

EECS 582 – F1634

Median latency

Tail latencySlide35

Evaluation – Tail latency improvement from additional reads

EECS 582 – F1635Slide36

Summary

EECS 582 – F1636EC used for in-memory object cachingEffective for load balancing and reducing read latencySuitable workloadsImmutable objectsBecause updates to any data units would require re-encoding a new set of paritiesCut-off for small objectsOverhead cannot be amortized – still use SR for small objectsSlide37

Discussions

EECS 582 – F1637In EC-cache, there are extra overheads for encoding/decoding/metadata/networking, in what contexts can they be tolerated/amortized?What design would be good for frequent in-place updates? Any existing solutions?What policy could minimize the eviction of splits needed for current read requests?

EC-Cache: Load-balanced, Low-latency Cluster - PowerPoint Presentation

EC-Cache: Load-balanced, Low-latency Cluster - PPT Presentation

Share:

Link:

Embed:

Related Contents