/
Availability in Globally Distributed Storage Systems Availability in Globally Distributed Storage Systems

Availability in Globally Distributed Storage Systems - PowerPoint Presentation

karlyn-bohler
karlyn-bohler . @karlyn-bohler
Follow
347 views
Uploaded On 2018-11-09

Availability in Globally Distributed Storage Systems - PPT Presentation

Daniel Ford Francois Labelle Florentina I Popovici Murray Stokely Van Anh Truong Luiz Barroso Carrie Grimes and Sean Quinlan Nabeel 1 Distributed Storage System ID: 723938

burst failure cell data failure burst data cell blocks stripe mttf cont availability node failures chunks reduction unavailability model

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Availability in Globally Distributed Sto..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Availability in Globally Distributed Storage Systems

Daniel Ford, Franc¸ois Labelle, Florentina I. Popovici, Murray Stokely, Van-Anh Truong,Luiz Barroso, Carrie Grimes, and Sean Quinlan - Nabeel

1Slide2

Distributed Storage System

Exponential increase in storage needsUses shared nothing architectureUses low commodity hardwareSoftware layer provides fault toleranceSuitable for data parallel, I/O bound computationsHighly scalable and cost effective2Slide3

Data Center

3Slide4

Data Center Components

4

Server Components

Racks

Interconnects

Cluster of RacksSlide5

Data Center Components

5

Server Components

Racks

Interconnects

Cluster of Racks

ALL THESE COMPONENTS CAN FAILSlide6

Google File System

6Slide7

Cell, Stripe and Chunk

7

Stripe 1 Stripe 2

Stripe 1 Stripe 2

CELL 1

CELL 2

Chunks

Chunks

Chunks

Chunks

GFS Instance 1

GFS Instance 2Slide8

Failure Sources and Events

Failure SourcesHardware – Disks, Memory etc.Software – chunk server processNetwork InterconnectPower Distribution UnitFailure EventsNode restartPlanned rebootUnplanned reboot 8Slide9

Fault Tolerance Mechanisms

Replication (R = n)‘n’ identical chunks (replication factor) are placed across storage nodes in different rack/cell/DCErasure Coding ( RS (n, m))‘n’ distinct data blocks and ‘m’ code blocksCan recover utmost ‘m’ blocks from the remaining ‘n-m’ blocks9Slide10

Replication

10

1 Chunk

5 replicas

Fast Encoding / Decoding

Very Space InefficientSlide11

Erasure Coding

11

‘n’ data blocks

Encode

‘n + m’ blocks

‘m’ code blocksSlide12

Erasure Coding

12

‘n’ data blocks

Encode

‘n + m’ blocks

‘m’ code blocksSlide13

Erasure Coding

13Highly Space EfficientSlow Encoding / Decoding

‘n’ data blocks

Decode

Encode

‘n + m’ blocks

‘m’ code blocks

‘n’ data blocksSlide14

Goal of the Paper

Characterizes the availability properties of cloud storage systemsSuggests a good availability model that helps in data placement and replication strategies14Slide15

Agenda

IntroductionFindings from the fleetCorrelated failuresModeling availability dataConclusion15Slide16

CDF of Node Unavailability

16Slide17

CDF of Node Unavailability

17Slide18

Average Availability and MTTF

Average availability of ‘N’ nodes AN = ----------------------------- Mean Time To Failure (MTTF) uptime downtime MTTF = --------------------

No. of failures

Failure 1 Failure 2

Uptime

18Slide19

CDF of Node Unavailability by Cause

19Slide20

Node Unavailability

20Slide21

Correlated Failures

Failure DomainSet of machines that simultaneously fails from a common source of failureFailure BurstSequence of node failures each occurring within a time window ‘w’ of the next37% of all failures are part of a burst of at-least 2 nodes21Slide22

Failure Burst

22Slide23

Failure Burst

23Slide24

Failure Burst

24Slide25

Failure Burst (cont..)

25Slide26

Failure Burst (cont..)

26Slide27

Failure Burst (cont..)

27Slide28

Failure Burst (cont..)

28Slide29

Failure Burst (cont..)

29Slide30

Domain Related Failures

Domain related issues – causes of correlated failuresA metric is devisedTo determine if a failure burst is domain related or randomIn evaluating the importance of domain diversity in cell design and data placement30Slide31

Rack Affinity

Probability that a burst of the same size affecting randomly chosen nodes have smaller affinity score Rack Affinity ScoreDetermines the rack concentration of the burstNo. of ways of choosing 2 nodes from the burst within the same rack where ki is the no. of nodes affected in the

i

th

rack affected.

Eg

. (1,4) and (1,1,1,2)

31Slide32

Stripe MTTF Vs Burst Size

32Slide33

Stripe MTTF Vs Burst Size

33Slide34

Stripe MTTF Vs Burst Size

34Slide35

Trace Simulation

35Slide36

Markov Model

To model & understand the impact of hardware and software changes in availabilityFocused on the availability of a stripeState : No. of available chunks (in the stripe)Transition : Rates by which a stripe moves to the next state due to:Chunk Failure ( reduces available chunks)Chunk Recoveries ( increases available chunks)36Slide37

Markov Chain

37Slide38

Markov Model Findings

RS (6,3) No correlated failures10% reduction in recovery time results in 19% reduction in unavailabilityCorrelated failuresReduced MTTF when correlated failures are modeled90% reduction in recovery time results in 6% reduction in unavailabilityReduces the benefit of increased data redundancy

38Slide39

Markov Model Findings (cont..)

39Slide40

Markov Model Findings (cont..)

R = 3 and increase in availability10% reduction in disk latency error has negative effect ???10% reduction in disk failure rate has 1.5% improvement10% reduction in node failure rate has 18% improvementImprovements below the node layer of the storage stack do not significantly improve data availability40Slide41

Single Cell Vs Multi-Cell

Trade off between availability & inter cell recovery bandwidth. Higher MTTF with Multicell replication.41Slide42

Single Cell Vs Multi Cell (cont..)

42CELL A

R3

R4

R1

R2

CELL B

CELL A

R3

R4

R1

R2

Single Cell

Multi CellSlide43

Conclusion

Correlation among node failures is importantCorrelated failures share common failure domainsMost unavailability periods are transient and differs significantly by causeReduce reboot times for kernel upgradesThe findings provides a feedback for improvingReplication and encoding schemesData placement strategiesPrimary causes of data unavailability43Slide44

44