/
Zettabyte  Reliability with Zettabyte  Reliability with

Zettabyte Reliability with - PowerPoint Presentation

karlyn-bohler
karlyn-bohler . @karlyn-bohler
Follow
346 views
Uploaded On 2018-09-21

Zettabyte Reliability with - PPT Presentation

Flexible Endtoend Data Integrity Yupu Zhang Daniel Myers Andrea ArpaciDusseau Remzi ArpaciDusseau University of Wisconsin Madison 592013 1 Data Corruption Imperfect ID: 673943

data 2013 zfs checksum 2013 data checksum zfs reliability disk memory fletcher corruption xor verify read zettabyte score goal analytical outlineintroduction flexible

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Zettabyte Reliability with" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Zettabyte Reliability with Flexible End-to-end Data Integrity

Yupu Zhang, Daniel Myers, Andrea Arpaci-Dusseau, Remzi Arpaci-DusseauUniversity of Wisconsin - Madison

5/9/2013

1Slide2

Data CorruptionImperfect hardware

Disk, memory, controllers [Bairavasundaram07, Schroeder09, Anderson03]Buggy softwareKernel, file system, firmware [Engler01, Yang04, Weinberg04]Techniques to maintain data integrityDetection: Checksums [Stein01, Bartlett04]Recovery: RAID [

Patterson88, Corbett04]

5/9/2013

2Slide3

In Reality

Corruption still occurs and goes undetectedExisting checks are usually isolatedHigh-level checks are limited (e.g, ZFS)Comprehensive

protection is needed

5/9/2013

3

Disk

ECC

Memory ECC

I

solated Protection

L

imited

P

rotectionSlide4

Previous State of the Art

End-to-end Data IntegrityChecksum for each data block is generated and verified by applicationSame checksum protects data throughout entire stackA strong checksum is usually preferred

5/9/2013

4

W

rite Path

Read PathSlide5

Two DrawbacksPerformance

Repeatedly accessing data from in-memory cacheStrong checksum means high overheadTimelinessIt is too late to recover from the corruption that occurs before a block is written to disk5/9/20135

W

rite Path

Read Path

unbounded

time

Generate

Checksum

Verify

Checksum

FAILSlide6

Flexible End-to-end Data Integrity

Goal: balance performance and reliabilityChange checksum across components or over timePerformanceFast but weaker checksum for in-memory dataSlow but stronger checksum for on-disk dataTimelinessEach component is aware of the checksumVerification can catch corruption in time5/9/20136Slide7

Our contributionModelingFramework to reason about reliability

of storage systemsReliability goal: Zettabyte Reliabilityat most one undetected corruption per Zettabyte readDesign and implementationZettabyte-Reliable ZFS (Z2FS)ZFS with flexible end-to-end data integrity

5/9/2013

7Slide8

Results

ReliabilityZ2FS is able to provide Zettabyte reliabilityZFS: Pettabyte at bestZ2FS detects and recovers from corruption in timePerformanceComparable to ZFS (less than 10% overhead)Overall faster than the straightforward end-to-end approach

(up to 17% in some cases)

 

5/9/2013

8Slide9

OutlineIntroduction

Analytical FrameworkOverviewExampleFrom ZFS to Z2FSImplementationEvaluationConclusion5/9/20139Slide10

Overview of the Framework

GoalAnalytically evaluate and compare reliability of storage systemsSilent Data CorruptionCorruption that is undetected by existing checksMetric: Probability of undetected data corruption when reading a data block from system (per I/O)Reliability Score =

 

5/9/2013

10Slide11

Models for the Framework

Hard diskUndetected Bit Error Rate ()Stable, not related to timeDisk Reliability Index = MemoryFailure in Time (FIT) / Mbit (

)Longer residency time, more likely corrupted

Memory Reliability Index

=

Checksum

Probability of undetected corruption on a device

with a checksum

 

5/9/2013

11Slide12

Calculating

 Focus on lifetime of blockFrom it being generated to it being readAcross multiple componentsFind all silent corruption scenarios

is sum of probabilities of each silent corruption scenario during lifetime of block in storage system

 

5/9/2013

12Slide13

Reliability Goal

Ideally, should be 0It’s impossibleGoal: Zettabyte ReliabilityAt most one SDC when reading one Zettabyte data from a storage system

Assuming a dat

a block is 4KB

Reliability Score is

17.5

100MB/s => 2.8 x 10

-6

SDC/year

17 nines

 

5/9/2013

13Slide14

OutlineIntroduction

Analytical FrameworkOverviewExampleFrom ZFS to Z2FSImplementationEvaluationConclusion5/9/201314Slide15

Sample Systems

NameReliability IndexDescriptionMemory

Disk

Worst

13.4

10

Worst memory & worst disk

Consumer

14.2

12

Non-ECC

memory & regular disk

Server

18.8

12

ECC memory & regular disk

Best

18.8

20

ECC memory & best disk

5/9/2013

15

Disk Reliability Index =

Regular disk: 12

Memory Reliability Index =

non-ECC memory: 14.2

ECC memory:

18.8

 Slide16

Example

5/9/2013

16

DISK

MEM

t

0

t

1

t

2

t

3

write()

read()

Assuming

there is only

one

corruption in each

scenario

Each time period is a scenario

= sum of probabilities of each time period

Assuming

seconds (flushing interval)

Residency Time

:

 Slide17

Example

(cont.)WorstConsumerServer

Best

Reliability Score

(

)

 

5/9/2013

17

Goal:

Zettabyte

Reliability

score: 17.5

none achieves the goal

Server & Consumer

disk corruption dominates

need to protect disk dataSlide18

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSOriginal ZFSEnd-to-end ZFSZ2FS : ZFS with flexible end-to-end data integrityImplementationEvaluationConclusion5/9/2013

18Slide19

ZFS

5/9/2013

19

DISK

MEM

t

0

t

1

t

2

t

3

Fletcher

write()

read()

Only on-disk blocks are protected

Generate

VerifySlide20

ZFS (cont.)

WorstConsumerBest

Reliability Score

(

)

 

5/9/2013

20

Goal:

Zettabyte

Reliability

score: 17.5

Best: only Petabyte

Now memory corruption dominates

need

end-to-end protection

ServerSlide21

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSOriginal ZFSEnd-to-end ZFSZ2FS : ZFS with flexible end-to-end data integrityImplementationEvaluationConclusion5/9/2013

21Slide22

End-to-end ZFS

5/9/2013

22

DISK

MEM

t

0

t

1

t

2

t

3

write()

read()

Fletcher /

xor

Checksum is generated and verified only by application

Only one type of checksum is used (Fletcher or

xor

)

Generate

VerifySlide23

Reliability Score (

)

 

End-to-end ZFS

(cont.)

Worst

Consumer

Server

Best

Worst

Consumer

Server

Best

5/9/2013

23

Fletcher

xor

provide best reliability

just fall short of the goalSlide24

Performance Issue

End-to-end ZFS (Fletcher) is 15% slower than ZFSEnd-to-end ZFS (xor) has only 3% overheadxor is optimized by the checksum-on-copy technique [Chu96]System

Throughput (MB/s)Normalized

Original ZFS

656.67

100%

End-to-end ZFS (Fletcher)

558.22

85%

End-to-end ZFS (

xor

)

639.89

97%

5/9/2013

24

Read 1GB Data from Page CacheSlide25

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSOriginal ZFSEnd-to-end ZFSZ2FS : ZFS with flexible end-to-end data integrityImplementationEvaluationConclusion

5/9/2013

25Slide26

Z2FS Overview

Goal Reduce performance overheadStill achieve Zettabyte reliabilityImplementation of flexible end-to-endStatic mode: change checksum across componentsxor as memory checksum and Fletcher as disk checksumDynamic mode: change checksum overtimeFor memory checksum, switch from xor to Fletcher after a certain period of timeLonger residency time => data more likely being corruptSlide27

Verify

Generate

Static Mode

5/9/2013

27

DISK

MEM

t

0

t

1

t

2

t

3

write()

read()

Checksum

Chaining

Fletcher

xor

Generate

Verify

VerifySlide28

Static Mode (cont.)

WorstConsumerServer

Best

Reliability Score

(

)

 

5/9/2013

28

Worst

use

Fletcher all the

way

Server & Best

xor

is good

enough as memory checksum

Consumer

may

drop below the goal

as

increases

 Slide29

Evolving to

Dynamic ModeReliability Score vs

for consumer

 

92 sec

5/9/2013

29

92 sec

Static

Dynamic

s

witching

the memory checksum from

xor

to Fletcher after 92 secSlide30

Verify

GenerateGenerate

Dynamic

Mode

5/9/2013

30

DISK

MEM

t

0

t

1

t

2

t

3

write()

read()

Fletcher

xor

t

4

xor

Fletcher

t

switch

Verify

Verify

VerifySlide31

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSImplementationEvaluationConclusion5/9/201331Slide32

Implementation

Attach checksum to all buffersUser buffer, data page and disk blockChecksum handlingChecksum chaining & checksum switchingInterfacesChecksum-aware system calls (for better protection)Checksum-oblivious APIs (for compatibility)LOC : 6500

 

5/9/2013

32Slide33

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSEvaluationConclusion5/9/201333Slide34

EvaluationQ1: How does Z2

FS handle data corruption?Fault injection experimentQ2: What’s the overall performance of Z2FS?Micro and macro benchmarks5/9/201334Slide35

Verify

GenerateGenerate

Fault Injection:

Z

2

FS

5/9/2013

35

DISK

MEM

t

0

t

1

write()

Fletcher

xor

FAIL

Ask the application to rewriteSlide36

Overall Performance

read a 1 GB file

Warm Read-intensive

5/9/2013

36

Better protection

usually means higher overhead

Z

2

FS helps to reduce the

overhead, especially for warm reads

Dominately

by

Random

I/

OsSlide37

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSEvaluationConclusion5/9/201337Slide38

Summary

Problem of straightforward end-to-end data integritySlow performanceUntimely detection and recoverySolution: Flexible end-to-end data integrityChange checksums across component or overtimeAnalytical FrameworkProvide insight about reliability of storage systemsImplementation of Z2FSReduce overhead while still achieve Zettabyte reliability Offer early detection and recovery

5/9/2013

38Slide39

ConclusionEnd-to-end data integrity provides comprehensive data protection

One “checksum” may not always fit alle.g. strong checksum => high overheadFlexibility balances reliability and performanceEvery device is differentChoose the best checksum based on device reliability5/9/201339Slide40

Thank you!

Questions?

Advanced Systems Lab (ADSL)

University of Wisconsin-Madison

http://www.cs.wisc.edu/adsl

Wisconsin Institute on Software-defined Datacenters in Madison

http

://wisdom.cs.wisc.edu/

5/9/2013

40