Zettabyte Reliability with - PowerPoint Presentation

askindma . @askindma

342 views
Uploaded On 2020-07-03

Zettabyte Reliability with - PPT Presentation

Flexible Endtoend Data Integrity Yupu Zhang Daniel Myers Andrea ArpaciDusseau Remzi ArpaciDusseau University of Wisconsin Madison 592013 1 Data Corruption Imperfect ID: 794765

data 2013 zfs checksum 2013 data checksum zfs reliability disk memory fletcher corruption xor verify read zettabyte score goal

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/794765" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download The PPT/PDF document "Zettabyte Reliability with" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

Zettabyte Reliability with Flexible End-to-end Data Integrity

Yupu Zhang, Daniel Myers, Andrea Arpaci-Dusseau, Remzi Arpaci-DusseauUniversity of Wisconsin - Madison

5/9/2013

Slide2

Data CorruptionImperfect

hardwareDisk, memory, controllers [Bairavasundaram07, Schroeder09, Anderson03]Buggy softwareKernel, file system, firmware [Engler01, Yang04, Weinberg04]Techniques to maintain data integrityDetection: Checksums [Stein01, Bartlett04]Recovery: RAID [

Patterson88, Corbett04]

5/9/2013

Slide3

In Reality

Corruption still occurs and goes undetectedExisting checks are usually isolatedHigh-level checks are limited (e.g, ZFS)Comprehensive

protection is needed

5/9/2013

Disk

ECC

Memory ECC

solated Protection

imited

rotection

Slide4

Previous State of the Art

End-to-end Data IntegrityChecksum for each data block is generated and verified by applicationSame checksum protects data throughout entire stackA strong checksum is usually preferred

5/9/2013

rite Path

Read Path

Slide5

Two DrawbacksPerformance

Repeatedly accessing data from in-memory cacheStrong checksum means high overheadTimelinessIt is too late to recover from the corruption that occurs before a block is written to disk5/9/20135

rite Path

Read Path

unbounded

time

Generate

Checksum

Verify

Checksum

FAIL

Slide6

Flexible End-to-end Data Integrity

Goal: balance performance and reliabilityChange checksum across components or over timePerformanceFast but weaker checksum for in-memory dataSlow but stronger checksum for on-disk dataTimelinessEach component is aware of the checksumVerification can catch corruption in time5/9/20136

Slide7

Our contributionModelingFramework to reason about reliability

of storage systemsReliability goal: Zettabyte Reliabilityat most one undetected corruption per Zettabyte readDesign and implementationZettabyte-Reliable ZFS (Z2FS)ZFS with flexible end-to-end data integrity

5/9/2013

Slide8

Results

ReliabilityZ2FS is able to provide Zettabyte reliabilityZFS: Pettabyte at bestZ2FS detects and recovers from corruption in timePerformanceComparable to ZFS (less than 10% overhead)Overall faster than the straightforward end-to-end approach

(up to 17% in some cases)

5/9/2013

Slide9

OutlineIntroduction

Analytical FrameworkOverviewExampleFrom ZFS to Z2FSImplementationEvaluationConclusion5/9/20139

Slide10

Overview of the Framework

GoalAnalytically evaluate and compare reliability of storage systemsSilent Data CorruptionCorruption that is undetected by existing checksMetric: Probability of undetected data corruption when reading a data block from system (per I/O)Reliability Score =

5/9/2013

Slide11

Models for the Framework

Hard diskUndetected Bit Error Rate ()Stable, not related to timeDisk Reliability Index = MemoryFailure in Time (FIT) / Mbit (

)Longer residency time, more likely corrupted

Memory Reliability Index

Checksum

Probability of undetected corruption on a device

with a checksum

5/9/2013

Slide12

Calculating

Focus on lifetime of blockFrom it being generated to it being readAcross multiple componentsFind all silent corruption scenarios

is sum of probabilities of each silent corruption scenario during lifetime of block in storage system

5/9/2013

Slide13

Reliability Goal

Ideally, should be 0It’s impossibleGoal: Zettabyte ReliabilityAt most one SDC when reading one Zettabyte data from a storage system

Assuming a dat

a block is 4KB

Reliability Score is

17.5

100MB/s => 2.8 x 10

-6

SDC/year

17 nines

5/9/2013

Slide14

OutlineIntroduction

Analytical FrameworkOverviewExampleFrom ZFS to Z2FSImplementationEvaluationConclusion5/9/201314

Slide15

Sample Systems

NameReliability IndexDescriptionMemory

Disk

Worst

13.4

Worst memory & worst disk

Consumer

14.2

Non-ECC

memory & regular disk

Server

18.8

ECC memory & regular disk

Best

18.8

ECC memory & best disk

5/9/2013

Disk Reliability Index =

Regular disk: 12

Memory Reliability Index =

non-ECC memory: 14.2

ECC memory:

18.8

Slide16

Example

5/9/2013

DISK

MEM

write()

read()

Assuming

there is only

one

corruption in each

scenario

Each time period is a scenario

= sum of probabilities of each time period

Assuming

seconds (flushing interval)

Residency Time

Slide17

Example

(cont.)WorstConsumerServer

Best

Reliability Score

(

)

5/9/2013

Goal:

Zettabyte

Reliability

score: 17.5

none achieves the goal

Server & Consumer

disk corruption dominates

need to protect disk data

Slide18

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSOriginal ZFSEnd-to-end ZFSZ2FS : ZFS with flexible end-to-end data integrityImplementationEvaluationConclusion5/9/2013

Slide19

ZFS

5/9/2013

DISK

MEM

Fletcher

write()

read()

Only on-disk blocks are protected

Generate

Verify

Slide20

ZFS (cont.)

WorstConsumerBest

Reliability Score

(

)

5/9/2013

Goal:

Zettabyte

Reliability

score: 17.5

Best: only Petabyte

Now memory corruption dominates

need

end-to-end protection

Server

Slide21

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSOriginal ZFSEnd-to-end ZFSZ2FS : ZFS with flexible end-to-end data integrityImplementationEvaluationConclusion5/9/2013

Slide22

End-to-end ZFS

5/9/2013

DISK

MEM

write()

read()

Fletcher /

xor

Checksum is generated and verified only by application

Only one type of checksum is used (Fletcher or

xor

)

Generate

Verify

Slide23

Reliability Score (

)

End-to-end ZFS

(cont.)

Worst

Consumer

Server

Best

Worst

Consumer

Server

Best

5/9/2013

Fletcher

xor

provide best reliability

just fall short of the goal

Slide24

Performance Issue

End-to-end ZFS (Fletcher) is 15% slower than ZFSEnd-to-end ZFS (xor) has only 3% overheadxor is optimized by the checksum-on-copy technique [Chu96]

SystemThroughput (MB/s)Normalized

Original ZFS

656.67

100%

End-to-end ZFS (Fletcher)

558.22

85%

End-to-end ZFS (

xor

)

639.89

97%

5/9/2013

Read 1GB Data from Page Cache

Slide25

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSOriginal ZFSEnd-to-end ZFSZ2FS : ZFS with flexible end-to-end data integrityImplementationEvaluationConclusion

5/9/2013

Slide26

Z2FS Overview

Goal Reduce performance overheadStill achieve Zettabyte reliabilityImplementation of flexible end-to-endStatic mode: change checksum across componentsxor as memory checksum and Fletcher as disk checksumDynamic mode: change checksum overtimeFor memory checksum, switch from xor to Fletcher after a certain period of timeLonger residency time => data more likely being corrupt

Slide27

Verify

Generate

Static Mode

5/9/2013

DISK

MEM

write()

read()

Checksum

Chaining

Fletcher

xor

Generate

Verify

Slide28

Static Mode (cont.)

WorstConsumerServer

Best

Reliability Score

(

)

5/9/2013

Worst

use

Fletcher all the

way

Server & Best

xor

is good

enough as memory checksum

Consumer

may

drop below the goal

increases

Slide29

Evolving to

Dynamic ModeReliability Score vs

for consumer

92 sec

5/9/2013

92 sec

Static

Dynamic

witching

the memory checksum from

xor

to Fletcher after 92 sec

Slide30

Verify

GenerateGenerate

Dynamic

Mode

5/9/2013

DISK

MEM

write()

read()

Fletcher

xor

Fletcher

switch

Verify

Slide31

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSImplementationEvaluationConclusion5/9/201331

Slide32

Implementation

Attach checksum to all buffersUser buffer, data page and disk blockChecksum handlingChecksum chaining & checksum switchingInterfacesChecksum-aware system calls (for better protection)Checksum-oblivious APIs (for compatibility)LOC : 6500

5/9/2013

Slide33

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSEvaluationConclusion5/9/201333

Slide34

EvaluationQ1: How does Z2

FS handle data corruption?Fault injection experimentQ2: What’s the overall performance of Z2FS?Micro and macro benchmarks5/9/201334

Slide35

Verify

GenerateGenerate

Fault Injection:

5/9/2013

DISK

MEM

write()

Fletcher

xor

FAIL

Ask the application to rewrite

Slide36

Overall Performance

read a 1 GB file

Warm Read-intensive

5/9/2013

Better protection

usually means higher overhead

FS helps to reduce the

overhead, especially for warm reads

Dominately

Random

Slide37

OutlineIntroduction

Analytical FrameworkFrom ZFS to Z2FSEvaluationConclusion5/9/201337

Slide38

Summary

Problem of straightforward end-to-end data integritySlow performanceUntimely detection and recoverySolution: Flexible end-to-end data integrityChange checksums across component or overtimeAnalytical FrameworkProvide insight about reliability of storage systemsImplementation of Z2FSReduce overhead while still achieve Zettabyte reliability Offer early detection and recovery

5/9/2013

Slide39

ConclusionEnd-to-end data integrity provides comprehensive data protection

One “checksum” may not always fit alle.g. strong checksum => high overheadFlexibility balances reliability and performanceEvery device is differentChoose the best checksum based on device reliability5/9/201339

Slide40

Thank you!

Questions?

Advanced Systems Lab (ADSL)

University of Wisconsin-Madison

http://www.cs.wisc.edu/adsl

Wisconsin Institute on Software-defined Datacenters in Madison

http

://wisdom.cs.wisc.edu/

5/9/2013