Ofir Weisse EECS 582 W16 1 About the paper Published in 1988 Summarized existing technologies EECS 582 W16 2 Motivation Whats In the Box That we call a computer EECS 582 W16 ID: 555907
Download Presentation The PPT/PDF document "A Case for Redundant Arrays of Inexpensi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
A Case for Redundant Arrays of Inexpensive Disks (RAID)
Ofir Weisse
EECS 582 – W16
1Slide2
About the paper
Published in 1988Summarized existing technologies
EECS 582 – W16
2Slide3
Motivation – What’s In the Box? (That we call a computer)
EECS 582 – W16
3Slide4
Motivation – What’s In the Box? (2)
CPU
EECS 582 – W16
4Slide5
Motivation – What’s In the Box? (2)
CPU
Performance measured in MIPS
Estimated to grow as fast as MIPS =
EECS 582 – W16
5Slide6
Motivation – What’s In the Box? (3)
Memory
Capacity measured in bytes
Estimated to grow as fast as
No. of Transistors =
3. Speed increases 40%-100% a year
(as of 1988
…)
EECS 582 – W16
6Slide7
Memory Bandwidth
EECS 582 – W16
7Slide8
Motivation – What’s In the Box? (4)
Storage
EECS 582 – W16
8Slide9
Motivation – What’s In the Box? (4)
Storage
Capacity measured in bytes
Density estimated to grow as fast as
MAD =
3. What about speed?
EECS 582 – W16
9Slide10
Motivation – What’s In the Box? (4)
Storage
EECS 582 – W16
10Slide11
The Speedup Formula
E
ffective speedup =
EECS 582 – W16
11Slide12
The Speedup Formula – let’s plug in!
Effective speedup =
Time spent in processing = 10%
f = 0.1
MIPS are doubled
k == 2
Actual speedup: ???
EECS 582 – W16
12Slide13
The Speedup Formula – let’s plug in!
Effective speedup =
Time spent in processing = 10%
f = 0.1
MIPS are doubled
k == 2
Actual speedup:
=1.05
EECS 582 – W16
13Slide14
Effective speedup =
Time spent in accessing memory=
2
0%
f = 0.2
Speed is doubled
k == 2
Actual speedup:
=1.11
The Speedup Formula – let’s plug in!
EECS 582 – W16
14Slide15
Effective speedup =
Time spent in accessing storage = 70%
f = 0.7
Speed improved by 40%
k == 1.4
Actual speedup: ????
The Speedup Formula – let’s plug in!
EECS 582 – W16
15Slide16
Effective speedup =
Time spend in accessing storage = 70%
f = 0.7
Speed improved by 40%
k == 1.4
Actual speedup:
=1.25
The Speedup Formula – let’s plug in!
EECS 582 – W16
16Slide17
The Idea
Replace Single Large Expensive Drive (SLED)
with multiple small cheap drives
EECS 582 – W16
17Slide18
Level 0: AID – Array of Inexpensive Disks
Be wise – parallelize!
EECS 582 – W16
18Slide19
The good:
Writing is as fast as the single slowest disk
Reading is as fast as the single slowest disk
Level 0: AID – Array of Inexpensive Disks
EECS 582 – W16
19Slide20
The good:
Writing is as fast as the single slowest disk
Reading is as fast as the single slowest
disk
The bad:
MTTF is terrible
Level 0: AID – Array of Inexpensive Disks
EECS 582 – W16
20Slide21
Level 1: RAID –
Redundant Array of Inexpensive Disks
Duplicate everything!
EECS 582 – W16
21Slide22
Level 1: RAID –
Redundant Array of Inexpensive Disks
The good:On read – use the disk with shorter
queue and minimum seek time
Can read 2 blocks in “one”
read
MeanTimeToFailure
is in the sky
EECS 582 – W16
22Slide23
Level 1: RAID –
Redundant Array of Inexpensive Disks
The good:
On read – use the disk with shorter
queue an minimum seek time
Can read 2 blocks in “one” read
MeanTimeToFailure
is in the sky
The bad:
Cost twice as much
Writing is always a bit worse
EECS 582 – W16
23Slide24
Error Correction Codes
EECS 582 – W16
24
Data
bit 0
Data
bit 1
Data
bit 2
Data
bit 3
Data
bit 4
Data
bit 5
Data
bit 6
ECC bit 7
ECC bit 8
ECC bit 9
ECC bit 10
(Paper also mentioned 20:5 ECC ratio)Slide25
RAID Level 2 & bit interleaving
D
bit 0
D
bit 1
D
bit 2
D
bit 3
D
bit 4
D
bit 5
ECC
1
ECC
2
ECC
3
ECC
4
D
bit
6
D
bit
7
D
bit
8
D
bit
9
D
bit
10
D
bit 11
ECC
5
ECC
6
ECC
7
ECC
8
D
bit
12
D
bit
13
D
bit
14
D
bit
15
D
bit
16
D
bit
17
ECC
9
ECC
10
ECC
11
ECC
12
….
EECS 582 – W16
25Slide26
RAID Level 2 & bit interleaving
D
bit 0
D
bit 1
D
bit 2
D
bit 3
D
bit 4
D
bit 5
ECC
1
ECC
2
ECC
3
ECC
4
D
bit
6
D
bit
7
D
bit
8
D
bit
9
D
bit
10
D
bit 11
ECC
5
ECC
6
ECC
7
ECC
8
D
bit
12
D
bit
13
D
bit
14
D
bit
15
D
bit
16
D
bit
17
ECC
9
ECC
10
ECC
11
ECC
12
….
EECS 582 – W16
26
The good:
Reading 6 sectors takes is like reading a single sector from a drive
Redundancy cost is only 4/6 =66%
MeanTimeToFailure
is still very highSlide27
RAID Level 2 & bit interleaving
D
bit 0
D
bit 1
D
bit 2
D
bit 3
D
bit 4
D
bit 5
ECC
1
ECC
2
ECC
3
ECC
4
D
bit
6
D
bit
7
D
bit
8
D
bit
9
D
bit
10
D
bit 11
ECC
5
ECC
6
ECC
7
ECC
8
D
bit
12
D
bit
13
D
bit
14
D
bit
15
D
bit
16
D
bit
17
ECC
9
ECC
10
ECC
11
ECC
12
….
EECS 582 – W16
27
The Bad:
Reading 1 sector requires reading all other disks to verify correctness
Writing 1 sector requires
Reading 10 disks
Modifying
Writing all 10 disksSlide28
Errors in Practice
Why do we need error correction in hard drives???
EECS 582 – W16
28Slide29
ChecksumAll we need is 1 bit of information
EECS 582 – W16
29Slide30
RAID Level 3
D
bit 0
D
bit 1
D
bit 2
D
bit 3
Parity 0
D
bit 6
D
bit 7
D
bit 8
D
bit 9
Parity 1
D
bit 12
D
bit 13
D
bit 14
D
bit 15
Parity 2
EECS 582 – W16
30Slide31
RAID Level 3
EECS 582 – W16
31
The good news
Redundancy cost can be as low as 20% or even less
MeanTimeToFailure
is still very high
Large reads, of 4 sectors, are as fast as reading 1 sector from a single drive (the slowest one)
To write a single bit we don’t have to
read from all disks. Parity can be computed
just by reading & writing 2 drives.
Any broken drive can be easily
reconstructed Slide32
RAID Level 3
EECS 582 – W16
32
The
bad
news:
The bits are interleaved!
To read an entire sector we have to read from 4 disks!
Also, writing a sector require writing to 5 disks
D
bit 0
D
bit 1
D
bit 2
D
bit 3
Parity 0
D
bit 6
D
bit 7
D
bit 8
D
bit 9
Parity 1
D
bit 12
D
bit 13
D
bit 14
D
bit 15
Parity 2Slide33
Gather the Bits = RAID Level 4
Using chunks interleaving
instead ofBit interleaving
EECS 582 – W16
33
Chunk
0
Chunk
1
Chunk
2
Chunk
3
Parity 0
Chunk
6
Chunk
7
Chunk
8
Chunk
9
Parity 1
Chunk
12
Chunk
13
Chunk
14
Chunk
15
Parity 2Slide34
RAID Level 4
The good news:
MTTF is the sameCan still reconstruct any drive from the other 4
To read a chunk we only need to read a single drive
To write a chunk we write to 2 drives instead of 5
We can read 4 chunks in parallel
We can write 4 chunks in
parellel
If they are on the same “line”
EECS 582 – W16
34
Chunk
0
Chunk
1
Chunk
2
Chunk
3
Parity 0
Chunk
6
Chunk
7
Chunk
8
Chunk
9
Parity 1
Chunk
12
Chunk
13
Chunk
14
Chunk
15
Parity 2Slide35
RAID Level 4 - The Choke
Point
What if we want to write chunk 0 and chunk 7 in parallel?
We Can’t!
The parity drive is involved in every write – this is a choke point!
EECS 582 – W16
35
Chunk
0
Chunk
1
Chunk
2
Chunk
3
Parity 0
Chunk
6
Chunk
7
Chunk
8
Chunk
9
Parity 1
Chunk
12
Chunk
13
Chunk
14
Chunk
15
Parity 2Slide36
Relieving the Pressure – RAID 5
EECS 582 – W16
36
Chunk
0
Chunk
1
Chunk
2
Chunk
3
Parity 0
Chunk
6
Chunk
7
Chunk
8
Parity 1
Chunk 9
Chunk
12
Chunk
13
Parity 2
Chunk
14
Chunk 15
Chunk
16
Parity 3
Chunk
17
Chunk
18
Chunk
19
Parity 4
Chunk
20
Chunk
21
Chunk
22
Chunk
23Slide37
Relieving the Pressure – RAID 5
MTTF is the sameCan still recover from a failed drive
Less stress on a single drive less failures!Can read 2 chunks in parallel
EECS 582 – W16
37
Chunk
0
Chunk
1
Chunk
2
Chunk
3
Parity 0
Chunk
6
Chunk
7
Chunk
8
Parity 1
Chunk 9
Chunk
12
Chunk
13
Parity 2
Chunk
14
Chunk 15
Chunk
16
Parity 3
Chunk
17
Chunk
18
Chunk
19
Parity 4
Chunk
20
Chunk
21
Chunk
22
Chunk
23Slide38
Discussion & Future Work
RAID levels 6,10 also existThis is a good lesson in taking many slow components
to create a better utility Multi core processors have similar paradigm
Can inspire future work
What did we ignore?
Controlling hardware/software
Different drives might have different latencies
Anything else?
EECS 582 – W16
38Slide39
Discussion
How did the trends affect the relativity of RAID?
EECS 582 – W16
39Slide40
Memory Bandwidth
EECS 582 – W16
40