Cornell University In collaboration with Mahesh Balakrishnan MSR SVC Tudor Marian Google and Hakim Weatherspoon Cornell Gecko ContentionOblivious Disk Arrays for Cloud Storage ID: 570414
Download Presentation The PPT/PDF document "Ji -Yong Shin" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Ji
-Yong Shin Cornell UniversityIn collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google), and Hakim Weatherspoon (Cornell)
Gecko: Contention-Oblivious Disk Arrays for Cloud Storage
FAST 2013Slide2
What happens to storage?
Cloud and Virtualization
VMM
Guest
VM
Guest
VM
Guest
VM
Guest
VM
…
Shared
Disk
Shared
Disk
Shared Disk
Shared Disk
SEQUENTIAL
RANDOM
2
I/O CONTENTION causes throughput collapse
4-Disk RAID0Slide3
Existing Solutions for I/O Contention?
I/O scheduling: reordering I/OsEntails increased latency for certain workloadsMay still require seekingWorkload placement: positioning workloads to minimize contentionRequires prior knowledge or dynamic predictionPredictions may be inaccurateLimits freedom of placing VMs in the cloud3Slide4
Log all writes to tail
Log-structured File System to the Rescue?[Rosenblum et al. 91]Addr 0 1 2 …
… N
…
Write
Write
Write
Log Tail
Shared
Disk
4Slide5
Garbage collection is the Achilles’ Heel of LFS [Seltzer et al. 93, 95; Matthews et al. 97]
…
Shared
Disk
Challenges of Log-Structured File System
… N
…
Log Tail
Shared
Disk
Shared
Disk
…
GC
Garbage Collection (GC)
from Log Head
5
Read
GC and read still cause disk seek
Log HeadSlide6
Challenges of Log-Structured File System
Garbage collection is the Achilles’ Heel of LFS2-disk RAID-0 setting of LFS GC under write-only synthetic workloadRAID 0 + LFS6
Max Sequential Throughput
Throughput falls by 10X during GCSlide7
Problem:
Increased virtualization leads to increased disk seeks and kills performanceRAID and LFS do not solve the problem7Slide8
Rest of the Talk
MotivationGecko: contention-oblivious disk storageSources of I/O contentionNew technique: Chained loggingImplementationEvaluationConclusion8Slide9
What Causes Disk Seeks?
Write-writeRead-readWrite-read9
VM 1
VM2
Write
Write
Read
Read
Write
ReadSlide10
What Causes Disk Seeks?
Write-writeRead-readWrite-readLoggingWrite-GC readRead-GC read10
VM 1
VM2
Write
Read
GCSlide11
Principle:
A single sequentially accessed disk is better than multiple randomly seeking disks11Slide12
Gecko’s Chained Logging Design
Separating the log tail from the bodyGC reads do not interrupt the sequential write1 uncontended drive >>faster>> N contended drives
Disk 2
Disk 1
Disk 0
Log
Tail
Physical
Addr
Space
GC
Garbage
collection
from Log Head to Tail
Disk 0
Disk 1
Disk 2
12
Eliminates
write-write and reduces
write-read
contention
Eliminates
write-GC read
contentionSlide13
Gecko’s Chained Logging Design
Smarter “Compact-In-Body” Garbage CollectionDisk 1Disk 0
Physical
Addr
Space
GC
13
Garbage collection
from Head to Body
Log
Tail
Disk 2
Garbage
collection
from Log Head to TailSlide14
Gecko Caching
What happens to reads going to tail drives?14
LRU Cache
Reduces
write-read
contention
Disk 1
Disk 0
Disk 2
Write
Tail Cache
Read
Read
Read
Body Cache
(Flash )
Reduces
read-read
contention
RAM
Hot data
MLC SSD
Warm dataSlide15
Gecko Properties Summary
No write-write contention,No GC-write contention, andReduced read-write contention15Disk 1
Disk 0
Disk 2
Write
Tail Cache
Read
Read
Body Cache
(Flash )
Disk 1’
Disk 0’
Disk 2’
Fault tolerance
+ Read performance
Mirroring/
Striping
Disk 1’
Disk 0’
Power saving w/o
consistency concerns
Reduced read-write
and read-read
contentionSlide16
Gecko Implementation
Primary map: less than 8 GB RAM for a 8 TB storageInverse map: 8 GB flash for a 8 TB storage (written every 1024 writes)4KB pages
empty
filled
Data
(in disk)
Physical-to-logical map
(in flash)
head
head
4-byte entries
Logical-to-physical map
(in memory)
16
R
R
W
W
W
W
tail
tail
Disk 0
Disk 1
Disk 2
W
WSlide17
Evaluation
How well does Gecko handle GC?Performance of Gecko under real workloads?Effect of varying Gecko chain length?Effectiveness of the tail cache?Durability of the flash based tail cache?17Slide18
Evaluation Setup
In-kernel versionImplemented as block device for portabilitySimilar to software RAIDHardwareWD 600GB HDD Used 512GB of 600GB2.5” 10K RPM SATA-600Intel MLC (multi level cell) SSD 240GB SATA-600User-level emulatorFor fast prototypingRuns block tracesTail cache support18Slide19
How Well Does Gecko Handle GC?
Gecko19Log + RAID0Gecko’s aggregate throughput always remains high3X higher aggregate & 4X higher application throughput
2-disk setting; write-only synthetic workload
Max Sequential Throughput of 1 disk
Max Sequential Throughput of 2 disks
Aggregate throughput
= Max throughput
Throughput collapsesSlide20
How Well Does Gecko Handle GC?
20App throughput can be preserved using smarter GCGeckoSlide21
MS Enterprise and MSR Cambridge Traces
Trace NameEstimatedAddr Space
Total Data
Accessed (GB)
Total Data
Read (GB)
Total Data
Written (GB)
TotalIOReq
NumReadReq
NumWriteReq
prxy
136
2,076
1,297
779
181,157,932
110,797,984
70,359,948
src1
820
3,107
2,224
884
85,069,608
65,172,645
19,896,963
proj
4,102
2,279
1,937
342
65,841,031
55,809,869
10,031,162
Exchange
4,822
760
300
460
61,179,165
26,324,163
34,855,002
usr
2,461
2,625
2,530
96
58,091,915
50,906,183
7,185,732
LiveMapsBE
6,737
2,344
1,786
558
44,766,484
35,310,420
9,456,064
MSNFS
1,424
303
201
102
29,345,085
19,729,611
9,615,474
DevDivRelease
4,620
428
252
176
18,195,701
12,326,432
5,869,269
prn
770
271
194
77
16,819,297
9,066,281
7,753,016
21
[
Narayanan et al. 08, 09] Slide22
What is the Performance of
Gecko under Real Workloads?GeckoMirrored chain of length 3Tail cache (2GB RAM + 32GB SSD)Body Cache (32GB SSD)Log + RAID10 Mirrored, Log + 3 disk RAID-0 LRU cache (2GB RAM + 64GB SSD)22
Gecko showed less read-write contention and higher cache hit rateGecko’s throughput is 2X-3X higher
Gecko
Log + RAID10
Mix of 8 workloads:
prn
, MSNFS,
DevDivRelease
,
proj
, Exchange, LiveMapsBE
, prxy, and src1
6 Disk configuration with 200GB of data prefilledSlide23
What is the Effect of Varying Gecko Chain Length?
Same 8 workloads with 200GB data prefilledSingle uncontended disk Separating reads and writes23
better performance
RAID 0
errorbar
=
stdevSlide24
How Effective Is the Tail Cache?
Read hit rate of tail cache (2GB RAM+32GB SSD) on 512GB disk 21 combinations of 4 to 8 MSR Cambridge and MS Enterprise traces24
Tail cache can effectively resolve read-write contention
At least 86% of read hit rate
RAM handles most of hot data
Amount of data changes hit rate
Still average 80+ % hit rateSlide25
How Durable is Flash Based Tail Cache?
Static analysis of lifetime based on cache hit rateUse of 2GB RAM extends SSD lifetime252X-8X Lifetime Extension
Lifetime at
40MB/sSlide26
Conclusion
Gecko enables fast storage in the cloudScales with increasing virtualization and number of coresOblivious to I/O contentionGecko’s technical contributionSeparates log tail from its body Separates reads and writesTail cache absorbs reads going to tailA single sequentially accessed disk is better than multiple randomly seeking disks26Slide27
Question?
27