Split-Level I/O Scheduling - PowerPoint Presentation

426 views
Uploaded On 2016-07-18

Split-Level I/O Scheduling - PPT Presentation

Suli Yang Tyler Harter Nishant Agrawal Samer Al Kiswany Salini Selvaraj Kowsalya Anand Krishnamurthy Rini T Kaushik Andrea C ArpaciDusseau Remzi ID: 409269

scheduling level write block level scheduling block write scheduler split file system req framework deadline app requests schedulers tags

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/409269" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Presentation The PPT/PDF document "Split-Level I/O Scheduling" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

Split-Level I/O Scheduling

Suli Yang, Tyler Harter, Nishant Agrawal, Samer Al-Kiswany, Salini Selvaraj Kowsalya, Anand Krishnamurthy, Rini T Kaushik, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau Slide2

…yet another I/O scheduling paper?

2CFQ (2003)BFQ (2010)Deadline (2002)mClock (2011)Token-Bucket (2008)Libra (2014)

pClock (2007)

Fahrrad

(2008)

FQ (1999)

Facade(2003)Slide3

Some mistakes we have been making for decades…

(in trying to build better schedulers)3Slide4

Current frameworks

fundamentally limitedCFQ, Deadline, Token-BucketImportant policies cannot be realizedFairness, Latency Guarantee, IsolationWasted effort trying to build new schedulers without fixing the framework4ProblemSlide5

Can we design a

simple and effective framework that lets us build schedulers to correctly realize important I/O policies?5Slide6

Solution: Split-Level Framework

Control: Allow scheduling at multiple levels Block levelSystem-call levelPage-cache levelInformation: Tag requests to identify the originSimplicity: Small set of hooks at key junctions within the storage stack6Slide7

Results

Three distinct policies implementedPriory, Deadline, IsolationLarge performance improvementsFairness: 12xTail latency: 4xIsolation: 6xGood foundation for applicationsReduce transaction latency for databasesImprove isolation for virtual machinesEffective rate limit for HDFS7Slide8

Overview

How I/O scheduling frameworks workSplit-Level Scheduling Framework: DesignSplit-Level Scheduler Case StudyConclusion8Slide9

Framework vs. Scheduler

Framework: A running environment (mechanism)Scheduler: Implement different policiesHow it works Framework provides callbacks to schedulers.9Slide10

Traditional Approach:

Block-Level I/O SchedulingPage Cache File SystemBlock-Level Queues

add_req

ispatch_req

eq_complete

Block-Level Scheduler

App

DeviceSlide11

Block-Level I/O Scheduling

11Simplified Complete Faire Queuing (CFQ) Implementation: Block-Level Queues dispatch_req

req_complete

Block-Level Scheduler

Device

dd_req

add_req

(r){

p =

r.submit_process

q =

get_queue

(p)

enqueue

(

q,r

)

}

dispatch_req

(){

q =

get_high_prio_queue

()

r =

queue

(q)

dispatch(r)

}

omplete_req

(r){

//clean up

}Slide12

Overview

What is an I/O scheduling frameworkSplit-Level Scheduling Framework: DesignThe reordering problemThe cause-mapping problemThe cost-estimation problem Split-Level Scheduler Case StudyConclusion12Slide13

Reordering

Scheduling is just reordering I/O requests13Slide14

File System

Data EntanglementBlock-Level Scheduler14File system tangles

data into one bundle Journal transactionShared metadata blockImpossible

for the schedulers to reorder

App1

App2Slide15

File System

Write DependenciesBlock-Level Scheduler15File systems carefully

order writes Schedulers cannot reorder

(

unless FS allows

)

App

tx1

tx2Slide16

Fundamental Limitation #1

(of block-level scheduling)The file system imposes ordering requirements contrary to the scheduling goals The scheduler cannot reorderToo late once data in the file system Need admission control16Slide17

Split-Level I/O Scheduling:

Multi-Layer HooksPage CacheFile SystemBlock-Level Queues

add_req

ispatch_req

eq_complete

Split-Level Scheduler

App

Device

rite()

fsync()

void data entanglement and ordering

above

the file systemSlide18

Cause Mapping

A scheduler needs to map an I/O request to the originating application18Slide19

Write Delegation

Page CacheBlock-Level SchedulerApp1App2

write()

Write-back Daemon

Loss of cause

nformation!

Write-back daemon submits all requests

Write-back, journaling, delayed allocation….Slide20

Fundamental Limitation #2

(of block-level scheduling)Cause-mapping information lost within the frameworkImpossible to map an I/O request back to its originating application (no matter how you implement the scheduler)20Slide21

Split-Level I/O Scheduling: Tags

Page CacheBlock-Level SchedulerApp1App2

write()

Write-back Daemon

Tags to identify origin

Tags pass across layers

2Slide22

Cost Estimation

A scheduler needs to estimate the cost of I/OMemory-level notification for timely estimateBlock-level notification for accurate estimateDetails in paper22Slide23

Split-Level I/O Scheduling Framework: Summary

Three key pieces: Multiple-layer hooks to prevent adverse file system interaction Tags to track causes across layersEarly memory-level notification of write workEasy Implementation~300 LOC in LinuxLittle added complexity for building schedulers23Slide24

Overview

How I/O scheduling frameworks workSplit-Level Scheduling Framework: DesignSplit-Level Scheduler Case StudyConclusion24Slide25

Challenge #1:

Priority SchedulerFairly allocate I/O resources based on the processes’ priorities25Slide26

Block-Level: CFQ

26goalWorkload:Eight processes with different priority (0-7), each sequentially writing its own file add_req(r){ p = r.submit_process q = get_queue(p) enqueue(q,r)}Slide27

Block-Level: CFQ

27the write-back threadadd_req(r){ p = r.submit_process q = get_queue(p) enqueue(q,r)}Slide28

Split-Level: AFQ

28CFQ deviate from the goal by 82%AFQ by 7% 12x improvementadd_req(r){ p = r.tagged_cause q = get_queue(p) enqueue(q,r)}Slide29

Challenge #2:

Deadline SchedulerProvide guaranteed latency of I/O requests29Slide30

Block-Deadline

Block-Deadline: cannot serve the low-latency requests until previous transaction completedFile SystemBlock-Deadline

App

tx1

tx2Slide31

Block-Deadline

Workload: Flush 4KB data to disk with or w/o background writesExpected Results: Operation finish within deadline (100ms)Slide32

Split-Deadline

Split-Deadline: suspend write() and fsync() to avoid many high-latency requests to accumulate in one transaction. File SystemSplit-Deadline

App

tx1

App

write()

fsync()

Write and fsync blocked to prevent high-latency data into FSSlide33

Split-Level:

Split-DeadlineSplit-Deadline maintains the deadline regardless of background writes. Slide34

The Fsync-Freeze Problem

During checkpointing, the system begins writing out the data that need to fsync()’d so aggressively that the service time for I/O requests from other processes go through the roof. ---Robert Hass (PostgreSQL)34Slide35

The Fsync-Freeze Problem

354x tail latency reduction.Split-Deadline solves the fsync-freeze problem!Workload: SQLite transaction with different checkpoint intervalExpected Results: Consistent transaction latency Slide36

Other Evaluation Results

Low overhead <1% runtime overhead <50 MB memory overheadOther schedulers Token-bucket for performance isolationOther applications PostgreSQL: latency guarantee for TPC-B workloads QEMU: provides isolation across VMs HDFS: effective I/O rate limit36Slide37

Overview

What is an I/O scheduling framework and how does it work. Split-Level Scheduling Framework: DesignSplit-Level Scheduler Case StudyConclusion37Slide38

Conclusion

For decades, people have been trying to build better block-level schedulersbound to fail without appropriate framework supportSplit-level framework enables correct scheduler implementationCross-layer tagsMulti-level hooksMemory-level notification38Source code and more information: http://research.cs.wisc.edu/adsl/Software/split/Slide39

Backup slides

39Slide40
Slide41

File System

Write DependenciesAppBlock-Level Scheduler

41Modern file system maintains data consistency by carefully

ordering writes

Schedulers

cannot reorder

unless file system allows it.

tx1

tx2Slide42

Split-Level I/O Scheduling:

Multi-Layer Hooks42System-call scheduling above the file system to avoid data entanglement.Block-level scheduling below the file system to maximize performance.Page Cache

App

ead()

write()

fsync()

File System

write-back

Block-Level Queues

dd_req

ispatch_req

eq_complete

Disk

SSD