/
Achieving High Performance and Fairness Achieving High Performance and Fairness

Achieving High Performance and Fairness - PowerPoint Presentation

stefany-barnette
stefany-barnette . @stefany-barnette
Follow
401 views
Uploaded On 2016-02-21

Achieving High Performance and Fairness - PPT Presentation

at Low Cost Lavanya Subramanian Donghyuk Lee Vivek Seshadri Harsha Rastogi Onur Mutlu 1 The Blacklisting Memory Scheduler Main Memory Interference Problem Causes interference between applications requests ID: 225501

memory req application ranking req memory ranking application performance blacklisting fairness aware schedulers complexity aid full high scheduler blacklist key monitor intensity

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Achieving High Performance and Fairness" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Achieving High Performance and Fairness at Low Cost

Lavanya Subramanian, Donghyuk Lee, Vivek Seshadri, Harsha Rastogi, Onur Mutlu

1

The Blacklisting Memory SchedulerSlide2

Main Memory Interference Problem

Causes interference between applications’ requests2Core

Core

Core

Core

Main MemorySlide3

Inter-Application Interference Results in Performance Degradation

High application slowdowns due to interference3Slide4

Tackling Inter-Application Interference:Application-aware Memory Scheduling4

Monitor RankHighest Ranked AID

EnforceRanks

Full ranking increases

critical path latency and area significantly to improve performance and fairness

4

3

2

1

2

4

3

1

Req

1

1

Req

2

4

Req

3

1

Req

4

1

Req

5

3

Req

7

1

Req

8

3

Request Buffer

Req

5

2

Request

App. ID (AID)

=

=

=

=

=

=

=

=Slide5

Performance vs. Fairness vs. Simplicity5

Performance

Fairness

Simplicity

FRFCFS

PARBS

ATLAS

TCM

Blacklisting

Ideal

App-unaware

App-aware (Ranking)

Our Solution

(No Ranking)

Is it essential to give up simplicity to optimize for performance and/or fairness?

Our solution achieves all three goals

Very Simple

Low performance and fairness

ComplexSlide6

Outline Introduction Problems with Application-aware Schedulers

Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion6Slide7

Outline Introduction Problems with Application-aware Schedulers

Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion7Slide8

Problems with Previous Application-aware Memory Schedulers

1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns8Slide9

Problems with Previous Application-aware Memory Schedulers

1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns9Slide10

Ranking Increases Hardware Complexity10

Highest Ranked AIDEnforceRanks

Req

1

1

Req

2

4

Req

3

1

Req

4

1

Req

5

3

Req

7

1

Req

8

3

Request Buffer

Req

5

4

Request

App. ID (AID)

Next Highest Ranked AID

Monitor

Rank

4

3

2

1

2

4

3

1

=

=

=

=

=

=

=

=

Hardware complexity increases with application/core countSlide11

11

Ranking Increases Hardware Complexity8x1.8x

Ranking-based application-aware schedulers incur high hardware cost

From synthesis of RTL implementations using a 32nm librarySlide12

Problems with Previous Application-aware Memory Schedulers

1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns12Slide13

Ranking Causes Unfair Slowdowns

13GemsFDTD

GemsFDTD

(high memory intensity)

sjeng

sjeng

(low memory intensity)

Full

ordered ranking of

applications

GemsFDTD

denied request serviceSlide14

Ranking Causes Unfair Slowdowns14

Ranking-based application-aware schedulers cause unfair slowdowns

GemsFDTD

(high memory intensity)

sjeng

(low memory intensity)Slide15

Problems with Previous Application-aware Memory Schedulers

1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns15

Our Goal: Design a memory scheduler with

Low Complexity, High Performance, and FairnessSlide16

Outline Introduction Problems with application-aware schedulers

Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion16Slide17

Key Observation 1: Group Rather Than Rank

Observation 1: Sufficient to separate applications into two groups, rather than do full ranking17Benefit 1: Low complexity compared to ranking

Group

Vulnerable

Interference

Causing

>

Monitor

Rank

4

3

2

1

2

4

3

1

4

2

3

1Slide18

Key Observation 1: Group Rather Than Rank

18

GemsFDTD

(high memory intensity)

sjeng

(low memory intensity)

No denial of request serviceSlide19

19Key Observation 1: Group

Rather Than RankBenefit 2: Lower slowdowns than ranking

GemsFDTD (high memory intensity)

sjeng

(low memory intensity)Slide20

Key Observation 1: Group Rather Than Rank

Observation 1: Sufficient to separate applications into two groups, rather than do full ranking20How to classify applications into groups?

Group

Vulnerable

Interference

Causing

>

Monitor

Rank

4

3

2

1

2

4

3

1

4

2

3

1Slide21

Key Observation 2Observation 2: Serving a large number of consecutive requests from an application causes interference

Basic Idea:Group applications with a large number of consecutive requests as interference-causing  BlacklistingDeprioritize blacklisted applicationsClear blacklist periodically (1000s of cycles)Benefits:Lower complexityFiner grained grouping decisions  Lower unfairness

21Slide22

Outline Introduction Problems with application-aware schedulers

Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion22Slide23

The Blacklisting Memory Scheduler (BLISS)23

1. Monitor Memory ControllerAID2

0

0

AID

Blacklist

1

0

2

0

3

0

AID1

AID1

AID1

AID1

1

Last

Req

AID

3

# Consecutive Requests

1

2

1

2

3

4

4

2. Blacklist

Memory Controller

Last

Req

AID

3

# Consecutive Requests

1

2. Blacklist

0

0

AID

Blacklist

1

2

0

3

0

1

1. Monitor

Req

Blacklist

Req

1

0

Req

2

1

Req

3

1

Req

4

0

Req

5

0

Req

6

0

Req

7

1

Req

8

0

Request Buffer

?

?

?

3. Prioritize

4. Clear Periodically

0

Simple and scalable design

3. Prioritize

4. Clear Periodically

1. Monitor

?

?

?

?

?Slide24

The Blacklisting Memory Scheduler (BLISS)24

1. Monitor Memory ControllerAID2

0

0

AID

Blacklist

1

0

2

0

3

0

AID1

AID1

AID1

AID1

1

Last

Req

AID

3

# Consecutive Requests

1

2

1

2

3

4

4

2. Blacklist

Memory Controller

Last

Req

AID

3

# Consecutive Requests

1

2. Blacklist

0

0

AID

Blacklist

1

2

0

3

0

1

1. Monitor

Req

Blacklist

Req

1

0

Req

2

1

Req

3

1

Req

4

0

Req

5

0

Req

6

0

Req

7

1

Req

8

0

Request Buffer

?

?

?

3. Prioritize

4. Clear Periodically

0

Simple and scalable design

3. Prioritize

4. Clear Periodically

1. Monitor

?

?

?

?

?Slide25

Outline Introduction Problems with application-aware schedulers

Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion25Slide26

MethodologyConfiguration of our simulated baseline system24 cores4 channels, 8 banks/channel

DDR3 1066 DRAM 512 KB private cache/coreWorkloadsSPEC CPU2006, TPC-C, Matlab , NAS80 multiprogrammed workloads26Slide27

MetricsSystem Performance:

Fairness:Complexity:Critical path latency and area from synthesis with 32 nm library27Slide28

Previous Memory SchedulersFRFCFS

[Zuravleff and Robinson, US Patent 1997, Rixner et al., ISCA 2000]Prioritizes row-buffer hits and older requestsFRFCFS-Cap [Mutlu and Moscibroda, MICRO 2007]Caps number of consecutive row-buffer hitsPARBS [Mutlu and Moscibroda, ISCA 2008]Batches oldest requests from each application; prioritizes batchEmploys ranking within a batch

ATLAS

[Kim et al., HPCA 2010]Prioritizes applications with low memory-intensity

TCM [Kim et al., MICRO 2010]

Always prioritizes low memory-intensity applications

Shuffles thread ranks of high memory-intensity applications

28

Application-unaware

+ Low complexity

- Low performance and fairness

Application-aware

+ High performance and fairness

- High complexitySlide29

Performance and Fairness29

Ideal

5%

21%

1. Blacklisting achieves the highest performance

2. Blacklisting balances performance and fairnessSlide30

Complexity30

43%70%

Blacklisting reduces complexity significantly

IdealSlide31

Performance vs. Fairness vs. Simplicity31

Performance

Fairness

Simplicity

FRFCFS

FRFCFS

-

Cap

PARBS

ATLAS

TCM

Blacklisting

Ideal

Highest performance

Close to simplest

Close to fairest

Blacklisting is the closest scheduler to idealSlide32

SummaryApplications’ requests interfere at main memoryPrevalent solution approachApplication-aware memory request scheduling

Key shortcoming of previous schedulers: Full rankingHigh hardware complexityUnfair application slowdownsOur Solution: Blacklisting memory schedulerSufficient to group applications rather than rankGroup by tracking number of consecutive requestsMuch simpler than application-aware schedulers at higher performance and fairness32Slide33

Achieving High Performance and Fairness at Low Cost

Lavanya Subramanian, Donghyuk Lee, Vivek Seshadri, Harsha Rastogi, Onur Mutlu

33

The Blacklisting Memory Scheduler