at Low Cost Lavanya Subramanian Donghyuk Lee Vivek Seshadri Harsha Rastogi Onur Mutlu 1 The Blacklisting Memory Scheduler Main Memory Interference Problem Causes interference between applications requests ID: 225501
Download Presentation The PPT/PDF document "Achieving High Performance and Fairness" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Achieving High Performance and Fairness at Low Cost
Lavanya Subramanian, Donghyuk Lee, Vivek Seshadri, Harsha Rastogi, Onur Mutlu
1
The Blacklisting Memory SchedulerSlide2
Main Memory Interference Problem
Causes interference between applications’ requests2Core
Core
Core
Core
Main MemorySlide3
Inter-Application Interference Results in Performance Degradation
High application slowdowns due to interference3Slide4
Tackling Inter-Application Interference:Application-aware Memory Scheduling4
Monitor RankHighest Ranked AID
EnforceRanks
Full ranking increases
critical path latency and area significantly to improve performance and fairness
4
3
2
1
2
4
3
1
Req
1
1
Req
2
4
Req
3
1
Req
4
1
Req
5
3
Req
7
1
Req
8
3
Request Buffer
Req
5
2
Request
App. ID (AID)
=
=
=
=
=
=
=
=Slide5
Performance vs. Fairness vs. Simplicity5
Performance
Fairness
Simplicity
FRFCFS
PARBS
ATLAS
TCM
Blacklisting
Ideal
App-unaware
App-aware (Ranking)
Our Solution
(No Ranking)
Is it essential to give up simplicity to optimize for performance and/or fairness?
Our solution achieves all three goals
Very Simple
Low performance and fairness
ComplexSlide6
Outline Introduction Problems with Application-aware Schedulers
Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion6Slide7
Outline Introduction Problems with Application-aware Schedulers
Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion7Slide8
Problems with Previous Application-aware Memory Schedulers
1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns8Slide9
Problems with Previous Application-aware Memory Schedulers
1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns9Slide10
Ranking Increases Hardware Complexity10
Highest Ranked AIDEnforceRanks
Req
1
1
Req
2
4
Req
3
1
Req
4
1
Req
5
3
Req
7
1
Req
8
3
Request Buffer
Req
5
4
Request
App. ID (AID)
Next Highest Ranked AID
Monitor
Rank
4
3
2
1
2
4
3
1
=
=
=
=
=
=
=
=
Hardware complexity increases with application/core countSlide11
11
Ranking Increases Hardware Complexity8x1.8x
Ranking-based application-aware schedulers incur high hardware cost
From synthesis of RTL implementations using a 32nm librarySlide12
Problems with Previous Application-aware Memory Schedulers
1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns12Slide13
Ranking Causes Unfair Slowdowns
13GemsFDTD
GemsFDTD
(high memory intensity)
sjeng
sjeng
(low memory intensity)
Full
ordered ranking of
applications
GemsFDTD
denied request serviceSlide14
Ranking Causes Unfair Slowdowns14
Ranking-based application-aware schedulers cause unfair slowdowns
GemsFDTD
(high memory intensity)
sjeng
(low memory intensity)Slide15
Problems with Previous Application-aware Memory Schedulers
1. Full ranking increases hardware complexity 2. Full ranking causes unfair slowdowns15
Our Goal: Design a memory scheduler with
Low Complexity, High Performance, and FairnessSlide16
Outline Introduction Problems with application-aware schedulers
Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion16Slide17
Key Observation 1: Group Rather Than Rank
Observation 1: Sufficient to separate applications into two groups, rather than do full ranking17Benefit 1: Low complexity compared to ranking
Group
Vulnerable
Interference
Causing
>
Monitor
Rank
4
3
2
1
2
4
3
1
4
2
3
1Slide18
Key Observation 1: Group Rather Than Rank
18
GemsFDTD
(high memory intensity)
sjeng
(low memory intensity)
No denial of request serviceSlide19
19Key Observation 1: Group
Rather Than RankBenefit 2: Lower slowdowns than ranking
GemsFDTD (high memory intensity)
sjeng
(low memory intensity)Slide20
Key Observation 1: Group Rather Than Rank
Observation 1: Sufficient to separate applications into two groups, rather than do full ranking20How to classify applications into groups?
Group
Vulnerable
Interference
Causing
>
Monitor
Rank
4
3
2
1
2
4
3
1
4
2
3
1Slide21
Key Observation 2Observation 2: Serving a large number of consecutive requests from an application causes interference
Basic Idea:Group applications with a large number of consecutive requests as interference-causing BlacklistingDeprioritize blacklisted applicationsClear blacklist periodically (1000s of cycles)Benefits:Lower complexityFiner grained grouping decisions Lower unfairness
21Slide22
Outline Introduction Problems with application-aware schedulers
Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion22Slide23
The Blacklisting Memory Scheduler (BLISS)23
1. Monitor Memory ControllerAID2
0
0
AID
Blacklist
1
0
2
0
3
0
AID1
AID1
AID1
AID1
1
Last
Req
AID
3
# Consecutive Requests
1
2
1
2
3
4
4
2. Blacklist
Memory Controller
Last
Req
AID
3
# Consecutive Requests
1
2. Blacklist
0
0
AID
Blacklist
1
2
0
3
0
1
1. Monitor
Req
Blacklist
Req
1
0
Req
2
1
Req
3
1
Req
4
0
Req
5
0
Req
6
0
Req
7
1
Req
8
0
Request Buffer
?
?
?
3. Prioritize
4. Clear Periodically
0
Simple and scalable design
3. Prioritize
4. Clear Periodically
1. Monitor
?
?
?
?
?Slide24
The Blacklisting Memory Scheduler (BLISS)24
1. Monitor Memory ControllerAID2
0
0
AID
Blacklist
1
0
2
0
3
0
AID1
AID1
AID1
AID1
1
Last
Req
AID
3
# Consecutive Requests
1
2
1
2
3
4
4
2. Blacklist
Memory Controller
Last
Req
AID
3
# Consecutive Requests
1
2. Blacklist
0
0
AID
Blacklist
1
2
0
3
0
1
1. Monitor
Req
Blacklist
Req
1
0
Req
2
1
Req
3
1
Req
4
0
Req
5
0
Req
6
0
Req
7
1
Req
8
0
Request Buffer
?
?
?
3. Prioritize
4. Clear Periodically
0
Simple and scalable design
3. Prioritize
4. Clear Periodically
1. Monitor
?
?
?
?
?Slide25
Outline Introduction Problems with application-aware schedulers
Key Observations The Blacklisting Memory Scheduler Design Evaluation Conclusion25Slide26
MethodologyConfiguration of our simulated baseline system24 cores4 channels, 8 banks/channel
DDR3 1066 DRAM 512 KB private cache/coreWorkloadsSPEC CPU2006, TPC-C, Matlab , NAS80 multiprogrammed workloads26Slide27
MetricsSystem Performance:
Fairness:Complexity:Critical path latency and area from synthesis with 32 nm library27Slide28
Previous Memory SchedulersFRFCFS
[Zuravleff and Robinson, US Patent 1997, Rixner et al., ISCA 2000]Prioritizes row-buffer hits and older requestsFRFCFS-Cap [Mutlu and Moscibroda, MICRO 2007]Caps number of consecutive row-buffer hitsPARBS [Mutlu and Moscibroda, ISCA 2008]Batches oldest requests from each application; prioritizes batchEmploys ranking within a batch
ATLAS
[Kim et al., HPCA 2010]Prioritizes applications with low memory-intensity
TCM [Kim et al., MICRO 2010]
Always prioritizes low memory-intensity applications
Shuffles thread ranks of high memory-intensity applications
28
Application-unaware
+ Low complexity
- Low performance and fairness
Application-aware
+ High performance and fairness
- High complexitySlide29
Performance and Fairness29
Ideal
5%
21%
1. Blacklisting achieves the highest performance
2. Blacklisting balances performance and fairnessSlide30
Complexity30
43%70%
Blacklisting reduces complexity significantly
IdealSlide31
Performance vs. Fairness vs. Simplicity31
Performance
Fairness
Simplicity
FRFCFS
FRFCFS
-
Cap
PARBS
ATLAS
TCM
Blacklisting
Ideal
Highest performance
Close to simplest
Close to fairest
Blacklisting is the closest scheduler to idealSlide32
SummaryApplications’ requests interfere at main memoryPrevalent solution approachApplication-aware memory request scheduling
Key shortcoming of previous schedulers: Full rankingHigh hardware complexityUnfair application slowdownsOur Solution: Blacklisting memory schedulerSufficient to group applications rather than rankGroup by tracking number of consecutive requestsMuch simpler than application-aware schedulers at higher performance and fairness32Slide33
Achieving High Performance and Fairness at Low Cost
Lavanya Subramanian, Donghyuk Lee, Vivek Seshadri, Harsha Rastogi, Onur Mutlu
33
The Blacklisting Memory Scheduler