Y Sophia Shao Sam Xi GuYeon Wei David Brooks Harvard University More accelerators OutofCore Accelerators 2 Die photo from Chipworks Accelerators annotated by Sophia Shao Harvard ID: 205562
Download Presentation The PPT/PDF document "Integration for Heterogeneous SoC Modeli..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Integration for Heterogeneous SoC Modeling
Y. Sophia Shao, Sam Xi, Gu-Yeon Wei, David BrooksHarvard UniversitySlide2
More accelerators.
Out-of-Core
Accelerators
2
[Die photo from
Chipworks
]
[Accelerators annotated
by
Sophia
Shao @ Harvard]
Maltiel
Consulting
estimates
[Shao, et al., IEEE Micro]Slide3
Accelerator-CPU Integration:Today’s Conventional SoCs3
Easy to integrate lots of IP, simple accelerator designHard to program and share dataCoreL2 $…L3 $
Core
L2 $
DMA
On-Chip System Bus
Acc
#1
Scratchpad
Acc
#n
ScratchpadSlide4
Accelerator Integration TrendUsers design application-specific hardware accelerators.System vendors provide Host Service Layer with virtual memory and cache coherence support
Intel QuickAssist QPI-Based FPGA Accelerator Platform (QAP)IBM POWER8’s Coherent Accelerator Processor Interface (CAPI)4CoreL2 $…
L3 $
Core
L2 $
Acc
Agent
Host Service Layer
Accelerator
Main CPU/
SoC
FPGA or user-defined ASICSlide5
Private L1/
ScratchpadAladdinAcceleratorSpecificDatapath
Shared Memory/Interconnect
Models
Unmodified
C-Code
Accelerator Design
Parameters
(e.g., # FU,
mem
. BW)
Power/Area
Performance
“Accelerator Simulator”
Design Accelerator-Rich
SoC
Fabrics and Memory Systems
Aladdin: A pre-RTL, Power-Performance Accelerator Simulator
5Slide6
Private L1/
ScratchpadAladdinAcceleratorSpecificDatapath
Shared Memory/Interconnect
Models
Unmodified
C-Code
Accelerator Design
Parameters
(e.g., # FU,
mem
. BW)
Power/Area
Performance
“Accelerator Simulator”
Design Accelerator-Rich
SoC
Fabrics and Memory Systems
Aladdin: A pre-RTL, Power-Performance Accelerator Simulator
6Slide7
Private L1/
ScratchpadAladdinAcceleratorSpecificDatapath
Shared Memory/Interconnect
Models
Unmodified
C-Code
Accelerator Design
Parameters
(e.g., # FU,
mem
. BW)
Power/Area
Performance
“Accelerator Simulator”
Design Accelerator-Rich
SoC
Fabrics and Memory Systems
Design Cost
Flexibility
Programmability
Aladdin: A pre-RTL, Power-Performance Accelerator Simulator
“Design Assistant”
Understand Algorithmic-HW
Design Space before RTL
7Slide8
Aladdin OverviewC Code
Power/AreaPerformanceActivityAcc Design Parameters
Optimization Phase
Realization Phase
Optimistic
IR
Initial
DDDG
Idealistic
DDDG
Program
Constrained
DDDG
Resource
Constrained
DDDG
Power/Area
Models
8
Dynamic Data
Dependence Graph
(DDDG)Slide9
Aladdin Take-AwayCompared to HLS and hand-written RTL for SHOC benchmarks and custom accelerator designsLarge design space exploration (DSE) in minutes instead of hours/days with unmodified C/C++ algorithm description
LimitationsDynamic approach Aladdin depends on realistic workload inputsAlgorithm dependent Aladdin enables DSE/algorithm exploration9Cycle Counts PowerAreawithin 2%
within 5%
within 7%Slide10
Aladdin enables pre-RTL simulation of accelerators with the rest of the SoC.
GPUShared ResourcesMemoryInterfaceSea of Fine-Grained Accelerators
Big
Cores
Small Cores
GPGPU-
Sim
gem5
.
..
g
em5
…
Ruby/GARNET
DRAMSim2
10Slide11
gem5-Aladdin Integration
CPUDMA EngineScratchpadTLB
DRAM
LLC
Cache
Cache
Acc
Datapath
11Slide12
gem5-Aladdin Integration
CPUDMA EngineDRAMLLCCache
Scratchpad
TLB
Cache
Acc
Datapath
12
Scratchpad
TLB
Cache
Acc
Datapath
Acc
Shared Cache
…
…Slide13
Acc
CacheMemory
CPU
Cache
Memory
13Slide14
Increasing number of accelerators are integrated into both mobile SoCs and servers.gem5-Aladdin integration enables rapid design space exploration of future accelerator-centric platforms. Download
Aladdin athttp://vlsiarch.eecs.harvard.edu/aladdin 14Heterogeneous SoC Modeling