/
Rapid Exploration of Accelerator-Rich Architectures: Rapid Exploration of Accelerator-Rich Architectures:

Rapid Exploration of Accelerator-Rich Architectures: - PowerPoint Presentation

sherrill-nordquist
sherrill-nordquist . @sherrill-nordquist
Follow
408 views
Uploaded On 2017-03-14

Rapid Exploration of Accelerator-Rich Architectures: - PPT Presentation

Automation from Concept to Prototyping David Brooks Jason Cong Zhenman Fang Yakun Sophia Shao and Sam Xi Harvard University amp UCLA Tutorial Outline Time Topic Speaker 830 am 900 am ID: 524272

accelerators accelerator soc design accelerator accelerators design soc aladdin acc shao sophia cores sam gem5 exploration dsp system amp

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Rapid Exploration of Accelerator-Rich Ar..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Rapid Exploration of Accelerator-Rich Architectures: Automation from Concept to Prototyping

David Brooks, Jason Cong,

Zhenman

Fang,

Yakun Sophia Shao

, and Sam Xi

Harvard

University & UCLASlide2

Tutorial OutlineTime

Topic

Speaker

8:30 am – 9:00 am

Accelerator Research Infrastructure Overview

Sophia Shao

9:00

am – 9:30 am

Aladdin: Accelerator Pre-RTL Modeling

Sophia Shao

9:30

am – 10:00 am

Rapid Hardware Specialization with HLS: Glass Half Full

Prof.

Zhiru

Zhang

10:00 am – 10:30 am

PARADE: HLS-Based Accelerator-Rich Architecture Simulation

Zhenman

Fang

10:30

am – 11:00 am

Break

11:00

am – 11:30 am

gem5-Aladdin: Accelerator System Co-Design

Sam Xi

11:30

am – 12:00 pm

ARAPrototyper

: FPGA Prototyping

Zhenman

Fang

12:00pm – 13:30 pm

Lunch

13:30

pm – 14:00 pm

Virtual

Machine Setup

Sophia Shao & Sam Xi

14:00 pm

– 14:30 pm

Hands-on: Accelerator Design Space Exploration using Aladdin

Sophia Shao

14:30 pm – 15:00 pm

Hands-on: SoC Design Space Exploration using gem5-Aladdin

Sam XiSlide3

Moore’s Law

3Slide4

CMOS Scaling is Slowing Down

http://

www.anandtech.com

/show/9447/intel-10nm-and-kaby-lake

4

180 nm

130 nm

90 nm

65 nm

45 nm

32 nm

22 nm

14 nm

10 nmSlide5

CMOS Technology Scaling

Technological

Fallow Period

5Slide6

Potential for Specialized Architectures[Zhang and Brodersen]

16

Encryption

17

Hearing Aid

18

FIR for disk read

19

MPEG Encoder

20

802.11 Baseband

6Slide7

Cores, GPUs, and Accelerators:Apple A8 SoC

Out-of-Core

Accelerators

7Slide8

Cores, GPUs, and Accelerators:Apple A8 SoC

Out-of-Core

Accelerators

8Slide9

Cores, GPUs, and Accelerators:Apple A8 SoC

Out-of-Core

Accelerators

Maltiel

Consulting

estimates

Our estimates

9Slide10

Challenges in AcceleratorsFlexibilityFixed-function accelerators are only designed for the target applications.ProgrammabilityToday’s accelerators are explicitly managed by programmers.

10Slide11

OMAP 4 SoC

Today’s

SoC

11Slide12

OMAP 4 SoC

Today’s

SoC

ARM Cores

GPU

DSP

DSP

System Bus

Secondary

Bus

Secondary

Bus

Tertiary

Bus

DMA

DMA

SD

USB

Audio

Video

Face

Imaging

USB

12Slide13

Challenges in AcceleratorsFlexibilityFixed-function accelerators are only designed for the target applications.ProgrammabilityToday’s accelerators are explicitly managed by programmers. Design CostAccelerator (and RTL) implementation is inherently tedious and time-consuming.

13Slide14

Today’s SoC

GPU/DSP

CPU

Buses

Mem

Inter-

face

Acc

CPU

Acc

Acc

Acc

Acc

Acc

Acc

Acc

Acc

14Slide15

Future Accelerator-Centric ArchitecturesFlexibilityDesign Cost

Programmability

How to decompose applications into accelerators?

How to rapidly design lots of accelerators?

How to design and manage the shared resources?

GPU/DSP

Big

Cores

Shared Resources

Memory

Interface

Sea of Fine-Grained

Accelerators

Small Cores

15Slide16

PARADE: Platform for Accelerator-Rich

A

rchitectural

D

esign &

E

xploration

[ICCAD 15]

extended

gem5

(

McPAT

)

for X86 CPU, with OS

auto-generated accelerators based on HLS (

AutoPilot

)

added SPM, DMA,

GAM & TLB model

extended

Ruby

(CACTI) for

coherent cache hierarchy

gem5 memory

model [ISPASS 14]

extended

Garnet

(DSENT) for

NoCSlide17

Using Xilinx Zynq SoC (FPGA fabrics + ARM)

Major components of an ARA

General processor cores

A sea of heterogeneous accelerators

Memory system + interconnects (

NoC

)

ARAPrototyper

: Prototyping an ARA on FPGASlide18

GPU/DSP

Big Cores

Shared Resources

Memory

Interface

Sea of Fine-Grained

Accelerators

Small Cores

18

g

em5-Aladdin: Accelerator-System

Co-Design

[MICRO’16]

Contributions

Accelerator Design w/

High-Level Synthesis

[ISLPED’13_1]

Aladdin:

A

ccelerator Pre-RTL, Power-Performance

Simulator

[ISCA’14, TopPicks’15]

MachSuite

: Accelerator

Benchmark

Suite [IISWC’14]

WIICA:

Accelerator

Workload

Characterization

[ISPASS’13]Slide19

Tutorial OutlineTime

Topic

Speaker

8:30 am – 9:00 am

Accelerator Research Infrastructure Overview

Sophia Shao

9:00

am – 9:30 am

Aladdin: Accelerator Pre-RTL Modeling

Sophia Shao

9:30

am – 10:00 am

Rapid Hardware Specialization with HLS: Glass Half Full

Prof.

Zhiru

Zhang

10:00 am – 10:30 am

PARADE: HLS-Based Accelerator-Rich Architecture Simulation

Zhenman

Fang

10:30

am – 11:00 am

Break

11:00

am – 11:30 am

gem5-Aladdin: Accelerator System Co-Design

Sam Xi

11:30

am – 12:00 pm

ARAPrototyper

: FPGA Prototyping

Zhenman

Fang

12:00pm – 13:30 pm

Lunch

13:30

pm – 14:00 pm

Virtual

Machine Setup

Sophia Shao & Sam Xi

14:00 pm

– 14:30 pm

Hands-on: Accelerator Design Space Exploration using Aladdin

Sophia Shao

14:30 pm – 15:00 pm

Hands-on: SoC Design Space Exploration using gem5-Aladdin

Sam Xi