/
21st Century Computer Architecture 21st Century Computer Architecture

21st Century Computer Architecture - PowerPoint Presentation

brambani
brambani . @brambani
Follow
344 views
Uploaded On 2020-06-19

21st Century Computer Architecture - PPT Presentation

A community white paper http craorgcccdocsinit21stcenturyarchitecturewhitepaperpdf Technion Haifa Israel June 2013 Information amp Commun Techs Impact Semiconductor Technologys Challenges ID: 781571

performance amp architecture memory amp performance memory architecture power century energy cmos chip technology computer scaling 21st infrastructure research

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "21st Century Computer Architecture" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

21st CenturyComputer Architecture A community white paperhttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf

Technion

, Haifa Israel, June 2013

Information &

Commun

. Tech’s Impact

Semiconductor Technology’s Challenges

Computer Architecture’s Future

Example: Bypassing Paged Virtual Memory

Slide2

White Paper Participants“*” contributed prose; “**” effort coordinator

Thanks of CCC, Erwin

Gianchandani

& Ed Lazowska for guidance and Jim Larus & Jeannette Wing for feedback

2

Sarita

Adve

, U Illinois *

David H. Albonesi, Cornell U

David Brooks, Harvard U

Luis

Ceze

, U Washington *

Sandhya

Dwarkadas

, U Rochester

Joel

Emer

, Intel/MIT

Babak

Falsafi

, EPFL

Antonio

Gonzalez

, Intel/UPC

Mark D. Hill, U Wisconsin *,**

Mary Jane Irwin, Penn State U *

David

Kaeli

, Northeastern U *

Stephen W.

Keckler

, NVIDIA/U Texas

Christos

Kozyrakis

, Stanford U

Alvin

Lebeck

, Duke U

Milo Martin, U Pennsylvania

José F.

Martínez

, Cornell U

Margaret

Martonosi

, Princeton U *

Kunle

Olukotun

, Stanford U

Mark

Oskin

, U Washington

Li-

Shiuan

Peh

, M.I.T.

Milos

Prvulovic

, Georgia Tech

Steven K. Reinhardt, AMD

Michael Schulte, AMD/U Wisconsin

Simha

Sethumadhavan

, Columbia U

Guri

Sohi

, U Wisconsin

Daniel

Sorin

, Duke U

Josep

Torrellas

, U Illinois *

Thomas F.

Wenisch

, U Michigan *

David Wood, U Wisconsin *

Katherine

Yelick

, UC Berkeley/LBNL *

Slide3

20th Century ICT Set UpInformation & Communication Technology (ICT)Has Changed Our World<long list omitted>Required innovations in algorithms, applications, programming languages, … , & system software

Key (invisible) enablers (cost-)performance gains

Semiconductor technology (“Moore’s Law”)

Computer architecture (~80x

per Danowitz et al

.)

3

Slide4

Enablers: Technology + Architecture4

Danowitz

et al., CACM 04/2012,

Figure 1

Technology

Architecture

Slide5

21st Century PromiseICT Promises Much MoreData-centric personalized health careComputation-driven scientific discoveryHuman network analysis

Much more: known & unknown

Characterized by

Big DataAlways OnlineSecure/Private…

Whither enablers of future (cost-)performance

gains?

5

Slide6

Technology’s Challenges 1/2Late 20th Century

The New Reality

Moore’s Law

2

×

transistors/chip

Transistor count still 2×

BUT…

Dennard

Scaling —~constant power/chip

Gone.

Can’t repeatedly

double

power/chip

6

Slide7

Classic CMOS Dennard Scaling: the Science behind Moore’s Law

7

National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)

Scaling:

Oxide:

t

OX

/

a

Results:

Power Density:

Voltage:

V/

a

Power/ckt:

1/

a

2

~Constant

(Finding 2)

Source: Future of Computing Performance: Game Over or Next Level?,

National Academy Press, 2011

Slide8

Power Density:~Constant

Post-classic CMOS

Dennard

Scaling

8

National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)

Scaling:

Oxide:

t

OX

/

a

Results:

Voltage:

V/

a

V

Power/ckt:

1

a

2

1/

a

2

Post Dennard CMOS Scaling Rule

TODO:

C

hips w/ higher power (no), smaller

(

)

, dark silicon

()

, or other (?)

Slide9

Technology’s Challenges 2/2Late 20th Century

The New Reality

Moore’s Law —

2× transistors/chip

Transistor count still 2× BUT…

Dennard

Scaling —~constant power/chip

Gone.

Can’t repeatedly

double

power/chip

Modest (hidden)

transistor unreliability

Increasing

t

ransistor unreliability

can’t be hidden

Focus on computation over communication

Communication (energy)

more

expensive than computation

1-time

costs amortized via mass market

One-time cost

much worse

&

want

specialized

platforms

9

How should architects step up as technology falters?

Slide10

21st Century Comp Architecture

20

th

Century

21st

Century

 

Single-chip

in

generic computer

Architecture as

Infrastructure

:

Spanning

s

ensors

to clouds

Performance

plus security, privacy, availability, programmability, …

 

 Cross-Cutting:

Break current layers with new interfaces

Performance

via invisible instr.-level parallelism

Energy

First

Parallelism

Specialization

Cross-layer design

Predictable technologies

: CMOS, DRAM,

& disks

New

technologies

(

non-volatile

memory,

near-threshold,

3D,

photonics,

…) Rethink: memory & storage, reliability, communication

10XX

Slide11

21st Century Comp Architecture

20

th

Century

21st

Century

 

Single-chip

in

stand-alone computer

Architecture as

Infrastructure

:

Spanning

s

ensors

to clouds

Performance

plus security, privacy, availability, programmability, …

 

 Cross-Cutting:

Break current layers with new interfaces

Performance

via invisible instr.-level parallelism

Energy

First

Parallelism

Specialization

Cross-layer design

Predictable technologies

: CMOS, DRAM,

& disks

New

technologies

(

non-volatile

memory,

near-threshold,

3D,

photonics,

…) Rethink: memory & storage, reliability, communication

11

Slide12

What Research Exactly?Research areas in white paper (& backup slides)Architecture as Infrastructure: Spanning Sensors to CloudsEnergy FirstTechnology Impacts on Architecture

Cross-Cutting Issues &

Interfaces

Much more research developed by future PIs!

E.g.: Efficient Virtual Memory

for

Big Memory

Servers

Basu

,

Gandhi

,

Chang

,

Hill,

& Swift [ISCA 2013]Big Memory: graph500, memcached, databases

Self-manage most memory (e.g.,

bufferpool

)

12

Slide13

10/5/1213Execution Time Overhead: TLB Misses

Significant waste

Larger memory?

Byte-

addr

NVM?

Lower is better

Slide14

Hardware: Direct SegmentOFFSETBASE LIMIT

VA

Conventional

P

aging

P

A

1

2

Direct Segment

Why Direct Segment?

Matches Big Memory Workload needs

NO Paging => NO TLB Miss

Slide15

Execution Time Overhead: TLB Misses10/5/12

15

92-100% TLB “misses” to direct segment

Requires:

Both

small SW + small HW changes

Slide16

21st CenturyComputer Architecture A community white paperhttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf

Technion

, Haifa Israel, June 2013

Information &

Commun

. Tech’s Impact

Semiconductor Technology’s Challenges

Computer Architecture’s Future

Example: Bypassing Paged Virtual Memory

Slide17

Pre-Competitive Research JustifiedRetain (cost-)performance enabler to ICT revolutionhttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdfSuccessful companies cannot do this by themselvesLack needed long-term focus

Don’t want to pay for what benefits all

Resist transcending interfaces that define their products

17

Slide18

White Paper ProcessLate March 2012CCC contacts coordinator & forms groupApril 2012Brainstorm (meetings/online doc)Read related docs (PCAST, NRC Game Over, ACAR1/2, …)Use online doc for intro & outline then parallel sectionsRotated authors to revise sections

May 2012

Brainstorm list of researcher in/out of comp. architecture

Solicit researcher feedback/endorsementDo distributed revision & redo of introRelease May 25 to CCC & via email

Kudos to participants on executing on a tight timetable

18

Slide19

Back Up SlidesDetailed research areas in white paperArchitecture as Infrastructure: Spanning Sensors to CloudsEnergy FirstTechnology Impacts on Architecture

Cross-Cutting

Issues &

Interfaceshttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf

Findings on National Academy “Game Over” Study

Glimpse at DARPA/ISAT Workshop

“Advancing

Computer Systems without Technology Progress

19

Slide20

1. Architecture as Infrastructure: Spanning Sensors to CloudsBeyond a chip in a generic computerTo pillar of 21st century

societal infrastructure.

Computation

in context (sensor, mobile, …, data center)

Systems often large & distributedCommunication issues can dominate computationGoals beyond performance

(battery

life, form factor

)

Opportunities (not exhaustive)

Reliable sensors harvesting (intermittent) energy

Smart phones to Star Trek’s medical “

tricorder

Cloud infrastructure suitable for both “Big Data” streams

& low-latency qualify-of-service with stragglers

Analysis & design tools that scale20

Slide21

2. Energy FirstBeyond single-core performance computerTo (cost-)performance per watt/joule Energy across the layersCircuit/technology (near-threshold CMOS, 3D stacking)

Architecture (reducing unnecessary data movement)

Software (communication-reducing algorithms)

Parallelism to save energyVast (fined-grained) homogeneous & heterogeneousImproved SW stack

Applications focus (beyond graphic processing units)Specialization for performance & energy efficiencyAbstractions for specialization (reducing 1-time cost)

Energy-efficient memory hierarchies

Reconfigurable logic structures

21

Slide22

3. Technology Impacts on ArchitectureBeyond CMOS, Dram, & Disks of last 3+ decades toUsing replacement circuit technologiesSub/near-threshold CMOS, QWFETs, TFETs, and QCAsNon-volatile storage

Beyond flash memory to STT-RAM

, PCRAM,

& memristor3D die stacking & interposerslogic, cache, small main memoryPhotonic interconnects

Inter- & even intra-chipDesign automationfrom circuit-design w/ new technologies to

pre-RTL functional, performance, power, area modeling of heterogeneous chips & systems

22

Slide23

4. Cross-Cutting Issues & InterfacesBeyond performance w/ stable interfaces toNew design goals (for pillar of societal infrastructure)Verifiability (bugs kill)R

eliability (“dependability” computing base?)

Security/Privacy (w/ non-volatile memory?)

Programmability (time to correct-performant solution)

Better InterfacesHigh-level information (quality of service, provenance)Parallelism ((

in)dependence, (lack of) side-effects)

Orchestrating communication ((recursive) locality)

Security/Reliability (fine-grain protection)

23

Slide24

Executive summary (Added to National Academy Slides)Highlights of National Academy Findings(F1) Computer hardware has transitioned to multicore(F2) Dennard scaling of CMOS has broken down(F3) Parallelism

and locality must be exploited by

software

(F4) Chip power will soon limit multicore scalingEight recommendations from algorithms to education

We know all of this at some level, BUT:

A

re

we all acting on this knowledge or hoping for business as usual

?

Thinking beyond next paper to where future value will be created?

Questions Asked but Not Answered Embedded in NA Talk

Briefly Close with

Kübler

-Ross Stages of Grief:

Denial

Acceptance

Source: Future of Computing Performance: Game Over or Next Level?,

National Academy Press, 2011Mark Hill talk (http://www.cs.wisc.edu/~markhill/NRCgameover_wisconsin_2011_05.pptx)

Slide25

The Graph25

System Capability (log)

8

0s

90s

0

0s

10s

2

0s

3

0s

40s

CMOS

Fallow Period

New Technology

Our Focus

5

0s

Source

: Advancing Computer Systems without Technology Progress,

ISAT

Outbrief

(http://www.cs.wisc.edu/~markhill/papers/isat2012_ACSWTP.pdf)

Mark

D. Hill and Christos

Kozyrakis

, DARPA/ISAT

Workshop, March 26-27, 2012

.

Approved

for Public Release, Distribution

Unlimited

The

views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government.

Slide26

Surprise 1 of 2Can Harvest in the “Fallow” Period!2 decades of Moore’s Law-like perf./energy gains

Wring out inefficiencies used to harvest Moore’s Law

HW/SW

Specialization/Co-design (3-100x)

Reduce SW Bloat (2-1000x)Approximate Computing (2-500x)

---------------------------------------------------

~1000x = 2 decades of Moore’s Law!

26

Slide27

“Surprise” 2 of 2Systems must exploit LOCALITY-AWARE parallelismParallelism Necessary, but not SufficientAs communication’s energy costs dominateShouldn’t be a surprise, but many are in denial

Both surprises hard

,

requiring “vertical cut” thru SW/HW

27