A community white paper NSF Outbrief on June 22 2012 Information amp Commun Techs Impact Semiconductor Technologys Challenges Computer Architectures Future PreCompetitive ID: 781570
Download The PPT/PDF document "21st Century Computer Architecture" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
21st CenturyComputer Architecture A community white paper
NSF
Outbrief
on June 22, 2012
Information
&
Commun
. Tech’s Impact
Semiconductor Technology’s Challenges
Computer Architecture’s Future
Pre-Competitive
Research
Justified
Process, Participants & Backups
Slide220th Century ICT Set UpInformation & Communication Technology (ICT)Has Changed Our World<long list omitted>
Required innovations in algorithms, applications, programming languages, … , & system software
Key (invisible) enablers (cost-)performance gains
Semiconductor technology (“Moore’s Law”)Computer architecture (~80x per Danowitz et al.)
2
Slide3Enablers: Technology + Architecture3
Danowitz
et al., CACM 04/2012,
Figure 1
Technology
Architecture
Slide421st Century PromiseICT Promises Much MoreData-centric personalized health care
Computation-driven scientific discovery
H
uman network analysisMuch more: known & unknownCharacterized byBig DataAlways OnlineSecure/Private…Whither enablers of future (cost-)performance gains?
4
Slide5Technology’s Challenges 1/2
Late 20
th
CenturyThe New RealityMoore’s Law
—
2
×
transistors/chip
Transistor count still 2×
BUT…
Dennard
Scaling —~constant power/chip
Gone.
Can’t repeatedly
double
power/chip
5
Slide6Classic CMOS Dennard Scaling:
the Science behind Moore’s Law
6
National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)
Scaling:
Oxide:
t
OX
/
a
Results:
Power Density:
Voltage:
V/
a
Power/ckt:
1/
a
2
~Constant
(Finding 2)
Source: Future of Computing Performance: Game Over or Next Level?,
National Academy Press, 2011
Slide7Power Density:
~Constant
Post-classic CMOS
Dennard Scaling
7
National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)
Scaling:
Oxide:
t
OX
/
a
Results:
Voltage:
V/
a
V
Power/ckt:
1
a
2
1/
a
2
Post Dennard CMOS Scaling Rule
TODO:
C
hips w/ higher power (no), smaller
(
)
, dark silicon
()
, or other (?)
Slide8Technology’s Challenges 2/2
Late 20
th
CenturyThe New RealityMoore’s Law —
2× transistors/chip
Transistor count still 2× BUT…
Dennard
Scaling —~constant power/chip
Gone.
Can’t repeatedly
double
power/chip
Modest (hidden)
transistor unreliability
Increasing
t
ransistor unreliability can’t be hiddenFocus on computation over communication Communication (energy) more
expensive than computation
1-time
costs amortized via mass market
One-time cost
much worse
&
want
specialized
platforms
8
How should architects step up as technology falters?
Slide921st Century Comp Architecture
20
th
Century21st
Century
Single-chip
in
generic computer
Architecture as
Infrastructure
:
Spanning
s
ensors
to clouds
Performance plus security, privacy, availability, programmability, … Cross-Cutting:
Break current layers with new interfaces
Performance
via invisible instr.-level parallelism
Energy
First
Parallelism
Specialization
Cross-layer design
Predictable technologies
: CMOS, DRAM,
& disks
New
technologies
(
non-volatile
memory,
near-threshold,
3D,
photonics,
…) Rethink: memory & storage, reliability, communication
9XX
Slide1021st Century Comp Architecture
20
th
Century21st
Century
Single-chip
in
stand-alone computer
Architecture as
Infrastructure
:
Spanning
s
ensors
to clouds
Performance plus security, privacy, availability, programmability, … Cross-Cutting:
Break current layers with new interfaces
Performance
via invisible instr.-level parallelism
Energy
First
Parallelism
Specialization
Cross-layer design
Predictable technologies
: CMOS, DRAM,
& disks
New
technologies
(
non-volatile
memory,
near-threshold,
3D,
photonics,
…) Rethink: memory & storage, reliability, communication
10
Slide11What Research Exactly?ExampleDream of sensor/phone apps that burn >> 1W packageNote: Short-term use often followed by long idle period
Exploit “Computational Sprinting
” (to idle!)
16 cores (32W) for 100s ms in 1W packageUse phase-change material thermal “capacitor”Research areas in white paper (& backup slides)Architecture as Infrastructure: Spanning Sensors to CloudsEnergy First
Technology Impacts on Architecture
Cross-Cutting Issues &
Interfaces
Computation Sprinting funded
NSF
CCF-1161505
Much more
research developed by future PIs!
11
Slide12Validating Thermal Models12
Source: Computational Sprinting (talk)
NSF CCF-1161505
Raghavan, Luo,
Chandawalla
,
Papaefthymiou
, Pipe,
Wenisch
& Martin
HPCA 2012 (
http://www.cis.upenn.edu/acg/papers/DaSi-talk.pptx)
Slide13Pre-Competitive Research JustifiedRetain (cost-)performance enabler to ICT revolutionhttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf
Successful companies cannot do this by themselves
L
ack needed long-term focusDon’t want to pay for what benefits allResist transcending interfaces that define their products13
Slide14White Paper ProcessLate March 2012CCC contacts coordinator & forms groupApril 2012Brainstorm (meetings/online doc)Read related docs (PCAST, NRC Game Over, ACAR1/2,
…)
Use online doc for intro & outline then parallel sections
Rotated authors to revise sectionsMay 2012Brainstorm list of researcher in/out of comp. architectureSolicit researcher feedback/endorsementDo distributed revision & redo of introRelease May 25 to CCC & via emailKudos to participants on executing on a tight timetable
14
Slide15White Paper Participants“*” contributed prose; “**” effort coordinator
Thanks of CCC, Erwin
Gianchandani
& Ed Lazowska for guidance and Jim Larus & Jeannette Wing for feedback15
Sarita
Adve
, U Illinois *
David H. Albonesi, Cornell U
David Brooks, Harvard U
Luis
Ceze
, U Washington *
Sandhya
Dwarkadas
, U Rochester
Joel Emer, Intel/MIT Babak Falsafi, EPFL
Antonio
Gonzalez
, Intel/UPC
Mark D. Hill, U Wisconsin *,**
Mary Jane Irwin, Penn State U *
David
Kaeli
, Northeastern U *
Stephen W.
Keckler
, NVIDIA/U Texas
Christos
Kozyrakis
, Stanford U
Alvin
Lebeck
, Duke U
Milo Martin, U Pennsylvania
José F.
Martínez
, Cornell UMargaret
Martonosi, Princeton U * Kunle
Olukotun, Stanford UMark Oskin, U Washington Li-Shiuan Peh, M.I.T. Milos Prvulovic, Georgia Tech Steven K. Reinhardt, AMDMichael Schulte, AMD/U WisconsinSimha
Sethumadhavan, Columbia UGuri
Sohi
, U Wisconsin
Daniel
Sorin
, Duke U
Josep
Torrellas
, U Illinois *
Thomas F.
Wenisch
, U Michigan *
David Wood, U Wisconsin *
Katherine
Yelick
, UC Berkeley/LBNL *
Slide16Back Up SlidesDetailed research areas in white paperArchitecture as Infrastructure: Spanning Sensors to CloudsEnergy First
Technology
Impacts on
ArchitectureCross-Cutting Issues & Interfaceshttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf Findings on National Academy “Game Over” StudyGlimpse at DARPA/ISAT Workshop
“Advancing
Computer Systems without Technology Progress
”
16
Slide171. Architecture as Infrastructure: Spanning Sensors to CloudsBeyond a chip in a generic computerTo pillar of
21
st
century societal infrastructure. Computation in context (sensor, mobile, …, data center) Systems often large & distributedCommunication issues can dominate computationGoals beyond performance (battery
life, form factor
)
Opportunities (not exhaustive)
Reliable sensors harvesting (intermittent) energy
Smart phones to Star Trek’s medical “
tricorder
”Cloud infrastructure suitable for both “Big Data” streams
& low-latency qualify-of-service with stragglersAnalysis & design tools that scale17
Slide182. Energy FirstBeyond single-core performance computerTo (cost-)performance per watt/joule
Energy across the layers
Circuit/technology (near-threshold CMOS, 3D stacking)
Architecture (reducing unnecessary data movement)Software (communication-reducing algorithms)Parallelism to save energyVast (fined-grained) homogeneous & heterogeneousImproved SW stackApplications focus (beyond graphic processing units)Specialization for performance & energy efficiencyAbstractions for specialization (reducing 1-time cost)
Energy-efficient memory hierarchies
Reconfigurable logic structures
18
Slide193. Technology Impacts on ArchitectureBeyond CMOS, Dram, & Disks of last 3+ decades toUsing replacement circuit technologies
Sub/near-threshold
CMOS, QWFETs, TFETs, and
QCAsNon-volatile storageBeyond flash memory to STT-RAM, PCRAM, & memristor3D die stacking & interposerslogic, cache, small main memoryPhotonic interconnectsInter- & even intra-chipDesign automationfrom circuit-design w/ new technologies to pre-RTL functional, performance, power, area modeling of heterogeneous chips & systems
19
Slide204. Cross-Cutting Issues & InterfacesBeyond performance w/ stable interfaces to
New design goals (for pillar of societal infrastructure)
Verifiability (bugs kill)
Reliability (“dependability” computing base?)Security/Privacy (w/ non-volatile memory?)Programmability (time to correct-performant solution)Better InterfacesHigh-level information (quality of service, provenance)Parallelism ((in)dependence, (lack of) side-effects)
Orchestrating communication ((recursive) locality)
Security/Reliability (fine-grain protection)
20
Slide21Executive summary (Added to National Academy Slides)Highlights of National Academy Findings(F1) Computer hardware has transitioned to multicore(F2) Dennard
scaling of CMOS has broken
down(F3) Parallelism and locality must be exploited by software(F4) Chip power will soon limit multicore scalingEight recommendations from algorithms to educationWe know all of this at some level, BUT:A
re
we all acting on this knowledge or hoping for business as usual
?
Thinking beyond next paper to where future value will be created?
Questions Asked but Not Answered Embedded in NA Talk
Briefly Close with
Kübler
-Ross Stages of Grief: Denial
…
Acceptance
Source: Future of Computing Performance: Game Over or Next Level?,
National Academy Press, 2011Mark Hill talk (http://www.cs.wisc.edu/~markhill/NRCgameover_wisconsin_2011_05.pptx)
Slide22The Graph22
System Capability (log)
8
0s
90s
0
0s
10s
2
0s
3
0s
40s
CMOS
Fallow Period
New Technology
Our Focus
5
0s
Source
: Advancing Computer Systems without Technology Progress,
ISAT
Outbrief
(http://www.cs.wisc.edu/~markhill/papers/isat2012_ACSWTP.pdf)
Mark
D. Hill and Christos
Kozyrakis
, DARPA/ISAT
Workshop, March 26-27, 2012
.
Approved
for Public Release, Distribution
Unlimited
The
views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government.
Surprise 1 of 2Can Harvest in the “Fallow” Period!2 decades of Moore’s Law-like perf
./
energy
gainsWring out inefficiencies used to harvest Moore’s LawHW/SW Specialization/Co-design (3-100x)Reduce SW Bloat (2-1000x)Approximate Computing (2-500x)
---------------------------------------------------
~1000x = 2 decades of Moore’s Law!
23
Slide24“Surprise” 2 of 2Systems must exploit LOCALITY-AWARE parallelismParallelism Necessary, but not Sufficient
As communication’s energy costs dominate
Shouldn’t be a surprise, but many are in denial
Both surprises hard, requiring “vertical cut” thru SW/HW
24