/
Analysis of a Chip Multiprocessor Using Analysis of a Chip Multiprocessor Using

Analysis of a Chip Multiprocessor Using - PowerPoint Presentation

celsa-spraggs
celsa-spraggs . @celsa-spraggs
Follow
342 views
Uploaded On 2019-11-08

Analysis of a Chip Multiprocessor Using - PPT Presentation

Analysis of a Chip Multiprocessor Using Scientific Applications Gilbert Hendry Aleksandr Biberman Johnnie Chan Benjamin G Lee Luca P Carloni Keren Bergman Shoaib Kamil Marghoob Mohiyuddin ID: 764541

symposium networks 2009 conc networks symposium conc 2009 international mesh photonic torus electronic results chip setup 2009international switch experiment

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Analysis of a Chip Multiprocessor Using" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Analysis of a Chip Multiprocessor Using Scientific Applications Gilbert Hendry Aleksandr Biberman Johnnie Chan Benjamin G. Lee Luca P. Carloni Keren Bergman Shoaib KamilMarghoob MohiyuddinAnkit JainLeonid OlikerJohn KubiatowiczJohn Shalf

Motivation CMPs of the future = 3D stackingLots of data on chipPhotonics offers key advantages5/21/2009International Symposium on Networks-on-Chip2 Network layer Memory layers Multi-core processor layer

5/21/2009 International Symposium on Networks-on-Chip Why Photonics?TX RX ELECTRONICS: Buffer, receive and re-transmit at every router. Each bus lane routed independently. (P  N LANES ) Off-chip BW is pin-limited and power hungry. Photonics changes the rules for Bandwidth, Energy, and Distance. OPTICS: Modulate/receive high bandwidth data stream once per communication event. Broadband switch routes entire multi-wavelength stream. Off-chip BW = On-chip BW for nearly same power. 3 RX TX RX RX TX RX TX RX TX TX TX TX TX TX RX

Silicon Photonic Integration MIT, 2008 IBM, 2007 Cornell, 2005 Sandia, 2008Ghent, 2007Columbia, 2008

Related Work 5/21/2009 International Symposium on Networks-on-Chip5Shacham, NOCS ‘07Vantrease, ISCA ‘08 Batten, HOTI ‘08

Hybrid Photonic Network 5/21/2009 International Symposium on Networks-on-Chip6ComputeElectronic Control Photonic Transmission

Hybrid Photonic Network 5/21/2009 International Symposium on Networks-on-Chip7

Hybrid Photonic Network 5/21/2009 International Symposium on Networks-on-Chip8

Contributions This work achieves:Accurate simulationApplication-based workloads Comparison of electronic and photonic networks5/21/2009International Symposium on Networks-on-Chip9

NanoPhotonic Devices 5/21/2009 International Symposium on Networks-on-Chip10Laser Photodetectors Electronic data Silicon waveguides Ring resonator (modulator) Ring resonator (filter) L. Chen, OE , 2008

Switching Building Blocks 5/21/2009 International Symposium on Networks-on-Chip11 Broadband 2×2 Switch B. G. Lee, ECOC 2008 Cross State Bar State  Transmission

Switch Characterization 5/21/2009 International Symposium on Networks-on-Chip12[A. Biberman et al., LEOS, 2007] ER IL ER Loss Parameter Value Waveguide propagation 0.5 dB/cm Waveguide crossing 0.05 dB Waveguide bend 0.005 dB/90 o Passing by Micro-Ring (OFF) 0 dB Coupling into Micro-Ring (ON) 0.5 dB Broadband 1×2 Switch

Higher Order Switches 5/21/2009 International Symposium on Networks-on-Chip13NES W

Simulation Environment Built in OMNeT++Processing Plane Random, Trace, ExecutionElectronic PlaneRoutersXY routingBubble Flow Control4 VCsPipelined – input, arbitration, outputORION – energyCircuit path setup logic WiresCustom lengths Photonic PlaneSwitches, modulators, detectors, filters, waveguides5/21/200914International Symposium on Networks-on-Chip

Optical Loss Analysis 5/21/2009 International Symposium on Networks-on-Chip15Optical powerDetector sensitivity Nonlinear effects Total Injected powerReceived powerWorst-case Insertion LossInjected power per wavelength...PI = pi × NλLaser Modulators Switch Switch Detectors

Insertion Loss Analysis 5/21/2009 International Symposium on Networks-on-Chip16

Experiment setup NetworksParameters TrafficResults5/21/2009International Symposium on Networks-on-Chip17

Experiment setup 5/21/2009 International Symposium on Networks-on-Chip18Electronic Mesh Concentrated Electronic MeshConcentrated Electronic Torus Parameters Traffic Results Networks Mesh Conc. Mesh Conc. Torus

Experiment setup 5/21/2009 International Symposium on Networks-on-Chip19Photonic Torus ParametersTrafficResultsNetworks Mesh Conc. Mesh Conc. Torus --------------- Ph. Torus

Experiment setup 5/21/2009 International Symposium on Networks-on-Chip20 Msg sizeBandwidthSelective Photonic Torus256BPh. TorusEl. Mesh Parameters Traffic Results Networks Mesh Conc. Mesh Conc. Torus --------------- Ph. Torus Selective

Experiment setup ParametersTrafficResults5/21/2009International Symposium on Networks-on-Chip21Concentrated Photonic TorusNetworks Mesh Conc. Mesh Conc. Torus --------------- Ph. Torus Selective Conc. Torus Conc. Sel. Gateway Core Core Core Core

Parameter Value Cores64Clock Frequency5 GHzData rate10 Gb/sSimulation Parameters5/21/2009International Symposium on Networks-on-Chip22Network Channel WidthBuffer Size (b) Electronic Mesh1281024Conc. Electronic Mesh1282048Conc. Electronic Torus1282048Photonic Torus32512Selective Photonic Torus641024Conc. Photonic Torus321024Selective Conc. Ph. Tor.642048Energy ParameterValuePSE dynamic energy375 fJPSE static (OFF) power400 µWModulation dynamic energy25 fJ /bit Modulation static power 30 µW Detector Energy 50 fJ /b Wire Energy ~50 fJ /bit/mm Networks Traffic Results Parameters

Synthetic Benchmarks 5/21/2009 23International Symposium on Networks-on-ChipNeighborRandom Bitreverse TornadoEach transfer occurs 100 timesTwo versions: small (96B), large (128kB)NetworksParametersResults Traffic

Scientific Applications 5/21/2009 International Symposium on Networks-on-Chip24Gyrokinetic Toroidal Code (GTC) Cactus PARAllel Total Energy Code (PARATEC)MADbenchProfiled by overloading communication functions in LinuxTraces broken into phases to preserve orderRandom mapping Application Num Phases Num Msgs Avg. Msg. Size (b) Cactus 2 285 25600 GTC 2 63 129796 MADbench 195 15414 5613 PARATEC 34 126059 43.3 Networks Parameters Results Traffic

Results – Synthetic (Small) 5/21/2009 International Symposium on Networks-on-Chip25 Conc. Mesh Conc. TorusPhotonic TorusSelective Conc. TorusConc. SelectiveElectronicPhotonic

Results – Synthetic (Large) 5/21/2009 International Symposium on Networks-on-Chip26 Conc. Mesh Conc. TorusPhotonic TorusSelective Conc. TorusConc. SelectiveElectronicPhotonic

Results - Applications 5/21/2009 International Symposium on Networks-on-Chip27 Conc. Mesh Conc. TorusPhotonic TorusSelective Conc. TorusConc. SelectiveElectronicPhotonic

Conclusions Detailed physically accurate simulations of future networks are informative.Photonics wins on energy consumptionSignificant difference in performance across different apps. Large messages/distances do well. Synergistic co-design of electronic and photonic planes may be beneficial5/21/2009International Symposium on Networks-on-Chip28