On Enhancing the Performance of Bufferless NetworkonChip By Mohamed Assem Abd ElMohsen Ibrahim Under the Supervision of Dr Hatem M El Boghdadi Masters Thesis Computer Engineering Department ID: 772065
Download Presentation The PPT/PDF document "On Enhancing the Performance of Bufferle..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
On Enhancing the Performance of Bufferless Network-on-Chip ByMohamed Assem Abd ElMohsen IbrahimUnder the Supervision of Dr. Hatem M. El-Boghdadi Master’s Thesis Computer Engineering Department 2016
PublicationsM. A. Abd ElMohsen and H. M. El-Boghdadi, "Investigating the Viability of Maximum Flexibility Selection Function in Bufferless 2D Meshes," in International Workshop on Many-core Embedded Systems, Portland, 2015, pp. 52-55On Enhancing the Performance of Bufferless Network-on-Chip2
Outline On Enhancing the Performance of Bufferless Network-on-Chip3Introduction BackgroundContributionScope of ThesisMotivationCongestion ManagementRanking PoliciesSelection FunctionMaxFlex Fixed Step Size MaxFlex Variable Step Size Conclusion & Future Work
Introduction
Background On Enhancing the Performance of Bufferless Network-on-Chip5 Switch Node Bidirectional Link Network-on-Chip (NoC) A group of switches connecting homogeneous or heterogeneous nodes in a multiple point-to-point fashion
Background TopologyArrangement of links and nodes/switchesOn Enhancing the Performance of Bufferless Network-on-Chip6 2D Mesh 2D Torus Fat Tree
Background On Enhancing the Performance of Bufferless Network-on-Chip7Routing Selection FunctionRouting FunctionSet of Productive Output Port(s)Flit (Source, Destination) Productive Output Port
Background On Enhancing the Performance of Bufferless Network-on-Chip8 D S Straight Line Selection Function Maximum Flexibility Selection Function X Y
Background On Enhancing the Performance of Bufferless Network-on-Chip9 ABCLink ? ? Flit Ranking Policy A B C < <
Background On Enhancing the Performance of Bufferless Network-on-Chip10 ?
Background On Enhancing the Performance of Bufferless Network-on-Chip11Switch Routing + Ranking N in S in E in W in N out S out E out W out Injection Ejection N N N E W
Background Problems ?On Enhancing the Performance of Bufferless Network-on-Chip12
Background On Enhancing the Performance of Bufferless Network-on-Chip13Switch Routing + Ranking N in S in E in W in N out S out E out W out Injection Ejection N N N E W N N
Background Problems ?On Enhancing the Performance of Bufferless Network-on-Chip14
Background On Enhancing the Performance of Bufferless Network-on-Chip15Switch Routing + Ranking N in S in E in W in N out S out E out W out Injection Ejection N N
Motivation On Enhancing the Performance of Bufferless Network-on-Chip16Buffers Elimination ConsProsPowerArea Traffic Volume Available Link BW Injection Rate Under High Injection Rate Buffered NoC > Bufferless NoC
Scope of the Thesis On Enhancing the Performance of Bufferless Network-on-Chip17Enhance Bufferless NoC Congestion PreventionDeflection-based Ranking PoliciesSelection FunctionMaxFlex Fixed Step SizeMaxFlex Variable Step Size
Contribution Push the injection rate boundary for the bufferless NoCs making it feasible in a wider range of practical latency-sensitive applicationsOn Enhancing the Performance of Bufferless Network-on-Chip18
Modified Fixed Step Size MaxFlex Selection Function
Motivation On Enhancing the Performance of Bufferless Network-on-Chip20 MaxFlex Selection Function Required MaxFlex Freedom Relax Contention
Proposed Approach On Enhancing the Performance of Bufferless Network-on-Chip21 Straight Line Selection Function Solution MaxFlex + Straight Line
Proposed Approach On Enhancing the Performance of Bufferless Network-on-Chip22MMaxFlexStep Size > 1 Borders TrafficCentral Switches Concentration ContentionDeflectionPacket latency
Analysis of Fixed Step Size MMaxFlex Assumptions:Each node sends only one packet to each other node (i.e. n2-1 packets)Packet length is one FlitNo deflectionsOn Enhancing the Performance of Bufferless Network-on-Chip 23Study the effect of step size on the distribution of packets through mesh network
Analysis of Fixed Step Size MMaxFlex For an meshW = Switch (i,j), where 1 i, j nP = Packet going from source node S (Xsrc, Ysrc) to destination D (Xdst, Ydst) On Enhancing the Performance of Bufferless Network-on-Chip 24 Any packet passing through switch W falls under one of 12 types
Type 1 & 2 Packets On Enhancing the Performance of Bufferless Network-on-Chip25Switch W Node WInjectionEjection
Type 3 & 4 Packets On Enhancing the Performance of Bufferless Network-on-Chip26 A W B C D Count = 6
Type 5 Packets On Enhancing the Performance of Bufferless Network-on-Chip27 A W B C D Count = 6 Down Traffic Up Traffic
Type 6 Packets On Enhancing the Performance of Bufferless Network-on-Chip28 A W B C Down Traffic Up Traffic
Type 7 Packets On Enhancing the Performance of Bufferless Network-on-Chip29 A W B C Down Traffic Up Traffic
Type 8 Packets On Enhancing the Performance of Bufferless Network-on-Chip30 AA B W B C C D E Up Traffic Down Traffic
Type 9 Packets On Enhancing the Performance of Bufferless Network-on-Chip31 A B W C D Up Traffic
Type 10 Packets On Enhancing the Performance of Bufferless Network-on-Chip32 A B W C D Up Traffic
Type 11 Packets On Enhancing the Performance of Bufferless Network-on-Chip33 AA B W B C C D E Up Traffic Down Traffic
Type 12 Packets On Enhancing the Performance of Bufferless Network-on-Chip34 A B A C W B D C D Up Traffic Down Traffic
Proof of Packet Types Completeness In an mesh, under MMaxFlex, any packet going from a source node to a destination node falls under one of the mentioned 12 traffic types On Enhancing the Performance of Bufferless Network-on-Chip35
Proof of Packet Types Completeness On Enhancing the Performance of Bufferless Network-on-Chip36 Type 4 Type 3 Type 5 Type 8 Type 6 Type 11 Type 9 Type 7 Type 12 Type 10 P moves on a column from S to D P moves on a row from S to D P moves on the diagonal from S to D The movement on the diagonal leads the packet to pass through switches on nearby diagonals P moves on a row till then moves on a diagonal P moves on a column till then moves on a diagonal Injection Type 2 Ejection Type 1 The previous cases cover the 12 mentioned patterns proving the lemma
Packets Distribution Analysis Results On Enhancing the Performance of Bufferless Network-on-Chip37Study effect of increasing the step size for MaxFlex Count the packets passing through each switch10x10 Mesh NoCRepresentative SwitchesBorder SwitchesCore Switches Switch (0,0) Switch (0,3) Switch (0,6) Switch (3,3) Switch (3,6) Switch (5,5)
Packets Distribution Analysis Results On Enhancing the Performance of Bufferless Network-on-Chip38
Experimental Setup On Enhancing the Performance of Bufferless Network-on-Chip39Experimental Setup TrafficTopologySimulatorPerformance MetricgpNoCsim 10x10 Mesh Uniform Packet Latency Deflection Count
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip 4098%31%
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip41
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip42 95%99%38%53%
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip43 Mesh NoC Value? Estimate Step size = 60% to 80% of the 2D Mesh Dimension Better Network Performance
Variable Step Size MaxFlex Selection Function
Motivation On Enhancing the Performance of Bufferless Network-on-Chip45Goal Increase NoC Links UtilizationEnhance Traffic DistributionAssign a Different Step Size for each FlitVariable Step Size MaxFlex
NoC Regions On Enhancing the Performance of Bufferless Network-on-Chip46 AB C W B D C Region (1,1) Region (2,1) Region (1,2) Region (2,2)
Proposed Approaches On Enhancing the Performance of Bufferless Network-on-Chip47Variable Step Size NMDVSORMDVSIORVSRMDVS
NMDVS Using the Manhattan distance between NoC nodesOn Enhancing the Performance of Bufferless Network-on-Chip48 Where
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip49 98%32%
RMDVS Using the Manhattan distance between NoC regionsOn Enhancing the Performance of Bufferless Network-on-Chip50 Where
RMDVS` Difference between the distance between the source and destination regionsOn Enhancing the Performance of Bufferless Network-on-Chip51 Where
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip52 96%18%
IORVS Using In-Region and Out-Region routingOn Enhancing the Performance of Bufferless Network-on-Chip53 Freedom Near Nodes Traffic X Far Nodes Traffic
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip54
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip55
ORMDVS Using the Manhattan distance between NoC nodes for Out-Region routingOn Enhancing the Performance of Bufferless Network-on-Chip56 Where
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip57
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip58 33%4%
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip59 29%30%7%8%33%7%8%2%2%8%
New Flit Ranking Policies
Motivation On Enhancing the Performance of Bufferless Network-on-Chip61Deflection Count PerformanceDecrease Deflections Favor the Flits with Higher Deflections Better Network Performance
Flit Ranking Policies On Enhancing the Performance of Bufferless Network-on-Chip62Flit Ranking Policies Oldest FirstOFMost Deflections FirstMDFDeflection Age RatioDARDeflection Distance RatioDDRLast DimensionLD
Flit Ranking Policies On Enhancing the Performance of Bufferless Network-on-Chip63OF MDF DAR DDR LD
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip64
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip65 52%35%50%46%
Latency-Sensitive Congestion Management Mechanisms
Motivation On Enhancing the Performance of Bufferless Network-on-Chip67High Injection Rate Traffic VolumeLink BandwidthContentionDeflectionSource Nodes Starvation Buffered X Bufferless
Congestion Management On Enhancing the Performance of Bufferless Network-on-Chip68Congestion Management DetectionPreventionControlHeuristics ActionsExtra Resources
Proposed Prevention Approaches On Enhancing the Performance of Bufferless Network-on-Chip69Congestion Prevention Decrease Traffic VolumeMore Link BandwidthMore Freedom for the FlitsUsing Larger NoCsUsing Sequential Injection
Using Larger NoCs On Enhancing the Performance of Bufferless Network-on-Chip70Larger NoCLNoC Extra LinksExtra Space for the FlitsContentionPerformance
Using Larger NoCs On Enhancing the Performance of Bufferless Network-on-Chip71 BC W B D C D 3x3 Mesh 4x4 Mesh
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip72 98%31%
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip73 99%89%
Using Sequential Injection On Enhancing the Performance of Bufferless Network-on-Chip74Sequential InjectionSI Divide the Application Mix to a Group of Smaller Mixes Run the Smaller Mixes in SequenceTraffic VolumePerformance Injected Data
Using Sequential Injection On Enhancing the Performance of Bufferless Network-on-Chip75 BC W B D C D Phase 1 Phase 2
Experimental Results On Enhancing the Performance of Bufferless Network-on-Chip76 99%72%
Conclusion
Conclusion Pushing the boundaries of bufferless NoCsSelection functions Investigated using larger and variable step sizes under MaxFlex selection function Flit ranking policies Targeted decreasing the flits’ deflectionsCongestion managementCongestion preventionOur work showed a huge enhancement in both packet latency and deflection count On Enhancing the Performance of Bufferless Network-on-Chip78
Future Work ExtensionInvestigate the proper size for the regions based on the overall NoC size Extend the regions concept to other aspects in NoC Extend our approaches to consider throughput-sensitive applications Investigate the effect of absorbing and re-injecting the NoC traffic via Sink NodesSilicon InterposerRandom Topologies On Enhancing the Performance of Bufferless Network-on-Chip79
Questions
Types Packets Count Type 1Type 2Type 3Type 4Type 5 Type 6Type 7Type 8Type 9Type 10Type 11Type 12On Enhancing the Performance of Bufferless Network-on-Chip81
Type 1 & Type 2 Packets Type 1: Packets destined to node (i,j) EjectionType 2: Packets injected by node (i,j)Injection On Enhancing the Performance of Bufferless Network-on-Chip82
Type 3 Packets Type 3:Packets passing through W injected by node (i,k) and destined to node (i,m) 1 k, m n j k mSame row communication On Enhancing the Performance of Bufferless Network-on-Chip83
Type 4 Packets Type 4:Packets passing through W injected by node (k,j) and destined to node (m,j) 1 k, m n i k mSame column communication On Enhancing the Performance of Bufferless Network-on-Chip84
Type 5 Packets Type 5:Packets passing through W as a result of communication between nodes on the same diagonal as node (i,j)Same diagonal communicationOn Enhancing the Performance of Bufferless Network-on-Chip85
Type 6 Packets Type 6:Packets passing through W as a result of communication destined to nodes on the same diagonal as node (i,j) from nodes with Move on a row first then diagonal On Enhancing the Performance of Bufferless Network-on-Chip86 If diagonal is below main diagonal Else
Type 7 Packets Type 7:Packets passing through W as a result of communication destined to nodes on the same diagonal as node (i,j) from nodes with Move on a column first then diagonal On Enhancing the Performance of Bufferless Network-on-Chip87 If diagonal is above main diagonal Else
Type 8 Packets Type 8:Packets passing through W as a result of communication between nodes on a diagonal other than node (i,j) diagonal Effect of Type 5 On Enhancing the Performance of Bufferless Network-on-Chip 88
Type 8 Packets On Enhancing the Performance of Bufferless Network-on-Chip89
Type 9 Packets Type 9:Packets passing through W as a result of communication destined to nodes on a diagonal other than node (i,j) diagonal from nodes with Effect of Type 6 On Enhancing the Performance of Bufferless Network-on-Chip90
Type 9 Packets On Enhancing the Performance of Bufferless Network-on-Chip91 If diagonal is below main diagonal Else
Type 10 Packets Type 10:Packets passing through W as a result of communication destined to nodes on a diagonal other than node (i,j) diagonal from nodes with Effect of Type 7 On Enhancing the Performance of Bufferless Network-on-Chip92
Type 10 Packets On Enhancing the Performance of Bufferless Network-on-Chip93 If diagonal is above main diagonal Else
Type 11 Packets Type 11:Packets passing through W as a result of communication between node (i,k) from same row as node (i,j) and nodes on node (i,m) diagonal where 1 k, m n and j k mEffect of Type 6 On Enhancing the Performance of Bufferless Network-on-Chip94 If diagonal is below main diagonal Else
Type 12 Packets Type 12:Packets passing through W as a result of communication between node (k,j) from same row as node (i,j) and nodes on node (m,j) diagonal where 1 k, m n and i k mEffect of Type 7 On Enhancing the Performance of Bufferless Network-on-Chip95 If diagonal is above main diagonal Else