José Vicente Escamilla José Flich Pedro Javier García 1 Introduction Motivation ICARO overview ICARO description Detection Notification Isolation Results Conclusions Questions Outline ID: 191450
Download Presentation The PPT/PDF document "ICARO: Congestion Isolation in Networks-..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
ICARO: Congestion Isolation in Networks-On-Chip
José Vicente EscamillaJosé FlichPedro Javier García
1Slide2
Introduction / MotivationICARO overviewICARO description
DetectionNotificationIsolationResultsConclusionsQuestions
Outline
2Slide3
Computing power demand
Power saving
Costs constraints
Introduction
CMP
MPSoC
CMP and MPSoCs use a network to interconnect nodes
Network performance degradation due to:
Power saving mechanisms (DVFS)
Bursty traffic patterns
Heterogeneous systems designs
Performance degradation may lead to congestion
Tile-Gx (72 cores)
3Slide4
Premise: Congestion is NOT a problem by itself.
ICARO Overview
ICARO does not remove congestion. ICARO
isolates
it.
Two types of traffic
Congested
Non-congested
Goal: To isolate congested traffic from non-congested one in order to avoid
HoL
-Blocking.
4Slide5
Related work
5
RCA
, P.
Gratz
et al.
Redirects
traffic
at
each
router based
on congestion metrics.Metrics are
piggybacked.Vicious cycles may
be created.“Prediction-based Flow Control for Network-on-Chip Traffic”, U. Ogras et al.Injection control based on prediction-models.Prediction-model uses links status sent through a dedicated network.Injection throttling may produce performance oscillations.AVADA/FVADA, Yi Xu et al.Map different flows to different queues based on the output port requested in the next router (lookahead routing).Require lookahead routing and credit-based flow control.Congested flows and non-congested ones may share queues, generating HoL-blocking in some degree since the mapping policy only consider one hop of the message path.Slide6
ICARO Overview
HoL-Blocking
Credits=2
Credits=0
6Slide7
ICARO uses two types of Virtual Networks (VNs)Regular VN: Non-congested traffic
Extra VN: Congested trafficThree stages:DetectionCongestion is detected at routers.
NotificationRouters notify to all Networks Interfaces (NIs).IsolationNIs isolate congested traffic from not-congested one.
ICARO Overview
7Slide8
NI 0
ICARO Overview
Congestion notification Network (CNN)
SW0
SW1
SW2
SW3
SW4
SW5
SW6
SW7
SW8
SW9
SW10
SW11
SW12
SW13
SW14
SW15
NI 1
NI 2
NI 3
NI 4
NI 5
NI 6
NI 7
NI 8
NI 9
NI 10
NI 11
NI 12
NI 13
NI 14
NI 15
Regular VN queue
Extra VN queue
8Slide9
It is performed at routersDetects congestion points ({router, port} pairs)
When a message arrives/leavesBuffer saturation checkingIf buffer.level > HIGH_THR such buffer is marked as saturated.
If buffer.level < LOW_THR such buffer is marked as NOT-saturated
(hysteresis).If any of the buffers of an input port is marked as saturated the whole input port is marked as well.Congestion checking
Requests from saturated input ports against each output port are computedEach output port requested by more than 1 saturated input port is marked as congested
ICARO Description
Congestion Detection
9Slide10
ICARO Description Congestion Notification Network (CNN)
Segmented ring connecting routers and NIsNetwork width (wires)
Process:
Notifications are injected to the register (when it is free).Notifications are delivered from a register to the next one at each cycle.
Notifications are discarded when reach their origin register.
N=Number of nodes
p=Router radix
1
p
(N)
log
2
+
+
10Slide11
ICARO Description Congestion Notification Network (CNN)
SW0
SW1
SW2
SW3
SW4
SW5
SW6
SW7
SW8
SW9
SW10
SW11
SW12
SW13
SW14
SW15
Register
Notification
11
NI 7
CNN
out
CNN in
Notification
Injection
Notification
Reception
in2
out
in1
Reg
SW 7Slide12
ICARO Description Congestion notification Network (CNN)
12Slide13
Notifications are stored in a cache memory.
Useless notifications are discardedUnreachable CPsRedundant notifications (merge)
ICARO Description
Notification storing
SW
Port
5
E
10
S
13Slide14
SW0
SW1
SW2
SW3
SW4
SW5
SW6
SW7
SW8
SW9
SW10
SW11
SW12
SW13
SW14
SW15
ICARO Description
Notification storing – Unreachable CPs
NI 0
SW
Port
10
S
--
--
NI 4
SW
Port
5
E
10
S
XY routing
14Slide15
SW0
SW1
SW2
SW3
SW4
SW5
SW6
SW7
SW8
SW9
SW10
SW11
SW12
SW13
SW14
SW15
ICARO Description
Notification storing – Redundant CPs
XY routing
NI 4
SW
Port
5
E
10
S
{SW10, Port S} notification is IGNORED
{SW5, Port E} and {SW10, Port S} notifications are MERGED
15Slide16
It is performed at NIsProcess:Initially all traffic is allocated into
regular-VNs.At each cycle the post-processor module checks messages at the header of all regular-VNs in parallel.
If the route crosses any of the CPs stored in the CPs cache memory the message is reallocated into extra-VNs.
ICARO Description
Traffic separation
16Slide17
Arbiter
ICARO Description
Traffic separation
Post-processor
CPs Cache
SW
Port
5
E
Regular-VN
Extra-VN
Network Interface 4
17
Regular-VN
Router 4
Extra-VN
in
out2
out1
dst:12
dst:15
dst:6Slide18
Results
Configuration and tools18
Simulation:
NoC simulator developed in our research group.
Compared against FVADA/AVADA with different number of virtual queues
FVADA: Restricted to 4 VCs
ICARO: Uses <
x>
VNs instead of VCs
Overheads analysis:
Tools used:
Synthesis:
Design vision (Synopsys)
Place & Route: Encounter (Cadence)Library: 45nm Nangate Open Cell (typical conditional)
Parameter
ValueTopology8x8 2D meshRoutingXYSwitchingWormhole (flit-level switching)Flow controlCreditsFlit size128 bitsMessage size5 flitsTraffic
0.3 f/c (background)
+ 1 f/c (hotspot 4-to-1, from
cycle
10k
to
20k)Slide19
4VC/VN
2VC/VN
8VC/VN
Results
Improvement
19Slide20
Results
Overheads - Router
20
Area overhead: ~6%.
Power overhead: varies from 6% to 10%.Slide21
Results
Overheads – Network Interface
21
Area overhead: varies from 3,8% to 6%
Power overhead: varies from 4,5% to 5,4%.Slide22
Conclusions:A mechanism to avoid HoL
-Blocking on networks-on-chip has been presented.ICARO manages to isolate harmful traffic from non-harmful one by using VNs achieving an overall latency improvement of up to 82%.Future work:To analyze hierarchical CNN to improve scalability.
To implement in-order delivery support
Conclusions and Future Work
22Slide23
Questions?
23