IFSRL An Intelligent Forwarding Strategy Based on Reinforcement Learning in NamedData Networking Yi Zhang 1 Bo Bai 2 Kuai Xu 3 Kai Lei 1 1 ICNLAB SECE Peking University 2 Future Network Theory Lab 2012 Labs Huawei ID: 808538
Download The PPT/PDF document "NetAI 2018 , Budapest, Hungary" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
NetAI 2018, Budapest, Hungary
IFS-RL: An Intelligent Forwarding Strategy Based on Reinforcement Learning in Named-Data NetworkingYi Zhang1, Bo Bai2, Kuai Xu3, Kai Lei1,*1ICNLAB, SECE, Peking University2Future Network Theory Lab, 2012 Labs, Huawei3Arizona State University
Slide2Outline
2/23IntroductionMethodologyBasic Training AlgorithmLearning GranularityEnhancement for Topology ChangePreliminary ExperimentsConclusionsNamed-Data Networking (NDN)Intelligent Forwarding StrategyReinforcement Learning (RL)
Slide3NetAI 2018, Budapest, Hungary
Introduction3/23
Slide4Introduction
Named-Data Networking (NDN)An Information Centric Network (ICN) architecturePull-based data delivery processTriggered by user requests, i.e., Interest Pkt.Request forwarding is driven by forwarding enginesReachability information about different content itemsForwarding Information Base (FIB)4
/23
Slide5Introduction (Cont
)Interest Forwarding Process in NDNThe forwarding plane enables each router to Utilize multiple alternative interfaces Measure the performance of each pathForwarding StrategyFor each Interest Pkt., select the optimal interface from multiple alternative interfaces5/23
…
interface 1
interface 2
interface
k
forward
Slide6Introduction (Cont
)Existing forwarding strategies Fixed control rulesSimplified models of the deployed environmentFail to achieve optimal performance across a broad set of network conditions & application demands6/23Propose IFS-RL: An intelligent forwarding strategy based on RLDetermine a self-adaptive learning granularityEnhance the basic model to handle topology changes
Slide7NetAI 2018, Budapest, Hungary
Methodology7/23
Slide8Basic Training Algorithm
Reinforcement Learning (RL) FrameworkConsist of Agent & EnvironmentFor a certain time step t8/23Transit state st→st+1
Observe state
s
t
Choose action
a
t
Receive reward
r
t
The goal
Maximize the
expected cumulative discounted reward
Slide9Basic Training Algorithm (Cont)
9/23The IFS-RL ModelAgent - RouterImplemented by Neural Networks (NNs)Observe the network state (e.g., RTT & # Pkt for each interface)Determine the optimal forwarding interfaceUse reward information to train the NNsEnvironment - Network
Slide10Basic Training Algorithm (Cont)
10/23The IFS-RL Model (Cont)State: st = (Dt, Nt) (Average Delay, # of Interest Pkt.)Dt = (d1, d2, …, dK); di: Avg. delay of interface i (Approximated by RTT)Nt
= (
n
1
,
n
2
, …,
n
K
);
n
i
:
# of Interest Pkt
. forwarded by interface
i
D
t
N
t
Slide11Basic Training Algorithm (Cont
)11/23The IFS-RL Model (Cont)ActionChoose an interface based on the learned policy μRewardNegative Average RTTs of all packets between two continuous actions
Slide12Basic Training Algorithm(Cont)
The IFS-RL Model (Cont)Policy π(st, at) (continuous domain)Deep Deterministic Policy Gradient (DDPG) [Timothy P. et al. '15]Actor-critic method12/231-D Conv. LayerDense Hid. Layer
Output
Layer
Actor Net.
Critic Net.
Slide13Learning GranularitySetting of learning granularity
Massive packets to be processedLet calculation keep up with pkt. arrivalPut the learning granularity as a part of action spaceUse the combination of Selected interface & Num. of time intervals13/23Action (Interface, #Time intervals)
Slide14Learning Granularity (Cont)IFS-
RL Algorithm (Consider the learning Granularity)Observe state information st = (Dt, Nt)Take action at according to the learned policy μSelected interface iLearning granularity TlgDuring the period of time TlgForward all the Interest Pkt. through interface iCalculate reward rtUpdate the NNs’ parameters according to (st, at, rt)Start the next round of learning
14
/23
Slide15Enhancement for Topo. ChangeNetwork Topology Changes
Lead to dimensional changes of st and atSet input and output formats span the max. # of interfaceE.g., ordinary routers with max. # of interfaces of 48 Zero out unavailable interfacesInterpretation of actor network’s outputApply a mask to the (softmax) actor net.'s output layer0-1 vector [m1, m2, …, mk]pi: normalized probability for action i15/23
Slide16NetAI 2018, Budapest, Hungary
Preliminary Experiments16/23
Slide17Experiment Results
Experiment settingSimulation experiments in NDNSimThroughput & Drop rateComp. with BestRoute [A. Afana et al.'12] & EPF [K. Lei et al.'15]Simulation topology: 17/23ConsumerProducerR1
R6
R2
R3
R4
R5
4 Mbps
7 Mbps
Bandwidth
4 Mbps
4 Mbps
4 Mbps
7 Mbps
7 Mbps
7 Mbps
10 Mbps
10 Mbps
Slide18Experiment Results (Cont)
Simulation experimentSimulation topology Pkt SizeInterest Pkt: 40 bytesData Pkt: 1024 bytes18/23ConsumerProducerR1R6R2R3
R4
R5
Delay
40
ms
40
ms
40
ms
10
ms
7
ms
7
ms
7
ms
7
ms
4 links between consumer & producer
With 1 link having
smaller delay
R1
-
R3
-
R6
Slide19Experiment Results (Cont)
Experimental ResultsConsumer sends Interest Pkt. at a constant rate of 1500 Pkt./sec for 50 Sec19/23ThroughputDrop RateIFS-RLIFS-RL
Slide20Experiment Results (Cont)
20/23Link UtilizationLoad balance of IFS-RL is not the bestMaximize throughput & minimize Pkt. drop rateTend to choose the interface with minimum RTTBestRouteEPFIFS-RLLink utilization
Slide21NetAI 2018, Budapest, Hungary
Conclusion21/23
Slide22Conclusion
IFS-RLAn intelligent forwarding strategyDeep Reinforcement Learning (DRL)Deep Deterministic Policy Gradient (DDPG)Learning granularityIncorporate learning granularity into the action spaceNetwork topology changesSet input and output formats span the max. # interfaceIntroduce a softmax maskSimulation experimentAchieve higher throughput & lower drop rateNeed improvement in load balancing22/23
Slide23NetAI 2018, Budapest, Hungary
Thank You!Q&AFor implementation details, please contact Yi Zhang (1601214039@sz.pku.edu.cn)23/23