NussbaumOlivierRichard imagfr Keywords network emulation software accuracy Abstract Between discrete event simulation and evaluation within real networks network emulation is a useful tool to study and evaluate the behaviour of applications Using a r ID: 82921
Download Pdf The PPT/PDF document "A Comparative Study of Network Link Emul..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
anetworkemulationfacilitybuiltintoLinux'sTrafcControl(TC)subsystem.Anetworkemulatornamedhxbt[14]isalsoavailableinOpenSolaris.Finally,hard-waresolutionsexist,suchasGtrcNET-1[15](usinganFPGA)ortheproductsfromAnu´e[16].Intheremainderofthiswork,wefocusonDummynet,NISTNetandTC/Netem,forthreemainreasons:Firstly,thosethreesolutionsareofproductionquality,andarenolongerprototypes.Theyarereadytobeusedbyresearchers;Secondly,theyarefreelyavailableonoperatingsystems(LinuxandFreeBSD)thatarecommonlyavailableinlaboratories;Finally,theyarealreadybeingusedbytheresearchcom-munity,eitherdirectly,orintegratedintovirtualnetworkemulators.Forexample,EmulabusesDummynetonitsFreeBSDnodesandLinux/TConitsLinuxnodes,whileV-emusesNISTNet.3.FEATURESDummynet,NISTNetandTC/Netemusethesameprinci-ple.Theycaptureincomingoroutgoingpackets,anduseasetofrulesandqueuestostorethepackets,untiltheydeterminethatthepacketcanbereleasedtotheoperatingsystem(inthecaseofincomingpackets)ortothenetwork(inthecaseofoutgoingpackets).However,theirimplementationsandfea-turesdiffer.Table1presentsthefeaturesofDummynet,NISTNet,andTC/Netem.NISTNetandNetemhaveverysimilarfeatures,andactuallysharesomecode,buttheirdesignistotallydif-ferent.WhileNISTNetisbuiltasastandalonemoduleandreliesonthereal-timeclockfortiming(whichisnormallyonlyusedtokeeptrackoftimewhenacomputerispoweredoff),NetemistightlyintegratedwithintheLinuxTrafcCon-trolsubsystem(usuallyusedtoenforcequalityofservicein-sidenetworks),andreliesonthesametimingsourceasdoestherestofthekernel.Also,NetemisdistributedwithLinux,whileNISTNetisdistributedseparately.NISTNetiscurrentlyonlyavailableforversionsofLinuxlowerthan2.6.14(wesuccessfullyuseditonLinux2.6.13.5),butweportedittoLinux2.6.261.Dummynetusesatotallydifferentcodebase,andhasbeenintegratedintoFreeBSDsinceFreeBSD4.ItsmainadvantageoverNISTNetandNetemisthatitworksonbothincomingandoutgoingpackets.However,Dummynetdoesn'tallowtoemulatedegradednetworkconditions(packetduplicationorcorruption). 1Ourpatchisavailableonhttp://perso.ens-lyon.fr/lucas.nussbaum/#nistnet.4.PERFORMANCEEVALUATIONInthissection,westudytheperformanceofDummynet,NISTNetandLinux/TC.Weinvestigatehowcloselytheemu-latednetwork'scharacteristicsmatchtheparametersprovidedbytheuser.Forgivenlatencyandbandwidthparameters,wemeasuretheresultinglatencyandbandwidthontheemulatednetwork.4.1.Experimentalsetup Figure1.ExperimentalsetupThefollowingexperimentsallusethesamenetworkandsystemcongurationshowninFigure1.Theplatformcon-sistsof3nodes(Dual-Opteron2.0GHzwith2GBofRAM)oftheGridExplorercluster(partofthefrenchnation-wideprojectGrid'5000)areused.TheRouterisconguredtoroutepacketsbetweenNode1andNode2.Nodes1and2arerunningLinux2.6.26,whiletheRouterusesLinux2.6.22or2.6.26(forTC/Netem),Linux2.6.26(forNISTNet),orFreeBSD6.1or7.0(forDummynet).Networkcardsaredual-portBroadcomBCM5780GigabitEthernetcontrollersinte-gratedinthenodes'motherboard.Withoutconguringnet-workemulationontherouter,wemeasuredamaximumband-widthof943MbpsandaRoundTripTime(RTT)ofabout180µsbetweennodes1and2.4.2.Timesourceandaccuracyoflatencyemu-lationTheentirefocusoftheindustryisonbandwidth,butthetruekillerislatency.Prof.M.SatyanarayananKeynoteatACMMobicom'96Latencyemulationisanimportantaspectofnetworkem-ulation.Ontoday'snetworks,mostofthelatencyisoftencauseddirectlybyphysicalconstantssuchasthespeedoflightinopticalber,andcan'tbeexpectedtobeimprovedinthenearfuture.Howapplicationsdealwithlatencyisincreas-inglyimportantforperformance,sincetheavailableband-widthkeepsincreasing. Table1.FeaturesofDummynet,NISTNet,andTC/Netem Dummynet NISTNet TC/Netem Availability IncludedinFreeBSD AvailableforLinux2.4and2.6(2.6.14),patchavailableformorerecentversions IncludedinLinux2.6 Timeresolution systemclock(upto10KHz) Realtimeclock systemclock(upto1KHz)orhighresolutiontimers Interceptionpoint Inputandoutput Inputonly Outputonly Latency Yes,constantvalue Yes,withoptionallycorrelatedjitterfollowinguniform,nor-mal,Pareto,ornormal+Paretodistributions Yes,withoptionallycorrelatedjitterfollowinguniform,nor-mal,Pareto,ornormal+Paretodistributions BWlimitation Yes,delaytoaddtopacketsiscomputedwhentheyenterDummynet Yes,delaytoaddtopacketsiscomputedwhentheyenterNISTNet Yes,usingtheTokenBucketFilterfromTC Packetdrop Yes,butwithoutcorrelation Yes,optionallycorrelated Yes,optionallycorrelated Packetreordering No Yes,optionallycorrelated Yes,optionallycorrelated Packetduplication No Yes,optionallycorrelated Yes,optionallycorrelated Packetcorruption No Yes,optionallycorrelated Yes,optionallycorrelated Theaccuracyoftheemulationdependsheavilyonthetimesourceusedbythesoftware.WhileNISTNetusestheRealTimeclockconguredat8192Hz,bothDummynetandNetemusethesametimersastherestofthekernel.OnFreeBSD(Dummynet),thetimerinterruptfrequencyiscon-guredbytheHZvariableofthekernelconguration,whosedefaultvalueis100Hz.ThesituationisdifferentonLinux.Inolderkernelversions(untilLinux2.6.22oni386and2.6.24onx86 64),Netemwasusingthetimerinterrupts(conguredat250Hzbyde-fault),likeDummynetonFreeBSD.Butinadditiontobeingexaminedateachtimerinterrupt,Netem'squeuewasalsoex-aminedeachtimeapacketenteredNetem,which,withim-portanttrafc,couldhideproblemscausedbyalowtimerfrequency.Innewerkernelversions,NetemusesanewsubsystemcalledHighResolutionTimers[17],allowingtoobtainamuchmoreprecisetimingofinterrupts.WeevaluatethosedifferentsolutionsbymeasuringtheRTTovertime,bysendingpingswithahighfrequency.Ifthefrequencyoftimerinterruptsisnothighenough,wewouldobservevariationsinthemeasuredRTT.Sincepacketscanonlybereleasedbytheemulatorwhenatimerinterruptoc-curs,theymightbereleasedslightlytooearly,orslightlytoolate,dependingonhowtheroundingwillhappen.Thiswillcausevariationsintheemulatedlatency.Theaccuracyofstandardpingimplementations,whichusegettimeofday()tomeasurethetime,wasnotsufcientforourpurposes.WemodiedapingimplementationtousetheCPUTimestampCounter(RDTSCassemblerinstruction),toachievebothhighmeasurementfrequency(10-20KHz)andmicrosecondprecision.Wemeasuredthelatencyovertimebetweennodes1and2(Figure1)whenconguringtheemulatorstodelaypacketsfromnode1tonode2for10ms,andevaluatedthefollowingcongurationsfortherouter:Linux2.6.22withLinux/TConx86 64,usingtimerin-terrupts,withafrequencyof100Hz,250Hz(thedefaultvalueonLinux)and1000Hz;Linux2.6.26withLinux/TConx86 64,usingHighResolutionTimers.Wealsoveriedthatchangingthetimerinterruptsfrequency(100Hz,250Hz,1000Hz)didn'tchangeourresultswiththisconguration;Linux2.6.26withNISTNet;FreeBSD7.0,withafrequencyof100Hz,1KHz,and10KHz.Forsomeexperiments,wealsocomparedtheresultswithFreeBSD6.1.Figures2and3showtheresultsforallofthosecongura-tions.Thecongurationsaresplitin3groups,eachprovidingsimilarresults,toeasecomparisons.Foreachconguration,theplotontheleft(Figure2)givestheevolutionoflatencyovertime,measuredusingpingssentwithaveryhighfre-quency,whiletheplotontheright(Figure3)givesthedistri-butionfunctionoflatency,measuredwithpingssentwitharandominterval.Severalcongurationsexhibitasawtoothbehaviour,whichcaneasilybeexplained:sincepacketscanonlybedequeuedwhenatimerinterrupthappens,thedurationoftheirstay intheemulator'squeuewilldependontheirarrivaltime.Packetsarrivinglongbeforethenexttimerinterruptwillstaylongerinthequeuethanpacketsarrivingjustbeforeatimerinterrupt.Thissawtoothbehaviourcouldcreateabiasinexperimen-talresults.Atnetwork-level,equipments(routers,switchers)mightnotbeabletohandleasuddenburstofpackets,andcausepacketdrops.Atapplication-level,thoseburstsofpack-etswillincreasetheneedforlargebuffers,andmightdesyn-chronizeprocessesthatwouldotherwisebesynchronized.WithLinux2.6.22andFreeBSD7.0,onecanclearlyseetheinuenceofthetimerfrequency.Increasingthefrequencymakestheemulationmoreaccurate.Withalowfrequency(100Hzor250Hz),thevariationsoflatencyareveryimpor-tant.Forexample,onLinux,andwithaclockconguredat100Hz,theemulatedlatencyvariesbetween13msand30mswhentheuserconguresalatencyof10ms.WithFreeBSD7.0,onecanalsoseethattheaccuracydoesn'timprovewhenthetimerfrequencyischangedfrom1KHzto10KHz.WithFreeBSD6.1(Figure2,group3),itisnotthecase.Afrequencyof10KHzprovideslatencyemulationthatis10timesmoreprecisethanwithaclockat1KHz.Becauseofdifferencesinalgorithmsusedtoemulatela-tency,onecanalsoseethattheemulatedlatencyisalwayshigherthantheoneconguredwithLinuxTC.Onthecon-trary,withDummynet,itislowerthantheconguredlatencymostofthetime.Finally,3solutionsprovidereasonableperformance(mostoftheremainingdifferencebetweentheemulatedlatencyandtheconguredlatencycanbeexplainedbythephysicalla-tencyoftheexperiment'snetwork):NISTNet,becauseitdoesn'tusethesametimerinter-ruptsastherestofthesystem,butaclockconguredat8192Hz;FreeBSD6.1conguredwithatimerfrequencyof10KHz;LinuxwithHighResolutionTimers.However,increasingtheinterruptfrequencywithNIST-NetandFreeBSD6.1isnotcost-free,becauseitimpliesthattheinterrupthandlingroutineisexecutedmuchmoreoften,causinguselesscontextswitchsbetweenuserspaceandker-nelspace,andcachetrashing.WeusedasimpleCPU-intensiveprogram(anextremelysimplecalculationwithnomemorypressure-Ackermann'sfunction-isperformed50billiontimes)toshowtheinu-enceofthetimerfrequencyonperformance.TheexecutiontimesofthisprogramonFreeBSDusingdifferenttimerfre-quencysettingsaredetailedinTable2.At10KHz,onecanTable2.AverageexecutiontimeofaCPU-intensivepro-gramonFreeBSDusingdifferenttimerinterruptfrequencysettings. HZvalue Executiontime Overhead 100Hz 81.5s 1000Hz 81.7s 0.2% 10000Hz 84.2s 3.3% seea3.3%slowdowninthesystem'sspeed,which,insomecircumstances,couldbecomeaproblem.ItisalsoworthnotingthatNISTNetsuffersfromthesameproblem,evenifitisusingaseparateclockfortiming.Af-terloadingtheNISTNetkernelmodule,theexecutionofthesameprogramtook86.8s(overheadof6.3%).HighResolutionTimersdon'tsufferfromthesameprob-lems.Whentheyareenabled,butnotused,theydon'tslowdowntherestofthesystem.However,whentheyareused,theyincreasethenumberofinterrupts.Sincetheyaremoreprecise,packetswon'tbesentingroups,butseparately,withadifferenttimerinterruptforeachpacket.4.3.BandwidthlimitationBandwidthlimitationistheotherimportantaspectofnet-workemulation.Manyoftoday'snetworklinkshaveverylimitedbandwidth,oranasymmetricbandwidth,suchasbroadbandor3Gnetworks.Mostexperimentaltestbedsdon'tincludesystemshostedonsuchconnections,soitisimportantforresearcherstobeabletoemulatesuchlinks.Theimplementationofbandwidthlimitationdiffersinnetworkemulators.WhileNISTNetandDummynetsimplycomputethedelaytoaddtoaspecicpacketbasedontheconguredbandwidthandthecurrentstateofthequeue,TCusesaToken-Bucketalgorithmtoshapetrafc.Inthisexperiment,wecomparedthedesiredratewiththeonemeasuredusingiperf.Usingiperfaddsasmallbiastothemeasurement,becauseiperfmeasurestheavailablebandwidthusingaTCPstream,whilethebandwidthlimita-tionsetsthebandwidthavailableforIPpackets.Theexper-imentwascarriedoutonEthernet,thustheinterfaceMTU(MaximumTransmissionUnit)wassetto1500bytes.TheIPandTCPheadersareusing52bytes,sothemeasuredband-widthwascorrectedby3.6%toincludetheIPandTCPhead-ers.Figures4and5comparesthecorrectedmeasuredband-widthusingDummynet(withatimerinterruptfrequencyof10KHz),NISTNet,andTC/Netem(withLinux2.6.26).Dif-ferencesbetweentheachievedbandwidtharelimited,butDummynetandNISTNetareslightlymoreaccuratethanTC.Whenlookingmorecloselyattheresultswhenthedesiredbandwidthishigh(between500Mbpsand1Gbps,second CongurationofTC'sTokenBucketFilterTheTokenBucketFilterusedwithTCtolimittheband-widthisalsoasourceofdiscrepanciesbetweentheconguredrateandthemeasuredrate.SinceitsoriginalgoalwastobeusedforprovidingQualityofService(QoS)insidenetworks,itusesacomplexalgorithm.Thisalgorithmallowsburstsofpacketstogothroughataratefasterthantheconguredrate;ifthelineisidle,thereisnoneedtodelayaveryshortbutveryintensiveconnection.Thisisofcoursenotagoodideawithnetworkemulation,butawork-aroundexistswiththepeakrateparameter,thataddsasecondTBFwithaverysmallbucket,toavoidbursting.However,thissecondtokenbucketaddscomplexitytothecongurationoftheTokenBucketFilter.Thismakesitverydifculttodeterminesettingsthatwillbeemulatedatthede-siredbandwidth.Bycontrast,congurationofDummynetorNISTNetiseasier.Itisimportantthatoneveriesthatthesettingsarecorrectbeforeconductingtheexperiment.5.USERINTERFACESWhiletheperformanceandtheaccuracyofnetworkemu-latorsareimportant,theirusabilityisalsoanimportantaspecttoconsider.BothDummynetandNISTNetusearule-basedcongu-ration,similartothecongurationofrewalls,whichmakethemeasytounderstand,especiallyforusersalreadyfamiliarwithrewallcongurations.However,theylacksupportforcomplexhierarchicalsetsofrules,whichcouldbeaproblemiftheuseristryingtoemulateacomplexnetworktopology.LinuxTCusesanotherapproach.Itscongurationisdonewithahierarchicalsetofqdiscs(queueingdisciplines)andclasses.Itismorepowerful,butalsomoredifculttounder-stand.6.INTERCEPTIONPOINTOneimportantadvantageofDummynetoverNISTNetandTCisthatitcancapturebothincomingandoutgoingpackets.NISTNetonlyallowsemulationofincomingpackets,whileTConlyallowsemulationofoutgoingpackets,whichislog-icalsinceitwasdesignedasatrafcshaper,notasanemu-lator.However,inmanycases,itisnecessarytoperformem-ulationofincomingpacketsaswell,forexampleiftheuserwantstoperformemulationofthesystemwheretheapplica-tionisrunning(withoutusinganintermediaterouter).AsolutionexistsforTCwiththeifbdevice(IntermediateFunctionalBlock),whichisadummy(software-only)net-workdevice.Itispossibletoredirectallincomingpacketstotheifbdevice,andtoapplyemulationparameterswhenpack-etsexittheifbdevice.Figure6showshowtoapply50msoflatencytoincomingpackets.However,onecanquestiontheoverheadcausedbysuchaconvolutedsolution.#initializeifbmodprobeifbifconfigifb0up#addaningressqdisctoprocess#incomingpacketstcqdiscadddeveth0ingress#redirecteverythingtoifb0tcfilteradddeveth0parentffff:\protocolipprio10\u32matchipsrc0.0.0.0/0flowidffff:\actionmirredegressredirectdevifb0#setupnetemonifb0tcqdiscadddevifb0root\netemdelay50msFigure6.UsingtheifbdummydevicetoapplyemulationparametersonincomingpacketsUsingaGtrcNET-1,anFPGA-basedhardwarenetworkemulatorandmeasurementtool,wemeasuredthetimetakenbypacketstotraverseacomputeractingasarouter.Intherstcase,noTCcongurationwasused.Inthesecondcase,anIFBdevicewasadded,andincomingpacketswereredi-rectedtoit,buttheIFBdevicedidn'tperformanyemulation.Figure7showsthatthedifferencebetweenthetwocasesisminor(about5.2µs),andprobablynegligibleinmostcases. Figure7.ECDFofthetraversaltimeofacomputeractingasarouter,withandwithoutredirectingincomingpacketstoanIFBdeviceTheCPUoverheadisunfortunatelymoreimportant.Underveryheavynetworkload(1Gbps,smallUDPpacketsgen-eratedusingiperf),ourtestsystemshowedthattheCPUwasusedabout40%ofthetimewithoutIFB.WhenIFBwasadded,itincreasedtoabout50%.Thiscouldbeaproblem whentheapplicationisrunningonthesamenodeastheemu-lator,sinceincreasedCPUusageoftheemulatormightaffecttheapplication'sperformance.7.FUTUREWORKOneaspectthatwasvoluntarilyignoredbythisstudyisthecostofnetworkemulationontherouter.Iftherouterisdedicatedtonetworkemulation,thisisunlikelytobecomeaproblem.However,inmanycases,itisinterestingtoexecutetheapplicationunderstudydirectlyonthesytemonwhichnetworkemulationtakesplace.Inthatcase,networkemula-tioncouldaffecttheresultssignicantly.Secondly,forsomeexperiments,itmightbenecessarytoconguremanyconcurrentqueues(forexample,onarouteremulatingthenetworklinksofahighnumberofsystems).Theperformanceofnetworkemulatorsmightbecomeaprob-lemwhenusedtoemulateahighnumberofdifferentlinks.Inparticular,thealgorithmformatchingpacketsandqueueswillthenbeofhighimportance,andshouldbeexamined.8.CONCLUSIONNetworkemulatorsallowonetoeasilyperformexperi-mentsundervariousnetworkconditions,enablingresearcherstoevaluatetheiralgorithmsindifferentenvironments.How-ever,thefactthatdifferentsolutionsexist,andthattheyhadneverbeencomparedbefore,limitedtheirwidespreaduse.Thisworkfocusesonthreenetworklinkemulators:Dum-mynet,NISTNetandtheLinuxTrafcControl(TC)subsys-tem,whicharefreelyavailableinwidelyusedoperatingsys-tems.Thosethreeemulatorshavealsobeenusedasbuildingblocksforlarge-scaleemulationplatformslikeEmulab.Wecontributeadetailedcomparisonofthosetools,includingastudyoftheaccuracyoflatencyandbandwidthemulation.Ourworkpinpointsseveralsomeissues.First,latencyemu-lationexhibitsasawtoothbehaviourthatcouldcreateabiasinexperiments.Ahigh-frequencytimersourcemitigatesthisproblem,butincreasingthetimerfrequencycausesanover-headwhichmightbeaprobleminsomeexperiments.WedemonstratehowrecentchangesintheLinuxkernel(highresolutiontimers)allowtoimprovethatsituation.Second,wedescribehowbandwidthemulation,whilebeingofreason-ablequalityinallthreeemulators,alsosuffersfromproblems.Dummynetdoesn'tallowonetoachieveveryhighemulatedbandwidth,andthetimerfrequencymightleadtoburstinessiftheemulatedbandwidthisimportant,leadingtounrealistictrafc.Finally,weprovideasetofcongurationsthat,forsev-eralreasons,don'texhibitsomeofthoseproblems.Itisim-portantthatusersareawareofthoseproblems,andvalidatetheiremulators'settingsbeforeperformingexperiments.Net-workemulatorsarepowerfultools,butshouldnotbetreatedasblackboxes.REFERENCES[1]JongSukAhn,PeterB.Danzig,ZhenLiu,andLiminYan.Evaluationoftcpvegas:emulationandexperiment.SIG-COMMComput.Commun.Rev.,25(4),1995.[2]H.J.Song,X.Liu,D.Jakobsen,R.Bhagwan,X.Zhang,K.Taura,andA.Chien.TheMicroGrid:ascientictoolformodelingcomputationalgrids.InSupercomputing'00:Pro-ceedingsofthe2000ACM/IEEEconferenceonSupercomput-ing,2000.[3]AminVahdat,KenYocum,KevinWalsh,PriyaMahadevan,DejanKostic,JeffChase,andDavidBecker.Scalabilityandaccuracyinalarge-scalenetworkemulator.InOSDI'02,2002.[4]BrianWhite,JayLepreau,LeighStoller,RobertRicci,ShashiGuruprasad,MacNewbold,MikeHibler,ChadBarb,andAb-hijeetJoglekar.Anintegratedexperimentalenvironmentfordistributedsystemsandnetworks.SIGOPSOper.Syst.Rev.,36(SI),2002.[5]PeiZhengandLionelM.Ni.Empower:Aclusterarchitecturesupportingnetworkemulation.IEEETrans.ParallelDistrib.Syst.,15(7),2004.[6]M.ZecandM.Mikuc.Operatingsystemsupportforinte-gratednetworkemulationinIMUNES.InProceedingsofthe1stWorkshoponOperatingSystemandArchitecturalSupportfortheondemandITInfraStructure,2004.[7]GeorgeApostolopoulosandConstantinosHassapis.V-em:Aclusterofvirtualmachinesforrobust,detailed,andhigh-performancenetworkemulation.InMASCOTS'06,2006.[8]P.Vicat-BlancPrimet,R.Takano,Y.Kodama,T.Kudoh,O.Gluck,andC.Otal.Largescalegigabitemulatedtestbedforgridtransportevaluation.InPFLDnet2006,2006.[9]DavidB.InghamandGrahamD.Parrington.Delayline:Awide-areanetworkemulationtool.ComputingSystems,7(3),1994.[10]MarkAllman,AdamCaldwell,andShawnOstermann.ONE:Theohionetworkemulator.TechnicalReportTR-19972,OhioUniversity,August1997.[11]LuigiRizzo.Dummynet:asimpleapproachtotheevaluationofnetworkprotocols.ACMComputerCommunicationReview,27(1),1997.[12]MarkCarsonandDarrinSantay.NISTNet:aLinux-basednetworkemulationtool.SIGCOMMComput.Commun.Rev.,33(3),2003.[13]StephenHemminger.NetworkemulationwithNetEm.Inlinux.conf.au2005,2005.[14]Hxbt:WANemulatorforsolaris.http://www.opensolaris.org/os/community/networking/readme.hxbt.txt.[15]Y.Kodama,T.Kudoh,R.Takano,H.Sato,O.Tatebe,andS.Sekiguchi.Gnet-1:gigabitethernetnetworktestbed.InCLUSTER'04,2004.[16]Anu´esystems.http://www.anuesystems.com.[17]ThomasGleixnerandDouglasNiehaus.Hrtimersandbeyond:Transformingthelinuxtimesubsystems.InProceedingsoftheOttawaLinuxSymposium,2006.