/
Cloud Control with Distributed Rate Limiting Barath Raghavan Kashi Vishwanath Sriram Ramabhadran Cloud Control with Distributed Rate Limiting Barath Raghavan Kashi Vishwanath Sriram Ramabhadran

Cloud Control with Distributed Rate Limiting Barath Raghavan Kashi Vishwanath Sriram Ramabhadran - PDF document

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
553 views
Uploaded On 2014-12-17

Cloud Control with Distributed Rate Limiting Barath Raghavan Kashi Vishwanath Sriram Ramabhadran - PPT Presentation

Snoeren Department of Computer Science and Engineering University of California San Diego ABSTRACT Todays cloudbased services integrate globally distribu ted re sources into seamless computing platforms Provisioning a nd ac counting for the resource ID: 25332

Snoeren Department Computer

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Cloud Control with Distributed Rate Limi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

CloudControlwithDistributedRateLimitingBarathRaghavan,KashiVishwanath,SriramRamabhadran,KennethYocum,andAlexC.SnoerenDepartmentofComputerScienceandEngineeringUniversityofCalifornia,SanDiegoABSTRACTToday'scloud-basedservicesintegrategloballydistributedre-sourcesintoseamlesscomputingplatforms.Provisioningandac-countingfortheresourceusageoftheseInternet-scaleapplicationspresentsachallengingtechnicalproblem.Thispaperpresentsthedesignandimplementationofdistributedratelimiters,whichworktogethertoenforceaglobalratelimitacrosstrafcaggregatesatmultiplesites,enablingthecoordinatedpolicingofacloud-basedservice'snetworktrafc.Ourabstractionnotonlyenforcesagloballimit,butalsoensuresthatcongestion-responsivetransport-layerowsbehaveasiftheytraversedasingle,sharedlimiter.Wepresenttwodesigns—onegeneralpurpose,andoneoptimizedforTCP—thatallowserviceoperatorstoexplicitlytradeoffbetweencommunicationcostsandsystemaccuracy,efciency,andscalabil-ity.Bothdesignsarecapableofratelimitingthousandsofowswithnegligibleoverhead(lessthan3%inthetestedconguration).WedemonstratethatourTCP-centricdesignisscalabletohundredsofnodeswhilerobusttobothlossandcommunicationdelay,mak-ingitpracticalfordeploymentinnationwideserviceproviders.CategoriesandSubjectDescriptorsC.2.3[ComputerCommunicationNetworks]:Networkmanage-mentGeneralTermsAlgorithms,Management,PerformanceKeywordsRateLimiting,TokenBucket,CDN1.INTRODUCTIONYesterday'sversionofdistributedcomputingwasaself-contained,co-locatedserverfarm.Today,applicationsareincreas-inglydeployedonthird-partyresourceshostedacrosstheInter-net.Indeed,therapidspreadofopenprotocolsandstandardslikeWeb2.0hasfueledanexplosionofcompoundservicesthatPermissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.SIGCOMM'07,August27{31,2007,Kyoto,Japan.Copyright2007ACM978-1-59593-713-1/07/0008...$5.00.scripttogetherthird-partycomponentstodeliverasophisticatedservice[27,29].Thesespecializedservicesarejustthebeginning:agshipconsumerandenterpriseapplicationsareincreasinglybe-ingdeliveredinthesoftware-as-a-servicemodel[9].Forexample,GoogleDocuments,GrooveOfce,andWindowsLiveareearlyex-amplesofdesktopapplicationsprovidedinahostedenvironment,andrepresentthebeginningofamuchlargertrend.AsidefromthefunctionalityandmanagementbenetsWeb-basedservicesaffordtheenduser,hostedplatformspresentsig-nicantbenetstotheserviceprovideraswell.Ratherthande-ployeachcomponentofamulti-tieredapplicationwithinapartic-ulardatacenter,so-called“cloud-basedservices”cantransparentlyleveragewidelydistributedcomputinginfrastructures.Google'sservice,forexample,reportedlyrunsonnearlyhalf-a-millionserversdistributedaroundtheworld[8].Potentialworld-widescaleneednotbelimitedtoafewlargecorporations,however.RecentofferingslikeAmazon'sElasticComputeCloud(EC2)promisetoprovidepracticallyinniteresourcestoanyonewillingtopay[3].Oneofthekeybarrierstomovingtraditionalapplicationstothecloud,however,isthelossofcostcontrol[17].Inthecloud-basedservicesmodel,costrecoveryistypicallyaccomplishedthroughmeteredpricing.Indeed,Amazon'sEC2chargesincrementallypergigabyteoftrafcconsumed[3].Experiencehasshown,how-ever,thatISPcustomerspreferatfeestousage-basedpricing[30].Similarly,atacorporatelevel,ITexpendituresaregenerallyman-agedasxed-costoverheads,notincrementalexpenses[17].Aat-feemodelrequirestheabilityforaprovidertolimitconsump-tiontocontrolcosts.Limitingglobalresourceconsumptioninadistributedenvironment,however,presentsasignicanttechnicalchallenge.Ideally,resourceproviderswouldnotrequireservicestospecifytheresourcedemandsofeachdistributedcomponentapriori;suchne-grainedmeasurementandmodelingcanbechal-lengingforrapidlyevolvingservices.Instead,theyshouldprovideaxedpriceforanaggregate,globalusage,andallowservicestoconsumeresourcesdynamicallyacrossvariouslocations,subjecttothespeciedaggregatelimit.Inthispaper,wefocusonaspecicinstanceofthisproblem:controllingtheaggregatenetworkbandwidthusedbyacloud-basedservice,ordistributedratelimiting(DRL).Ourgoalistoallowasetofdistributedtrafcratelimiterstocollaboratetosubjectaclassofnetworktrafc(forexample,thetrafcofaparticularcloud-basedservice)toasingle,aggregategloballimit.Whiletrafcpolicingiscommonindatacentersandwidespreadintoday'snetworks,suchlimiterstypicallyenforcepolicyindependentlyateachlocation[1].Forexample,aresourceproviderwith10hostingcentersmaywishtolimitthetotalamountoftrafcitcarriesforaparticularserviceto100Mbps.Itscurrentoptionsaretoeitherlimittheserviceto100Mbpsateachhostingcenter(runningtheriskthattheymayall usethislimitsimultaneously,resultingin1Gbpsoftotaltrafc),ortolimiteachcentertoaxedportion(i.e.,10Mbps)whichover-constrainstheservicetrafcaggregateandisunlikelytoallowtheservicetoconsumeitsallocatedbudgetunlesstrafcisperfectlybalancedacrossthecloud.Thekeychallengeofdistributedratelimitingistoallowindi-vidualowstocompetedynamicallyforbandwidthnotonlywithowstraversingthesamelimiter,butwithowstraversingotherlimitersaswell.Thus,owsarrivingatdifferentlimitersshouldachievethesameratesastheywouldiftheyallweretraversingasingle,sharedratelimiter.Fairnessbetweenowsinsideatrafcaggregatedependscriticallyonaccuratelimiterassignments,whichinturndependuponthelocalpacketarrivalrates,numbersofows,andup/down-streambottleneckcapacities.Weaddressthisissuebypresentingtheillusionofpassingallofthetrafcthroughasingletoken-bucketratelimiter,allowingowstocompeteagainsteachotherforbandwidthinthemannerprescribedbythetransportpro-tocol(s)inuse.Forexample,TCPowsinatrafcaggregatewillsharebandwidthinaow-fairmanner[6].Thekeytechnicalchal-lengetoprovidingthisabstractionismeasuringthedemandoftheaggregateateachlimiter,andapportioningcapacityinproportiontothatdemand.Thispapermakesthreeprimarycontributions:RateLimitingCloud-basedServices.Weidentifyakeychallengefacingthepracticaldeploymentofcloud-basedservicesandiden-tifythechiefengineeringdifculties:howtoeffectivelybalanceaccuracy(howwellthesystemboundsdemandtotheaggregateratelimit),responsiveness(howquicklylimitersrespondtovaryingtrafcdemands),andcommunicationbetweenthelimiters.Adis-tributedlimitercannotbesimultaneouslyperfectlyaccurateandre-sponsive;thecommunicationlatencybetweenlimitersboundshowquicklyonelimitercanadapttochangingdemandatanother.DistributedRateLimiterDesign.Wepresentthedesignandim-plementationoftwodistributedratelimitingalgorithms.First,weconsideranapproach,globalrandomdrop(GRD),thatapproxi-matesthenumber,butnottheprecisetiming,ofpacketdrops.Sec-ond,weobservethatapplicationsdeployedusingWebserviceswillalmostexclusivelyuseTCP.ByincorporatingknowledgeaboutTCP'scongestioncontrolbehavior,wedesignanothermechanism,owproportionalshare(FPS),thatprovidesimprovedscalability.EvaluationandMethodology.Wedevelopamethodologytoeval-uatedistributedratelimitersunderavarietyoftrafcdemandsanddeploymentscenariosusingbothalocal-areatestbedandanInter-nettestbed,PlanetLab.WedemonstratethatbothGRDandFPSexhibitlong-termbehaviorsimilartoacentralizedlimiterforbothmixedandhomogeneousTCPtrafcinlow-lossenvironments.Furthermore,weshowthatFPSscaleswell,maintainingnear-ideal50-Mbpsrateenforcementandfairnessupto490limiterswithamodestcommunicationbudgetof23Kbpsperlimiter.2.CLASSESOFCLOUDSCloud-basedservicescomeinvaryingdegreesofcomplexity;astheconstituentservicesbecomemorenumerousanddynamic,resourceprovisioningbecomesmorechallenging.Weobservethatthedistributedratelimitingproblemarisesinanyservicecomposedofgeographicallydistributedsites.Inthissectionwedescribethreeincreasinglymundaneapplications,eachillustratinghowDRLem-powersserviceproviderstoenforceheretoforeunrealizabletrafcpolicies,andhowitoffersanewservicemodeltocustomers.2.1Limitingcloud-basedservicesCloud-basedservicespromisea“utility”computingabstractioninwhichclientsseeauniedserviceandareunawarethatthesys-temstitchestogetherindependentphysicalsitestoprovidecycles,bandwidth,andstorageforauniformpurpose.Inthiscontext,weareinterestedinrate-basedresourcesthatclientssourcefromasin-gleprovideracrossmanysitesorhostingcenters.Forclouds,distributedratelimitingprovidesthecriticalabilityforresourceproviderstocontroltheuseofnetworkbandwithasifitwereallsourcedfromasinglesite.AproviderrunsDRLacrossitssites,settinggloballimitsonthetotalnetworkusageofparticulartrafcclassesorclients.Providersarenolongerrequiredtomi-graterequeststoaccomodatestaticbandwidthlimits.Instead,theavailablebandwidthgravitatestowardsthesiteswiththemostde-mand.Alternatively,clientsmaydeployDRLtocontrolaggregateusageacrossprovidersastheyseet.DRLremovesthearticialseparationofaccessmeteringandgeographythatresultsinexcesscostfortheclientand/orwastedresourcesforserviceproviders.2.2ContentdistributionnetworksWhilefull-scalecloud-basedcomputingisinitsinfancy,simplecloud-basedservicessuchascontent-distributionnetworks(CDNs)areprevalenttodayandcanbenetfromdistributedratelimiting.CDNssuchasAkamaiprovidecontentreplicationservicestothird-partyWebsites.ByservingWebcontentfromnumerousgeograph-icallydiverselocations,CDNsimprovetheperformance,scalabil-ity,andreliabilityofWebsites.Inmanyscenarios,CDNoperatorsmaywishtolimitresourceusageeitherbasedonthecontentservedortherequestingidentity.Independentratelimitersareinsufcient,however,ascontentcanbeservedfromanyofnumerousmirrorsaroundtheworldaccordingtouctuatingdemand.UsingDRL,acontentdistributionnetworkcansetper-customerlimitsbaseduponservice-levelagreements.TheCDNprovidesser-vicetoallsitesasbefore,butsimplyappliesDRLtoallout-boundtrafcforeachsite.Inthisway,thebandwidthconsumedbyacus-tomerisconstrained,asisthebudgetrequiredtofundit,avoidingtheneedforCDNstoremovecontentforcustomerswhocannotpayfortheirpopularity.1Alternatively,theCDNcanuseDRLasaprotectivemechanism.Forinstance,theCoDeeNcontentdistribu-tionnetworkwasforcedtodeployanad-hocapproachtoratelimitnefarioususersacrossproxies[37].DRLmakesitsimpletolimitthedamageontheCDNduetosuchbehaviorbyratelimitingtrafcfromanidentiedsetofusers.Insummary,DRLprovidesCDNswithapowerfultoolformanagingaccesstotheirclients'content.2.3InternettestbedsPlanetlabsupportsthedeploymentoflong-livedserviceproto-types.EachPlanetlabservicerunsinaslice—essentiallyafractionoftheentireglobaltestbedconsistingof1=Nofeachmachine.CurrentlyPlanetlabprovideswork-conservingbandwidthlimitsateachindividualsite,butthesystemcannotcoordinatebandwidthdemandsacrossmultiplemachines[18].DRLdynamicallyapportionsbandwidthbasedupondemand,al-lowingPlanetlabadministratorstosetbandwidthlimitsonaper-slicegranularity,independentofwhichnodesasliceuses.InthecontextofasinglePlanetlabservice,theserviceadministratormaylimitservicetoaparticularuser.InSection5.7weshowthatDRLprovideseffectivelimitsforaPlanetlabservicedistributedacrossNorthAmerica.Inaddition,whilewefocusuponnetworkratelim-itinginthispaper,wehavebeguntoapplyourtechniquestocontrolotherimportantrate-basedresourcessuchasCPU. 1Forexample,Akamaicustomersaretypicallynotratelimitedandbilledinarrearsforactualaggregateusage,leavingthemopentopotentiallylargebills.Ifdemanddramaticallyexceedsexpectationand/ortheirwillingnesstopay,manualinterventionisrequired[2]. 2.4AssumptionsandscopeLikecentralizedratelimitingmechanisms,distributedratelim-itingdoesnotprovideQoSguarantees.Thus,whencustomerspayforagivenlevelofservice,providersmustensuretheavailabilityofadequateresourcesforthecustomertoattainitsgivenlimit.Intheextreme,aprovidermayneedtoprovisioneachlimiterwithenoughcapacitytosupportaservice'sentireaggregatelimit.Nevertheless,weexpectmanydeploymentstoenjoyconsiderablebenetsfromstatisticalmultiplexing.Determiningthemosteffectiveprovision-ingstrategy,however,isoutsidethescopeofthispaper.Furthermore,weassumethatmechanismsarealreadyinplacetoquicklyandeasilyidentifytrafcbelongingtoaparticularser-vice[25].Inmanycasessuchfacilities,suchassimpleaddressorprotocol-basedclassiers,alreadyexistandcanbereadilyadoptedforuseinadistributedratelimiter.Inothers,wecanleveragere-centworkonnetworkcapabilities[32,39]toprovideunforgeablemeansofattribution.Finally,withoutlossofgenerality,wediscussoursolutionsinthecontextofasingleservice;multipleservicescanbelimitedinthesamefashion.3.LIMITERDESIGNWeareconcernedwithcoordinatingasetoftopologicallydis-tributedlimiterstoenforceanaggregatetrafclimitwhileretainingthebehaviorofacentralizedlimiter.Thatis,areceivershouldnotbeabletotellwhethertheratelimitisenforcedatoneormanylo-cationssimultaneously.Specically,weaimtoapproximateacen-tralizedtoken-buckettrafc-policingmechanism.Wechooseato-kenbucketasareferencemechanismforanumberofreasons:itissimple,reasonablywellunderstood,andcommonlydeployedinIn-ternetrouters.Mostimportantly,itmakesinstantaneousdecisionsaboutapacket'sfate—packetsareeitherforwardedordropped—andsodoesnotintroduceanyadditionalqueuing.Wedonotassumeanythingaboutthedistributionoftrafcacrosslimiters.Thus,trafcmayarriveatanyorallofthelimitersatanytime.Weuseapeer-to-peerlimiterarchitecture:eachlimiterisfunctionallyidenticalandoperatesindependently.Thetaskofalimitercanbesplitintothreeseparatesubtasks:estimation,com-munication,andallocation.Everylimitercollectsperiodicmea-surementsofthelocaltrafcarrivalrateanddisseminatesthemtotheotherlimiters.Uponreceiptofupdatesfromotherlimiters,eachlimitercomputesitsownestimateoftheglobalaggregatearrivalratethatitusestodeterminehowtobestserviceitslocaldemandtoenforcetheglobalratelimit.3.1EstimationandcommunicationWemeasuretrafcdemandintermsofbytesperunittime.Eachlimitermaintainsanestimateofbothlocalandglobaldemand.Es-timatinglocalarrivalratesiswell-studied[15,34];weemployastrategythatcomputestheaveragearrivalrateoverxedtimein-tervalsandappliesastandardexponentially-weightedmovingav-erage(EWMA)ltertotheseratestosmoothoutshort-termuc-tuations.TheestimateintervallengthandEWMAsmoothingpa-rameterdirectlyaffecttheabilityofalimitertoquicklytrackandcommunicatelocalratechanges;wedetermineappropriatesettingsinSection5.Attheendofeachestimateinterval,localchangesaremergedwiththecurrentglobalestimate.Inaddition,eachlimitermustdis-seminatechangesinlocalarrivalratetotheotherlimiters.Thesim-plestformofcommunicationfabricisabroadcastmesh.Whilefastandrobust,afullmeshisalsoextremelybandwidth-intensive(re-quiringO(N2)updatemessagesperestimateinterval).Instead,weimplementagossipprotocolinspiredbyKempeetal.[22].Such GRD-HANDLE-PACKET(P:Packet)1demand nPiri2bytecount bytecount+LENGTH(P)3ifdemand�limitthen4dropprob (demandlimit)=demand5ifRAND()dropprobthen6DROP(P)7return8FORWARD(P) Figure1:PseudocodeforGRD.Eachvaluericorrespondstothecurrentestimateoftherateatlimiteri.“epidemic”protocolshavebeenwidelystudiedfordistributedco-ordination;theyrequirelittletonocommunicationstructure,andarerobusttolinkandnodefailures[10].Attheendofeaches-timateinterval,limitersselectaxednumberofrandomlychosenlimiterstoupdate;limitersuseanyreceivedupdatestoupdatetheirglobaldemandestimates.Thenumberoflimiterscontacted—thegossipbranchingfactor—isaparameterofthesystem.Wecom-municateupdatesviaaUDP-basedprotocolthatisresilienttolossandreordering;fornowweignorefailuresinthecommunicationfabricandrevisittheissueinSection5.6.Eachupdatepacketis48bytes,includingIPandUDPheaders.Moresophisticatedcom-municationfabricsmayreducecoordinationcostsusingstructuredapproaches[16];wedeferaninvestigationtofuturework.3.2AllocationHavingaddressedestimationandcommunicationmechanisms,wenowconsiderhoweachlimitercancombinelocalmeasure-mentswithglobalestimatestodetermineanappropriatelocallimittoenforce.Anaturalapproachistobuildaglobaltokenbucket(GTB)limiterthatemulatesthene-grainedbehaviorofacentral-izedtokenbucket.Recallthatarrivingbytesrequiretokenstobeallowedpassage;ifthereareinsufcienttokens,thetokenbucketdropspackets.Therateatwhichthebucketregeneratestokensdic-tatesthetrafclimit.InGTB,eachlimitermaintainsitsownglobalestimateandusesreportedarrivaldemandsatotherlimiterstoesti-matetherateofdrainoftokensduetocompetingtrafc.Specically,eachlimiter'stokenbucketrefreshestokensattheglobalratelimit,butremovestokensbothwhenbytesarrivelocallyandtoaccountforexpectedarrivalsatotherlimiters.Attheendofeveryestimateinterval,eachlimitercomputesitslocalarrivalrateandsendsthisvaluetootherlimitersviathecommunicationfabric.Eachlimitersumsthemostrecentvaluesithasreceivedfortheotherlimitersandremovestokensfromitsownbucketatthis“global”rateuntilanewupdatearrives.AsshowninSection4,however,GTBishighlysensitivetostaleobservationsthatcon-tinuetoremovetokensatanoutdatedrate,makingitimpracticaltoimplementatlargescaleorinlossynetworks.3.2.1GlobalrandomdropInsteadofemulatingtheprecisebehaviorofacentralizedtokenbucket,weobservethatonemayinsteademulatethehigher-orderbehaviorofacentrallimiter.Forexample,wecanensuretherateofdropsoversomeperiodoftimeisthesameasinthecentralizedcase,asopposedtocapturingtheburstinessofpacketdrops—inthisway,weemulatetherateenforcementofatokenbucketbutnotitsburstlimiting.Figure1presentsthepseudocodeforaglobalran-domdrop(GRD)limiterthattakesthisapproach.LikeGTB,GRDmonitorstheaggregateglobalinputdemand,butusesittocalculateapacketdropprobability.GRDdropspacketswithaprobability proportionaltotheexcessglobaltrafcdemandinthepreviousin-terval(line4).Thus,thenumberofdropsisexpectedtobethesameasinasingletokenbucket;theaggregateforwardingrateshouldbenogreaterthanthegloballimit.GRDsomewhatresemblesREDqueuinginthatitincreasesitsdropprobabilityastheinputdemandexceedssomethreshold[14].Becausetherearenoqueuesinourlimiter,however,GRDrequiresnotuningparametersofitsown(besidestheestimator'sEWMAandestimateintervallength).IncontrasttoGTB,whichattemptstoreproducethepacket-levelbehaviorofacentralizedlimiter,GRDtriestoachieveaccuracybyreproducingthenumberoflossesoverlongerperiodsoftime.Itdoesnot,however,captureshort-termeffects.ForinherentlyburstyprotocolslikeTCP,wecanimproveshort-termfairnessandresponsivenessbyexploitinginformationabouttheprotocol'scongestioncontrolbehavior.3.2.2FlowproportionalshareOneofthekeypropertiesofacentralizedtokenbucketisthatitretainsinter-owfairnessinherenttotransportprotocolssuchasTCP.GiventheprevalenceofTCPintheInternet,andespe-ciallyinmoderncloud-basedservices,wedesignaowpropor-tionalshare(FPS)limiterthatusesdomain-specicknowledgeaboutTCPtoemulateacentralizedlimiterwithoutmaintainingde-tailedpacketarrivalrates.EachFPSlimiterusesatokenbucketforratelimiting—thus,eachlimiterhasalocalratelimit.Un-likeGTB,whichrenewstokensattheglobalrate,FPSdynami-callyadjustsitslocalratelimitinproportiontoasetofweightscomputedeveryestimateinterval.Theseweightsarebaseduponthenumberofliveowsateachlimiterandserveasaproxyfordemand;theweightsarethenusedtoenforcemax-minfairnessbe-tweencongestion-responsiveows[6].TheprimarychallengeinFPSisestimatingTCPdemand.Inthepreviousdesigns,eachratelimiterestimatesdemandbymeasuringpackets'sizesandtherateatwhichitreceivesthem;thisaccuratelyreectsthebyte-leveldemandofthetrafcsources.Incontrast,FPSmustdeterminedemandintermsofthenumberofTCPowspresent,whichisindependentofarrivalrate.Furthermore,sinceTCPalwaysattemptstoincreaseitsrate,asingleowconsumingallofalimiter'srateisnearlyindistinguishablefrom10owsdoingthesame.2However,wewouldlikethata10-owaggregatereceive10timestheweightofasingleow.OurapproachtodemandestimationinFPSisshowninFig-ure2.Flowaggregatesareinoneoftwostates.Iftheaggregateunder-utilizestheallottedrate(locallimit)atalimiter,thenallcon-stituentowsmustbebottlenecked.Inotherwords,theowsareallconstrainedelsewhere.Ontheotherhand,iftheaggregateei-thermeetsorexceedsthelocallimit,wesaythatoneormoreoftheconstituentowsisunbottlenecked—fortheseowsthelim-iteristhebottleneck.WecalculateowweightswiththefunctionFPS-ESTIMATE.Ifowsweremax-minfair,theneachunbottle-neckedowwouldreceiveapproximatelythesamerate.Wethere-forecountaweightof1foreveryunbottleneckedowateverylimiter.Thus,ifallowswereunbottlenecked,thenthedemandateachlimiterisdirectlyproportionaltoitscurrentowcount.Set-tingthelocalweighttothisnumberresultsinmax-minfairalloca-tions.Weusethecomputedweightonline10ofFPS-ESTIMATEtoproportionallysetthelocalratelimit. 2Thereisaslightdifferencebetweenthesescenarios:largerowaggregateshavesmallerdemandoscillationswhendesynchro-nized[4].SinceTCPisperiodic,weconsidereddistinguishingTCPowaggregatesbaseduponthecomponentfrequenciesintheag-gregateviatheFFT.However,wefoundthatthesignalproducedbyTCPdemandsisnotsufcientlystationary. FPS-ESTIMATE()1foreachowfinsampleset2ESTIMATE(f)3localdemand ri4iflocaldemandlocallimitthen5maxowrate MAXRATE(sampleset)6idealweight locallimit=maxowrate7else8remoteweights nPj=iwj9idealweight localdemandremoteweights limitlocaldemand10locallimit idealweightlimit remoteweights+idealweight11PROPAGATE(idealweight)FPS-HANDLE-PACKET(P:Packet)1ifRAND()resampleprobthen2addFLOW(P)tosampleset3TOKEN-BUCKET-LIMIT(P) Figure2:PseudocodeforFPS.wicorrespondstotheweightateachlimiterithatrepresentsthenormalizedowcount(asopposedtoratesriasinGRD).AseeminglynaturalapproachtoweightcomputationistocountTCPowsateachlimiter.However,owcountingfailstoaccountforthedemandsofTCPowsthatarebottlenecked:10bottle-neckedowsthatshareamodemdonotexertthesamedemandsuponalimiterasasingleowonanOC-3.Thus,FPSmustcom-putetheequivalentnumberofunbottleneckedTCPowsthatanaggregatedemandrepresents.OurprimaryinsightisthatwecanuseTCPitselftoestimatedemand:inanaggregateofTCPows,eachowwilleventuallyconvergetoitsfair-sharetransmissionrate.Thisapproachleadstotherstoftwooperatingregimes:Localarrivalratelocalratelimit.Whenthereisatleastoneunbottleneckedowatthelimiter,theaggregateinputrateisequalto(orslightlymorethan)thelocalratelimit.Inthiscase,wecomputetheweightbydividingthelocalratelimitbythesendingrateofanunbottleneckedow,asshownonlines5and6ofFPS-ESTIMATE.Intuitively,thisallowsustouseaTCPow'sknowl-edgeofcongestiontodeterminetheamountofcompetingdemand.Inparticular,ifalltheowsattheproviderareunbottlenecked,thisyieldsaowcountwithoutactualowcounting.Thus,tocomputetheweight,alimitermustestimateanunbottle-neckedowrate.Wecanavoidper-owstatebysamplingpacketsatalimiterandmaintainingbytecountersforaconstant-sizeowset.Weassumethattheowwiththemaximumsendingrateisunbottlenecked.However,itispossiblethatoursamplesetwillcontainonlybottleneckedows.Thus,wecontinuouslyresam-pleanddiscardsmallowsfromourset,therebyensuringthatthesamplesetcontainsanunbottleneckedow.Itislikelythatwewillselectanunbottleneckedowinthelongrunfortworeasons.First,sinceweuniformlysamplepackets,anunbottleneckedowismorelikelytobepickedthanabottleneckedow.Second,asamplesetthatcontainsonlybottleneckedowsresultsintheweightbeingoverestimated,whichincreasesthelocalratelimit,causesunbot-tleneckedowstogrow,andmakesthemmorelikelytobechosensubsequently.Toaccountforbottleneckedows,FPSimplicitlynormalizestheweightbyscalingdownthecontributionofsuchowsproportionaltotheirsendingrates.Abottleneckedowonlycontributesafrac-tionequaltoitssendingratedividedbythatofanunbottleneckedow.Forexample,ifabottleneckedowsendsat10Kbps,andthe fairshareofanunbottleneckedowis20Kbps,thebottleneckedowcountsforhalftheweightofanunbottleneckedow.Localarrivalratelocalratelimit.Whenallowsatthelimiterarebottlenecked,thereisnounbottleneckedowwhoseratecanbeusedtocomputetheweight.Sincetheowaggregateisunabletousealltherateavailableatthelimiter,wecomputeaweightthat,basedoncurrentinformation,setsthelocalratelimittobeequaltothelocaldemand(line9ofFPS-ESTIMATE).Alimitermayoscillatebetweenthetworegimes:enteringthesecondtypicallyreturnsthesystemtotherst,sincetheaggregatemaybecomeunbottleneckedduetothechangeinlocalratelimit.Asaresult,thelocalratelimitisincreasedduringthenextalloca-tion,andthecyclerepeats.Wenotethatthisoscillationisnecessarytoallowbottleneckedowstobecomeunbottleneckedshouldad-ditionalcapacitybecomeavailableelsewhereinthenetwork;liketheestimator,weapplyanEWMAtosmooththisoscillation.WehaveprovedthatFPSisstable—givenstableinputdemands,FPSremainsatthecorrectallocationofweightsamonglimitersonceitarrivesinthatstate.(WeincludetheproofintheAppendix.)Itremainsanopenquestion,however,whetherFPSconvergesunderallconditions,andifso,howquickly.Finally,TCP'sslowstartbehaviorcomplicatesdemandestima-tion.Considerthearrivalofaowatalimiterthathasacurrentratelimitofzero.Withoutbuffering,theow'sSYNwillbelostandtheowcannotestablishitsdemand.Thus,weallowburstingofthetokenbucketwhenthelocalratelimitiszerotoallowaTCPowinslowstarttosendafewpacketsbeforelossesoccur.Whentheallocatordetectsnonzeroinputdemand,ittreatsthedemandasabottleneckedowfortherstestimateinterval.Asaresult,FPSallocatesratetotheowequivalenttoitsinstantaneousrateduringthebeginningofslowstart,thusallowingittocontinuetogrow.4.EVALUATIONMETHODOLOGYOurnotionofagooddistributedratelimiterisonethataccuratelyreplicatescentralizedlimiterbehavior.Trafcpolicingmecha-nismscanaffectpacketsandowsonseveraltimescales;partic-ularly,wecanaimtoemulatepacket-levelbehaviororow-levelbehavior.However,packet-levelbehaviorisnon-intuitive,sinceapplicationstypicallyoperateattheowlevel.Eveninasinglelimiter,anyonemeasureofpacket-levelbehavioructuatesduetorandomnessinthephysicalsystem,thoughtransport-layerowsmayachievethesamerelativefairnessandthroughput.Thisim-pliesaweaker,buttractablegoaloffunctionallyequivalentbehav-ior.Tothisend,wemeasurelimiterperformanceusingaggregatemetricsoverrealtransport-layerprotocols.4.1MetricsWestudythreemetricstodeterminethedelityoflimiterde-signs:utilization,owfairness,andresponsiveness.Thebasicgoalofadistributedratelimiteristoholdaggregatethroughputacrossalllimitersbelowaspeciedgloballimit.Toestablishdelityweneedtoconsiderutilizationoverdifferenttimescales.Achievablethroughputinthecentralizedcasedependscriticallyonthetrafcmix.Differentowarrivals,durations,roundtriptimes,andproto-colsimplythataggregatethroughputwillvaryonmanytimescales.Forexample,TCP'sburstinesscausesitsinstantaneousthroughputoversmalltimescalestovarygreatly.Alimiter'slong-termbe-haviormayyieldequivalentaggregatethroughput,butmayburstonshorttimescales.Notethat,sinceourlimitersdonotqueuepackets,someshort-termexcessisunavoidabletomaintainlong-termthroughput.Particularly,weaimtoachievefairnessequaltoorbetterthanthatofacentralizedtokenbucketlimiter.Fairnessdescribesthedistributionofrateacrossows.Weem-ployJain'sfairnessindextoquantifythefairnessacrossaowset[20].Theindexconsiderskowswherethethroughputofowiisxi.Thefairnessindexfisbetween0and1,where1iscom-pletelyfair(allowssharebandwidthequally):f=Pki=1xi2 kPki=1x2iWemustbecarefulwhenusingthismetrictoascertainow-leveldelity.ConsiderasetofidenticalTCPowstraversingasinglelimiter.Betweenruns,thefairnessindexwillshowconsiderablevariation;establishingtheow-levelbehaviorforoneormorelim-itersrequiresustomeasurethedistributionoftheindexacrossmul-tipleexperiments.AdditionalcaremustbetakenwhenmeasuringJain'sindexacrossmultiplelimiters.Thoughtheindexapproaches1asowsreceivetheirfairshare,skewedthroughputdistributionscanyieldseeminglyhighindices.Forexample,consider10owswhere9achievesimilarthroughputwhile1getsnothing;thisre-sultsintheseeminglyhighfairnessindexof0.9.Ifweconsiderthedistributionofowsacrosslimiters—the9owsgothroughonelimiterandthe1owthroughanother—thefairnessindexdoesnotcapturethepoorbehaviorofthealgorithm.Nevertheless,suchametricisnecessarytohelpestablishtheow-levelbehaviorofourlimiters,andthereforeweuseitasastandardmeasureoffairnesswiththeabovecaveat.Wepointoutdiscrepancieswhentheyarise.4.2ImplementationToperformratelimitingonrealowswithoutproxying,weuseuser-spacequeuinginiptablesonLinuxtocapturefullIPpack-etsandpassthemtothedesignatedratelimiterwithoutallowingthemtoproceedthroughkernelpacketprocessing.Eachratelimitereitherdropsthepacketorforwardsitontothedestinationthrougharawsocket.Weusesimilar,butmorerestrictedfunctionalityforVNETrawsocketsinPlanetLabtocaptureandtransmitfullIPpackets.RatelimiterscommunicatewitheachotherviaUDP.Eachgossipmessagesentoverthecommunicationfabriccontainsasequencenumberinadditiontorateupdates;thereceivinglimiterusesthesequencenumbertodetermineifanupdateislost,andifso,compensatesbyscalingthevalueandweightofthenewestup-datebythenumberoflostpackets.Notethatallofourexperimentsratelimittrafcinonedirection;limitersforwardreturningTCPACKsirrespectiveofanyratelimits.4.3EvaluationframeworkWeevaluateourlimitersprimarilyonalocal-areaemulationtestbedusingModelNet[35],whichweuseonlytoemulatelinklatencies.AModelNetemulationtestsreal,deployableproto-typesoverunmodied,commodityoperatingsystemsandnetworkstacks,whileprovidingalevelofrepeatabilityunachievableinanInternetexperiment.Runningourexperimentsinacontrolleden-vironmenthelpsusgainintuition,ensuresthattransientnetworkcongestion,failures,andunknownintermediatebottlenecklinksdonotconfuseourresults,andallowsdirectcomparisonacrossexper-iments.Weruntheratelimiters,trafcsources,andtrafcsinksonseparateendpointsinourModelNetnetworktopology.Allsource,sink,andratelimitermachinesrunLinux2.6.9.TCPsourcesuseNewRenowithSACKenabled.Weuseasimplemeshtopologytoconnectlimitersandrouteeachsourceandsinkpairthroughasinglelimiter.Thevirtualtopologyconnectsallnodesusing100-Mbpslinks. 0 2500 5000 7500 10000 12500 15000 0 5 10 15 Rate (Kbps)Time (sec) (a)Centralizedtokenbucket. 0 5 10 15 Time (sec) (b)Globaltokenbucket. 0 5 10 15 Time (sec) (c)Globalrandomdrop. 0 5 10 15 Time (sec)Aggregate Departures Limiter 1 Departures Limiter 2 Departures (d)Flowproportionalshare.Figure3:Timeseriesofforwardingrateforacentralizedlimiterandourthreelimitingalgorithmsinthebaselineexperiment—3TCPowstraverselimiter1and7TCPowstraverselimiter2. 0 0.05 0.1 0.15 0.2 0.25 5 10 15 Relative frequencyForwarding rate (Mbps) (a)CTB1sec. 5 10 15 Forwarding rate (Mbps) (b)GTB1sec. 5 10 15 Forwarding rate (Mbps) (c)GRD1sec. 5 10 15 Forwarding rate (Mbps) (d)FPS1sec. 0 0.015 0.03 0.045 0.06 0.075 5 10 15 Relative frequencyForwarding rate (Mbps) (e)CTB0.1sec. 5 10 15 Forwarding rate (Mbps) (f)GTB0.1sec. 5 10 15 Forwarding rate (Mbps) (g)GRD0.1sec. 5 10 15 Forwarding rate (Mbps) (h)FPS0.1sec.Figure4:Deliveredforwardingratefortheaggregateatdifferenttimescales—eachrowrepresentsonerunofthebaselineexperi-mentacrosstwolimiterswiththe“instantaneous”forwardingratecomputedoverthestatedtimeperiod.5.EVALUATIONOurevaluationhastwogoals.Therstistoestablishtheabilityofouralgorithmstoreproducethebehaviorofasinglelimiterinmeetingthegloballimitanddeliveringow-levelfairness.Theseexperimentsuseonly2limitersandasetofhomogeneousTCPows.Nextwerelaxthisidealizedworkloadtoestablishdelityinmorerealisticsettings.Theseexperimentshelpachieveoursecondgoal:todeterminetheeffectiveoperatingregimesforeachdesign.Foreachsystemweconsiderresponsiveness,performanceacrossvarioustrafccompositions,andscaling,andvarythedistributionofowsacrosslimiters,owstarttimes,protocolmix,andtrafccharacteristics.Finally,asaproofofconcept,wedeployourlim-itersacrossPlanetLabtocontrolamock-upofasimplecloud-basedservice.5.1BaselineThebaselineexperimentconsistsoftwolimitersconguredtoenforcea10-Mbpsgloballimit.Weloadthelimiterswith10un-bottleneckedTCPows;3owsarriveatonelimiterwhile7ar-riveattheother.Wechoosea3-to-7owskewtoavoidscenariosthatwouldresultinapparentfairnessevenifthealgorithmfails.Thereferencepointisacentralizedtoken-bucketlimiter(CTB)servicingall10ows.Wexowandinter-limiterroundtriptimes(RTTs)to40ms,andtokenbucketdepthto75,000bytes—slightlygreaterthanthebandwidth-delayproduct,and,fornow,usealoss-freecommunicationfabric.Eachexperimentlasts60seconds(enoughtimeforTCPtostabilize),theestimateintervalis50ms,andthe1-secondEWMAparameteris0.1;weconsideralternativevaluesinthenextsection.Figure3plotsthepacketforwardingrateateachlimiteraswellastheachievedthroughputoftheowaggregate.Inallcases,theaggregateutilizationisapproximately10Mbps.Welookatsmallertimescalestodeterminetheextenttowhichthelimitisenforced.Figure4showshistogramsofdelivered“instantaneous”forward-ingratescomputedovertwodifferenttimeperiods,thusshowingwhetheralimiterisburstyorconsistentinitslimiting.Allde-signsdeliverthegloballimitover1-secondintervals;bothGTBandGRD,however,areburstyintheshortterm.Bycontrast,FPScloselymatchestheratesofCTBatbothtimescales.WebelievethisisbecauseFPSusesatokenbuckettoenforcelocallimits.Itappearsthatwhenenforcingthesameaggregatelimit,theforward-ingrateofmultipletokenbucketsapproximatesthatofasingletokenbucketevenatshorttimescales.ReturningtoFigure3,theaggregateforwardingrateshouldbeapportionedbetweenlimitersinabouta3-to-7split.GTBclearlyfailstodeliverinthisregard,butbothGRDandFPSappearap-proximatelycorrectuponvisualinspection.WeuseJain'sfairnessindextoquantifythefairnessoftheallocation.Foreachrunofanexperiment,wecomputeonefairnessvalueacrossallows,ir-respectiveofthelimiteratwhichtheyarrive.Repeatingthisex-periment10timesyieldsadistributionoffairnessvalues.Weusequantile-quantileplotstocomparethefairnessdistributionofeach 0 2500 5000 7500 10000 12500 15000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Rate (Kbps)Time (sec)Aggregate departures Flow 1 departures Flow 2 departures Flow 3 departures Flow 4 departures Flow 5 departures (a)Centraltokenbucket 0 2500 5000 7500 10000 12500 15000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Rate (Kbps)Time (sec)Aggregate departures Limiter 1 departures Limiter 2 departures Limiter 3 departures Limiter 4 departures Limiter 5 departures (b)Globalrandomdropwith500-msestimateinterval 0 2500 5000 7500 10000 12500 15000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Rate (Kbps)Time (sec)Aggregate departures Limiter 1 departures Limiter 2 departures Limiter 3 departures Limiter 4 departures Limiter 5 departures (c)Globalrandomdropwith50-msestimateinterval 0 2500 5000 7500 10000 12500 15000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Rate (Kbps)Time (sec)Aggregate departures Limiter 1 departures Limiter 2 departures Limiter 3 departures Limiter 4 departures Limiter 5 departures (d)Flowproportionalshare500-msestimateintervalFigure6:Timeseriesofforwardingrateforaowjoinexperiment.Every10seconds,aowjoinsatanunusedlimiter.limitertothecentralizedtokenbucket(CTB).Recallthatanim-portantbenchmarkofourdesignsistheirabilitytoreproduceadistributionofowfairnessatleastasgoodasthatofCTB.Iftheydo,theirpointswillcloselyfollowthex=yline;pointsbelowthelinearelessfair,indicatingpoorlimiterbehaviorandpointsabovethelineindicatethattheratelimitingalgorithmproducedbetterfairnessthanCTB.Figure5comparesdistributionsforallalgorithmsinourbase-lineexperiment.GTBhasfairnessvaluesaround0.7,whichcorre-spondstothe7-owaggregateunfairlydominatingthe3-owag-gregate.ThisbehaviorisclearlyvisibleinFigure3(b),wherethe7-owlimiterreceivesalmostallthebandwidth.GRDandFPS,ontheotherhand,exhibitdistributionsthatareatorabovethatofCTB.GRD,infact,hasafairnessindexcloseto1.0—muchbet-terthanCTB.Weverifythiscounter-intuitiveresultbycomparingtheperformanceofCTBwiththatofasingleGRDlimiter(labeled“CentralRandomDrop”inthegure).Itisnotsurprising,then,thatFPSislessfairthanGRD,sinceitusesatokenbucketateachlimitertoenforcethelocalratelimit.3Additionally,withhomogeneousowsacrossawiderangeofparameters—estimateintervalsfrom10msto500msandEWMAfrom0to0.75—wendthatGTBandGRDaresensitivetoesti-mateintervals,astheyattempttotrackpacket-levelbehaviors(weomitthedetailsforspace).Ingeneral,GTBexhibitspoorfairnessforalmostallchoicesofEWMAandestimateinterval,andper-formswellonlywhentheestimateintervalissmallandtheEWMAissetto0(nolter).WeconjecturethatGTBneedstosampletheshort-termbehaviorofTCPincongestionavoidance,sinceconsid-eringsolelyaggregatedemandoverlongtimeintervalsfailstocap-turetheincreasedaggressivenessofalargerowaggregate.We 3Infuturework,weplantoexperimentwithalocalGRD-likeran-domdropmechanisminsteadofatokenbucketinFPS;thiswillimprovethefairnessofFPSinmanyscenarios. 0.7 0.75 0.8 0.85 0.9 0.95 1 0.7 0.75 0.8 0.85 0.9 0.95 1 Fairness index (limiter algorithm)Fairness index (single token bucket) Global Random Drop Central Random Drop Flow Proportional Share Global Token Bucket Figure5:Quantile-quantileplotsofasingletokenbucketvs.distributedlimiterimplementations.Foreachpoint(x;y);xrepresentsaquantilevalueforfairnesswithasingletokenbucketandyrepresentsthesamequantilevalueforfairnessforthelimiteralgorithm.veriedthatGTBprovidesbetterfairnessifwelengthenTCP'speriodicbehaviorbygrowingitsRTT.SinceallresultsshowthatGTBfailswithanythingbutthesmallestestimateinterval,wedonotconsideritfurther.GRDissensitivetotheestimateinterval,butintermsofshort-termutilization,notowfairness,sinceitmaintainsthesamedropprobabilityuntilitreceivesnewupdates.Thus,itoccasionallydropsatahigher-than-desiredrate,causingcongestion-responsiveowstobackoffsignicantly.Whileitslong-termfairnessremainshighevenfor500-msestimateintervals,short-termutilizationbe-comesexceedinglypoor.Bycontrast,forhomogeneousows,FPSappearsinsensitivetotheestimateinterval,sinceow-leveldemandisconstant.BothGRDandFPSrequireanEWMAtosmoothinputdemandtoavoidover-reactingtoshort-termburstiness.45.2FlowdynamicsWenowinvestigateresponsiveness(timetoconvergenceandsta-bility)byobservingthesystemasowsarriveanddepart.Wesequentiallyaddowstoasystemof5limitersandobservetheconvergencetofairshareofeachow.Figure6(a)showstheref-erencetime-seriesbehaviorforacentralizedtokenbucket.Notethateventhroughasingletokenbucket,thesystemisnotcom-pletelyfairorstableasowsarriveordepartduetoTCP'sbursti-ness.Witha500-msestimateinterval,GRD(Figure6(b))failstocapturethebehaviorofthecentraltokenbucket.Onlywithanorder-of-magnitudesmallerestimateinterval(Figure6(c))isGRDabletoapproximatethecentraltokenbucket,albeitwithincreasedfairness.FPS(Figure6(d)),ontheotherhand,exhibitstheleastamountofvariationinforwardedrateevenwitha500-msestimateinterval,sinceow-leveldemandissufcientlyconstantoverhalf-secondintervals.ThisexperimentillustratesthatthebehaviorGRDmustobserveoccursatapacket-leveltimescale:largeestimatein-tervalscauseGRDtolosetrackoftheglobaldemand,resultinginchaoticpacketdrops.FPS,ontheotherhand,onlyrequiresupdatesasowsarrive,depart,orchangetheirbehavior. 4ThoughneitherareparticularlysensitivetoEWMA,weempiri-callydeterminedthatareasonablesettingofthe1-secondEWMAis0.1.Weusethisvalueunlessotherwisenoted. CTB GRD FPS Goodput(bulkmean) 6900.90 7257.87 6989.76 (stddev) 125.45 75.87 219.55 Goodput(webmean) 1796.06 1974.35 2090.25 (stddev) 104.32 93.90 57.98 Webrate(h-mean)[0,5000) 28.17 25.84 25.71 [5000,50000) 276.18 342.96 335.80 [50000,500000) 472.09 612.08 571.40 [500000,1) 695.40 751.98 765.26 Fairness(bulkmean) 0.971 0.997 0.962 Table1:Goodputanddeliveredrates(Kbps),andfairnessforbulkowsover10runsoftheWebowexperiment.Weusemeanvaluesforgoodputacrossexperimentsandusethehar-monicmeanofrates(Kbps)deliveredtoWebowsofsize(inbytes)withinthespeciedranges.5.3TrafcdistributionsWhileTCPdominatescloud-basedservicetrafc,theowsthemselvesarefarfromregularintheirsize,distribution,anddu-ration.HereweevaluatetheeffectsofvaryingtrafcdemandsbyconsideringWebrequeststhatcontendwithlong-runningTCPowsforlimiterbandwidth.Toseewhetherourratelimitingal-gorithmscandetectandreacttoWeb-servicedemand,weassign10long-lived(bulk)owstoonelimiterandtheservicerequeststotheother;thisrepresentstheeffectiveworst-caseforDRLsinceshortandlongowscannotexertordinarycongestivepressuresuponeachotherwhenisolated.Weareinterestedintheabilityofbothtrafcpoolstoattainthecorrectaggregateutilization,thelong-termfairnessofthestableows,andtheserviceratesfortheWebows.Sincewedonothaveaccesstotrafctracesfromdeployedcloud-basedservices,weuseapriortechniquetoderiveadistri-butionofWebobjectsizesfromaCAIDAWebtraceforahigh-speedOC-48MFN(MetropolitanFiberNetwork)Backbone1link(SanJosetoSeattle)thatfollowsaheavy-taileddistribution[36].WefetchobjectsinparallelfromanApacheWebserverusinghttperfviaalimiter.Wedistributerequestsuniformlyoverob-jectsinthetracedistribution.RequestsarriveaccordingtoaPois-sonprocesswithaverageof15.Table1givesthedeliveredratesfortheWebowsofdifferentsizesandthedeliveredratesforthe10-owaggregatesineachscenarioacross10runs.Thisshowsthatthe10-owaggregateachievedacomparableallocationineachscenario.WhenseeninconjunctionwiththeWebdownloadservicerates,italsoindicatesthattheWebtrafcaggregateintheotherlimiterreceivedthecor-rectallocation.ConsideringtheWebowserviceratesalone,weseethatbothGRDandFPSexhibitserviceratesclosetothatofasingletokenbucket,evenforowsofsignicantlydifferentsizes.Thefairnessindexofthelong-livedowsonceagainshowsthatGRDexhibitshigherfairnessthaneitherCTBorFPS.FPSdoesnotbenetfromthefactthatitsamplesow-levelbehavior,which,inthiscontext,isnomorestablethanthepacket-levelbehaviorob-servedbyGRD.5.4BottleneckedTCPowsSofar,thelimitersrepresentthebottlenecklinkforeachTCPow.HerewedemonstratetheabilityofFPStocorrectlyallocaterateacrossaggregatesofbottleneckedandunbottleneckedows.TheexperimentinFigure7beginsasourbaseline3-to-7owskewexperimentwhere2limitersenforcea10Mbpslimit.Around15seconds,the7-owaggregateexperiencesanupstream2-Mbpsbot- 0 2500 5000 7500 10000 12500 15000 0 5 10 15 20 25 30 35 40 45 50 Rate (Kbps)Time (sec)Aggregate departures Limiter 1 departures Limiter 2 departures Figure7:FPSratelimitingcorrectlyadjustingtothearrivalofbottleneckedows. CTB GRD FPS Aggregate(Mbps) 10.57 10.63 10.43 ShortRTT(Mbps) 1.41 1.35 0.92 (stddev) 0.16 0.71 0.15 LongRTT(Mbps) 0.10 0.16 0.57 (stddev) 0.01 0.03 0.05 Table2:Averagethroughputfor7short(10-msRTT)owsand3long(100ms)RTTowsdistributedacross2limiters.tleneck,andFPSquicklyre-apportionstheremaining8Mbpsofrateacrossthe3owsatlimiter1.Then,attime31,asingleun-bottleneckedowarrivesatlimiter2.FPSrealizesthatanunbottle-neckedowexistsatlimiter2,andincreasestheallocationforthe(7+1)-owaggregate.Inasinglepipe,the4unbottleneckedowswouldnowsharetheremaining8Mbps.Thus,limiter2shouldget40%ofthegloballimit,2Mbpsfromthe7bottleneckedows,and2Mbpsfromthesingleunbottleneckedow.Bytime39,FPSapportionstherateinthisratio.5.5MixedTCPowround-triptimesTCPisknowntobeunfairtolong-RTTows.Inparticular,short-RTTowstendtodominateowswithlongerRTTswhencompetingatthesamebottleneck,astheirtightercontrolloopsal-lowthemtomorequicklyincreasetheirtransmissionrates.FPS,ontheotherhand,makesnoattempttomodelthisbias.Thus,whenthedistributionofowRTTsacrosslimitersishighlyskewed,onemightbeconcernedthatlimiterswithshort-RTTowswouldar-ticiallythrottlethemtotherateachievedbylonger-RTTowsatotherlimiters.Weconductaslightvariantofthebaselineexper-iment,withtwolimitersanda3-to-7owsplit.Inthisinstance,however,all7owstraversinglimiter2are“short”(10-msRTT),andthe3owstraversinglimiter1are“long”(100-msRTT),rep-resentingaworst-casescenario.Table2showstheaggregatede-liveredthroughput,aswellastheaveragethroughputforshortandlong-RTTowsforthedifferentallocators.Asexpected,FPSpro-videsahigherdegreeoffairnessbetweenRTTs,butallthreelim-itersdeliverequivalentaggregaterates.5.6ScalingWeexplorescalabilityalongtwoprimarydimensions:thenum-berofows,andthenumberoflimiters.Firstweconsidera2-limitersetupsimilartothebaselineexperiment,butwithaglobalratelimitof50Mbps.Wesend5000owstothetwolimitersina3-7ratio:1500owstotherstand3500tothesecond.GRDandFPSproduceutilizationof53and46Mbpsandowfairnessof0.44and0.33respectively.Thisisroughlyequaltothatofasingletokenbucketwith5000ows(whichyielded51Mbpsand0.34).Thispoorfairnessisnotsurprising,aseachowhasonly10Kbps,andpriorworkhasshownthatTCPisunfairundersuchcon-ditions[28].Nevertheless,ourlimiterscontinuetoperformwellwithmanyows.Next,weinvestigateratelimitingwithalargenumberoflimitersanddifferentinter-limitercommunicationbudgets,inanenviron-mentinwhichgossipupdatescanbelost.Weconsideratopologywithupto490limiters;ourtestbedcontains7physicalmachineswith70limiterseach.Flowstravelfromthesourcethroughdiffer-entlimiternodes,whichthenforwardthetrafctothesink.(WeconsiderTCPowshereandusesymmetricpathsfortheforwardandreversedirectionsofaow.)Wesettheglobalratelimitto50Mbpsandtheinter-limiterandsource-sinkRTTsto40ms.Ourexperimentsetuphasthenumberofowsarrivingateachlimiterchosenuniformlyatrandomfrom0to5.Forexperimentswiththesamenumberoflimiters,thedistributionandnumberofowsisthesame.Westart1randomowfromtheabovedistributionevery100ms;eachowlivesfor60seconds.Toexploretheeffectofcommunicationbudget,wevarythebranchingfactor(br)ofthegossipprotocolfrom1to7;foragivenvalue,eachextralimiterincursaxedcommunicationcost.Figure8showsthebehaviorofFPSinthisscalingexperiment.Atitsextremethereare1249owstraversing490limiters.(Westopat490notduetoalimitationofFPS,butduetoalackoftestbedresources.)Whenbr=3,eachextralimiterconsumes48203=2:88Kbps.Thus,at490limiters,theentiresystemconsumesatotalof1.4Mbpsofbandwidthforcontrolcommunication—lessthan3%overheadrelativetothegloballimit.Wendthatbeyondabranchingfactorof3,thereislittlebeneteitherinfairnessorutilization.Indeed,extremelyhighbranchingfactorsleadtomessageandultimatelyinformationloss.Beyond50limiters,GRDfailstolimittheaggregaterate(notshown),butthisisnotassuagedbyanincreasingcommunicationbudget(increasingbr).InsteaditindicatesGRD'sdependenceonswiftlyconvergingglobalarrivalrateestimates.Incontrast,FPS,becauseitdependsonmoreslowlymovingestimatesofthenumberofowsateachlimiter,maintainsthelimitevenat490limiters.Thisexperimentshowsthatlimitersrelyuponup-to-datesum-mariesofglobalinformation,andthesesummariesmaybecomestalewhendelayedordroppedbythenetwork.Inparticular,ourconcernlieswithstaleunder-estimatesthatcausethesystemtoovershoottheglobalrate;acompletelydisconnectedsystem—duetoeithercongestion,failure,orattack—couldover-subscribethegloballimitbyafactorofN.Wecanavoidthesescenariosbyini-tializinglimiterswiththenumberofpeers,N,andrunningalight-weightmembershipprotocol[24]tomonitorthecurrentnumberofconnectedpeers.Foreachdisconnectedpeer,thelimitercanre-ducethegloballimitby1 N,andseteachstaleestimatetozero.Thisconservativepolicydrivesthelimiterstowarda1 Nlimiter(whereeachlimiterenforcesanequalfractionoftheglobalaggregate)asdisconnectionsoccur.Moregenerally,though,wedeferanalysisofDRLunderadversarialorByzantineconditionstofuturework. 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 300 350 400 450 500 Jain's Fairness IndexNumber of limitersCentral token bucket Branch = 7 Branch = 3 Branch = 1 (a)FPSfairness. 35000 40000 45000 50000 55000 60000 65000 70000 0 50 100 150 200 250 300 350 400 450 500 Aggregate delivered rate (Kbps)Number of limitersCentral token bucket Branch = 7 Branch = 3 Branch = 1 (b)FPSdeliveredrate.Figure8:Fairnessanddeliveredratevs.numberoflimitersinthescalingexperiment.5.7Limitingcloud-basedservicesFinally,wesubjectFPStointer-limiterdelays,losses,andTCParrivalsandowlengthssimilartothoseexperiencedbyacloud-basedservice.AsinSection5.3,wearenotconcernedwiththeactualservicebeingprovidedbythecloudoritscomputationalload—weareonlyinterestedinitstrafcdemands.Hence,weemulateacloud-basedservicebyusinggenericWebrequestsasastand-inforactualservicecalls.Weco-locatedistributedratelimiterswith10PlanetLabnodesdistributedacrossNorthAmericaconguredtoactascomponentservers.Withoutlossofgenerality,wefocusonlimitingonlyout-boundtrafcfromtheservers;wecouldjustaseasilylimitin-boundtrafcaswell,butthatwouldcomplicateourexperimentalinfrastructure.EachPlanetLabnoderunsApacheandservesaseriesofWebobjects;anoff-test-bedclientmachinegeneratesrequestsfortheseobjectsusingwget.Theratelimitersenforceanaggregateglobalratelimitof5Mbpsontheresponsetrafcusinga100-msestimateintervalandagos-sipbranchingfactorof4,resultinginatotalcontrolbandwidthof38.4Kbps.Theinter-limitercontroltrafcexperienced0.47%lossduringthecourseoftheexperiment.Figure9showstheresultingtime-seriesplot.Initiallyeachcon-tentserverhasdemandstoserve3requestssimultaneouslyfor30seconds,andthenthetotalsystemloadshiftstofocusononly4serversfor30seconds,emulatingachangeintheservice'srequestload,perhapsduetoaphasetransitionintheservice,oraashcrowdofuserdemand.Figure9(a)showsthebasecase,whereastatic1 Nlimitingpolicycannottakeadvantageofunusedcapacityattheother6sites.Incontrast,FPS,whileoccasionallyburstingabovethelimit,accommodatesthedemandswinganddeliversthefullratelimit.6.RELATEDWORKTheproblemofonline,distributedresourceallocationisnotanewone,buttoourknowledgewearethersttopresentaconcreterealizationofdistributedtrafcratelimiting.Whiletherehasbeenconsiderableworktodeterminetheoptimalallocationofband-widthbetweenend-pointpairsinvirtualprivatenetworks(VPNs),thegoalisfundamentallydifferent.IntheVPNhosemodel[23],thechallengeistomeetvariousquality-of-serviceguaranteesbyprovisioninglinksinthenetworktosupportanytrafcdistributionthatdoesnotexceedthebandwidthguaranteesateachendpointintheVPN.Conversely,thedistributedratelimitingproblemistocontroltheaggregatebandwidthutilizationatalllimitersinthenetwork,regardlessoftheavailablecapacityattheingressoregresspoints.Distributedratelimitingcanbeviewedasacontinuousformofdistributedadmissioncontrol.Distributedadmissioncontrolallowsparticipantstotestforandacquirecapacityacrossasetofnetworkpaths[21,40];eachedgerouterperformsow-admissionteststoensurethatnosharedhopisover-committed.Whileourlimiterssimilarly“admit”trafcuntilthevirtuallimiterhasreachedcapac-ity,theydosoinaninstantaneous,reservation-freefashion.Ensuringfairnessacrosslimiterscanbeviewedasadistributedinstanceofthelink-sharingproblem[15].Anumberofpacketschedulingtechniqueshavebeendevelopedtoenforcelink-sharingpolicies,whichprovidebandwidthguaranteesfordifferentclassesoftrafcsharingasinglelink.Thesetechniques,suchasweightedfairqueuing[11],apportionlinkcapacityamongtrafcclassesaccordingtosomexedweights.Theseapproachesdifferfromoursintwokeyrespects.First,byapproximatinggeneralizedpro-cessorsharing[31],theyallocateexcessbandwidthacrossback-loggedclassesinamax-minfairmanner;weavoidenforcinganyexplicittypeoffairnessbetweenlimiters,thoughFPStriestoen-suremax-minfairnessbetweenows.Second,mostclass-basedfair-queuingschemesaimtoprovideisolationbetweenpacketsofdifferentclasses.Incontrast,weexposetrafcateachlimitertoallothertrafcinthesystem,preservingwhateverimplicitnotionoffairnesswouldhaveexistedinthesingle-limitercase.AsdiscussedinSection3,weuseatokenbuckettodenethereferencebehaviorofasinglelimiter.Thereareabroadrangeofactivequeuemanage-mentschemesthatcouldserveequallywellasacentralizedrefer-ence[13,14].DeterminingwhethersimilardistributedversionsofthesesophisticatedAQMschemesexistisasubjectoffuturework.Thegeneralproblemofusingandefcientlycomputingaggre-gatesacrossadistributedsetofnodeshasbeenstudiedinanumberofothercontexts.Theseincludedistributedmonitoring[12],trig-gering[19],counting[33,38],anddatastreamquerying[5,26].Twosystemsinparticularalsoestimateaggregatedemandtoap-portionsharedresourcesatmultiplepointsinanetwork.Therstisatoken-basedadmissionarchitecturethatconsiderstheproblemofparallelowadmissionsacrossedgerouters[7].Theirgoalistodividethetotalcapacityfairlyacrossallocationsatedgeroutersbysettinganedgerouter'slocalallocationquotainproportiontoitsshareoftherequestload.Howevertheymustreverttoarst-comerst-servedallocationmodelifeverforcedto“revoke”bandwidthtomaintaintherightshares.Zhaoetal.useasimilarprotocoltoenforceservicelevelagreementsbetweenserverclusters[41].A 0 2500 5000 7500 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 Rate (Kbps)Time (sec)Aggregate departures Limiter 1 departures Limiter 2 departures Limiter 3 departures Limiter 4 departures Limiter 5 departures Limiter 6 departures Limiter 7 departures Limiter 8 departures Limiter 9 departures Limiter 10 departures (a)1 Nlimiting. 0 2500 5000 7500 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 Rate (Kbps)Time (sec)Aggregate departures Limiter 1 departures Limiter 2 departures Limiter 3 departures Limiter 4 departures Limiter 5 departures Limiter 6 departures Limiter 7 departures Limiter 8 departures Limiter 9 departures Limiter 10 departures (b)FlowProportionalShareFigure9:Atime-seriesgraphratelimitingat10PlanetLabsitesacrossNorthAmerica.EachsiteisaWebserver,frontedbyaratelimiter.Every30secondstotaldemandshiftstofourserversandthenbacktoall10nodes.Thetoplinerepresentsaggregatethroughput;otherlinesrepresenttheservedratesateachlimiter.setoflayer-7switchesemploya“coordinated”requestqueuingal-gorithmtodistributeservicerequestsinproportiontotheaggregatesumofswitchqueuelengths.7.CONCLUSIONAscloud-basedservicestransitionfrommarketingvaporwaretoreal,deployedsystems,thedemandsontraditionalWeb-hostingandInternetserviceprovidersarelikelytoshiftdramatically.Inparticular,currentmodelsofresourceprovisioningandaccount-inglacktheexibilitytoeffectivelysupportthedynamiccompo-sitionandrapidlyshiftingloadenabledbythesoftwareasaser-viceparadigm.Wehaveidentiedonekeyaspectofthisproblem,namelytheneedtoratelimitnetworktrafcinadistributedfashion,andprovidedtwonovelalgorithmstoaddressthispressingneed.Ourexperimentsshowthatnaiveimplementationsbasedonpacketarrivalinformationareunabletodeliveradequatelevelsoffairness,and,furthermore,areunabletocopewiththelatencyandlosspresentinthewidearea.Wepresentedthedesignandim-plementationoftwolimiters,aprotocol-agnosticglobalrandomdropalgorithmandaowproportionalsharealgorithmappropri-atefordeploymentinTCP-basedWeb-servicesenvironmentsthatisrobusttolongdelaysandlossyinter-limitercommunication.Bytranslatinglocalarrivalrateintoaowweight,FPScommunicatesaunitofdemandthatisinherentlymorestablethanpacketarrivals.Thus,itispossibleforthelocalarrivalratestouctuate,butfortheowweighttoremainunchanged.Ourresultsdemonstratethatitispossibletorecreate,atdis-tributedpointsinthenetwork,theowbehaviorthatendusersandnetworkoperatorsexpectfromasinglecentralizedratelim-iter.Moreover,itispossibletoleverageknowledgeofTCP'scon-gestionavoidancealgorithmtodosousinglittlebandwidth,hun-dredsoflimiters,thousandsofows,andrealisticInternetdelaysandlosses.WhileourexperiencewithGRDindicatesitmaybedifculttodeveloparobustprotocol-agnosticlimiter,itislikelythatUDP-basedprotocolsdeployedinacloudwillhavetheirowncongestion-controlalgorithms.Hence,FPScouldbeextendedtocalculateaowweightfortheseaswell.AcknowledgementsWearegratefultoAlbertoBlanc,JayChen,Yu-ChungCheng,Ste-fanSavage,ScottShenker,AminVahdat,GeorgeVarghese,andGeoffVoelkerfortheirvaluablecomments.ThisworkissupportedbytheNationalScienceFoundationthroughgrantCNS-627167.8.REFERENCES[1]Packeteer.http://www.packeteer.com.[2]AkamaiTechnologies.Personalcommunication,June2007.[3]Amazon.Elasticcomputecloud.http://aws.amazon.com/ec2.[4]G.Appenzeller,I.Keslassy,andN.McKeown.Sizingrouterbuffers.InProceedingsofACMSIGCOMM,2004.[5]B.BabcockandC.Olston.Distributedtop-kmonitoring.InProceedingsofACMSIGMOD,2003.[6]D.BertsekasandR.Gallager.DataNetworks.PrenticeHall,1987.[7]S.BhatnagarandB.Nath.Distributedadmissioncontroltosupportguaranteedservicesincore-statelessnetworks.InProceedingsofIEEEINFOCOM,2003.[8]D.F.Carr.HowGoogleworks.BaselineMagazine,July2006.[9]G.CarraroandF.Chong.Softwareasaservice(SaaS):Anenterpriseperspective.MSDNSolutionArchitectureCenter,Oct.2006.[10]A.Demers,D.Greene,C.Hauser,W.Irish,J.Larson,S.Shenker,H.Sturgis,D.Swinehart,andD.Terry.Epidemicalgorithmsforreplicateddatabasemaintenance.InProceedingsofACMPODC,1987.[11]A.Demers,S.Keshav,andS.Shenker.Analysisandsimulationofafairqueueingalgorithm.InProceedingsofACMSIGCOMM,1989.[12]M.DilmanandD.Raz.Efcientreactivemonitoring.InProceedingsofIEEEINFOCOM,2001.[13]W.Feng,K.Shin,D.Kandlur,andD.Saha.Theblueactivequeuemanagementalgorithms.IEEE/ACMTransactionsonNetworking,10(4),2002. [14]S.FloydandV.Jacobson.Randomearlydetectiongatewaysforcongestionavoidance.IEEE/ACMTransactionsonNetworking,1(4),1993.[15]S.FloydandV.Jacobson.Link-sharingandresourcemanagementmodelsforpacketnetworks.IEEE/ACMTransactionsonNetworking,3(4),1995.[16]I.Gupta,A.-M.Kermarrec,andA.J.Ganesh.Efcientepidemic-styleprotocolsforreliableandscalablemulticast.InProceedingsofIEEESRDS,2002.[17]D.Hinchcliffe.2007:TheyearenterprisesopenthierSOAstotheInternet?EnterpriseWeb2.0,Jan.2007.[18]M.Huang.Planetlabbandwidthlimits.http://www.planet-lab.org/doc/BandwidthLimits.php.[19]A.Jain,J.M.Hellerstein,S.Ratnasamy,andD.Wetherall.Awakeupcallforinternetmonitoringsystems:Thecasefordistributedtriggers.InProceedingsofHotNets-III,2004.[20]R.Jain,D.M.Chiu,andW.Hawe.Aquantitativemeasureoffairnessanddiscriminationforresourceallocationinsharedcomputersystems.Technicalreport,DECResearchReportTR-301,1984.[21]S.Jamin,P.Danzig,S.Shenker,andL.Zhang.Ameasurement-basedadmissioncontrolalgorithmforintegratedservicespacketnetworks.InProceedingsofACMSIGCOMM,1995.[22]D.Kempe,A.Dobra,andJ.Gehrke.Gossip-basedcomputationofaggregateinformation.InProceedingsofIEEEFOCS,2003.[23]A.Kumar,R.Rastogi,A.Siberschatz,andB.Yener.Algorithmsforprovisioningvirtualprivatenetworksinthehosemodel.IEEE/ACMTransactionsonNetworking,10(4),2002.[24]J.Liang,S.Y.Ko,I.Gupta,andK.Nahrstedt.MON:On-demandoverlaysfordistributedsystemmanagement.InProceedingsofUSENIXWORLDS,2005.[25]J.Ma,K.Levchenko,C.Kriebich,S.Savage,andG.M.Voelker.Automatedprotocolinference:Unexpectedmeansofidentifyingprotocols.InProceedingsofACM/USENIXIMC,2006.[26]A.Manjhi,V.Shkapenyuk,K.Dhamdhere,andC.Olston.Finding(recently)frequentitemsindistributeddatastreams.InProceedingsofIEEEICDE,2005.[27]P.Marks.Mashup'websitesareahacker'sdreamcometrue.NewScientistmagazine,2551:28,May2006.[28]R.Morris.TCPbehaviorwithmanyows.InProceedingsofIEEEICNP,1997.[29]J.Musser.Programmableweb.http://www.programmableweb.com.[30]A.M.Odlyzko.Internetpricingandthehistoryofcommunications.ComputerNetworks,36:493–517,2001.[31]A.K.ParekhandR.G.Gallager.Ageneralizedprocessorsharingapproachtoowcontrolinintegratedservicesnetworks:thesingle-nodecase.IEEE/ACMTransactionsonNetworking,1(3),1993.[32]B.RaghavanandA.C.Snoeren.Asystemforauthenticatedpolicy-compliantrouting.InProceedingsofACMSIGCOMM,2004.[33]N.ShavitandA.Zemach.Diffractingtrees.ACMTransactionsonComputerSystems,14(4),1996.[34]I.Stoica,S.Shenker,andH.Zhang.Core-statelessfairqueueing:ascalablearchitecturetoapproximatefairbandwidthallocationsinhighspeednetworks.InProceedingsofACMSIGCOMM,1998.[35]A.Vahdat,K.Yocum,K.Walsh,P.Mahadevan,D.Kosti´c,J.Chase,andD.Becker.Scalabilityandaccuracyinalarge-scalenetworkemulator.InProceedingsofUSENIXOSDI,2002.[36]K.VishwanathandA.Vahdat.Realisticandresponsivenetworktrafcgeneration.InProceedingsofACMSIGCOMM,2006.[37]L.Wang,K.Park,R.Pang,V.S.Pai,andL.Peterson.ReliabilityandsecurityintheCoDeeNcontentdistributionnetwork.InProceedingsofUSENIX,2004.[38]R.WattenhoferandP.Widmayer.Aninherentbottleneckindistributedcounting.InProceedingsofACMPODC,1997.[39]X.Yang,D.Wetherall,andT.Anderson.ADoS-limitingnetworkarchitecture.InProceedingsofACMSIGCOMM,2005.[40]Z.-L.Zhang,Z.Duan,andY.T.Hou.Onscalabledesignofbandwidthbrokers.IEICETransactionsonCommunications,E84-B(8),2001.[41]T.ZhaoandV.Karamcheti.Enforcingresourcesharingagreementsamongdistributedserverclusters.InProceedingsofIEEEIPDPS,2002.APPENDIXHereweshowthatFPScorrectlystabilizestothe“correct”alloca-tionsatalllimitersinthepresenceofbothunbottleneckedandbot-tleneckedows.First,wepresentamodelofTCPestimationovernlimiters.Leta1;a2;:::;anbethenumberofunbottleneckedowsatlimiters1tonrespectively.Similarly,letB1;B2;:::;Bnbethelocalbottleneckedowrates(whichmayincludemultipleows).Attheithlimiter,thereexistsalocalratelimit,li.Theselimitsaresubjecttotheconstraintthatl1+l2++ln=L,whereListheglobalratelimit.LetU=LPiBirepresentthetotalamountofrateavailableforunbottleneckedows.LetA=Piairepresentthetotalnumberofunbottleneckedowsacrossalllimiters.Giventhesevalues,aTCPestimatoroutputsatupleofweights(w1;w2;:::;wn)thatareusedbyFPStoas-signratelimitsatalllimiters.Supposewearegivenperfectglobalknowledgeandaretaskedtocomputethecorrectallocationsatalllimiters.TheallocationwouldbeI=(Ua1 A+B1;Ua2 A+B2;:::;Uan A+Bn):Notethattheseweightsarealsoequaltotheactualratelimitsas-signedateachnode.Thiscorrespondstoanallocationwhichwouldresultforeachlimiter'sowaggregatehadallows(globally)beenforcedthroughasinglepipeofcapacityL.FPSrstestimatestherateofasingleunbottleneckedowateachlimiter.Oncestabilized,suchaowatlimiternumberiwillreceivearatef(whereliisthecurrentratelimitatlimiteri):f=liBi ai:Giventheseowrates,FPSwillcomputeanewweightwiateachlimiter:wi=liai liBi:OnceFPSarrivesattheidealallocation,itwillremainattheidealallocationintheabsenceofanydemandchanges.Thatis,suppose(l1;:::;ln)=(I1;:::;In):Weclaimthatthenewlycomputedweights(w1;:::;wn)resultinthesameallocation;equivalently,w1 w1++wn=I1 I1++In:Theweightscomputedgiventhisstartingstateare,foreachi,wi=(Uai A+Bi)ai (Uai A+Bi)Bi:Thus,consideringtheallocationatlimiter1,w1 w1++wn=Ua1+AB1 U Ua1+AB1 U++Uan+ABn U;whichisequaltoUa1+AB1 Ua1+AB1++Uan+ABn=I1 I1++In;theidealallocationfractionforlimiter1.Theallocationsatotherlimitersareanalogous.