/
Opus:anOverlayPeerUtilityServiceRebeccaBraynard,DejanKosti Opus:anOverlayPeerUtilityServiceRebeccaBraynard,DejanKosti

Opus:anOverlayPeerUtilityServiceRebeccaBraynard,DejanKosti - PDF document

stefany-barnette
stefany-barnette . @stefany-barnette
Follow
383 views
Uploaded On 2015-09-15

Opus:anOverlayPeerUtilityServiceRebeccaBraynard,DejanKosti - PPT Presentation

ThisresearchissupportedinpartbytheNationalScienceFoundationEIA9972879ITR0082912HewlettPackardIBMIntelandMicrosoftBraynardissupportedbyanNSFgraduatefellowshipandVahdatisalsosupportedbyanNSF ID: 129656

ThisresearchissupportedinpartbytheNationalScienceFounda-tion(EIA-9972879 ITR-0082912) Hewlett-Packard IBM Intel andMicrosoft.BraynardissupportedbyanNSFgraduatefellowshipandVahdatisalsosupportedbyanNSF

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Opus:anOverlayPeerUtilityServiceRebeccaB..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Opus:anOverlayPeerUtilityServiceRebeccaBraynard,DejanKosti´c,AdolfoRodriguez,JeffChaseandAminVahdatDepartmentofComputerScienceDukeUniversityrebecca,dkostic,razor,chase,vahdat@cs.duke.eduToday,anincreasingnumberofimportantnetworkservices, ThisresearchissupportedinpartbytheNationalScienceFounda-tion(EIA-9972879,ITR-0082912),Hewlett-Packard,IBM,Intel,andMicrosoft.BraynardissupportedbyanNSFgraduatefellowshipandVahdatisalsosupportedbyanNSFCAREERaward(CCR-9984328).Theextremestanceofassociatingcodewitheverypacketthattravelsacrossthenetworkhasnotseenwidedeploy- QoSGuarantees:Finally,moretraditionalunicastapplicationsarealsousingdistributednetworkre-sourcestoachievebetterperformanceandreliabilitythanwouldbedeliveredbytheunderlyingIPnetwork.Oneinitialstudy[31]usednetworktracestodeter-minethat,inmanycases,defaultIProutingresultsinpathswithinferiorreliabilityandperformancerel-ativetoindirectroutingthroughoneofasetofin-termediatenodes.RecentworkonResilientOverlayNetworks[3]quantiessuchimprovementsina13-nodetestbedrunningacrosstheInternet.Anotherre-centstudy[32]advocatesusingmultipleintermediatepointsinanoverlaytoredundantlytransmitthesamedatafromsourcetodestination,reducingend-to-endlossratesandperformancevariability.Today,alloftheseservicesmustredundantlyacquireandadministernodesacrosstheInternettoprovidethereq-uisitefunctionality.Thisapproachforcesservicestoreim-plementsubstantiallysimilarfunctionality,suchastrack-ingchangingnetworkcharacteristics,buildingappropri-atetopologies,failuredetection,orIPtopologymatching.Further,giventhehighlyburstynatureofInternettrafcandserviceaccesspatterns,individualservicesmustover-provisionforpeaklevelsofdemandthatareoftenafactorof3-10higherthantheaveragecase.Webelievethatacommonsysteminfrastructuretosup-porttherequirementsofabroadrangeofapplicationswillimprovetheperformanceofexistingapplicationsbyex-portingcommonbestpracticesandwillalsoeffectaqual-itativeshiftintheeasewithwhichnoveldistributedappli-cationscanbedeployed.Thus,weproposeOpus,anOver-layPeerUtilityService,toautomaticallycongureservernetworkoverlayswiththegoalofdynamicallymeetingtheperformanceandavailabilityrequirementsofabroadrangeofcompetingapplications.Byobservingchangingnetworkconditionsandapplicationaccesspatterns,Opuswill:i)allocateaportionofglobalresourcestoeachappli-cation,ii)placethesereplicasatappropriatepointsinthenetwork,andiii)createoverlaysthatsatisfytherequire-mentsofindividualdistributedapplications.Keytoourapproachisbuildingscalablestructurestotrackchangingsystemcharacteristicsanddevelopingacommonabstrac-tionforprioritizingamongcompetingapplications.Therestofthispaperisorganizedasfollows.Section2describestheOpussystemarchitecture.Next,Section3presentsindividualchallengesweareaddressingtorealizethismodelandsomeinitialresults.Section4comparesourworktorelatedeffortsandSection5presentsourconclu-2ArchitectureTheOpusserviceallocatesoverlaysofserverandnet-workresourcesfromasharedpool,asneededtomeetservicequalitygoalsefcientlyinthefaceofdynami-callychangingglobalcharacteristics.Serviceworkloadsarestreamsofrequestsoriginatingfromclientsthroughoutthenetwork,andrequiringvaryingamountsofcomputa-tion,shareddataaccess,anddatatransfertoandfromeachclient.Weassumethatserviceapplicationsshowstableaverageper-requestbehavior;loadsaredenedbyofferedrequestratesthatmayvarycontinuouslythroughtime.Figure1depictsthehigh-levelOpussystemmodel.Weenvisionacollectionofserversites(e.g.,smallclustersordatacenters)colocatedwithswitchingcentersattheinte-rioroftheInternet“cloud.”OpusmanagesthesePoints-of-Presence(PoPs)inacoordinatedfashionasasharedphys-icalinfrastructurefordistributedInternetapplications.Ap-plicationsconsistofcomponentsrunningonselectedOpusnodes.Opusconguresthesenodestorunapplicationsoft-wareandorganizesthemasanapplication-specicoverlaynetwork.Opusresourceallocatorscooperatetoassignresourcestoeachoverlayapplication.Theseresourcesinclude“slices”oftheserverandnetworkcapacityonsomesubsetoftheOpusPoPs—thepiechartsinFigure1representper-regionapplicationdemandlevelsthatideallycorrespondtoresourceallocationlevelsinnearbyOpusPoPs—togetherwithanoverlaytopologyconguredforthatapplication.Ourapproachistodescribeapplicationsabstractlyintermsoftheirservicequalitygoals,thengeneratecandidateal-lotmentsandoverlaytopologiesthatbalanceservicequal-itywithnetworkperformanceandcost.Arequestroutinginfrastructuredirectsexternaltrafc(e.g.,clientrequests)destinedforeachapplicationservicetoselectednodesas-signedtothatapplication.RequestroutingmayleverageDNSredirection[10],anycast[5,21,38],oranOpusnam-inginterface.Theserviceoverlay.EachOpusPoPrunsaninstanceoftheOpussitemanager,whichcoordinatesresourceus-ageatthatsiteandexchangesstatussummarieswithotherOpussites.Opususesitsownoverlayservicesinter-nallytodisseminatestatusinformationandrelatedmeta-datathroughaserviceoverlaythatinterconnectsallactivenodes.Theserviceoverlayformsthe“backbone”forco-ordinated,decentralizedresourceallocationandresourcecontrol,asdescribedbelow.Thustheserviceoverlaymustbedynamicandself-healing:ifanetworkpathislostorde-graded,thentheserviceoverlaymustreconguretoreroutetrafcthroughadifferentpath. Application demand(per network region) Overlay nodeApplication Overlays Figure1:Opussystemmodel.Scalabilityisakeyconcerninthedesignoftheserviceoverlay,asweexpectOpustoscaleto10,000ormorenodesacrossthewidearea.Wetakeadecentralizedap-proachinwhichlocalsitemanagers“thinkgloballybutactlocally,”makinglocalresourceallocationchoicestocon-vergeondesirableglobaloutcomesbasedoninformationdisseminatedthroughtheserviceoverlay.Thisstatusin-formationincludessummariesofresourceavailability,net-workconditions,load,anddeliveredperformanceateachsite.Theschemesforconguringoverlaysalsorequirees-timatesoflinkcapacities,delays,andfailureprobabilitiesforthephysicalnetworkinterconnectingtheOpussites.AkeypremiseoftheOpusarchitectureisthateffectiveresourceallocationandcontrolrequiresonlyapproximateinformationaboutglobalconditions.Section3.3presentssomeresultsfromourinitialapproachtoscalabledissem-inationofsystemmetadatathroughtheserviceoverlay,Adaptiveper-applicationoverlays.Aprimarytaskoftheserviceoverlayistoassistintheconstructionandmaintenanceofapplicationoverlays.Individualapplica-tionsusetheseoverlaystorouteinternalapplicationtraf-c,disseminatecontent,and/orsynchronizetheirstatein-formationefciently.Forexample,avideodeliverysys-temwoulduseitsoverlaytodisseminateitscontenttoparticipatingsites,whichinturntransmitthedatatoendclients.Areplicateddatabasesystemwoulduseitsover-laytomaintainreplicaconsistencybypropagatingupdatesamongactivereplicasites.AtthecoreoftheOpusarchitecturearealgorithmstoselectthenumberandplacementofsitelocationsforeachapplication,allocateglobalresourceshares,andcongureoverlaystolinktheselectedsites.Theseinter-relatedas-pectsoftheoverlayconstructionprobleminteractinacom-plexwaytodetermineend-to-endapplicationbehavior.Asanexample,consideranInternetserviceusingdynamicreplicationforscalabilityandavailability.Forareplicatedservice,theneedtopropagateupdatesacrossreplicasim-posesnewnetworkloadandmaycompromiseavailabil-ity.PreviousworkintheTACTprojecthasshownthattheavailabilityofareplicacongurationdependsontheapplication’sconsistencydemands,aswellasthenumberandplacementofreplicasandthereliabilityoftheirinter-connections[39].Whileaddingserversitescanimproveperformanceandavailability,moreisnotnecessarilybet-ter:wendthatinsomecasesadditionalsitescanactu-reduceoverallperformanceandavailabilitydependingonapplicationconsistencyrequirements.Infact,asmallsetofwell-placedandwell-provisionedreplicasitesgen-erallyoutperformsalargersetofpoorly-placedreplicas.ThetechniquesdevelopedforTACTallowtheOpusallo-catorstopredictperformanceandquantifyavailabilityasafunctionofthecandidateoverlaycharacteristicsandtheapplication’sconsistencytargets.Afterinstantiatinganoverlayforanapplication,theOpusresourceallocatorsdynamicallyadapttheoverlaytopologyandsiteallotmentstorespondtoobservedloadandnetworkconditions.Forexample,ifmanyaccessesareobservedforanapplicationinagivennetworkregion,thesystemmayreallocateadditionalresourcesataPoPclosetothatlocation,possiblyaddinganewsitepresencetotheapplicationoverlay.Thesystemcontinuouslymonitorslo-calandglobalconditionsthroughtheserviceoverlay,andusesfeedbackcontrolasabasisforincremental,adaptiveresourceprovisioning.Inadditiontoenablingdynamicadaptation,thefeedbackloopenablesthesystemtocon-tinuouslyreneresourceallotmentssothatthequalityofaninitialsolutionislesscritical.Resourceallocationandservicequality.OpusstrivesforresourceallotmentsthatarebotheffectiveefÞcientAneffectiveallotmentmeetsservicequalitygoals;anef-cientallotmentconservesresources.Oneapproachisto striveforleast-costallotmentsthatsatisfyxedapplicationservicequalityboundsunderexistingtrafcconditions.Wetakeamoregeneralapproachtoenablethesystemtoprioritizeapplicationsunderresourceconstraint.Althoughweexpectthattheutilityisadequatelyprovisionedandemploysadmissioncontroltoavoidovercommittingitsre-sources,theInternetenvironmentisadversarial,andlarge-scaleservicesshoulddesignin“fallback”positionsforex-tremescenariosinvolvingsiteorlinkfailures,ashcrowds,orattacksonthesystemoritsphysicalinfrastructure.Forthisreason,ourapproachemphasizesdynamictradeoffsofservicequalityandcost.Thiscanenablethesystemtomatchresourcedemandwithdynamicallyvaryinglevelsofresourcesupply,inordertomaximizetheglobalgoodun-derthefullrangeofconditionsandconstraintsthatitmightencounterinoperation.Indeed,akeybenetoftheutil-ityapproachisthatitcanreallocatesharedinfrastructuretorespondtoadverseconditions.Suchreallocationmaytakeplacebasedoneconomicconsiderations(e.g.,whoiswillingtopaythemost?)orbasedonrelativeapplicationpriority(e.g.,whichservicesmustabsolutelystayupandrunningduringadenialofserviceattack?).Amajortechnicalchallengeforexibleresourceallo-cationinanoverlayutilityserviceistogeneratecandidateoverlaycongurationswithvaryingtradeoffsofcostandservicequality(benet).Section3.1presentstwoover-laystructuresweareinvestigatingtosupportthisobjec-tive.Dynamiccost/benettradeoffsalsodependonmod-elstopredictandquantifythebenetofeachcandidatecongurationalongmultipledimensionsofservicequal-ity.Themodelsmustconsidernon-traditionalqualitymea-suressuchasavailabilityorconsistency,aswellasmoretraditionalperformancemeasuressuchasresponsetime,fairness,orthroughput.Theunitstoquantifydifferentdi-mensionsofservicequalityandcostarearbitrary:thesys-temmayscalethesemeasuresarbitrarilybeforecomparingorcombiningthemtobalancecompetingobjectives.Givenmeasuresforservicequalityandcostforcandi-dateoverlaycongurations,thesystemneedsexiblecri-teriatoestablishcustomerpriority.OurinitialapproachistopresentservicequalitygoalstotheresourceallocatorsasServiceLevelAgreements(SLAs)represent-ingacontinuumofservicequalitytradeoffs.Theutilityfunctionsmayrepresentarbitrarycriteriaforestablishingcustomerpriority.OpusSLAsspecifyservicequalitygoalsascontinuousutilityfunctionsspecifyingvaluesassociatedwithvariouslevelsofservicevolumeandqualityforeachcustomer.Inourpreviousworkonserverprovisioninginindividualdatacenters,wefoundthatgeneralizedutilityfunctionsareaexiblemeanstoguidedynamictradeoffsofservicequalityandcost[8].Modestconstraintsontheformoftheutilityfunctionsenabletheresourceallocatorstoidentifyutility-maximizingallocationsefciently,andrenethemincrementallyinafeedbackloop.Section3.2outlinesourapproachinmoredetail.Securityandisolation.Securityisanimportantcon-siderationforanygeneral-purposeutility.Opusallocatesresourcestoapplicationsatthegranularityofindividualnodes,eliminatingasubsetofthesecurityandisolationissuesassociatedwithsimultaneouslyhostingmultipleap-plications.Inthefuture,weplantoinvestigatetheuseofvirtualmachinetechnologytoisolateservicesrunningonthesamephysicalhost[37].Onthenetworkside,wemuststillisolatetrafconthewirefromdifferentapplicationsrunningatthesameOpussite.VLANs,supportedinmostmodernswitches,supportsuchfunctionalityanditshouldbestraightforwardtoautomatetherequisiteisolationinre-sponsetodynamicreallocationoflocalsiteresources.Fi-nally,policy-basedsharingofphysicalresourcesdependsonaccuratemeasuresofapplicationresourcedemand.Insomecasesitmaybeusefulforthecustomeritselftopro-videtheloadandQoSmeasures.Ifso,Opusreliesonsim-pleeconomicstoencouragecustomerstodeployefcientsoftwareandaccuratelyrepresenttheirresourceneedsforagivendemandlevel:customersconserveresourceswhentheyareaskedtopayfortheirusage.3SystemComponentsThissectiondescribesfouroftheprincipalchallengesthatwemustaddresstodeployageneral-purposeandlarge-scaleserviceutility:constructingoverlays,allocat-ingresources,propagatingstatus,andimprovingreliabil-itythroughmulti-pathrouting.Acommonthemerunningacrossallsystemcomponentsisthatlocaldecisionsmustbemadetoapproximatetheglobalgoodbasedonpartialanduncertaininformation.3.1OverlayTopologyConstructionAsdiscussedabove,Opusmustbuildandmaintaintwoseparatetypesofoverlays.Theserviceoverlaymaintainsanddistributesoverallservicemetadataamongparticipat-ingsites.Theserviceoverlayalsofacilitatestheconstruc-tionofsmaller-scaleapplicationoverlaysdesignedtomeettheperformanceandreliabilityrequirementsofabroadrangeofnetworkservices.Anumberofeffortshaveinves-tigatedtechniquesforbuildingproperoverlaytopologiestomatchaparticularapplication’srequirements[3,18,20].Further,thescalabilityofcurrenttechniquesrequireglobal knowledgeanddonotscalebeyondafewtensofnodes..Wemustdevisesolutionsthatscaletothousandsofnodesforapplicationoverlaysandtensofthousandsofnodesfortheserviceoverlay.Ourinitialworkfocusesondevelopingageneralover-laytopologythatenablesdynamictradeoffsbetweennet-workperformance/reliabilityandcost.Notethatacostofanoverlaylinkcanbeassignedarbitrarily,butislikelytodependuponthecostoftheindividualphysicallinksthatcomposeanoverlaylink.Thiscostmayreectcurrentcon-gestionlevelsonalink,thepricepaidtoanISPtousealink,etc.Theactualassignmentofcosttoindividuallinksisbeyondthescopeofthispaper,thoughwedoassumethatnoindividualOpusnodesareawareofthisglobalcostmetricandthatthemetricchangesdynamicallyovertime.Oneofthekeyinitialgoalsinourworkistobuildap-plicationoverlaystoenableexibleanddynamictradeoffsbetweenoverlaycost—logicallyameasureoftotalnet-workresourcesconsumedintransmittingdataacrosstheoverlay—andtheassociatedperformanceandreliabilitycharacteristicsoftheoverlay.Toquantifythebenetsofcompetingstructures,weneedasetofmetricstocomparethequalityofcandidateoverlaytopologies.Initially,wefocusonnetworkcostandrelativedelaypenalty(RDP)tocharacterizeoverlaytopologies.RDPmeasurestherelativeincreaseofdelayincurredfromusingaparticularoverlayrelativetodirecttransmissionintheunderlyingIPnetwork.Networkcostisthesumofallthelinkweightsassociatedwithagivenoverlaytopology.Wehaveidentiedtwocandidateoverlaytopologiesthatenablesuchexibletradeoffs[23].A[7]en-suresthatallpathsinanoverlayhaveanRDPnoworse.Lowervaluesofresultinhighercostforbuildingtheoverlay.Becausek-spannersattempttoguaranteelow-latencypathsbetweenallpairsofhosts,itismoreappro-priateformulti-senderapplications.Asecondstructure,[22](lightweightapproximateshortest-pathtree),enablessimilartradeoffsforsingle-senderapplications.WithaLAST,acongurationparameter,,boundstheRDPofallpathsfromadesignatedsourcetoalldestina-tionshaveanRDPnoworsethan.Forinstance,aLASTensuresthatalldestinationsreceivedatawithdelayatmost50%higherthantransmissionthroughIP.Wenowpresenttheresultsofsomeinitialexperimentstoquantifythebenetsofk-spannersandLASTs.Theprin-cipalgoalhereistoenableOpustouseoverlay-specictuningparameterstomatchapplicationrequirements.ForexampleOpuscanadapttochangingconditionsbyturn-ingaknob(suchastheorthevalue)toreallocatere-sourcestoadjustthebalanceofcostandperformance.Toquantifythebenetsofdynamicallytradingnetworkcostforperformanceinoverlays,weransomesimulationsofbothk-spannersandLASTs.Forourexperiments,wecon-structeda200-nodeoverlayrandomlydistributedamonga600-nodeGT-ITMgeneratedtopology[6].EdgedelaywasassignedbasedondefaultGT-ITMparameters.Fortheseexperiments,weequateedgecostwithdelaythoughwearecurrentlyinvestigatingtechniquestoallowsimultane-ous,bi-criterianetworkoptimization[25].InthecaseofaLAST,Figure2(a)showshowtheparameteraffectsthecostoftheresultingoverlay,relativetobothashortestpathtree(RDP=1.0)andaminimumcostspanningtree(withanunboundedRDP).At,theoverlaycostishigh,com-parabletoashortestpathtree.However,asdemonstratedinFigure2(b),thissamepointcorrespondstothebestper-formance(comparabletoshortestpathroutingpacketsintheunderlyingnetwork).Asincreases,thenetworkcostoftheLASToverlaydecreases,eventuallymatchingthecostofaminimumcostspanningtreeat.Ofcourse,Figure2(b)alsoshowsthatsuchalow-costoverlayalsoresultsinrelativelypoorperformance.OnenicequalityofthetradeoffsexpressedaboveisthatitispossibletobuilddistributionstructuresthatbalancecostandRDP.Forex-ample,with,weareabletoobtainacostwithinofanMSTandanRDPwithinofanSPTforourtargettopologyandedgeweights.Thisresultshowspromiseforourabilitytobuildoverlaysthatmatchappli-cationrequirementswithrelativelylowcostoverhead(forallbutthemostdemandingapplications).Akeynextchallengeistodevelopscalabledistributedalgorithmsforbuildingandmaintainingk-spannersorLASTs.Tosupportourgoalofscalability,wemustavoidthenecessityofglobalknowledge,excessivenet-workprobing,anddistributedlockingtobuildandmain-tainsuchtopologies.Ourapproachistouseprobabilis-tictechniquesandhierarchytoselectivelyprobethechar-acteristicsofvariousnetworkregions.Keytoourap-proachishavingeachnodegraduallymigratetoits(ap-proximately)“proper”locationintheoverlay.Thisisrel-ativelystraightforwardassumingthepresenceofglobalgroupmembershipandpairwiseprobing.However,thisrequiresunscalablememoryandnetworkoverheadrespectively).Recentproposalsinpeer-to-peernetworkingaddressscalabilityconcernsbybuildingrandomizedover-lays[28,30,33]requiringonlyper-nodestate.Incontrast,ourgoalistoinvestigatethepracticalityofcon-structingoverlayswithspecicperformancecharacteris-ticsusingpartial,approximateandprobabilisticknowledgeofnetworkinformation. 80009000100001100012000130001400015000160001700011.522.53LAST Alpha (tunable property)Network cost SPT LAST MST 0.81.21.41.61.811.522.53LAST Alpha (tunable property)Root RDP MST LAST SPT (a)(b)Figure2:Dynamicallytradingnetworkcostforrelativedelayproductusingalightweightapproximateshortestpathtree3.2ResourceAllocationOneofthekeycomponentsofOpusisresourceallocationamongcompetingapplications.Thisprincipallyrequiresdeterminingtherelativepriorityforcompetingapplicationsandtheproportionofglobalresourcesthatshouldbeal-locatedbasedoncurrentsystemconditions.WewilluseSLAsasthebasisforeconomicprioritization,buildingonourinitialsuccesswithusinganeconomicmodelforpri-oritizationandresourceallocationinaclustersetting[8].Thebasicresourcemappingchallengeistoestablishamatrixofallotmentsfromsystemresourcesacrosscustomers(applications).ThesystemresourcesincludeserversintheOpusPoPsandnetworklinksinterconnect-ingthem.Thesystemstrivestobalancetheservicequalityoftheselectedallotmentswiththeircosts.InOpus,ourchallengeistoallocatetheseresourcesinadecentralizedmanner,basedonpartialinformationaboutresourcesup-plyanddemandcollectedthroughtheserviceoverlay.OpususesageneralizedmeasureofbenetorutilityasabasisforexibleSLAsrepresentingdynamictradeoffsbetweenservicequalityandvalue.Customersareassoci-atedwithutilityfunctionsspecifyingthevalueofanygivenlevelofservicevolumeandservicequalitypredictedtore-sultfromacandidateallotment.Opusmakesresourceal-locationdecisionsbycomparingtheexpectedutilityofasetofcandidatecongurations,withthegoalofmaximiz-ingglobalutility.Thesystemusesmodelstopredicttheeffectsofcandidateresourceallotmentsonservicequality,thenevaluatestheSLAfunctionstodeterminetheexpectedvalueofthepredictedbehavior.Informally,thedomainsofthesecompositefunctionsarecontinuousmeasuresofthecostofresourcesassigned,e.g.,theaggregateamountofserverresourcesassignedtotheapplicationattheOpusPoPs,orthenetworkcostofaLASTtreewithagivenparameter.Theunitsofvaluearearbitrary,aslongasthesystemcancombinevaluesassignedtomultiplemeasuresofservicequality,andcomparethetotalvaluesofcandi-datecongurationstodeterminewhichofthealternativesispreferable.Theresultingoptimizationproblemsfallintoaclassi-caleconomicframeworkforresourceallocation.Com-putingoptimalresourceallocationsfromsetsofutilityfunctionsandservicequalityestimatesisalinearlycon-strainednon-linearoptimizationproblem.Tomaketheproblemtractable,weconstrainthecompositeutilityfunc-tionstobe.Thismeansthatthemarginalbeneofassigningadditionalresources,e.g.,serversornetworklinks,toacongurationdeclinessteadilyandapproacheszero:addingresourcesbeyondsomepointdoesnotre-sultinmeaningfulimprovementofservicequality.Moreformally,theutilitygradient(thederivativeifthefunc-tionisdifferentiableoverareal-valueddomain)isnon-negativeandmonotonicallynon-increasing.Thisreducestheoptimizationproblemtoasimpleconvexprogram-mingproblemwithanefcientsolutionbasedongradi-entclimbing[19].Iftherearesufcientresourcestoavoidstarvinganycustomer,thenthereexistsauniqueoptimalsolutionwiththepropertythatthemarginalutilityofanadditionalresourceunitisinequilibriumacrossallcus- Allocated ResourcesThroughput App 1App 2 Figure3:Exampleofgradientclimbingtodeterminere-sourceallocation.tomers.Thisequilibriummarginalutilityisequivalenttoequilibriumpricethatmatchesresourcedemandwithavailablesupplyinaneconomicmarketforallocatingre-Asimpleexamplehelpstoillustratethispoint.Sup-posethatanOpussystemhoststwoapplicationserviceswithaconstantlevelofofferedload.Figure3showscon-cavecurvesthatqualitativelyrepresentthethroughputofthetwohypotheticalapplicationsasafunctionofthere-sourcesallocatedtoeach.IftheSLAfunctionsforthesecustomersdenevalueaslinearwithdeliveredthrough-put,andtheyhaveequalpriority(theirutilityfunctionshavethesameslope),thenOpuswillseekaresourceal-lotmentthatmaximizesglobalthroughput.Notethatwhileweusethroughputinthissimpleexample,they-axiscouldjustaseasilyrepresentavailability,reliability,latency,orsomeotherservicequalitymetric,e.g.,acompositemetricrepresentingexpectedcustomerrevenue.Thecurvesshowthataddingresourcessignicantlyim-provesthroughputwhenallotmentsarelowandthecus-tomersarestarved.However,asmoreresourcesareadded,themarginalgaininthroughputdeclinesandapproacheszero(trivially,foranofferedloadof100smalllere-questspersecond,changingallocationfrom10to11ma-chinesisnotlikelytomeasurablyimprovethroughput).Themarginalbenetofanadditionalresourceunitcanbemeasuredbytherstderivativeorgradientofeachappli-utilitycurve.Inthisexample,thegradientsatparticularpointsonthex-axisrepresentthecurrentallo-cationofresourcestoeachapplicationandtheexpectedtofallocatinganadditionalunitofresourcetoap-plication1versusapplication2.Here,application1wouldenjoyagreaterestimatedboostinthroughputfromanaddi-tionalunitofresourcebecauseithasalargergradientthanapplication2.Opusgivespreferencetoapplication1untilitsmarginalgainequalizes.Thus,theOpusresourceallocatorsstrivetomaximizeglobalvalueacrossallapplications.Inthegeneralcase,theSLAfunctionsmayspecifyutilityasacombinationofservicequalitymetricsinacommoncurrencyofvalue.Theutilityfunctionsmayalsoincorporateprioritybyvalu-ingservicequalityforsomeapplicationshigherthanoth-ers.Forexample,thevaluemetricswouldprioritize,say,disseminationoftacticalinformationoverdistributionoftrainingvideos,enablingthesystemtoprovisionresourcesrationallyiffacedwithanunexpectedcrisisandresourceshortage.Thevalueoftheallotmentschangesdynami-callywithchangingconditionsandofferedload.Ourchal-lengethenistoestimatethechangingshapeandgradientforthesecurvestorespondtodynamicchanges,basedonpartialknowledgepropagatedthroughtheserviceoverlay.Overall,theconcavityconstraintallowsthesystemtoadjustequilibriumallotmentsincrementallytoadapttochangingconditions.Thesystemcontinuouslymonitorsloadandresourcestatus,andpropagatesstatusinforma-tionthroughtheserviceoverlay.Thisstatusinformationconstitutesafeedbacksignaltotriggeradaptiveresourcereallocation.Ratherthancomputinganewallocationfromscratch,thesystemrespondstochangesbyincrementallyadaptinganexistingcongurationtorestoreequilibrium.Thiscanbedoneusinganefcientgreedyalgorithmwhosecostscaleswiththemagnitudeofshiftsinloadorresourceavailabilityfromoneintervaltothenext[8].Economicresourceallocationscalesnaturallyusingadecentralizedfederationofautonomouslocalmarketsexchanginginformationtoconvergetowardaglobalequi-librium.Ourinitialdesigncentersaroundahierarchicalstructuretoaggregaterelatedresourcesintoofplanningtheirinternalallocationslocally.AcellmightbeanentireOpusPoPoraportionofalargePoP,e.g.,anarrayofgenericserverssharingaredirectingswitchnodeincloseproximity.Cellscooperatetotradeloadorre-sourcesinordertobalanceresourceusageacrossthesys-tems.Toderivethemagnitudeofresourceshifts,cellsex-changeinformationaboutthesupplyanddemandforre-sourcesineachcell.Thiscanbecapturedcompactlyasthemarginalutilitygainedbyaddingresourcestothatcellorshiftingloadawayfromthatcelltofreeupresourcesforsomeotheruse;thismarginalutilityisequivalenttotheforresourcesinthatcell.Webelievethatthiscel-lularstructureisthekeytoscalableresourceprovisioninginlargedatacentersandnetworksofserversites.Akeytenetofthisworkisthatservicequalitymustbemeasuredinanapplication-specicmanner.Thus,oneimportantquestioninvolvesincorporatingmultipledimen- D Cluster Agent Cluster Overlay Network Figure4:Hierarchicaldatadisseminationindicast.sionsofservicequality,includingreliability,performance,anddataconsistency,intoasingleutilityfunction.Oneoptionistodeneaunimeasureincor-poratingallaspectsofservicequality,withasingleutilityfunctionforeachcustomer.Analternativeistodeneeachdimensionofservicequalityasaseparateutilityfunction,andrepresenttradeoffsoptimizingthesumoftheindivid-ualvaluemeasures.3.3ScalableTrackingofSystemCharacteristicsAsdiscussedabove,aprimarychallengetobuildingandmaintaininglargescaleutilitiesinvolvesmaintainingdis-tributedstateaboutglobalsystemcharacteristics.Con-sidertherequirementsofthefollowingOpustasks.First,forrequestrouting,clientsspreadacrossthenetworkmustchoosethereplicamostlikelytodeliverthebestperfor-mance,reliability,security,etc.Toachievesuchfunc-tionality,therequestroutinginfrastructuremusttrackdy-namicallychangingreplicacharacteristics,forinstance,availablebandwidthandloadinformation.Second,build-ingandmaintainingoverlaysrequiresprobingthenetworkcharacteristicsamongallparticipatingreplicas.Ingen-eral,nodesoneanotherintheunderlyingtopology(i.e.,displayingstrongpair-wiseperformanceandreliabil-itycharacteristics)shouldpeertogetherintheoverlay.Fi-nally,thesystemmusttrackdynamicgroupmembershipinformationtoretirenodesthatfailorfallbehindlong-lastingnetworkpartitions.Thus,anOpusnoderequiresanabstractiontocommunicateitslocalstateandlocalob-servations(e.g.,networkprobes)toothersystemnodes.Similarly,Opusnodesmustreceiveupdatesaboutglobalsystemcharacteristicsfromremotesites.Inalarge-scaleutility,itisimpracticaltomaintainaccurateglobalsystemcharacteristics.Ourchallengethenistobalancecommuni-cationcostswithdataaccuracyasafunctionofsystemsizeandglobalcharacteristics.Todevelopacommunicationabstractionabletoscaletolargenumbersofnodes,wedrawinspirationfromInternetroutingprotocols[16,26,29],perhapsthebestexampleofdistributedprotocolsthatscaletoglobalproportions.Thefundamentallessonwedrawisthataggregationerarchyapproximationarefundamentaltowide-areascalability.Weapplythesedesignideastoagenericcom-municationlibrarywithinOpus,calleddicast,designedtodistributeapproximatedatainascalablefashion.Thus,notallupdatesoriginatingatagivennodewillbe(orevenneedbe)deliveredtoallparticipants.Further,individualupdatesmaybeaggregatedtogethertoincreasingdegreesasdatamovesthroughthenetwork.Theuseofaggregationindicastnaturallyleadstotheconstructionofatree-basedstructure,asdepictedinFig-ure4.Nodesarepartitionedintoclustersofsize,wheredeterminestheheightofthetree(fornodes,aclus-tersizeofimpliesatreeheightofapproximatelyEachclusterelectsanagent,aspeakerresponsiblefordis-seminatinglocalclusterinformationtotherestofthedi-casttree.Agentsfromadjacentclustersformsecond-levelclusters.Thisprocessisrepeateduntilan-thlevelclusterisformed,whereistheheightoftree.Notethatallphysicalnodesinthedicasttreeareattheleaves(levelclusters)andintermediatenodesinthetreeareelectedmembersfromtheleafsetwhoservemultipleresponsibil-ities.Derivinggoodperformancefromsuchanapproachrequiresassigningnodestoclusterswithothertopologi-nodes.Weplantoleverageexistingworkonclustering[24]toaidinthisprocesswherepossible.Indicast,datatravelsupthetree,potentiallybeingag-gregatedwithdatafromotherdicastnodes.Ateachlevelofthetree,anoverlay(asdiscussedinSection3.1above)propagatesthedataamongallparticipatingclustermem-bers.Associatedwitheachlevelofthetreeisatargetc)levelofaccuracyforeitheraggre-gatedorindividualnodeinformation.Onceaparticularupdatereachesalevelofthetreewhereaggregateaccuracyrequirementsarenotviolated,itwillbebufferedawaitingthearrivaloffurtherupdatesthatwilleventuallyforcethepropagationofanaggregateupdatetonodeshigherupthetree.Asdataspreadstohigher-levelclusters,itisinturntransmittedbacktowardtheleavesbecauseeachagentisamemberofatleasttwoadjacentlevelsinthetree.Oneexampleuseofdicastistopropagateper-regionresourceconsumptioninformationtoinuencelocalre-sourceallocationdecisions.Thus,alocalnodemayhaveexactinformationaboutper-applicationresourceconsump-tionfornodes(inthesamecluster).However,itmayonlyhaveaggregate(andsomewhatinaccurate)in- formationaboutresourceconsumptioninremoteclusters.However,suchapproximateandaggregatedataislikelytobesufcienttosetlocalallocationlevelstomeetglobalal-locationtargets.Similarly,informationonper-clusterloadimbalancesmaybeusedtomakeadecisiontoreallocateagivenreplicafromoneapplicationtoanothertobettermeettargetSLAsortomaximizeglobalsystemthroughput.3.4ReliabilityQoSGuaranteesFormanyemergingInternetservices,reliabilityandavail-abilityaremoreimportantmetricsthanrawserviceper-formance.Thereareanumberofpotentialdenitionsforserviceavailability;wedeneavailabilitytobetheper-centageofrequeststhatcanbesatisedwithinindividualclientperformancerequirements.Manyexistingmetricsforavailabilityconsideraserviceavailableifitiscur-rentlysatisfyingclientrequestswithavailabilityreducingtoasimplemeasureofuptime,ortheamountoftimewith-outhardware/softwarefailures.Forourapproach,avail-abilityismeasuredbyintegratingacrossallclientrequests,withthoserequeststhatreturntooslowly(e.g.,basedonanexpecteddistributionorevenonper-clientperformanceex-pectations)markedasunavailable.Inthecontextofareplicatedutility,anindividualhostedservicemaybeconsideredunavailableforanumberofrea-sons,includingfailuresintherequestroutinginfrastruc-ture,innetworklinks,or,inourmoregeneralmodel,be-causeinsufcientresourceswereallocatedtomeettargetperformancecharacteristics.Oneapproachwearepur-suingforaddressingfailuresatthenetworklevel,calledrestrictedßooding,istobuildoverlaytopologiesthatre-dundantlytransmitthesamedataovermultiplelogicalpaths[32].However,weuseavariantofanti-entropy[34]tominimizetheoverheadassociatedwithredundanttrans-missionforcertainapplicationclasses.Hereparticularoverlaynodesmaychoosetoforwardanapplication-layerframeredundantlyalongmultiplepathstoasingledestina-tion,especiallyifanygivenpathdoesnotmeetaggregatereliabilityrequirements.Asthedatatravelstowarditsdes-tination,certaindownstreamnodesmayreceivemultiplecopiesofthesameframe(asidentiedbyauniqueiden-er).Inthiscase,thedownstreamnodewillre-evaluatetheestimatedreliabilityoftheremainderofthepathandsuppressduplicateframesifreliabilitytargetsarelikelytobeachievedwiththepropagationofasingleframe.Thismannerofrestrictedoodingprovidestwoprincipalad-vantages.First,restrictedoodingmeansthattheoverlaydoesnothavetonecessarilypreventoverlayconstruction.Next,multipleindependentpathsto .96.98.97.97.99Target Reliability = .98 Figure5:UsingRestrictedFloodingtocontrolcostversusreliabilitytradeoffs.thedestinationsmeanthatindividualdelays,failures,orpacketdropswillnotnecessarilypreventthetimelydeliv-eryofdatagivenavailableredundancyinthedistributionAprimarychallengetodevelopingsuchanapproachisensuringthattheoverlaytopologymatchesthefailurechar-acteristicsoftheunderlyingnetwork.Forinstance,ifsepa-ratelogicallinksinanoverlaycorrespondtoacommonfailure-pronelinkintheunderlyingphysicalnetwork,afailedphysicallinkcanresultinfailuresinmultiplelogicaloverlaylinks.Thus,itisimportanttoconstructoverlaysdisjointpaths,wherethefailurecorrelationamonglogicaloverlaylinksislow.Wedeterminethelosscorrela-tionamongmultiplepotentiallinksbycollectingstatisticalinformationaboutlosscorrelationsandbyusingnetworktopologyinformationwhereavailable.Ouruseofmultipleredundantpathsenablesimmediatefailoverratherthanre-lyingontheunderlyingnetworktoconvergetonewroutesinthefaceoffailure.Further,wehopethatthecombina-tionofrestrictedoodingandcarefulconstructionofover-laytopologieswillresultinonlyminimaltrafcoverheadrelativetosingle-pathrouting.ConsiderthesimpleexampledepictedinFigure5whereasourcewishestotransmitdatatoadestination,,withanend-to-endreliabilityof98%andwherealllinksaredisjoint.Omittingthedetailsofthesimplecalculations,transmittingthedatathrougheithertowardsultsinreliabilityof93.1%.However,bytransmittingdatathroughbothmeansthatatleastonecopyofthedataarrivesatthejoinpoint,,99.6%ofthetime,withtwocopiesofthedataarrivingwithan88.5%probability.Whennodeforwardsonecopyofthedata(suppressingthesecondshoulditarrivelater),resultingend-to-endreli-abilityis98.7%,whichmeetsthetargetyield.Forwardingbothcopiesresultsin99.8%reliability.Thegoalofourworkinrestrictedoodingistopro-videeachnodewithenoughinformationtodeterminehow manysimultaneousroutestomaintainforagivencommu-nicationstreamtoachieveagivenlevelofreliability.In-termediatenodesmustthendetermineifitisfeasibletosuppresssubsequenttransmissionofthesamedataandstillmaintaintargetreliability.Inthisexample,restrictedingmustdetermineifanapproximately88%increaseintheutilizationoftheoverlayedgeforthisparticularcom-municationstreamisworththepotential1.1%improve-mentinend-to-endreliability.Ofcourse,thisevaluationmustbemadeinresponsetochangingnetworkconditionsandapplicationdemands.Finally,inSection3.1,wediscussedtechniquesforal-lowingapplicationdeveloperstodynamicallytradeforperformance.Ourapproachtoprovidinghighreliabil-itythroughredundanttransmissionanddisjointpathsaddsanotherdimensiontothistradeoff:itallowsapplicationstospecifybothperformanceandreliabilitytargets.Opusthenstrivestobuildthelowestcost(orlowestoverhead)overlaytomeetthespeciedgoals.4RelatedWorkOurworkonOpusisinspiredbyrelatedeffortsinanumberofdifferentelds.ResearchintoActiveNet-works[1,17,27,36]proposesmovingcomputationintothenetworkonaper-packetlevel.WeviewourutilitymodelasalogicalculminationoftheActiveNetworkphi-losophy.Thatis,overlayspushapplication-levelfunction-alitytospecicintermediatenodesinthenetwork.How-ever,thegranularityofcomputationinoverlaysiscoarsergrainedthaninActiveNetworks,operatingonapplication-layerframes[9]ratherthanindividualnetworkpackets.Indesigningtheabstractionsforourutilityenvironment,wewillbuildontheworkalreadyperformedinthecontextofActiveNetworks.WorkintoActiveServices[2]investigatesasimilarin-termediatepointofpushingapplicationfunctionalityintothenetwork.Relativetothiseffort,wefocusonthewide-areaissuesassociatedwithsimultaneouslydeployingandallocatingresourcesamongcompetingapplicationsinascalableutility.Wherepossible,weintendtoleveragethesetofabstractionsdevelopedforactiveservicesrunningwithinaclusterenvironment(analogoustoourindividualOpussites).Anumberofeffortsareinvestigatingautilitymodelforwide-areacomputing.Akamai[10]hostsalargenumberofserversacrosstheInternet.Globus[13]andLegion[14]investigateresourceallocationinthecontextofawide-areacomputationalgrid.WebOS[35]investigatessystemsupportforwide-areaservices.Withinasinglemachineroom,ClusterReservesenforcesaglobalallocationofre-sourcesamongmultipleresourceprincipals[4].Relativetotheseefforts,ourgoalistosimultaneouslyinvestigateissuesofresourceallocation,replicaplacement,andover-layconstructionbasedonaneconomicmodeltodetermineper-applicationprioritylevels.WebelievethatourworkinOpuswillbecomplementarytotheseexistingefforts.Ourutilitymodelinvestigatestechniquesforallocatingnetworkresourcestocompetingapplications.Welever-ageoverlaynetworksbothtotrackthecharacteristicsoftheutilityasawhole,aswellastopropagateupdatesamongindividualapplicationnodes.Theideaofanoverlaynet-workisnotnew,havingbeenleveragedtoeasethede-ploymentofbothmulticastintheMbone[12]andIPv6inthe6bone[15].Untilrecently,overlayswereviewedasatransitiontechnology.However,recentacademicandcom-mercialeffortsareadvocatingtheuseofoverlaysasafun-damentalapproachforbothdeployingnewnetworkfunc-tionality(e.g.,multicast[18,20])andforimprovingtheperformanceandavailabilityofexistingapplications(e.g.,improvedapplication-layerrouting[3,31,32]).Relativetoexistingapproaches,ourworkisageneralutilityinfras-tructuretoallocatenodesamongcompetingapplications.Further,weinvestigatefundamentaltechniquesforscalingoverlaynetworkstothousandsofnodesandfordesign-ing,implementing,andevaluatingdistributedalgorithmsforbuildingandmaintainingoverlayscapableofmatchingapplicationperformanceandavailabilityrequirements.Recently,therehasbeentremendousinterestinscalablepeer-to-peerlookupservices[11,28,30].Atahighlevel,thesesystemshashanobjectnametoakeywithinsomeaddressspaceandrandomlyassigncooperatingpeerstoberesponsibleforsomeregionofthisaddressspace.Anendclientwishingtolookupaparticularobjectperformsthehashandusesthelookupinfrastructuretorouteitsrequesttotheappropriatepeerinapplication-levelhops.Thesystemisscalableinthatpeersmaintainnomorethanstateinfacilitatingthislookup.Theseelegantde-signsprovidesignicantscalabilitybenetsatthecostoflossofcontroloverexactlyhownodesareinterconnected,thecostofresultingoverlays,etc.Ourworkonresourceal-locationandmanaginginexactinformationacrossthewideareaisorthogonaltotheseefforts.However,oneexplicitgoalofthisworkistodeterminetherelativeperformancetsandcomputational/communicationofexplicitver-susimplicitoverlayconstructionandmaintenanceinlargescaledistributedsystems. 5ConclusionsThispaperpresentsanovelmodelforwide-areacomput-ingwhereacollectionofserversitesdistributedacrosstheInternetsimultaneouslysupporttherequirementsofabroadrangeofdecentralizedInternetapplications.Ratherthanforcingindividualapplicationstoreimplementsig-cantfunctionalityandtoredundantlyadministerdis-tributedserviceresources,anoverlaypeerutilityserviceOpus,dynamicallyallocatesresourcesamongcompetingapplications.Thispaperdescribesourapproachtore-alizingthisvisionandsomeofthespecicresearchis-suesweareaddressing.Inparticular,wepresent:i)thesystemarchitectureandabstractionsnecessaryfordiverseapplicationstopushfunctionalitytointermediatenodes,ii)modelsforresourceallocationandreplicaplacementforcompetingapplicationsbasedondynamicallychang-ingsystemcharacteristics,iii)constructingdynamicper-applicationscalableoverlaysthatbothmatchapplicationperformance/availabilityrequirementsandthatmakeefcientuseofunderlyingnetworkresources,andiv)decen-tralizedandscalabletechniquesfortrackingglobalsystemcharacteristicsthroughaggressiveuseofhierarchy,aggre-gation,andapproximation.References[1]D.ScottAlexander,WilliamA.Arbaugh,MichaelW.Hicks,PankajKakkar,AngelosD.Keromytis,JonathanT.Moore,CarlA.Gunter,ScottM.Nettles,andJonathanM.Smith.TheSwitchWareActiveNetworkArchitecture.IEEENetwork,12(3):2936,May/June1998.[2]ElanAmir,StevenMcCanne,andRandyKatz.AnAc-tiveServiceFrameworkanditsApplicationtoReal-TimeMultimediaTranscoding.InProceedingsofSIGCOMMSeptember1998.[3]DavidG.Andersen,HariBalakrishnan,M.FransKaashoek,andRobertMorris.ResilientOverlayNet-works.InProceedingsofSOSP2001,October2001.[4]MohitAron,PeterDruschel,andWillyZwaenepoel.Clus-terReserves:AMechanismforResourceManagementinCluster-basedNetworkServers.InProceedingsoftheACMSigmetrics2000InternationalConferenceonMea-surementandModelingofComputerSystems,June2000.[5]S.Bhattarcharjee,M.Ammar,E.Zegura,V.Sha,andZ.Fei.Application-LayerAnycasting.InProceedingsofIEEEInfocom,April1997.[6]KenCalvert,MattDoar,andEllenW.Zegura.ModelingInternetTopology.IEEECommunicationsMagazine,June[7]BarunChandra,GautamDas,GiriNarasimhan,andJoseSoares.NewSparsenessResultsonGraphSpanners.InSymposiumonComputationalGeometry,pages192[8]JeffreyS.Chase,DarrellC.Anderson,PrachiN.Thakar,AminM.Vahdat,andRonaldP.Doyle.Managingenergyandserverresourcesinhostingcenters.InProceedingsofthe18thACMSymposiumonOperatingSystemPrinciples,October2001.[9]DavidD.ClarkandDavidL.Tennenhouse.ArchitecturalConsiderationsforaNewGenerationProtocols.InPro-ceedingsofSIGCOMM,September1990.[10]AkamaiCorporation,1999.[11]FrankDabek,M.FransKaashoek,DavidKarger,RobertMorris,andIonStoica.Wide-areaCooperativeStoragewithCFS.InProceedingsofthe18thACMSymposiumonOperatingSystemsPrinciples(SOSP,October2001.[12]H.Eriksson.Mbone:TheMulticastBackbone.cationsoftheACM,37(8):5460,1994.[13]IanFosterandCarlKesselman.Globus:AMetacomput-ingInfrastructureToolkit.InInternationalJournalofSu-percomputerApplications,volume11(2),pages115[14]AndrewS.Grimshaw,WilliamA.Wulf,andtheLe-gionteam.TheLegionVisionofaWorldwideVirtualComputer.CommunicationsoftheACM,40(1),January[15]I.Guardini,P.Fasano,andG.Girardi.IPv6OperationalExperiencewithinthe6bone.InProceedingsoftheInter-netSocietyConference,July2000.[16]RochGuerinandArielOrda.QoS-basedRoutinginNet-workswithInaccurateInformation.InProceedingsofIEEEINFOCOM,1997.[17]MichaelHicks,PankajKakkar,JonathanT.Moore,CarlA.Gunter,andScottNettles.PLAN:APacketLanguageforActiveNetworks.InProceedingsoftheThirdACMSIG-PLANInternationalConferenceonFunctionalProgram-mingLanguages,pages8693,1998.[18]YanghuaChu,SanjayRao,andHuiZhang.ACaseForEndSystemMulticast.InProceedingsofACMSigmetricsJune2000.[19]ToshihideIbarakiandNaokiKatoh,editors.ResourceAl-locationProblems:AlgorithmicApproaches.MITPress,Cambridge,MA,1988.[20]JohnJannotti,DavidK.Gifford,KirkL.Johnson,M.FransKaashoek,andJr.JamesW.OToole.Overcast:ReliableMulticastingwithanOverlayNetwork.InProceedingsofOperatingSystemsDesignandImplementation(OSDI)October2000. [21]DinaKatabiandJohnWroclawski.AFrameworkforScal-ableGlobalIP-Anycast.InProceedingsofSigcomm,Au-gust2000.[22]S.Khuller,B.Raghavachari,andN.Young.BalancingMinimumSpanningandShortestPathTrees.InProc.ACM/SIAMSymp.onDiscreteAlgorithms,January1993.[23]DejanKosticandAminVahdat.LatencyversusCostOp-timizationsinHierarchicalOverlayNetworks.TechnicalReportCS-2001-04,DukeUniversity,January2002.[24]BalachanderKrishnamurthyandJiaWang.OnNetwork-AwareClusteringofWebClients.InProceedingsofACMSIGCOMM2000,August2000.[25]AdamMeyerson,KameshMunagala,andSergePlotkin.Cost-Distance:TwoMetricNetworkDesign.InProceed-ingsoftheSymposiumontheFoundationsofComputerScience(FOCS),November2000.[26]J.Moy.OSPFVersion2.TechnicalReportRFC2178,InternetEngineeringTaskForce,NetworkWorkingGroup,July1997.[27]ErikL.Nygren,StephenGarland,andM.FransKaashoek.PAN:AHigh-PerformanceActiveNetworkNodeSupport-ingMultipleMobileCodeSystems.InProceedingsIEEEOpenArch1999,March1999.[28]SylviaRatnasamy,PaulFrancisMarkHandley,RichardKarp,andScottShenker.AContentAddressableNetwork.ProceedingsofSIGCOMM2001,August2001.[29]Y.RekhterandT.Li.ABorderGatewayProtocol4(BGP-4).TechnicalReportRFC1771,InternetEngineeringTaskForce,NetworkWorkingGroup,March1995.[30]AntonyRowstronandPeterDruschel.StorageManage-mentandCachinginPAST,aLarge-Scale,PersistentPeer-to-PeerStorageUtility.InProceedingsofthe18thACMSymposiumonOperatingSystemsPrinciples(SOSPOctober2001.[31]StefanSavage,ThomasAnderson,AmitAggarwal,DavidBecker,NealCardwell,AndyCollins,EricHoffman,JohnSnell,AminVahdat,GeoffVoelker,andJohnZahorjan.Detour:ACaseforInformedInternetRoutingandTrans-IEEEMicro,19(1),January1999.[32]AlexC.Snoeren,KennethConley,andDavidK.Gifford.Mesh-BasedContentRoutingUsingXML.InProceedingsofthe18thACMSymposiumonOperatingSystemsPrinci-ples(SOSP,October2001.[33]IonStoica,RobertMorris,DavidKarger,FransKaashoek,andHariBalakrishnan.Chord:AScalablePeertoPeerLookupServiceforInternetApplications.InProceedingsofthe2001SIGCOMM,August2001.[34]DouglasB.Terry,MarvinM.Theimer,KarinPetersen,AlanJ.Demers,MikeJ.Spreitzer,andCarlH.Hauser.ManagingUpdateConictsinBayou,aWeaklyConnectedReplicatedStorageSystem.InProceedingsoftheFifteenthACMSymposiumonOperatingSystemsPrinciples,De-cember1995.[35]AminVahdat,ThomasAnderson,MichaelDahlin,EshwarBelani,DavidCuller,PaulEastham,andChadYoshikawa.WebOS:OperatingSystemServicesforWide-AreaAppli-cations.InProceedingsoftheSeventhIEEESymposiumonHighPerformanceDistributedSystems,Chicago,Illinois,July1998.[36]DavidWetherall.ActiveNetworkVisionandReality:LessonsFromaCapsule-basedSystem.InProceedingsofthe17thSymposiumonOperatingSystemsPrinciples,December1999.[37]AndrewWhitaker,MarianneShaw,andStevenD.Gribble.Denali:LightweightVirtualMachinesforDistributedandNetworkedApplications.TechnicalReport02-02-01,Uni-versityofWashington,2002.[38]ChadYoshikawa,BrentChun,PaulEastham,AminVah-dat,ThomasAnderson,andDavidCuller.UsingSmartClientstoBuildScalableServices.InProceedingsoftheUSENIXTechnicalConference,January1997.[39]HaifengYuandAminVahdat.DesignandEvaluationofaContinuousConsistencyModelforReplicatedServices.InProceedingsofOperatingSystemsDesignandImplemen-tation(OSDI),October2000.