/
USENIX Association 10th International Conference on Autonomic Computin USENIX Association 10th International Conference on Autonomic Computin

USENIX Association 10th International Conference on Autonomic Computin - PDF document

myesha-ticknor
myesha-ticknor . @myesha-ticknor
Follow
403 views
Uploaded On 2016-05-12

USENIX Association 10th International Conference on Autonomic Computin - PPT Presentation

WorkingSetbasedPhysicalMemoryBallooningJuiHaoChiangStonyBrookUniversityHanLinLiandTzickerChiuehIndustrialTechnologyResearchInstitute 9610th International Conference on Autonomic Computing ICAC ID: 317117

WorkingSet-basedPhysicalMemoryBallooningJui-HaoChiangStonyBrookUniversityHan-LinLiandTzi-ckerChiuehIndustrialTechnologyResearchInstitute 9610th International Conference Autonomic

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "USENIX Association 10th International Co..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

USENIX Association 10th International Conference on Autonomic Computing (ICAC ’13)95 WorkingSet-basedPhysicalMemoryBallooningJui-HaoChiangStonyBrookUniversityHan-LinLiandTzi-ckerChiuehIndustrialTechnologyResearchInstitute 9610th International Conference on Autonomic Computing (ICAC ’13)USENIX Association pagesthathavebeenaccessedatleastonceintheobser-vationwindow.However,thisschemeisinfeasiblebe-causetheoverheadoftrappingeverymemoryread/writeissimplytooprohibitivetobeacceptableinpractice.Togetaroundthisproblem,VMware’sESXusedasam-plingapproachtoestimatingtheworkingsetsizeofaVM.PeriodicallyitmarksarandomlysampledsubsetoftheVM’sguestphysicalpagesasinvalid,countsthenumberofpagesinthesubsetthatareaccessedwhen-everaprotectionfaultagainstanyofthesepagesoccurs,andusestheresultingcounttoinfertheVM’sworkingsetsize.AnotherwaytoestimateaVM’sworkingsetsize,usedbythemechanism[15]intheXenhypervisor,istodirectlyusethe ticmaintainedbytheLinuxkernel,whichcorrespondstothetotalnumberofmemorypagescon-sumedbyallprocessesonaVM.Forpagereclamation,LinuxmaintainstwoLRU(LeastRecentlyUsed)lists,,foreachofthefollowingtwotypesofmemorypages:(1)AnonymousMemory,whichcor-respondstotheheapsandstacksofuserprocesses,andPageCache,whichcorrespondstothekernel’smem-orytobufferandcachethepayloadsofdiskreadsandUtilizingthehardwarereferencebit,LinuxputspagesthatareaccessedmorefrequentlyintoActivelistandleavepagesthatareaccessedlessfrequentlyinInactivelist.ThepagereclamationmechanismtraversestheIn-activelisttofreeitspagesandpossiblyre-allocatethem.Ifareclaimedpagebelongstoanonymousmemory,thekernelmarksthepage’spagetableentryasnon-present,andswapsoutthepage’scontenttotheswapdisk.Whenthepageislateraccessed,aeventoccursanditisswappedin.Ifareclaimedpagebelongstopagecache,thekernelushesitscontenttodiskifithasbeendirtied.Ifthepageislateraccessed,arefaulteventoccursanditisbroughtbackin.WhenaVM’sphysicalmemoryallocationislargerthanorequaltoitsworkingsetsize,thenumberofswapinandrefaulteventsshouldbeclosetozero.ThisobservationinspiresthethirdwaytoestimateaVM’sworkingsetsize:Graduallydecreasingtheballoontar-getoftheballoondriverintheVMuntiltheVM’sswapinandrefaultcountsstarttobecomenon-zero.TheamountofphysicalmemoryallocatedtotheVMatthatinstantistheVM’sworkingsetsize.Moreconcretely,a3-statenitestatemachine,asshowninFigure1,isusedtoadaptivelytrackaVM’sworkingsetsize(WSS).Any-timetheWSSchanges,weadjusttheVM’sballoontar-getaccordingly.Thenite-statemachinestartsintheFASTstateandinitializestheVM’sWSStotheVM’s .WhileintheFASTstate,thenite-statemachineiterativelylowerstheVM’sWSSby5%ofthe FAST SLOW COOL_DOWN Committed_AS changesCommitted_ASchangesswapin/refaultdetectedCool_downcounter reaches swapin/refaultdetected swapin/refaultdetected Committed_ASchangesFigure1:Thenite-statemachineusedtotrackaVM’sworkingsetsize. valueattheendofeveryepoch(epochsizesetto1secondcurrently),untilswapinorrefaulteventsoccurwithinthecurrentepoch,whichsuggeststhenite-statemachinemayhaveovershottheWSSadjustment.Assoonasswapin/refaulteventsariseinanepoch,thenite-statemachineraisestheVM’scur-rentWSSestimatebythesumoftheobservedswapinandrefaulteventcounts,andentersthe DOWNstate,regardlessofwhetherthenite-statemachinewasoriginallyintheFAST DOWNSLOWWhileinthe DOWNstate,thenite-statema-chineinitializesacool-downcountertoadefaulttime-outvalue(currentlysetat8seconds)andwaitsforittoexpire,andresetsthecool-downcountertothesamedefaultvalueifadditionalswapin/refaulteventsarise.IntheSLOWstate,thenite-state-machineap-pliesthesamelogicasinFASTstateexceptthattheVM’sWSSisiterativelyloweredby1%ofthecur- valueineachepoch.WheneverthetrackedVM’s changes,thenite-statemachineconsiderstheVM’sworkingsetsizehaschangedsignicantly,andresetsitselfbyenteringtheFASTstateandre-initializingtheVM’sWSStothenew 3TWS-basedMemoryBallooningMemoryballooning[21,8]isatechniquethatreclaimsphysicalmemoryfromaVMbyinstallinginsidetheVMaballoondriverthatallocatesmemorypagesfromtheVM’skernelviathestandardAPIs,pinsthemdown,andreturnsthemtothehypervisor.TheballoontargetofaballoondriveristhedifferencebetweentheVM’scon-guredmemoryrequirementandtheamountofmemoryitallocatesfromtheVM.HowtocorrectlysetaVM’sballoontargetisanimpor-tantissue.WhenaballoondriverallocatesmorethanthehostVM’sfreememorypool,theVMOS’spagerecla-mationmechanismistriggeredtoevictcoldpages.TheupperboundonaVM’sballoontargetistheVM’scon-guredmemoryrequirement,andthelowerboundistheVM’sminimummemoryrequirementthatpreventsOut- 2 USENIX Association 10th International Conference on Autonomic Computing (ICAC ’13)97 of-Memoryexceptions.TheoptimalwaytosetaVM’sballoontargetistosetittotheVM’sworkingsetsize,be-causethisallowsthehypervisortoreclaimthemaximumamountofphysicalmemoryfromaVMwhilereducingtheperformanceimpactontheVMtotheminimum.Theself-ballooningmechanismintheXenhyper-visorsetsaLinuxVM’sballoontargettoitscurrent value.Thisapproachguaranteesthatapplicationsconsuminganonymousmemorynotsufferfromanyswap-indelaybecausealltheirstacksandheapsarelikelytobememory-resident.However,com-paredwiththeworkingset-basedapproachtosettingtheballoontarget,thisapproachhastwodeciencies. doesnotfactorthepagecacheintoaVM’sphysicalmemorydemand,andthusmaycausesubstantialperformancedegradationforapplica-tionswithintensivediskI/Oactivities,whichcouldsig-nicantlybenetfromthepagecache.Incontrast,theworkingsetapproachkeepsacounterforrefaultevents,andincorporatesthiscounterintothecalculationofaVM’sworkingsetsizeandthusballoontarget.Second, capturesonlythepagesthatareallocatedbutnotthosethatareactuallyusedrecently.Morespecif-ically, isincrementedupontherstac-cesstoeachnewlyallocatedanonymousmemorypageandisdecrementedonlywhentheownerprocessexplic-itlyfreesthepage.Forexample,ifaprogramallocatesandaccessesamemorypageonlyoncewhenthepro-gramstartsbutleavesituntoucheduntiltheprogramex-its,theLinuxkernelcannotexcludethiscoldpagefromaVM’s eventhoughitisclearlyoutsidetheVM’sworkingset.Incontrast,theworkingsetap-proachactivelyforcestheVMOStoinvokeitspagereclamationmechanismtopinpointandevictcoldpages.4PerformanceEvaluationInthispaper,wereporttheresultsofaperformanceeval-uationstudyofTWS-basedmemoryballooning.ThetestmachineusedinthisstudycontainsanIntelCorei7quad-coreprocessorwithVTandEPTenabledand16GBphysicalmemory,andrunsXen-4.1with64-bitvanillaLinux3.2.6asthekernel.AlltheVMsinthisstudyareconguredwith1virtualCPUand2GBmemory,andrunLinux3.2.664-bitkernelwiththeourdevelopedkernelmoduleformemoryballoon-isakernelthreadthatwakesupeverysecondtocollectrelevantinformation,suchas ,swapincountandrefaultcount,andmakeadjust-mentstotheballoontarget.ToverifytheeffectivenessoftheseTWS-basedbal-looningalgorithm,werstcompareditwithself-ballooningmechanismintheXenhypervisor.ThenwecompareditwiththelatestVMwareESXi5.0server. Benchmark TWSBallooning SelfBallooning Used Degra- Target Degra- Target dation dation SPECweb 0% 263.3MB 0% 263.3MB SPECcpu 3.08% 783.6MB 4.11% 922.6MB OLTP 3.31% 350.8MB 17.99% 328.8MB Table1:ComparisonbetweenTWS-basedballooningandselfballooningintermsofperformancedegradationandballoontargetforthethreebenchmarks,SPECwebBanking,SPECCPU401andOTLP.TheperformancedegradationiscalculatedbasedonacomparisonwiththeperformanceofthesameVMthatisconguredwith2GBmemory.Inthiscomparison,weusedtwoidenticaltestmachineswhereonerunstheXenhypervisorwiththeTWS-basedmemoryvirtualizationoptimizationsandtheotherrunstheESXiserver.ThememorygiventoeachVMdoesnotincludeanythingownedbythehypervisor.4.1EffectivenessofTWS-basedBallooningWeevaluatetheeffectivenessofTWS-basedballooningbycomparingtheperformancedegradationandballoontargetofaVMrunningasetofbenchmarkprogramswhenTWS-basedballooningisusedwiththosewhenXen’sself-ballooningisused.TheballoontargetofaVMistheamountofphysicalmemorythatamemoryballooningschemeallocatestotheVM.Theperformancedegradationofamemoryballooningschemeistheper-formancedifferencebetweenabenchmarkprogramrun-ninginaVMwhosephysicalmemoryallocationiscon-trolledbytheballooningschemeinquestionandthesamebenchmarkprogramrunninginaVMthatiscon-guredwithandindeedgiven2GBmemory,ortheconguration.Thefollowingthreebenchmarkpro-gramsareused:SPECwebBanking[3]runningagainstApache[1],SPECCPU,andOLTPfromtheSysbenchsuite[4]runningagainstMySQL[2].Table1showstheperformancedegradationandbal-loontargetcomparisonbetweenTWS-basedballooningandself-ballooningforthethreebenchmarkprograms.ThememoryrequirementofSPECwebBankingbench-markissmallerthantheminimumphysicalmemoryal-locationtothetestVM,263.3MB.Asaresult,bothTWS-basedballooningandself-ballooningproducethesameballoontarget,whichisthesameastheminimumphysicalmemoryallocation,andthebenchmarkprogramdoesnotexperienceanyperformancedegradationunderTWS-basedballooningandunderself-ballooning,whencomparedwiththeBaselineconguration.FortheSPECCPU401benchmark,theaverageballoontargetofTWS-basedballooningis15.07%(783.6MBvs.922.6MB) 3 9810th International Conference on Autonomic Computing (ICAC ’13)USENIX Association    \r\f \nFigure2:TheballoontargetsproducedbyTWS-basedballooningandself-ballooningovertime,andtheresult-ingcombinedswapinandrefaultcountovertimeunderTWS-basedballooning,whentheSPECCPU401bench-markisusedasthetestworkload.   \r\f  \r \f \n\t\b     \r\f Figure3:TheballoontargetsproducedbyTWS-basedballooningandself-ballooningovertime,andtheresult-ingcombinedswapinandrefaultcountovertimeun-derTWS-basedballooning,whentheSysbenchOLTPbenchmarkisusedasthetestworkload.smallerthanthatofself-ballooning,andyettheperfor-mancedegradationofTWS-basedballooningissmallerthanthatofself-ballooning(3.08%vs.4.11%).ThesuperiorityofTWS-basedballooningcomesfromthefactthattheworkingsetsizeitproduceseffectivelyremovespagesthatareallocatedbutunused,asshownbythegapbetweenthetwoballoontargetcurvesinFig-ure2.However,despiteallocatingasmalleramountofphysicalmemorytothetestVM,theperformancedegradationofTWS-basedballooningissmallerthanself-ballooning,becauseitreactsfastertothesuddenchangeintheVM’sdemand,e.g.attimepoints320sec-onds,460seconds,and630secondsofFigure2.Dur-ingthesetransitions,TWS-basedballooningisabletoallocatemorephysicalmemorythanCommitted AS,andthuscutsdownunnecessaryswapinandrefaultevents.BecausetheOLTPbenchmarkperformsintensivediskI/Oaccessesandthusrequiresalargerpagecache,Com- ASisnotanaccurateestimateofthebenchmark’sworkingsetasitdoesnottakeintoaccountpagecache.Asaresult,theaverageballoontargetproducedbyTWS-basedballooningis6.70%higherthanself-ballooning,andjustiablyso,becausetheperformancedegradationofTWS-basedballooningisonly3.31%,whichissignif-icantlysmallerthanthatofself-ballooning,or17.99%.AsshowninFigure3,TWS-basedballooningdetectsre-faulteventsandincreasesthetestVM’sballoontargetaccordingly,andasaresultproducesaballoontargetthatismoreinlinewiththeVM’sworkingsetsizeandmorecapableofreducingtheperformanceoverheadofmem-oryballooningtotheminimum.WealsoruntwoVMs,onewithaconstantworkingsetsizeof300MBandtheotherwithaconstantworkingsetsizeof1200MB,ontheXenhypervisorwithTWS-basedballooningandonVMware’sESXi5.0.EachVMisconguredwith2GBmemorybutgivenonly263.3MBatthestart-uptime.AfterthesetwoVMsstarttorun,ittakesTWS-basedballooning10secondstoreachtheidealphysicalmemoryallocation,i.e.,giving300MBtothe300MBVMandgiving1200MBtothe1200MBVM.However,forthesameset-up,ittakesVMwareESXi136secondstoreachthesameidealphysicalmemoryalloca-tion.ThereasonthatVMwareESXitakeslongertoac-complishthesameisbecauseitusesasamplingapproachtoprobeaVM’sworkingsetsize.5RelatedWorkStandardoperatingsystemsestimatetheactiveportionofbuffercacheorpagecachebymaintainingLRU-likestatistics[19,12,5]toimplementpagereplacementlogic.Luetal.[14]proposedtoallocateasmallpor-tionofmemorytoeachVMwhileleavingtheremainingmemoryasanexclusivecacheismanagedbythehyper-visor.Thus,thememoryaccessesofVMscanbein-terceptedwithintheexclusivecache,andtheLRUmissratiocurve[5]isderivedtomeasuretheworkingsetsize.Zhaoetal.[24,23]trackthememoryaccessofVMsbychangingtheuser/supervisorprivilegebitofguestpagetableentriestosupervisormodesothatallmemoryac-cessofVMwillbetrappedbecausetheVMrunsinusermode.Similarly,theLRUmissratiocurveisalsoderivedforworkingsetsizeprediction.Toreducetheoverheadfromtrappingmemoryaccess,theVMwareESXserver[21]usessamplingbasedmech-anismtopredicttheworkingsetsizeofVMs.Toper-formthesampling,theESXserverrandomlychoosesafewhundredsmemorypagesperiodically,e.g.,thede-faultsettingistochoose100pagesper60-secondforeachVM.However,thismechanismonlygivesaroughestimationoftheVMworkingsetsize,anditcannotre-ecttheworkingsetsizeexceedingthecurrentallocatedmemory. 4 USENIX Association 10th International Conference on Autonomic Computing (ICAC ’13)99 Whenitcomestoreclamationmechanism,theClockalgorithm[9]iscommonlyusedinguestOSsandsev-eralresearchefforts[17,22,7,11]aimedtoestimatetheworkingsetsizebymonitoringthechangesofaccessbitonthehardwarepagetable.ThisapproachrequiresmodicationstotheguestOS.Incontrast,ourapproachleveragestheguestOS’spagereclamationmechanismanddoesnotrequireanyguestOSmodications.6ConclusionMakingefcientutilizationofthephysicalmemoryavailableonavirtualizedserverisakeytechnicalchal-lengeformodernhypervisors.Possiblesolutionsincludememoryde-duplication,whichallowsdifferentVMstosharecommonpages,andmemoryballooning,whichre-claimsunusedpagesfromaVMwhenitsphysicalmem-oryallocationislargerthanitsworkingsetsize.ThispaperdescribesandevaluatestechniquesthatexploittheknowledgeofeachVM’sworkingsettodelivermoreef-cientmemoryballooning.Moreconcretely,thespecicresearchcontributionsofthisworkareAlow-overheadactiveprobingmechanismthatcouldaccuratelysensetheworkingsetofeachVMandtrackitdynamically,Anintelligentmemoryballooningalgorithmthatcoulddetectallocatedbutunusedpagesandreclaimthem,andComparedwithVMware’sESXi,whichisastate-of-the-arthypervisor,theproposedworkingsetestimationschemeismoreaccurateandmoreresponsivetoworkingsetchanges,butincursaslightprobingoverhead,thepro-posedmemoryballooningalgorithmisabletoquicklyreclaimmorememorypageswithoutincurringadditionalperformancepenalty.References[1]Apachehttpserverproject.http://httpd.apache.org/.[2]Mysql:opensourcedatabaseserver.http://www.mysql.com/.[3]Specweb2009.http://www.spec.org/web2009/.[4]Sysbench:asystemperformancebenchmark.http://sysbench.sourceforge.net/.[5]AASI,G.,CCAVAL,C.,ANDADUA,D.A.Calculatingstackdistancesefciently.MSP’02,ACM,pp.37–43.[6]ARCANGELI,A.,EIDUS,I.,ANDRIGHT,C.Increasingmem-orydensitybyusingKSM.LinuxSymposium,2009,pp.19–28.[7]BANSAL,S.,ANDODHA,D.S.Car:Clockwithadaptivereplacement.FAST’04,USENIXAssociation,pp.187–200.[8]BARHAM,P.,DRAGOVIC,B.,FRASER,K.,HAND,S.,ARRIS,T.,H,A.,NEUGEBAUER,R.,PRATT,I.,ANDARFIELD,A.Xenandtheartofvirtualization,vol.37.ACM,2003,pp.164–177.[9]CORBATO,F.J.Apagingexperimentwiththemulticssystem.InHonorofP.M(1969),Morse,MITPress,pp.217–228.[10]GUPTA,D.,L,S.,VRABLE,M.,SAVAGE,S.,SNOERENA.C.,VARGHESE,G.,VOELKER,G.M.,ANDAHDAT,A.Differenceengine:Harnessingmemoryredundancyinvirtualmachines.OSDI’08.[11]J,S.,CHEN,F.,HANG,X.Clock-pro:aneffec-tiveimprovementoftheclockreplacement.ATEC’05,USENIXAssociation,pp.35–35.[12]J,S.,HANG,X.Lirs:anefcientlowinter-referencerecencysetreplacementpolicytoimprovebuffercacheperfor-mance.SIGMETRICS’02,ACM,pp.31–42.[13]JHIANG,T.-.C.Introspection-basedmemoryde-duplicationandmigration.VEE’13.[14]L,P.,ANDHEN,K.Virtualmachinememoryaccesstracingwithhypervisorexclusivecache.USENIXATC’07,USENIXAssociation,pp.3:1–3:15.[15]MAGENHEIMER,D.Addself-ballooningtoballoondriver.discussiononxendevelopmentmailinglistandpersonalcommu-nication,april2008.[16]MAGENHEIMER,D.TranscendentMemoryonXen.XenSummit,February2009,p.3.[17]MAUERER,W.ProfessionalLinuxKernelArchitecture.WroxPressLtd.,Birmingham,UK,UK,2008.[18]MURRAY,D.G.,H,S.,ANDETTERMAN,M.A.Satori:En-lightenedpagesharing.ATEC’09.[19]O’NEIL,E.J.,O’NEIL,P.E.,EIKUM,G.Thelru-kpagereplacementalgorithmfordatabasediskbuffering.MODRec.22,2(June1993),297–306.[20]SCHOPP,J.H.,FRASER,K.,ANDILBERMANN,M.J.Re-sizingmemorywithballoonsandhotplug.LinuxSymposium2(2006),313319.[21]WALDSPURGER,C.A.Memoryresourcemanagementinvmwareesxserver.SIGOPSOper.Syst.Rev.362002),181–194.[22]ZHANG,I.,GARTHWAITE,A.,BASKAKOV,Y.,ANDARRK.C.FastrestoreofcheckpointedmemoryusingworkingsetSIGPLANNot.46,7(Mar.2011),87–98.[23]ZHAO,W.,J,X.,WANG,Z.,W,X.,L,Y.,AND,X.Lowcostworkingsetsizetracking.USENIXATC’11,USENIXAssociation,pp.17–17.[24]ZHAO,W.,AND,Z.Dynamicmemorybalancingforvirtualmachines.InVEE’09VMwareESXi5.0.0build-623860. 5