/
KVMARM The Design and Implementation of the Linux ARM Hypervisor Christoffer Dal KVMARM The Design and Implementation of the Linux ARM Hypervisor Christoffer Dal

KVMARM The Design and Implementation of the Linux ARM Hypervisor Christoffer Dal - PDF document

phoebe-click
phoebe-click . @phoebe-click
Follow
396 views
Uploaded On 2014-10-03

KVMARM The Design and Implementation of the Linux ARM Hypervisor Christoffer Dal - PPT Presentation

columbiaedu Jason Nieh Department of Compouter Science Columbia University niehcscolumbiaedu Abstract As ARM CPUs become increasingly common in mobile devices and servers there is a growing demand for providing the ben efits of virtualization for ARM ID: 2330

columbiaedu Jason Nieh Department

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "KVMARM The Design and Implementation of ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

KVM/ARM:TheDesignandImplementationoftheLinuxARMHypervisorChristofferDallDepartmentofComputerScienceColumbiaUniversitycdall@cs.columbia.eduJasonNiehDepartmentofCompouterScienceColumbiaUniversitynieh@cs.columbia.eduAbstractAsARMCPUsbecomeincreasinglycommoninmobiledevicesandservers,thereisagrowingdemandforprovidingtheben-efitsofvirtualizationforARM-baseddevices.WepresentourexperiencesbuildingtheLinuxARMhypervisor,KVM/ARM,thefirstfullsystemARMvirtualizationsolutionthatcanrun 333 kernel.Forexample,standardOSmechanismsinLinuxwouldhavetobesignificantlyredesignedtoruninHypmode.Split-modevirtualizationmakesitpossibletotakeadvantageofthebenefitsofahostedhypervisordesignbyrunningthehypervisorinnormalprivilegedCPUmodestoleverageexistingOSmech-anismswithoutmodificationwhileatthesametimestillusingHypmodetoleverageARMhardwarevirtualizationfeatures.Second,wedesignedandimplementedKVM/ARMfromthegroundupasanopensourceprojectthatwouldbeeasytomain-tainandintegrateintotheLinuxkernel.Thisisespeciallyimpor-tantinthecontextofARMsystemswhichlackstandardwaystointegratehardwarecomponents,featuresforhardwarediscoverysuchasastandardBIOSorPCIbus,andstandardmechanismsforinstallinglow-levelsoftware.Astandalonebaremetalhy-pervisorwouldneedtobeportedtoeachandeverysupportedhardwareplatform,ahugemaintenanceanddevelopmentburden.Linux,however,issupportedacrossalmostallARMplatformsandbyintegratingKVM/ARMwithLinux,KVM/ARMisauto-maticallyavailableonanydevicerunningarecentversionoftheLinuxkernel.Byusingsplit-modevirtualization,wecanleveragetheexistingKernel-basedVirtualMachine(KVM)[22]hypervi-sorinterfaceinLinuxandcanreusesubstantialpiecesofexistingkernelcodeandinterfacestoreduceimplementationcomplexity.KVM/ARMrequiresaddinglessthan6,000linesofARMcodetoLinux,amuchsmallercodebasetomaintainthanstandalonehypervisors.KVM/ARMwasacceptedastheARMhypervisorofthemainlineLinuxkernelasoftheLinux3.9kernel,ensuringitswideadoptionandusegiventhedominanceofLinuxonARMplatforms.Basedonouropensourceexperiences,weoffersomeusefulhintsontransferringresearchideasintoimplementationslikelytobeadoptedbytheopensourcecommunity.Third,wedemonstratetheeffectivenessofKVM/ARMonrealmulticoreARMhardware.Ourresultsarethefirstmeasure-mentsofahypervisorusingARMvirtualizationsupportonrealhardware.Wecompareagainstthestandardwidely-usedLinuxKVMx86hypervisorandevaluateitsperformanceoverheadforrunningapplicationworkloadsinvirtualmachines(VMs)versusnativenon-virtualizedexecution.OurresultsshowthatKVM/ARMachievescomparableperformanceoverheadinmostcases,andsignificantlylowerperformanceoverheadfortwoimportantapplications,ApacheandMySQL,onmulticoreplat-forms.TheseresultsprovidethefirstcomparisonofARMandx86virtualizationextensionsonrealhardwaretoquantitativelydemonstratehowthedifferentdesignchoicesaffectvirtualizationperformance.WeshowthatKVM/ARMalsoprovidespowerefficiencybenefitsoverLinuxKVMx86.Finally,wemakeseveralrecommendationsregardingfuturehardwaresupportforvirtualizationbasedonourexperiencesbuildingandevaluatingacompleteARMhypervisor.Weidentifyfeaturesthatareimportantandhelpfultoreducethesoftwarecomplexityofhypervisorimplementations,anddiscussmech-anismsusefultomaximizehypervisorperformance,especiallyinthecontextofmulticoresystems.ThispaperdescribesthedesignandimplementationofKVM/ARM.Section2presentsanoverviewoftheARMvir-tualizationextensionsandacomparisonwithx86.Section3describesthedesignoftheKVM/ARMhypervisor.Section4discussestheimplementationofKVM/ARMandourexperi-encesreleasingittotheLinuxcommunityandhavingitadoptedintothemainlineLinuxkernel.Section5presentsexperimentalresultsquantifyingtheperformanceandenergyefficiencyofKVM/ARM,aswellasaquantitativecomparisonofrealARMandx86virtualizationhardware.Section6makesrecommen-dationsfordesigningfuturehardwarevirtualizationsupport.Section7discussesrelatedwork.Finally,wepresentsomeconcludingremarks.2.ARMVirtualizationExtensionsBecausetheARMarchitectureisnotclassicallyvirtualiz-able[27],ARMintroducedhardwarevirtualizationsupportasanoptionalextensioninthelatestARMv7[]andARMv8[architectures.Forexample,theCortex-A15[]isacurrentARMv7CPUwithhardwarevirtualizationsupport.WepresentabriefoverviewoftheARMvirtualizationextensions.CPUVirtualizationFigure1showstheCPUmodesontheARMv7architecture,includingTrustZone(SecurityExtensions)andanewCPUmodecalledHypmode.TrustZonesplitsthemodesintotwoworlds,secureandnon-secure,whichareor-thogonaltotheCPUmodes.Aspecialmode,monitormode,isprovidedtoswitchbetweenthesecureandnon-secureworlds.Al-thoughARMCPUsalwayspowerupstartinginthesecureworld,ARMbootloaderstypicallytransitiontothenon-secureworldatanearlystage.Thesecureworldisonlyusedforspecializedusecasessuchasdigitalrightsmanagement.TrustZonemayappearusefulforvirtualizationbyusingthesecureworldforhypervisorexecution,butthisdoesnotworkbecausetrap-and-emulateisnotsupported.Thereisnomeanstotrapoperationsexecutedinthenon-secureworldtothesecureworld.Non-securesoftwarecanthereforefreelyconfigure,forexample,virtualmemory.Anysoft-warerunningatthehighestnon-secureprivilegelevelthereforehasaccesstoallnon-securephysicalmemory,makingitimpos-sibletoisolatemultipleVMsrunninginthenon-secureworld.Hypmodewasintroducedasatrap-and-emulatemechanismtosupportvirtualizationinthenon-secureworld.Hypmodeisa Figure1:ARMv7ProcessorModes 334 CPUmodethatisstrictlymoreprivilegedthanotherCPUmodes,userandkernelmodes.SoftwarerunninginHypmodecancon-figurethehardwaretotrapfromkernelmodeintoHypmodeonvarioussensitiveinstructionsandhardwareinterrupts.TorunVMs,thehypervisormustatleastpartiallyresideinHypmode.TheVMwillexecutenormallyinuserandkernelmodeuntilsomeconditionisreachedthatrequiresinterventionofthehyper-visor.Atthispoint,thehardwaretrapsintoHypmodegivingcon-troltothehypervisor,whichcanthenmanagethehardwareandprovidetherequiredisolationacrossVMs.Oncetheconditionisprocessedbythehypervisor,theCPUcanbeswitchedbackintouserorkernelmodeandtheVMcancontinueexecuting.TheARMarchitectureallowseachtraptobeconfiguredtotrapdirectlyintoaVM'skernelmodeinsteadofgoingthroughHypmode.Forexample,trapscausedbysystemcallsorpagefaultsfromusermodecanbeconfiguredtotraptoaVM'skernelmodedirectlysothattheyarehandledbytheguestOSwithoutinterventionofthehypervisor.ThisavoidsgoingtoHypmodeoneachsystemcallorpagefault,reducingvirtualizationoverhead.Additionally,alltrapsintoHypmodecanbedisabledandasinglenon-virtualizedkernelcanruninkernelmodeandhavecompletecontrolofthesystem.ARMdesignedthevirtualizationsupportaroundaseparateCPUmodedistinctfromexistingkernelmode,becausetheyenvi-sionedastandalonehypervisorunderneathamorecomplexrichOSkernel[14].Theywantedtomakeitsimplerforhypervisordeveloperstoimplementthehypervisors,andthereforereducedthenumberofcontrolregistersavailableinHypmodecomparedtokernelmode.Similarly,theymandatedcertainbitstobesetinthepagetableentries,becausetheydidnotenvisionahypervisorsharingpagetableswithsoftwarerunninginuserspace,whichisforexamplewhattheLinuxkerneldoeswithkernelmode.MemoryVirtualizationARMalsoprovideshardwaresupporttovirtualizephysicalmemory.WhenrunningaVM,thephysicaladdressesmanagedbytheVMareactuallyIntermediatePhysicalAddresses(IPAs),alsoknownasguestphysicaladdresses,andneedtobetranslatedintophysicaladdresses(PAs),alsoknownashostphysicaladdresses.Similarlytonestedpagetablesonx86,ARMprovidesasecondsetofpagetables,Stage-2pagetables,whichtranslatefromIPAstoPAscorrespondingtoguestandhostphysicaladdresses,respectively.Stage-2translationcanbecompletelydisabledandenabledfromHypmode.Stage-2pagetablesuseARM'snewLPAEpagetableformat,withsubtledifferencesfromthepagetablesusedbykernelmode.InterruptVirtualizationARMdefinestheGenericInterruptController(GIC)architecture[].TheGICroutesinterruptsfromdevicestoCPUsandCPUsquerytheGICtodiscoverthesourceofaninterrupt.TheGICisespeciallyimportantinmulticoreconfigurations,becauseitisusedtogenerateInter-ProcessorInterrupts(IPIs)fromoneCPUcoretoanother.TheGICissplitintwoparts,thedistributorandtheCPUinterfaces.Thereisonlyonedistributorinasystem,buteachCPUcorehasaGICCPUinterface.BoththeCPUinterfacesandthedistributorareaccessedoveraMemory-Mappedinterface(MMIO).Thedis-tributorisusedtoconfiguretheGIC,forexample,toconfiguretheCPUcoreaffinityofaninterrupt,tocompletelyenableordisableinterruptsonasystem,ortosendanIPItoanotherCPUcore.TheCPUinterfaceisusedtoacknowledge(ACK)andtosignalEnd-Of-Interrupt(EOI).Forexample,whenaCPUcorereceivesaninterrupt,itwillreadaspecialregisterontheGICCPUinterface,whichACKstheinterruptandreturnsthenumberoftheinterrupt.TheinterruptwillnotberaisedtotheCPUagainbeforetheCPUwritestotheEOIregisteroftheCPUinterfacewiththevalueretrievedfromtheACKregister.InterruptscanbeconfiguredtotraptoeitherHyporkernelmode.TrappingallinterruptstokernelmodeandlettingOSsoftwarerunninginkernelmodehandlethemdirectlyisefficient,butdoesnotworkinthecontextofVMs,becausethehypervisorlosescontroloverthehardware.TrappingallinterruptstoHypmodeensuresthatthehypervisorretainscontrol,butrequiresemulatingvirtualinterruptsinsoftwaretosignaleventstoVMs.Thisiscumbersometomanageandexpensivebecauseeachstepofinterruptandvirtualinterruptprocessing,suchasACKingandEOIing,mustgothroughthehypervisor.TheGICv2.0includeshardwarevirtualizationsupportintheformofavirtualGIC(VGIC)sothatreceivingvirtualinterruptsdoesnotneedtobeemulatedinsoftwarebythehypervisor.TheVGICintroducesaVGICCPUinterfaceforeachCPUandacorrespondinghypervisorcontrolinterfaceforeachCPU.VMsareconfiguredtoseetheVGICCPUinterfaceinsteadoftheGICCPUinterface.Virtualinterruptsaregeneratedbywritingtospecialregisters,thelistregisters,intheVGIChypervisorcontrolinterface,andtheVGICCPUinterfaceraisesthevirtualinter-ruptsdirectlytoaVM'skernelmode.BecausetheVGICCPUinterfaceincludessupportforACKandEOI,theseoperationsnolongerneedtotraptothehypervisortobeemulatedinsoftware,reducingoverheadforreceivinginterruptsonaCPU.Forex-ample,emulatedvirtualdevicestypicallyraisevirtualinterruptsthroughasoftwareAPItothehypervisor,whichcanleveragetheVGICbywritingthevirtualinterruptnumberfortheemulateddeviceintothelistregisters.ThiscausestheVGICtointerrupttheVMdirectlytokernelmodeandletstheguestOSACKandEOIthevirtualinterruptwithouttrappingtothehypervisor.NotethatthedistributormuststillbeemulatedinsoftwareandallaccessestothedistributorbyaVMmuststilltraptothehypervisor.Forexample,whenavirtualCPUsendsavirtualIPItoanothervir-tualCPU,thiswillcauseatraptothehypervisor,whichemulatesthedistributoraccessinsoftwareandprogramsthelistregistersonthereceivingCPU'sGIChypervisorcontrolinterface.TimerVirtualizationARMdefinestheGenericTimerArchi-tecturewhichincludessupportfortimervirtualization.Generictimersprovideacounterthatmeasurespassingoftimeinreal-time,andatimerforeachCPU,whichisprogrammedtoraiseaninterrupttotheCPUafteracertainamountoftimehaspassed.TimersarelikelytobeusedbybothhypervisorsandguestOSes,buttoprovideisolationandretaincontrol,thetimersusedbythehypervisorcannotbedirectlyconfiguredandmanipulatedbyguestOSes.SuchtimeraccessesfromaguestOSwouldneedto 335 traptoHypmode,incurringadditionaloverheadforarelativelyfrequentoperationforsomeworkloads.HypervisorsmayalsowishtovirtualizeVMtime,whichisproblematicifVMshavedirectaccesstocounterhardware.ARMprovidesvirtualizationsupportforthetimersbyintro-ducinganewcounter,thevirtualcounterandanewtimer,thevir-tualtimer.AhypervisorcanbeconfiguredtousephysicaltimerswhileVMsareconfiguredtousevirtualtimers.VMscanthenaccess,program,andcancelvirtualtimerswithoutcausingtrapstoHypmode.Accesstothephysicaltimerandcounterfromker-nelmodeiscontrolledfromHypmode,butsoftwarerunninginkernelmodealwayshasaccesstothevirtualtimersandcounters.Comparisonwithx86ThereareanumberofsimilaritiesanddifferencesbetweentheARMvirtualizationextensionsandhardwarevirtualizationsupportforx86fromIntelandAMD.IntelandAMDextensionsareverysimilar,sowelimitourcomparisontoARMandIntel.ARMsupportsvirtualizationthroughaseparateCPUmode,Hypmode,whichisaseparateandstrictlymoreprivilegedCPUmodethanprevioususerandkernelmodes.Incontrast,Intelhasrootandnon-rootmode[20],whichareorthogonaltotheCPUprotectionmodes.Whilesensi-tiveoperationsonARMtraptoHypmode,sensitiveoperationscantrapfromnon-rootmodetorootmodewhilestayinginthesameprotectionlevelonIntel.AcrucialdifferencebetweenthetwohardwaredesignsisthatIntel'srootmodesupportsthesamefullrangeofuserandkernelmodefunctionalityasitsnon-rootmode,whereasARM'sHypmodeisastrictlydifferentCPUmodewithitsownsetoffeatures.AhypervisorusingARM'sHypmodehasanarguablysimplersetoffeaturestousethanthemorecomplexoptionsavailablewithIntel'srootmode.BothARMandInteltrapintotheirrespectiveHypandrootmodes,butIntelprovidesspecifichardwaresupportforaVMcontrolblockwhichisautomaticallysavedandrestoredwhenswitchingtoandfromrootmodeusingonlyasingleinstruction.Thisisusedtoautomaticallysaveandrestoregueststatewhenswitchingbetweenguestandhypervisorexecution.Incontrast,ARMprovidesnosuchhardwaresupportandanystatethatneedstobesavedandrestoredmustbedoneexplicitlyinsoftware.ThisprovidessomeflexibilityinwhatissavedandrestoredinswitchingtoandfromHypmode.Forexample,trappingtoARM'sHypmodeispotentiallyfasterthantrappingtoIntel'srootmodeifthereisnoadditionalstatetosave.ARMandIntelarequitesimilarintheirsupportforvirtual-izingphysicalmemory.Bothintroduceanadditionalsetofpagetablestotranslateguesttohostphysicaladdresses.ARMben-efitedfromhindsightinincludingStage-2translationwhereasInteldidnotincludeitsequivalentExtendedPageTable(EPT)supportuntilitssecondgenerationvirtualizationhardware.ARM'ssupportforvirtualtimershavenorealx86counterpart,anduntiltherecentintroductionofIntel'svirtualAPICsup-port[20],ARM'ssupportforvirtualinterruptsalsohadnox86counterpart.WithoutvirtualAPICsupport,EOIinginterruptsinanx86VMrequirestrapstorootmode,whereasARM'svirtualGICavoidsthecostoftrappingtoHypmodeforthoseinterrupthandlingmechanisms.ExecutingsimilartimerfunctionalitybyaguestOSonx86willincuradditionaltrapstorootmodecom-paredtothenumberoftrapstoHypmoderequiredforARM.Readingacounter,however,isnotaprivilegedoperationonx86anddoesnottrap,evenwithoutvirtualizationsupportinthecounterhardware.3.HypervisorArchitectureInsteadofreinventingandreimplementingcomplexcorefunc-tionalityinthehypervisor,andpotentiallyintroducingtrickyandfatalbugsalongtheway,KVM/ARMbuildsonKVMandleveragesexistinginfrastructureintheLinuxkernel.Whileastan-dalonebaremetalhypervisordesignapproachhasthepotentialforbetterperformanceandasmallerTrustedComputingBase(TCB),thisapproachislesspracticalonARM.ARMhardwareisinmanywaysmuchmorediversethanx86.Hardwarecompo-nentsareoftentightlyintegratedinARMdevicesinnon-standardwaysbydifferentdevicemanufacturers.ARMhardwarelacksfeaturesforhardwarediscoverysuchasastandardBIOSoraPCIbus,andthereisnoestablishedmechanismforinstallinglow-levelsoftwareonawidevarietyofARMplatforms.Linux,however,issupportedacrossalmostallARMplatformsandbyintegratingKVM/ARMwithLinux,KVM/ARMisautomati-callyavailableonanydevicerunningarecentversionoftheLinuxkernel.ThisisincontrasttobaremetalapproachessuchasXen[32],whichmustactivelysupporteveryplatformonwhichtheywishtoinstalltheXenhypervisor.Forexample,foreverynewSoCthatXenneedstosupport,thedevelopersmustimplementanewserialdevicedriverinthecoreXenhypervisor.WhileKVM/ARMbenefitsfromitsintegrationwithLinuxintermsofportabilityandhardwaresupport,akeyproblemwehadtoaddresswasthattheARMhardwarevirtualizationextensionsweredesignedtosupportastandalonehypervisordesignwherethehypervisoriscompletelyseparatefromanystandardkernelfunctionality,asdiscussedinSection2.Inthefollowing,wedescribehowKVM/ARM'snoveldesignmakesitpossibletobenefitfromintegrationwithanexistingkernelandatthesametimetakeadvantageofthehardwarevirtualizationfeatures.3.1Split-modeVirtualizationSimplyrunningahypervisorentirelyinARM'sHypmodeisattractivesinceitisthemostprivilegedlevel.However,sinceKVM/ARMleveragesexistingkernelinfrastructuresuchasthescheduler,runningKVM/ARMinHypmodeimpliesrunningtheLinuxkernelinHypmode.Thisisproblematicforatleasttworeasons.First,low-levelarchitecturedependentcodeinLinuxiswrittentoworkinkernelmode,andwouldnotrununmodifiedinHypmode,becauseHypmodeisacompletelydifferentCPUmodefromnormalkernelmode.ThesignificantchangesthatwouldberequiredtorunthekernelinHypmodewouldbeveryunlikelytobeacceptedbytheLinuxkernelcommunity.Moreimportantly,topreservecompatibilitywithhardwarewithoutHypmodeandtorunLinuxasaguestOS,low-levelcodewouldhavetobewrittentoworkinbothmodes,potentiallyresulting 336 inslowandconvolutedcodepaths.Asasimpleexample,apagefaulthandlerneedstoobtainthevirtualaddresscausingthepagefault.InHypmodethisaddressisstoredinadifferentregisterthaninkernelmode.Second,runningtheentirekernelinHypmodewouldad-verselyaffectnativeperformance.Forexample,Hypmodehasitsownseparateaddressspace.Whereaskernelmodeusestwopagetablebaseregisterstoprovidethefamiliar3GB/1GBsplitbetweenuseraddressspaceandkerneladdressspace,Hypmodeusesasinglepagetableregisterandthereforecannothavedirectaccesstotheuserspaceportionoftheaddressspace.FrequentlyusedfunctionstoaccessusermemorywouldrequirethekerneltoexplicitlymapuserspacedataintokerneladdressspaceandsubsequentlyperformnecessaryteardownandTLBmaintenanceoperations,resultinginpoornativeperformanceonARM.TheseproblemswithrunningaLinuxhypervisorusingARMHypmodedonotoccurforx86hardwarevirtualization.x86rootmodeisorthogonaltoitsCPUprivilegemodes.TheentireLinuxkernelcanruninrootmodeasahypervisorbecausethesamesetofCPUmodesavailableinnon-rootmodeareavailableinrootmode.Nevertheless,giventhewidespreaduseofARMandtheadvantagesofLinuxonARM,findinganefficientvirtualizationsolutionforARMthatcanleverageLinuxandtakeadvantageofthehardwarevirtualizationsupportisofcrucialimportance.KVM/ARMintroducessplit-modevirtualization,anewap-proachtohypervisordesignthatsplitsthecorehypervisorsothatitrunsacrossdifferentprivilegedCPUmodestotakeadvantageofthespecificbenefitsandfunctionalityofferedbyeachCPUmode.KVM/ARMusessplit-modevirtualizationtoleveragetheARMhardwarevirtualizationsupportenabledbyHypmode,whileatthesametimeleveragingexistingLinuxkernelser-vicesrunninginkernelmode.Split-modevirtualizationallowsKVM/ARMtobeintegratedwiththeLinuxkernelwithoutintrusivemodificationstotheexistingcodebase.Thisisdonebysplittingthehypervisorintotwocomponents,thelowvisorandthehighvisor,asshowninFigure2.Thelowvi-sorisdesignedtotakeadvantageofthehardwarevirtualizationsupportavailableinHypmodetoprovidethreekeyfunctions.First,thelowvisorsetsupthecorrectexecutioncontextbyappro-priateconfigurationofthehardware,andenforcesprotectionandisolationbetweendifferentexecutioncontexts.Thelowvisordi-rectlyinteractswithhardwareprotectionfeaturesandisthereforehighlycriticalandthecodebaseiskepttoanabsoluteminimum.Second,thelowvisorswitchesfromaVMexecutioncontexttothehostexecutioncontextandvice-versa.ThehostexecutioncontextisusedtorunthehypervisorandthehostLinuxkernel.Werefertoanexecutioncontextasaworld,andswitchingfromoneworldtoanotherasaworldswitch,becausetheentirestateofthesystemischanged.SincethelowvisoristheonlycomponentthatrunsinHypmode,onlyitcanberesponsibleforthehardwarereconfigurationnecessarytoperformaworldswitch.Third,thelowvisorprovidesavirtualizationtraphandler,whichhandlesinterruptsandexceptionsthatmusttraptothehypervisor.Thelowvisorperformsonlytheminimalamountof Figure2:KVM/ARMSystemArchitectureprocessingrequiredanddefersthebulkoftheworktobedonetothehighvisorafteraworldswitchtothehighvisoriscomplete.ThehighvisorrunsinkernelmodeaspartofthehostLinuxkernel.ItcanthereforedirectlyleverageexistingLinuxfunction-alitysuchasthescheduler,andcanmakeuseofstandardkernelsoftwaredatastructuresandmechanismstoimplementitsfunc-tionality,suchaslockingmechanismsandmemoryallocationfunctions.Thismakeshigher-levelfunctionalityeasiertoimple-mentinthehighvisor.Forexample,whilethelowvisorprovidesalow-leveltrap-handlerandthelow-levelmechanismtoswitchfromoneworldtoanother,thehighvisorhandlesStage-2pagefaultsfromtheVMandperformsinstructionemulation.NotethatpartsoftheVMruninkernelmode,justlikethehighvisor,butwithStage-2translationandtrappingtoHypmodeenabled.BecausethehypervisorissplitacrosskernelmodeandHypmode,switchingbetweenaVMandthehighvisorinvolvesmultiplemodetransitions.AtraptothehighvisorwhilerunningtheVMwillfirsttraptothelowvisorrunninginHypmode.Thelowvisorwillthencauseanothertraptorunthehighvisor.Similarly,goingfromthehighvisortoaVMrequirestrappingfromkernelmodetoHypmode,andthenswitchingtotheVM.Asaresult,split-modevirtualizationincursadoubletrapcostinswitchingtoandfromthehighvisor.OnARM,theonlywaytoperformthesemodetransitionstoandfromHypmodeisbytrapping.However,asshowninSection5,thisextratrapisnotasignificantperformancecostonARM.KVM/ARMusesamemorymappedinterfacetosharedatabe-tweenthehighvisorandlowvisorasnecessary.Becausememorymanagementcanbecomplex,weleveragethehighvisor'sabilitytousetheexistingmemorymanagementsubsysteminLinuxtomanagememoryforboththehighvisorandlowvisor.Managingthelowvisor'smemoryinvolvesadditionalchallengesthough,becauseitrequiresmanagingHypmode'sseparateaddressspace.Onesimplisticapproachwouldbetoreusethehostkernel'spagetablesandalsousetheminHypmodetomaketheaddressspacesidentical.Thisunfortunatelydoesnotwork,becauseHypmodeusesadifferentpagetableformatfromkernelmode.Therefore,thehighvisorexplicitlymanagestheHypmodepagetablestomapanycodeexecutedinHypmodeandanydatastructuressharedbetweenthehighvisorandthelowvisortothesamevirtualaddressesinHypmodeandinkernelmode. 337 Action Nr. State ContextSwitch 38 GeneralPurpose(GP)Registers 26 ControlRegisters 16 VGICControlRegisters 4 VGICListRegisters 2 Arch.TimerControlRegisters 32 64-bitVFPregisters 4 32-bitVFPControlRegisters Trap-and-Emulate - CP14TraceRegisters - WFIInstructions - SMCInstructions - ACTLRAccess - Cacheops.bySet/Way - L2CTLR/L2ECTLRRegisters Table1:VMandHostStateonaCortex-A153.2CPUVirtualizationTovirtualizetheCPU,KVM/ARMmustpresentaninterfacetotheVMwhichisessentiallyidenticaltotheunderlyingrealhard-wareCPU,whileensuringthatthehypervisorremainsincontrolofthehardware.ThisinvolvesensuringthatsoftwarerunningintheVMmusthavepersistentaccesstothesameregisterstateassoftwarerunningonthephysicalCPU,aswellasensuringthatphysicalhardwarestateassociatedwiththehypervisoranditshostkernelispersistentacrossrunningVMs.RegisterstatenotaffectingVMisolationcansimplybecontextswitchedbysavingtheVMstateandrestoringthehoststatefrommemorywhenswitchingfromaVMtothehostandviceversa.KVM/ARMconfiguresaccesstoallothersensitivestatetotraptoHypmode,soitcanbeemulatedbythehypervisor.Table1showstheCPUregisterstatevisibletosoftwarerun-ninginkernelandusermode,andKVM/ARM'svirtualizationmethodforeachregistergroup.Thelowvisorhasitsowndedi-catedconfigurationregistersonlyforuseinHypmode,andisnotshowninTable1.KVM/ARMcontextswitchesregistersduringworld-switcheswheneverthehardwaresupportsit,becauseitallowstheVMdirectaccesstothehardware.Forexample,theVMcandirectlyprogramtheStage-1pagetablebaseregisterwithouttrappingtothehypervisor,afairlycommonoperationinmostguestOSes.KVM/ARMperformstrapandemulateonsensitiveinstructionsandwhenaccessinghardwarestatethatcouldaffectthehypervisororwouldleakinformationaboutthehardwaretotheVMthatviolatesitsvirtualizedabstraction.Forexample,KVM/ARMtrapsifaVMexecutestheWFIin-struction,whichcausestheCPUtopowerdown,becausesuchanoperationshouldonlybeperformedbythehypervisortomaintaincontrolofthehardware.KVM/ARMdefersswitchingcertainregisterstateuntilabsolutelynecessary,whichslightlyimprovesperformanceundercertainworkloads.ThedifferencebetweenrunninginsideaVMinkernelorusermodeandrunningthehypervisorinkernelorusermodeisdeter-minedbyhowthevirtualizationextensionshavebeenconfiguredbyHypmodeduringtheworldswitch.AworldswitchfromthehosttoaVMperformsthefollowingactions:(1)storeallhostGPregistersontheHypstack,(2)configuretheVGICfortheVM,(3)configurethetimersfortheVM,(4)saveallhost-specificcon-figurationregistersontotheHypstack,(5)loadtheVM'sconfig-urationregistersontothehardware,whichcanbedonewithoutaf-fectingcurrentexecution,becauseHypmodeusesitsownconfig-urationregisters,separatefromthehoststate,(6)configureHypmodetotrapfloating-pointoperationsforlazycontextswitch-ing,trapinterrupts,trapCPUhaltinstructions(WFI/WFE),trapSMCinstructions,trapspecificconfigurationregisteraccesses,andtrapdebugregisteraccesses,(7)writeVM-specificIDsintoshadowIDregisters,(8)settheStage-2pagetablebaseregister(VTTBR)andenableStage-2addresstranslation,(9)restoreallguestGPregisters,and(10)trapintoeitheruserorkernelmode.TheCPUwillstayintheVMworlduntilaneventoccurs,whichtriggersatrapintoHypmode.Suchaneventcanbecausedbyanyofthetrapsmentionedabove,aStage-2pagefault,orahardwareinterrupt.Sincetheeventrequiresservicesfromthehighvisor,eithertoemulatetheexpectedhardwarebehaviorfortheVMortoserviceadeviceinterrupt,KVM/ARMmustperformanotherworldswitchbackintothehighvisoranditshost.TheworldswitchbacktothehostfromaVMperformsthefollowingactions:(1)storeallVMGPregisters,(2)disableStage-2translation,(3)configureHypmodetonottrapanyregisteraccessorinstructions,(4)saveallVM-specificconfig-urationregisters,(5)loadthehost'sconfigurationregistersontothehardware,(6)configurethetimersforthehost,(7)saveVM-specificVGICstate,(8)restoreallhostGPregisters,and(9)trapintokernelmode.3.3MemoryVirtualizationKVM/ARMprovidesmemoryvirtualizationbyenablingStage-2translationforallmemoryaccesseswhenrunninginaVM.Stage-2translationcanonlybeconfiguredinHypmode,anditsuseiscompletelytransparenttotheVM.Thehighvisorman-agestheStage-2translationpagetablestoonlyallowaccesstomemoryspecificallyallocatedforaVM;otheraccesseswillcauseStage-2pagefaultswhichtraptothehypervisor.ThismechanismensuresthataVMcannotaccessmemorybelongingtothehypervisororotherVMs,includinganysensitivedata.Stage-2translationisdisabledwhenrunninginthehighvisorandlowviserbecausethehighvisorhasfullcontrolofthecompletesystemanddirectlymanagesthehostphysicaladdresses.WhenthehypervisorperformsaworldswitchtoaVM,itenablesStage-2translationandconfigurestheStage-2pagetablebaseregisteraccordingly.AlthoughboththehighvisorandVMssharethesameCPUmodes,Stage-2translationsensurethatthehighvisorisprotectedfromanyaccessbytheVMs.KVM/ARMusessplit-modevirtualizationtoleverageexistingkernelmemoryallocation,pagereferencecounting,andpageta-blemanipulationcode.KVM/ARMhandlesStage-2pagefaultsbyconsideringtheIPAofthefault,andifthataddressbelongstonormalmemoryintheVMmemorymap,KVM/ARMallocatesapagefortheVMbysimplycallinganexistingkernelfunction,suchasget_user_pages,andmapstheallocatedpagetotheVMintheStage-2pagetables.Incomparison,abaremetal 338 hypervisorwouldbeforcedtoeitherstaticallyallocatememorytoVMsorwriteanentirenewmemoryallocationsubsystem.3.4I/OVirtualizationKVM/ARMleveragesexistingQEMUandVirtio[29]userspacedeviceemulationtoprovideI/Ovirtualization.Atahardwarelevel,allI/OmechanismsontheARMarchitecturearebasedonload/storeoperationstoMMIOdeviceregions.WiththeexceptionofdevicesdirectlyassignedtoVMs,allhardwareMMIOregionsareinaccessiblefromVMs.KVM/ARMusesStage-2translationstoensurethatphysicaldevicescannotbeac-cesseddirectlyfromVMs.AnyaccessoutsideofRAMregionsallocatedfortheVMwilltraptothehypervisor,whichcanroutetheaccesstoaspecificemulateddeviceinQEMUbasedonthefaultaddress.Thisissomewhatdifferentfromx86,whichusesx86-specifichardwareinstructionssuchasinlandoutlforportI/OoperationsinadditiontoMMIO.AsweshowinSection5,KVM/ARMachieveslowI/Operformanceoverheadwithverylittleimplementationeffort.3.5InterruptVirtualizationKVM/ARMleveragesitstightintegrationwithLinuxtoreuseexistingdevicedriversandrelatedfunctionality,includinghan-dlinginterrupts.WhenrunninginaVM,KVM/ARMconfigurestheCPUtotrapallhardwareinterruptstoHypmode.Oneachinterrupt,itperformsaworldswitchtothehighvisorandthehosthandlestheinterrupt,sothatthehypervisorremainsincompletecontrolofhardwareresources.Whenrunninginthehostandthehighvisor,interruptstrapdirectlytokernelmode,avoidingtheoverheadofgoingthroughHypmode.Inbothcases,allhardwareinterruptprocessingisdoneinthehostbyreusingLinux'sexistinginterrupthandlingfunctionality.However,VMsmustreceivenotificationsintheformofvir-tualinterruptsfromemulateddevicesandmulticoreguestOSesmustbeabletosendvirtualIPIsfromonevirtualcoretoanother.KVM/ARMusestheVGICtoinjectvirtualinterruptsintoVMstoreducethenumberoftrapstoHypmode.AsdescribedinSection2,virtualinterruptsareraisedtovirtualCPUsbypro-grammingthelistregistersintheVGIChypervisorCPUcontrolinterface.KVM/ARMconfigurestheStage-2pagetablestopreventVMsfromaccessingthecontrolinterfaceandtoallowaccessonlytotheVGICvirtualCPUinterface,ensuringthatonlythehypervisorcanprogramthecontrolinterfaceandthattheVMcanaccesstheVGICvirtualCPUinterfacedirectly.However,guestOSeswillstillattempttoaccessaGICdistrib-utortoconfiguretheGICandtosendIPIsfromonevirtualcoretoanother.Suchaccesseswilltraptothehypervisorandthehypervisormustemulatethedistributor.KVM/ARMintroducesthevirtualdistributor,asoftwaremodeloftheGICdistributoraspartofthehighvisor.Thevir-tualdistributorexposesaninterfacetouserspace,soemulateddevicesinuserspacecanraisevirtualinterruptstothevirtualdistributorandexposesanMMIOinterfacetotheVMidenticaltothatofthephysicalGICdistributor.ThevirtualdistributorkeepsinternalsoftwarestateaboutthestateofeachinterruptandusesthisstatewheneveraVMisscheduled,toprogramthelistregisterstoinjectvirtualinterrupts.Forexample,ifvirtualCPU0sendsanIPItovirtualCPU1,thedistributorwillprogramthelistregistersforvirtualCPU1toraiseavirtualIPIinterruptthenexttimevirtualCPU1runs.Ideally,thevirtualdistributoronlyaccessesthehardwarelistregisterswhennecessary,sincedeviceMMIOoperationsaretypicallysignificantlyslowerthancachedmemoryaccesses.AcompletecontextswitchofthelistregistersisrequiredwhenschedulingadifferentVMtorunonaphysicalcore,butnotnecessarilyrequiredwhensimplyswitchingbetweenaVMandthehypervisor.Forexample,iftherearenopendingvirtualinter-rupts,itisnotnecessarytoaccessanyofthelistregisters.NotethatoncethehypervisorwritesavirtualinterrupttoalistregisterwhenswitchingtoaVM,itmustalsoreadthelistregisterbackwhenswitchingbacktothehypervisor,becausethelistregisterdescribesthestateofthevirtualinterruptandindicates,forex-ample,iftheVMhasACKedthevirtualinterrupt.TheinitialunoptimizedversionofKVM/ARMusesasimplifiedapproachwhichcompletelycontextswitchesallVGICstateincludingthelistregistersoneachworldswitch.3.6TimerVirtualizationReadingcountersandprogrammingtimersarefrequentopera-tionsinmanyOSesforprocessschedulingandtoregularlypolldevicestate.Forexample,Linuxreadsacountertodetermineifaprocesshasexpireditstimeslice,andprogramstimerstoensurethatprocessesdon'texceedtheirallowedtimeslices.Applicationworkloadsalsooftenleveragetimersforvariousreasons.Trappingtothehypervisorforeachsuchoperationislikelytoincurnoticeableperformanceoverheads,andallowingaVMdirectaccesstothetime-keepinghardwaretypicallyimpliesgivinguptimingcontrolofthehardwareresourcesasVMscandisabletimersandcontroltheCPUforextendedperiodsoftime.KVM/ARMleveragesARM'shardwarevirtualizationfea-turesofthegenerictimerstoallowVMsdirectaccesstoreadingcountersandprogrammingtimerswithouttrappingtoHypmodewhileatthesametimeensuringthehypervisorremainsincontrolofthehardware.SinceaccesstothephysicaltimersiscontrolledusingHypmode,anysoftwarecontrollingHypmodehasaccesstothephysicaltimers.KVM/ARMmaintainshardwarecontrolbyusingthephysicaltimersinthehypervisoranddisallowingaccesstophysicaltimersfromtheVM.TheLinuxkernelrunningasaguestOSonlyaccessesthevirtualtimerandcanthereforedirectlyaccesstimerhardwarewithouttrappingtothehypervisor.Unfortunately,duetoarchitecturallimitations,thevirtualtimerscannotdirectlyraisevirtualinterrupts,butalwaysraisehardwareinterrupts,whichtraptothehypervisor.KVM/ARMdetectswhenaVMvirtualtimerexpires,andinjectsacorre-spondingvirtualinterrupttotheVM,performingallhardwareACKandEOIoperationsinthehighvisor.ThehardwareonlyprovidesasinglevirtualtimerperphysicalCPU,andmultiplevirtualCPUsmaybemultiplexedacrossthissinglehardwareinstance.Tosupportvirtualtimersinthisscenario,KVM/ARMdetectsunexpiredtimerswhenaVMtrapstothehypervisor 339 andleveragesexistingOSfunctionalitytoprogramasoftwaretimeratthetimewhenthevirtualtimerwouldhaveotherwisefired,hadtheVMbeenleftrunning.Whensuchasoftwaretimerfires,acallbackfunctionisexecuted,whichraisesavirtualtimerinterrupttotheVMusingthevirtualdistributordescribedabove.4.ImplementationandAdoptionWehavesuccessfullyintegratedourworkintotheLinuxkernelandKVM/ARMisnowthestandardARMhypervisoronLinuxplatforms,asitisincludedineverykernelbeginningwithversion3.9.Itsrelativesimplicityandrapidcompletionwasfaciliatedbyspecificdesignchoicesthatallowittoleveragesubstantialexistinginfrastructuredespitedifferencesintheunderlyinghardware.Wesharesomelessonswelearnedfromourexperiencesinhopesthattheymaybehelpfultoothersingettingresearchideaswidelyadoptedbytheopensourcecommunity.Codemaintainabilityiskey.Itisacommonmisconceptionthataresearchsoftwareimplementationprovidingpotentialimprovementsorinterestingnewfeaturescansimplybeopensourcedandtherebyquicklyintegratedbytheopensourcecom-munity.Animportantpointthatisoftennottakenintoaccountisthatanyimplementationmustbemaintained.Ifanimplemen-tationrequiresmanypeopleandmuchefforttobemaintained,itismuchlesslikelytointegratedintoexistingopensourcecodebases.Becausemaintainabilityissocrucial,reusingcodeandinterfacesisimportant.Forexample,KVM/ARMbuildsonexistinginfrastructuresuchasKVMandQEMU,andfromtheverystartweprioritizedaddressingcodereviewcommentstomakeourcodesuitableforintegrationintoexistingsystems.AnunexpectedbutimportantbenefitofthisdecisionwasthatwecouldleveragethecommunityforhelptosolvehardbugsorunderstandintricatepartsoftheARMarchitecture.Beaknowncontributor.Convincingmaintainerstointegratecodeisnotjustaboutthecodeitself,butalsoaboutwhosubmitsit.ItisnotunusualforresearcherstocomplainaboutkernelmaintainersnotacceptingtheircodeintoLinuxonlytohavesomeknownkerneldevelopersubmitthesameideaandhaveitaccepted.Thereasonisanissueoftrust.Establishingtrustisacatch-22:onemustbewell-knowntosubmitcode,yetonecannotbecomeknownwithoutsubmittingcode.Onewaytodothisistostartsmall.Aspartofourwork,wealsomadevarioussmallchangestoKVMtopreparesupportforARM,whichincludedcleaningupexistingcodetobemoregenericandimprovingcrossplatformsupport.TheKVMmaintainersweregladtoacceptthesesmallimprovements,whichgeneratedgoodwillandhelpedusbecomeknowntotheKVMcommunity.Makefriendsandinvolvethecommunity.Opensourcede-velopmentturnsouttobequiteasocialenterprise.Networkingwiththecommunityhelpstremendously,notjustonline,butinpersonatconferencesandothervenues.Forexample,atanearlystageinthedevelopmentofKVM/ARM,wetraveledtoARMheadquartersinCambridge,UKtoestablishcontactwithbothARMmanagementandtheARMkernelengineeringteam,whobothcontributedtoourefforts.Asanotherexample,animportantissueinintegratingKVM/ARMintothekernelwasagreeingonvariousinterfacesforARMvirtualization,suchasreadingandwritingcontrolregisters.Sinceitisanestablishedpolicytoneverbreakreleasedinterfacesandcompatibilitywithuserspaceapplications,exist-inginterfacescannotbechanged,andthecommunityputsgreateffortintodesigningextensibleandreusableinterfaces.Decidingontheappropriatenessofaninterfaceisajudgmentcallandnotanexactscience.Wewerefortunateenoughtoreceivehelpfromwell-knownkerneldeveloperssuchasRustyRussell,whohelpedusdriveboththeimplementationandcommunicationaboutourinterfaces,specificallyforuserspacesaveandrestoreofregisters,afeatureusefulforbothdebuggingandVMmigration.WorkingwithanestablisheddeveloperlikeRustywasatremendoushelpbecausewebenefitedfrombothhisexperienceandstrongvoiceinthekernelcommunity.Involvethecommunityearly.Animportantissueindevelop-ingKVM/ARMwashowtogetaccesstoHypmodeacrosstheplethoraofavailableARMSoCplatformssupportedbyLinux.OneapproachwouldbetoinitializeandconfigureHypmodewhenKVMisinitialized,whichwouldisolatethecodechangestotheKVMsubsystem.However,becausegettingintoHypmodefromthekernelinvolvesatrap,earlystagebootloadermusthavealreadyinstalledcodeinHypmodetohandlethetrapandallowKVMtorun.Ifnosuchtraphandlerwasinstalled,trappingtoHypmodecouldendupcrashingthekernel.WeworkedwiththekernelcommunitytodefinetherightABIbe-tweenKVMandthebootloader,butsoonlearnedthatagreeingonABIswithSoCvendorshadhistoricallybeendifficult.IncollaborationwithARMandtheopensourcecommunity,wereachedtheconclusionthatifwesimplyrequiredthekerneltobebootedinHypmode,wewouldnothavetorelyonfragileABIs.ThekernelthensimplytestswhenitstartsupwhetheritisinHypmode,inwhichcaseitinstallsatraphandlertoprovideahooktore-enterHypmodeatalaterstage.Asmallamountofcodemustbeaddedtothekernelbootprocedure,buttheresultisamuchcleanerandrobustmechanism.IfthebootloaderisHypmodeunawareandthekerneldoesnotbootupinHypmode,KVM/ARMwilldetectthisandwillsimplyremaindisabled.ThissolutionavoidstheneedtodesignanewABIanditturnedoutthatlegacykernelswouldstillwork,becausetheyalwaysmakeanexplicitswitchintokernelmodeastheirfirstinstruction.ThesechangesweremergedintothemainlineLinux3.6kernel,andofficialARMkernelbootrecommendationsweremodifiedtorecommendthatallbootloadersbootthekernelinHypmodetotakeadvantageofthenewarchitecturefeatures.Knowthechainofcommand.ThereweremultiplepossibleupstreampathsforKVM/ARM.Historically,otherarchitecturessupportedbyKVMsuchasx86andPowerPCweremergedthroughtheKVMtreedirectlyintoLinusTorvalds'treewiththeappropriateapprovaloftherespectivearchitecturemaintain-ers.KVM/ARM,however,requiredafewminorchangesto 340 ARM-specificheaderfilesandtheidmapsubsystem,anditwasthereforenotclearwhetherthecodewouldbeintegratedviatheKVMtreewithapprovalfromtheARMkernelmaintainerorviatheARMkerneltree.RussellKingistheARMkernelmaintainer,andLinuspullsdirectlyfromhisARMkerneltreeforARM-relatedcode.Thesituationwasparticularlyinteresting,becauseRussellKingdidnotwanttomergevirtualizationsup-portinthemainlinekernel[24]andhedidnotreviewourcode.Atthesametime,theKVMcommunitywasquiteinterestedinintegratingourcode,butcouldnotdosowithoutapprovalfromtheARMmaintainers,andRussellKingrefusedtoengageinadiscussionaboutthisprocedure.Bepersistent.WhileweweretryingtomergeourcodeintoLinux,alotofchangeswerehappeningaroundLinuxARMsupportingeneral.TheamountofchurninSoCsupportcodewasbecominganincreasinglybigproblemformaintainers,andmuchworkwasunderwaytoreduceboardspecificcodeandsup-portasingleARMkernelbinarybootableacrossmultipleSoCs.Inlightoftheseongoingchanges,gettingenoughtimefromARMkernelmaintainerstoreviewthecodewaschallenging,andtherewasextrapressureonthemaintainerstobehighlycriticalofanynewcodemergedintotheARMtree.Wehadnochoicebuttokeepmaintainingandimprovingthecode,andregularlysendoutupdatedpatchseriesthatfollowedupstreamkernelchanges.EventuallyWillDeacon,oneoftheARMmaintainers,madetimeforseveralcomprehensiveandhelpfulreviews,andafteraddressinghisconcerns,hegaveushisapprovalofthecode.Afterallthis,whenwethoughtweweredone,wefinallyreceivedsomefeedbackfromRussellKing.WhenMMIOoperationstraptothehypervisor,thevirtualiza-tionextensionspopulatearegisterwhichcontainsinformationusefultoemulatetheinstruction(whetheritwasaloadorastore,source/targetregisters,andthelengthofMMIOaccesses).AcertainclassofinstructionsusedbyolderLinuxkernelsdonotpopulatesucharegister.KVM/ARMthereforeloadstheinstruc-tionfrommemoryanddecodesitinsoftware.Eventhoughthedecodingimplementationwaswelltestedandreviewedbyalargegroupofpeople,RussellKingobjectedtoincludingthisfeature.Hehadalreadyimplementedmultipleformsofinstructionde-codinginothersubsystemsanddemandedthatweeitherrewritesignificantpartsoftheARMkerneltounifyallinstructionde-codingtoimprovecodereuse,ordroptheMMIOinstructiondecodingsupportfromourimplementation.Ratherthanpursuearewritingeffortthatcoulddragonformonths,weabandonedtheotherwisewell-likedandusefulcodebase.Wecanonlyspec-ulateaboutthetruemotivesbehindthisdecision,astheARMmaintainerwouldnotengageinadiscussionaboutthesubject.After15mainpatchrevisionsandmorethan18months,theKVM/ARMcodewassuccessfullymergedintoLinus'streeviaRussellKing'sARMtreeinFebruary2013.Ingettingallthesethingstocometogetherintheendbeforethe3.9mergewindow,thekeywashavingagoodrelationshipwithmanyofthekerneldeveloperstogettheirhelp,andbeingpersistentincontinuingtopushtohavethecodemergedinthefaceofvariouschallenges.5.ExperimentalResultsWepresentsomeexperimentalresultsthatquantifytheperfor-manceofKVM/ARMonmulticoreARMhardware.WeevaluatethevirtualizationoverheadofKVM/ARMcomparedtonativeexecutionbyrunningbothmicrobenchmarksandrealapplicationworkloadswithinVMsanddirectlyonthehardware.Wemeasureandcomparetheperformance,energy,andimplementationcostsofKVM/ARMversusKVMx86todemonstratetheeffectivenessofKVM/ARMagainstamorematurehardwarevirtualizationplatform.Theseresultsprovidethefirstrealhardwaremeasure-mentsoftheperformanceofARMhardwarevirtualizationsup-portaswellasthefirstcomparisonbetweenARMandx86.5.1MethodologyARMmeasurementswereobtainedusinganInsignalArndaleboard[19]withadualcore1.7GHzCortexA-15CPUonaSamsungExynos5250SoC.ThisisthefirstandmostwidelyusedcommerciallyavailabledevelopmentboardbasedontheCortexA-15,thefirstARMCPUwithhardwarevirtualizationsupport.Onboard100MbEthernetisprovidedviatheUSBbusandanexternal120GBSamsung840seriesSSDdrivewascon-nectedtotheArndaleboardviaeSATA.x86measurementswereobtainedusingbothalow-powermobilelaptopplatformandanindustrystandardserverplatform.Thelaptopplatformwasa2011MacBookAirwithadualcore1.8GHzCorei7-2677MCPU,aninternalSamsungSM256C256GBSSDdrive,andanApple100MbUSBEthernetadapter.TheserverplatformwasadedicatedOVHSP3serverwithadualcore3.4GHzIntelXeonE31245v2CPU,twophysicalSSDdrivesofwhichonlyonewasused,and1GBEthernetconnectedtoa100Mbnetworkinfrastructure.x86hardwarewithvirtualAPICsupportwasnotyetavailableatthetimeofourexperiments.Giventhedifferencesinhardwareplatforms,ourfocuswasnotonmeasuringabsoluteperformance,butrathertherelativeperformancedifferencesbetweenvirtualizedandnativeexecu-tiononeachplatform.Sinceourgoalistoevaluatehypervisors,notrawhardwareperformance,thisrelativemeasureprovidesausefulcross-platformbasisforcomparingthevirtualizationperformanceandpowercostsofKVM/ARMversusKVMx86.Toprovidecomparablemeasurements,wekeptthesoftwareenvironmentsacrossallhardwareplatformsthesameasmuchaspossible.BoththehostandguestVMsonallplatformswereUbuntuversion12.10.WeusedthemainlineLinux3.10kernelforourexperiments,withpatchesforhugepagesupportappliedontopofthesourcetree.Sincetheexperimentswereperformedonanumberofdifferentplatforms,thekernelconfigurationshadtobeslightlydifferent,butallcommonfeatureswerecon-figuredsimilarlyacrossallplatforms.Inparticular,VirtiodriverswereusedintheguestVMsonbothARMandx86.WeusedQEMUversionv1.5.0forourmeasurements.Allsystemswereconfiguredwithamaximumof1.5GBofRAMavailabletotherespectiveguestVMorhostbeingtested.Furthermore,allmulticoremeasurementsweredoneusingtwophysicalcoresandguestVMswithtwovirtualCPUs,andsingle-coremeasurements 341 wereconfiguredwithSMPdisabledinthekernelconfigurationofboththeguestandhostsystem;hyperthreadingwasdisabledonthex86platforms.CPUfrequencyscalingwasdisabledtoensurethatnativeandvirtualizedperformancewasmeasuredatthesameclockrateoneachplatform.Formeasurementsinvolvingthenetworkandanotherserver,100MbEthernetwasusedonallsystems.TheARMandx86lap-topplatformswereconnectedusingaNetgearGS608v3switch,anda2010iMacwitha3.2GHzCorei3CPUwith12GBofRAMrunningMacOSXMountainLionwasusedasaserver.Thex86serverplatformwasconnectedtoa100MbportintheOVHnetworkinfrastructure,andanotheridenticalserverinthesamedatacenterwasusedastheserver.Whiletherearesomedifferencesinthenetworkinfrastructureusedforthex86serverplatformbecauseitiscontrolledbysomeoneelse,wedonotexpectthesedifferencestohaveanysignificantimpactontherelativeperformancebetweenvirtualizedandnativeexecution.Wepresentresultsforfoursetsofexperiments.First,wemea-suredthecostofvariousmicroarchitecturalcharacteristicsofthehypervisorsonmulticorehardwareusingcustomsmallguestOSes[1121]withsomebugfixpatchesapplied.WefurtherinstrumentedthecodeonbothKVM/ARMandKVMx86toreadthecyclecounteratspecificpointsalongcriticalpathstomoreaccuratelydeterminewhereoverheadtimewasspent.Second,wemeasuredthecostofanumberofcommonlow-levelOSoperationsusinglmbench[25]v3.0onbothsingle-coreandmulticorehardware.Whenrunninglmbenchonmulticoreconfigurations,wepinnedeachbenchmarkprocesstoaseparateCPUtomeasurethetrueoverheadofinterprocessorcommuni-cationinVMsonmulticoresystems.Third,wemeasuredrealapplicationperformanceusingavarietyofworkloadsonbothsingle-coreandmulticorehardware.Table2describestheeightapplicationworkloadsweused.Fourth,wemeasuredenergyefficiencyusingthesameeightapplicationworkloadsusedformeasuringapplicationperfor-mance.ARMpowermeasurementswereperformedusinganARMEnergyProbe[]whichmeasurespowerconsumptionoverashuntattachedtothepowersupplyoftheArndaleboard.PowertotheexternalSSDwasdeliveredbyattachingaUSBpowercabletotheUSBportsontheArndaleboardtherebyfactoringstoragepowerintothetotalSoCpowermeasuredatthepowersupply.x86powermeasurementswereperformedusingthepowerstattool,whichreadsACPIinformation.powerstatmeasurestotalsystempowerdrawfromthebat-tery,sopowermeasurementsonthex86systemwererunfrombatterypowerandcouldonlyberunonthex86laptopplatform.Althoughwedidnotmeasurethepowerefficiencyofthex86serverplatform,itisexpectedtobemuchlessefficientthatthex86laptopplatform,sousingthex86laptopplatformprovidesaconservativecomparisonofenergyefficiencyagainstARM.Thedisplayandwirelessfeaturesofthex86laptopplatformwereturnedofftoensureafaircomparison.Bothtoolsreportedinstantaneouspowerdrawinwattswitha10Hzinterval.These apache Apachev2.2.22WebserverrunningApacheBenchv2.3onthelocalserver,whichmeasuresnumberofhandledrequestspersecondsservingtheindexfileoftheGCC4.4manualusing100concurrentrequests mysql MySQLv14.14(distrib5.5.27)runningtheSysBenchOLTPbenchmarkusingthedefaultconfiguration memcached memcachedv1.4.14usingthememslapbenchmarkwithaconcurrencyparameterof100 kernelcompile kernelcompilationbycompilingtheLinux3.6.0kernelusingthevexpress defconfigforARMusingGCC4.7.2onARMandtheGCC4.7.2arm-linux-gnueabi-crosscompilationtoolchainonx86 untar untarextractingthe3.6.0Linuxkernelimagecompressedwithbz2compressionusingthestandardtarutility curl1K curlv7.27.0downloadinga1KBrandomlygener-atedfile1,000timesfromtherespectiveiMacorOVHserverandsavingtheresultto/dev/nullwithoutputdisabled,whichprovidesameasureofnetworklatency curl1G curlv7.27.0downloadinga1GBrandomlygen-eratedfilefromtherespectiveiMacorOVHserverandsavingtheresultto/dev/nullwithoutputdis-abled,whichprovidesameasureofnetworkthroughput hackbench hackbenchh26]usingunixdomainsocketsand100processgroupsrunningwith500loops Table2:BenchmarkApplicationsmeasurementswereaveragedandmultipliedbythedurationofthetesttoobtainanenergymeasure.5.2PerformanceandPowerMeasurementsTable3presentsvariousmicro-architecturalcostsofvirtual-izationusingKVM/ARMonARMandKVMx86onx86.MeasurementsareshownincyclesinsteadoftimetoprovideausefulcomparisonacrossplatformswithdifferentCPUfre-quencies.WeshowtwonumbersfortheARMplatformwherepossible,withandwithoutVGICandvirtualtimerssupport.Hypercallisthecostoftwoworldswitches,goingfromtheVMtothehostandimmediatelybackagainwithoutdoinganyworkinthehost.KVM/ARMtakesthreetofourtimesasmanycyclesforthisoperationversusKVMx86duetotwomainfactors.First,savingandrestoringVGICstatetousevirtualinter-ruptsisquiteexpensiveonARM;availablex86hardwaredoesnotyetprovidesuchmechanism.TheARMnoVGIC/vtimersmeasurementdoesnotincludethecostofsavingandrestoringVGICstate,showingthatthisaccountsforoverhalfofthecostofaworldswitchonARM.Second,x86provideshardwaresupporttosaveandrestorestateontheworldswitch,whichismuchfaster.ARMrequiressoftwaretoexplicitlysaveandrestorestate,whichprovidesgreaterflexibility,buthighercosts.Nevertheless,withouttheVGICstate,thehypercallcostsareonlyabout600cyclesmorethanthehardwareacceleratedhypercallcostonthex86serverplatform.TheARMworldswitchcostshavenotbeenoptimizedandcanbereducedfurther.Forexample,asmallpatcheliminatingunnecessaryatomicoperationsreducesthehypercallcostbyroughly300cycles,butdidnotmakeitintothemainlinekerneluntilafterv3.10wasreleased.Asanotherexample,ifpartsoftheVGICstatewerelazilycontextswitchedinsteadof 342 MicroTest ARM ARMno x86 x86 VGIC/vtimers laptop server Hypercall 5,326 2,270 1,336 1,638 Trap 27 27 632 821 I/OKernel 5,990 2,850 3,190 3,291 I/OUser 10,119 6,704 10,985 12,218 IPI 14,366 32,951 17,138 21,177 EOI+ACK 427 13,726 2,043 2,305 Table3:Micro-ArchitecturalCycleCountsbeingsavedandrestoredoneachworldswitch,thismayalsoreducetheworldswitchcosts.TrapisthecostofswitchingthehardwaremodefromtheVMintotherespectiveCPUmodeforrunningthehypervisor,HypmodeonARMandrootmodeonx86.ARMismuchfasterthanx86becauseitonlyneedstomanipulatetworegisterstoperformthistrap,whereasthecostofatraponx86isroughlythesameasthecostofaworldswitchbecausethesameamountofstateissavedbythehardwareinbothcases.ThetrapcostonARMisaverysmallpartoftheworldswitchcosts,indicatingthatthedoubletrapincurredbysplit-modevirtualizationonARMdoesnotaddmuchoverhead.I/OKernelisthecostofanI/OoperationfromtheVMtoadevice,whichisemulatedinsidethekernel.I/OUsershowsthecostofissuinganI/Ooperationtoadeviceemulatedinuserspace,addingtoI/OKernelthecostoftransitioningfromtheker-neltoauserspaceprocessanddoingasmallamountofworkinuserspaceonthehostforI/O.ThisisrepresentativeofthecostofusingQEMU.Sincetheseoperationsinvolveworldswitches,sav-ingandrestoringVGICstateisagainasignificantcostonARM.KVMx86isfasterthanKVM/ARMonI/OKernel,butslightlysloweronI/OUser.Thisisbecausethehardwareoptimizedworldswitchonx86constitutesthemajorityofthecostofperformingI/Ointhekernel,buttransitioningfromkerneltoauserspacepro-cessonthehostsideismoreexpensiveonx86becausex86KVMsavesandrestoresadditionalstatelazilywhengoingtouserspace.Notethattheaddedcostofgoingtouserspaceincludessavingadditionalstate,doingsomeworkinuserspace,andreturningtothekernelandprocessingtheKVM_RUNioctlcallforKVM.IPIisthecostofissuinganIPItoanothervirtualCPUcorewhenbothvirtualcoresarerunningonseparatephysicalcoresandbothareactivelyrunninginsidetheVM.IPImeasurestimestartingfromsendinganIPIuntiltheothervirtualcorerespondsandcompletestheIPI.ItinvolvesmultipleworldswitchesandsendingandreceivingahardwareIPI.Despiteitshigherworldswitchcost,ARMisfasterthanx86becausetheunderlyinghardwareIPIonx86isexpensive,x86APICMMIOoperationsrequireKVMx86toperforminstructiondecodingnotneededonARM,andcompletinganinterruptonx86ismoreexpen-sive.ARMwithoutVGIC/vtimersissignificantlyslowerthanwithVGIC/vtimerseventhoughithaslowerworldswitchcostsbecausesending,EOIingandACKinginterruptstraptothehypervisorandarehandledbyQEMUinuserspace.EOI+ACKisthecostofcompletingavirtualinterruptonbothplatforms.ItincludesbothinterruptacknowledgmentandcompletiononARM,butonlycompletiononthex86platform.ARMrequiresanadditionaloperation,theacknowledgment,totheinterruptcontrollertodeterminethesourceoftheinterrupt.x86doesnotbecausethesourceisdirectlyindicatedbytheinterruptdescriptortableentryatthetimewhentheinterruptisraised.However,theoperationisroughly5timesfasteronARMthanx86becausethereisnoneedtotraptothehypervisoronARMbecauseofVGICsupportforbothoperations.Onx86,theEOIoperationmustbeemulatedandthereforecausesatraptothehypervisor.ThisoperationisrequiredforeveryvirtualinterruptincludingbothvirtualIPIsandinterruptsfromvirtualdevices.Figures3to7showvirtualizedexecutionmeasurementsnormalizedrelativetotheirrespectivenativeexecutionmeasure-ments,withlowerbeinglessoverhead.Figures3and4shownormalizedperformanceforrunninglmbenchinaVMversusrunningdirectlyonthehost.Figure3showsthatKVM/ARMandKVMx86havesimilarvirtualizationoverheadinasinglecoreconfiguration.Forcomparison,wealsoshowKVM/ARMper-formancewithoutVGIC/vtimers.Overall,usingVGIC/vtimersprovidesslightlybetterperformanceexceptforthepipeandctxswworkloadswherethedifferenceissubstantial.Thereasonforthehighoverheadinthiscaseiscausedbyupdatingtherun-queueclockintheLinuxschedulereverytimeaprocessblocks,sincereadingacountertrapstouserspacewithoutvtimersontheARMplatform.WeverifiedthisbyrunningtheworkloadwithVGICsupport,butwithoutvtimers,andwecountedthenumberoftimerreadexitswhenrunningwithoutvtimerssupport.Figure4showsmoresubstantialdifferencesinvirtualizationoverheadbetweenKVM/ARMandKVMx86inamulticoreconfiguration.KVM/ARMhaslessoverheadthanKVMx86forkandexec,butmoreforprotectionfaults.Bothsystemshavetheworstoverheadforthepipeandctxswworkloads,thoughKVMx86ismorethantwotimesworseforpipe.ThisisduetothecostofrepeatedlysendinganIPIfromthesenderofthedatainthepipetothereceiverforeachmessageandthecostofsendinganIPIwhenschedulinganewprocess.x86notonlyhashigherIPIoverheadthanARM,butitmustalsoEOIeachIPI,whichismuchmoreexpensiveonx86thanonARMbecausethisrequirestrappingtothehypervisoronx86butnotonARM.WithoutusingVGIC/vtimers,KVM/ARMalsoincurshighover-headcomparabletoKVMx86becauseitthenalsotrapstothehypervisortoACKandEOItheIPIs.Figures5and6shownormalizedperformanceforrunningapplicationworkloadsinaVMversusrunningdirectlyonthehost.Figure5showsthatKVM/ARMandKVMx86havesim-ilarvirtualizationoverheadacrossallworkloadsinasinglecoreconfigurationexceptfortheMySQLworkloads,butFigure6showsthattherearemoresubstantialdifferencesinperformanceonmulticore.Onmulticore,KVM/ARMhassignificantlylessvirtualizationoverheadthanKVMx86onApacheandMySQL.Overallonmulticore,KVM/ARMperformswithin10%ofrunningdirectlyonthehardwareforallapplicationworkloads, 343 Figure3:UPVMNormalizedlmbenchPerformance Figure4:SMPVMNormalizedlmbenchPerformance Figure5:UPVMNormalizedApplicationPerformance Figure6:SMPVMNormalizedApplicationPerformancewhilethemorematureKVMx86systemhassignificantlyhighervirtualizationoverheadsforApacheandMySQL.KVM/ARM'ssplit-modevirtualizationdesignallowsittoleverageARMhardwaresupportwithcomparableperformancetoatraditionalhypervisorusingx86hardwaresupport.ThemeasurementsalsoshowthatKVM/ARMperformsbetteroverallwithARMVGIC/vtimerssupportthanwithout.Figure7showsnormalizedpowerconsumptionofusingvirtualizationversusdirectexecutionforvariousapplicationworkloadsonmulticore.WeonlycomparedKVM/ARMonARMagainstKVMx86onx86laptop.TheIntelCorei7CPUusedintheseexperimentsisoneofIntel'smorepoweropti-mizedprocessors,andweexpectthatserverpowerconsumptionwouldbeevenhigher.ThemeasurementsshowthatKVM/ARMusingVGIC/vtimersismorepowerefficientthanKVMx86virtualizationinallcasesexceptmemcachedanduntar.BothworkloadsarenotCPUboundonbothplatformsandthepowerconsumptionisnotsignificantlyaffectedbythevirtualizationlayer.However,duetoARM'sslightlyhighervirtualizationoverheadfortheseworkloads,theenergyvirtualizationoverheadisslightlyhigheronARMforthetwoworkloads.Whileamoredetailedstudyofenergyaspectsofvirtualizationisbeyondthescopeofthispaper,thesemeasurementsneverthelessprovideusefuldatacomparingARMandx86virtualizationenergycosts. Figure7:SMPVMNormalizedEnergyConsumption5.3ImplementationComplexityWecomparethecodecomplexityofKVM/ARMtoitsKVMx86counterpartinLinux3.10.KVM/ARMis5,812linesofcode(LOC),countingjustthearchitecture-specificcodeaddedtoLinuxtoimplementit,ofwhichthelowvisorisamere718LOC.Asaconservativecomparison,KVMx86is25,367LOC,ex-cludingguestperformancemonitoringsupport,notyetsupportedbyKVM/ARM,and3,311LOCrequiredforAMDsupport.ThesenumbersdonotincludeKVM'sarchitecture-genericcode,7,071LOC,whichissharedbyallsystems.Table4showsa 344 Component KVM/ARM KVMx86(Intel) CoreCPU 2,493 16,177 PageFaultHandling 738 3,410 Interrupts 1,057 1,978 Timers 180 573 Other 1,344 1,288 Architecture-specific 5,812 25,367 Table4:CodeComplexityinLinesofCode(LOC)breakdownofthetotalhypervisorarchitecture-specificcodeintoitsmajorcomponents.Byinspectingthecodewenoticethatthestrikingadditionalcomplexityinthex86implementationismainlyduetothefivefollowingreasons:(1)SinceEPTwasnotsupportedinearlierhardwareversions,KVMx86mustsupportshadowpagetables.(2)Thehardwarevirtualizationsupporthaveevolvedovertime,requiringsoftwaretoconditionallycheckforsupportforalargenumberoffeaturessuchasEPT.(3)Anumberofoperationsrequiresoftwaredecodingofinstructionsonthex86platform.KVM/ARM'sout-of-treeMMIOinstructiondecodeimplemen-tationwasmuchsimpler,only462LOC.(4)Thevariouspagingmodeonx86requiresmoresoftwarelogictohandlepagefaults.(5)x86requiresmoresoftwarelogictosupportinterruptsandtimersthanARM,whichprovidesVGIC/vtimershardwaresupportthatreducessoftwarecomplexity.KVM/ARM'sLOCislessthanpartiallycompletebare-metalmicrovisorswrittenforHypmode[31],withthelowvisorLOCalmostanorderofmagnitudesmaller.Unlikestandalonehyper-visors,KVM/ARM'scodecomplexityissosmallbecauselotsoffunctionalitysimplydoesnothavetobeimplementedbecauseitisalreadyprovidedbyLinux.Table4doesnotincludeothernon-hypervisorarchitecture-specificLinuxcode,suchasbasicbootstrapping,whichissignificantlymorecode.Portingastan-dalonehypervisorsuchasXenfromx86toARMismuchmorecomplicatedbecauseallofthatARMcodeforbasicsystemfunc-tionalityneedstobewrittenfromscratch.Incontrast,sinceLinuxisdominantonARM,KVM/ARMjustleveragesexistingLinuxARMsupporttorunoneveryplatformsupportedbyLinux.6.RecommendationsFromourexperiencesbuildingKVM/ARM,weofferafewrec-ommendationsforhardwaredesignerstosimplifyandoptimizefuturehypervisorimplementations.Sharekernelmodememorymodel.ThehardwaremodetorunahypervisorshouldusethesamememorymodelasthehardwaremodetorunOSkernels.SoftwaredesignersthenhavegreaterflexibilityindecidinghowtightlytointegrateahypervisorwithexistingOSkernels.ARMHypmodeunfortunatelydidnotdothis,preventingKVM/ARMfromsimplyreusingthekernel'spagetablesinHypmode.ThisreusewouldhavesimplifiedtheimplementationandallowedforperformancecriticalemulationcodetoruninHypmode,avoidingacompleteworldswitchinsomecases.Somemightarguethatthisrecommendationmakesformorecomplicatedstandalonehypervisorimplementations,butthisisnotreallytrue.Forexample,ARMkernelmodeal-readyhasasimpleoptiontouseoneortwopagetablebaseregisterstounifyorsplittheaddressspace.Ourrecommendationisdifferentfromthex86virtualizationapproach,whichdoesnothaveaseparateandmoreprivileged,hypervisorCPUmode.HavingaseparateCPUmodepotentiallyimprovesstand-alonehypervisorperformanceandimplementation,butnotsharingthekernelmemorymodelcomplicatesthedesignofhypervisorsintegratedwithhostkernels.MakeVGICstateaccessfast,oratleastinfrequent.WhileVGICsupportcanimproveperformanceespeciallyonmulticoresystems,ourmeasurementsalsoshowthataccesstoVGICstateaddssubstantialoverheadtoworldswitches.ThisiscausedbyslowMMIOaccesstotheVGICcontrolinterfaceinthecriticalpath.ImprovingtheMMIOaccesstimeislikelytoimproveVMperformance,butifthisisnotpossibleorcost-effective,MMIOaccessestotheVGICcouldatleastbemadelessfrequent.Forexample,asummaryregistercouldbeintroduceddescribingthestateofeachvirtualinterrupt.Thiscouldbereadwhenper-formingaworldswitchfromtheVMtothehypervisortogetinformationwhichcancurrentlyonlybeobtainedbyreadingallthelistregisters(seeSection3.5)oneachworldswitch.CompletelyavoidIPItraps.HardwaresupporttosendvirtualIPIsdirectlyfromVMswithouttheneedtotraptothehypervisorwouldimproveperformance.Hardwaredesignersmayunderes-timatehowfrequentIPIsareonmodernmulticoreOSes,andourmeasurementsrevealthatsendingIPIsaddssignificantoverheadforsomeworkloads.ThecurrentVGICdesignrequiresatraptothehypervisortoemulateaccesstotheIPIregisterinthedis-tributor,andthisemulatedaccessmustbesynchronizedbetweenvirtualcoresusingasoftwarelockingmechanism,whichaddssignificantoverheadforIPIs.CurrentARMhardwaresupportsreceivingthevirtualIPIs,whichcanbeACKedandEOIedwith-outtraps,butunfortunatelydoesnotaddressthealsoimportantissueofsendingvirtualIPIs.7.RelatedWorkVirtualizationhasalonghistory[27],buthasenjoyedaresur-gencestartinginthelate1990s.Mosteffortshavealmostex-clusivelyfocusedonvirtualizingthex86architecture.WhilesystemssuchasVMware[10]andXen[]wereoriginallybasedonsoftware-onlyapproachesbeforetheintroductionofx86hardwarevirtualizationsupport,allx86virtualizationplat-forms,VMware[],Xen,andKVM[22],nowleveragex86hardwarevirtualizationsupport.Becausex86hardwarevirtual-izationsupportdifferssubstantiallyfromARMintheabilitytocompletelyrunthehypervisorinthesamemodeasthekernel,x86virtualizationapproachesdonotlendthemselvesdirectlytotakeadvantageofARMhardwarevirtualizationsupport.Somex86approachesalsoleveragethehostkerneltopro-videfunctionalityforthehypervisor.VMwareWorkstation'shypervisorcreatesaVMMseparatefromthehostkernel,butthis 345 approachisdifferentfromKVM/ARMinanumberofimportantways.First,theVMwareVMMisnotintegratedintothehostker-nelsourcecodeandthereforecannotreuseexistinghostkernelcode,forexample,forpopulatingpagetablesrelatingtotheVM.Second,sincetheVMwareVMMisspecifictox86itdoesnotrunacrossdifferentprivilegedCPUmodes,andthereforedoesnotuseadesignsimilartoKVM/ARM.Third,mostoftheem-ulationandfault-handlingcoderequiredtorunaVMexecutesatthemostprivilegedlevelinsidetheVMM.KVM/ARMexecutesthiscodeinthelessprivilegedkernelmode,andonlyexecutesaminimalamountofcodeinthemostprivilegedmode.Incontrast,KVMbenefitsfrombeingintegratedwiththeLinuxkernellikeKVM/ARM,butthex86designreliesonbeingabletorunthekernelandthehypervisortogetherinthesamehardwarehypervisormode,whichisproblematiconARM.Full-systemvirtualizationoftheARMarchitectureisarela-tivelyunexploredresearcharea.Mostapproachesaresoftwareonly.Anumberofstandalonebaremetalhypervisorshavebeendeveloped[161728],butthesearenotwidespread,aredevel-opedspecificallyfortheembeddedmarket,andmustbemodifiedandportedtoeverysinglehosthardwareplatform,limitingtheiradoption.AnabandonedportofXenforARM[18]requirescomprehensivemodificationstotheguestkernel,andwasneverfullydeveloped.AnearlierprototypeforKVMonARM[1215usedanautomatedlightweightparavirtualizationapproachtoautomaticallypatchkernelsourcecodetorunasaguestkernel,buthadpoorperformance.VMwareHorizonMobile[]useshostedvirtualizationtoleverageLinux'ssupportforawiderangeofhardwareplatforms,butrequiresmodificationstoguestOSesanditsperformanceisunproven.NoneoftheseparavirtualizationapproachescouldrununmodifiedguestOSes.AnearlierstudyattemptedtoestimatetheperformanceofARMhardwarevirtualizationsupportusingasoftwaresimulatorandasimplehypervisorlackingimportantfeatureslikeSMPsupportanduseofstorageandnetworkdevicesbymultipleVMs[31].Becauseofthelackofhardwareoracycle-accuratesimulator,norealperformanceevaluationwaspossible.Incon-trast,wepresentthefirstevaluationofARMvirtualizationexten-sionsusingrealhardware,provideadirectcomparisonwithx86,andpresentthedesignandimplementationofacompletehypervi-sorusingARMvirtualizationextensions,includingSMPsupport.AnewerversionofXenexclusivelytargetingservers[32]isbeingdevelopedusingARMhardwarevirtualizationsupport.Be-causeXenisabaremetalhypervisorthatdoesnotleveragekernelfunctionality,itcanbearchitectedtorunentirelyinHypmoderatherthanusingsplit-modevirtualization.Atthesametime,thisrequiresasubstantialcommercialengineeringeffort.SinceXenisastandalonehypervisor,portingXenfromx86toARMisdif-ficultinpartbecauseallARM-relatedcodemustbewrittenfromscratch.EvenaftergettingXentoworkononeARMplatform,itmustbemanuallyportedtoeachdifferentARMdevicethatXenwantstosupport.BecauseofXen'scustomI/OmodelusinghypercallsfromVMsfordeviceemulationonARM,Xenunfor-tunatelycannotrunguestOSesunlesstheyhavebeenconfiguredtoincludeXen'shypercalllayerandincludesupportforXenBusparavirtualizeddrivers.Incontrast,KVM/ARMusesstandardLinuxcomponentstoenablefasterdevelopment,fullSMPsup-port,andtheabilitytorununmodifiedOSes.KVM/ARMiseasilysupportedonnewdeviceswithLinuxsupport,andwespentalmostnoefforttosupportKVM/ARMonARM'sVersa-tileExpressboards,theArndaleboard,andhardwareemulators.WhileXencanpotentiallyreduceworldswitchtimesforopera-tionsthatcanbehandledinsidetheXenhypervisor,switchingtoDom0forI/OsupportorswitchingtootherVMswouldinvolvecontextswitchingthesamestateasKVM/ARM.Microkernelapproachesforhypervisors[1630]havebeenusedtoreducethehypervisorTCBandrunotherhypervisorser-vicesinusermode.Theseapproachesdifferbothindesignandrationalefromsplit-modevirtualization,whichsplitshypervisorfunctionalityacrossprivilegedmodestoleveragevirtualizationhardwaresupport.Split-modevirtualizationalsoprovidesadiffer-entsplitofhypervisorfunctionality.KVM/ARM'slowvisorisamuchsmallercodebasethatimplementsonlythelowestlevelhy-pervisormechanisms.Itdoesnotincludehigher-levelfunctional-itypresentinthehypervisorTCBusedintheseotherapproaches.8.ConclusionsKVM/ARMisthemainlineLinuxARMhypervisorandthefirstsystemthatcanrununmodifiedguestoperatingsystemsonARMmulticorehardware.KVM/ARM'ssplit-modevirtu-alizationmakesitpossibletouseARMhardwarevirtualizationextensionswhileleveragingLinuxkernelmechanismsandhard-waresupport.OurexperimentalresultsshowthatKVM/ARM(1)incursminimalperformanceimpactfromtheextratrapsin-curredbysplit-modevirtualization,(2)hasmodestvirtualizationoverheadandpowercosts,within10%ofdirectnativeexecutiononmulticorehardwareforrealapplicationworkloads,and(3)achievescomparableorlowervirtualizationoverheadandpowercostsonmulticorehardwarecomparedtowidely-usedKVMx86virtualization.BasedonourexperiencesintegratingKVM/ARMintothemainlineLinuxkernel,weprovidesomehintsongettingresearchideasandcodeadoptedbytheopensourcecommunity,andrecommendationsforhardwaredesignerstoimprovefuturehypervisorimplementations.9.AcknowledgmentsMarcZyngierhelpedwithdevelopment,implementedVGICandvtimerssupport,andassisteduswithhardwarebringup.RustyRussellworkedonthecoprocessoruserspaceinterfaceandassistedwithupstreaming.WillDeaconandAviKivityprovidednumeroushelpfulcodereviews.PeterMaydellhelpedwithQEMUsupportanddebugging.GlebNatapovhelpedusbetterunderstandKVMx86performance.MarceloTosattiandNicolasViennothelpedwithresolvingwhatbecameknownasthevoodoobug.KeithAdams,HaniJamjoom,andEmmettWitchelprovidedhelpfulcommentsonearlierdraftsofthispaper.ThisworkwassupportedinpartbyARMandNSFgrantsCNS-1162447,CCF-1162021,andCNS-1018355. 346 References[1]K.AdamsandO.Agesen.AComparisonofSoftwareandHard-wareTechniquesforx86Virtualization.InProceedingsofthe12thInternationalConferenceonArchitecturalSupportforProgram-mingLanguagesandOperatingSystems,pages2–13,Oct.2006.[2]O.Agesen,J.Mattson,R.Rugina,andJ.Sheldon.SoftwareTechniquesforAvoidingHardwareVirtualizationExits.InProceedingsofthe2012USENIXAnnualTechnicalConferencepages373–385,June2012.[3]ARMLtd.ARMEnergyProbe.http://www.arm.com/products/tools/arm-energy-probe.php[4]ARMLtd.ARMCortex-A15TechnicalReferenceManualARMDDI0438C,Sept.2011.[5]ARMLtd.ARMGenericInterruptControllerArchitectureversion2.0ARMIHI0048B,June2011.[6]ARMLtd.ARMArchitectureReferenceManualARMv7-ADDI0406C.b,July2012.[7]ARMLtd.ARMArchitectureReferenceManualARMv8-ADDI0487A.a,Sept.2013.[8]P.Barham,B.Dragovic,K.Fraser,S.Hand,T.Harris,A.Ho,R.Neugebauer,I.Pratt,andA.Warfield.XenandtheArtofVirtualization.InProceedingsofthe19thACMSymposiumonOperatingSystemsPrinciples,pages164–177,Oct.2003.[9]K.Barr,P.Bungale,S.Deasy,V.Gyuris,P.Hung,C.Newell,H.Tuch,andB.Zoppis.TheVMwareMobileVirtualizationPlatform:isthatahypervisorinyourpocket?SIGOPSOperatingSystemsReview,44(4):124–135,Dec.2010.[10]E.Bugnion,S.Devine,M.Rosenblum,J.Sugerman,andE.Y.Wang.BringingVirtualizationtothex86ArchitecturewiththeOriginalVMwareWorkstation.ACMTransactionsonComputerSystems,30(4):12:1–12:51,Nov.2012.[11]C.DallandA.Jones.KVM/ARMUnitTests.https://github.com/columbia/kvm-unit-tests[12]C.DallandJ.Nieh.KVMforARM.InProceedingsoftheOttawaLinuxSymposium,pages45–56,July2010.[13]C.DallandJ.Nieh.SupportingKVMontheARMarchitecture.LWN.net,July2013.http://lwn.net/Articles/557132/[14]DavidBrash,ArchitectureProgramManager,ARMLtd.Personalcommunication,Nov.2012.[15]J.-H.Ding,C.-J.Lin,P.-H.Chang,C.-H.Tsang,W.-C.Hsu,andY.-C.Chung.ARMvisor:SystemVirtualizationforARM.InProceedingsoftheOttawaLinuxSymposium(OLS),pages93–107,July2012.[16]GeneralDynamics.OKL4Microvisor.http://www.ok-labs.com/products/okl4-microvisor[17]GreenHillsSoftware.INTEGRITYSecureVirtualiza-tion.http://www.ghs.com/products/rtos/integrity_virtualization.html[18]J.Hwang,S.Suh,S.Heo,C.Park,J.Ryu,S.Park,andC.Kim.XenonARM:SystemVirtualizationusingXenHypervisorforARM-basedSecureMobilePhones.InProceedingsofthe5thConsumerCommunicationsandNewtorkConference,Jan.2008.[19]InSignalCo.ArndaleBoard.org.http://arndaleboard.org[20]IntelCorporation.Intel64andIA-32ArchitecturesSoftwareDevelopersManual,325462-044US,Aug.2012.[21]A.Kivity.KVMUnitTests.https://git.kernel.org/cgit/virt/kvm/kvm-unit-tests.git[22]A.Kivity,Y.Kamay,D.Laor,U.Lublin,andA.Liguori.kvmTheLinuxVirtualMachineMonitor.InProceedingsoftheOttawaLinuxSymposium(OLS),volume1,pages225–230,June2007.[23]KVM/ARMMailingList.https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm[24]LinuxARMKernelMailingList.A15H/WVirtualizationSupport,Apr.2011.http://archive.arm.linux.org.uk/lurker/message/20110412.204714.a36702d9.en.html[25]L.McVoyandC.Staelin.lmbench:PortableToolsforPerfor-manceAnalysis.InProceedingsofthe1996USENIXAnnualTechnicalConference,pages279–294,Jan.1996.[26]I.Molnar.Hackbench.http://people.redhat.com/mingo/cfs-scheduler/tools/hackbench.c[27]G.J.PopekandR.P.Goldberg.FormalRequirementsforVirtualizableThirdGenerationArchitectures.CommunicationsoftheACM,17(7):412–421,July1974.[28]RedBendSoftware.vLogixMobile.http://www.redbend.com/en/mobile-virtualization[29]R.Russell.virtio:TowardsaDe-FactoStandardforVirtualI/ODevices.SIGOPSOperatingSystemsReview,42(5):95–103,July2008.[30]U.SteinbergandB.Kauer.Nova:AMicrohypervisor-BasedSecureVirtualizationArchitecture.InProceedingsofthe5thEuro-peanConferenceonComputerSystems,pages209–222,Apr.2010.[31]P.VaranasiandG.Heiser.Hardware-SupportedVirtualizationonARM.InProceedingsoftheSecondAsia-PacificWorkshoponSystems,pages11:1–11:5,July2011.[32]Xen.org.XenARM.http://xen.org/products/xen_arm.html 347