/
How to determine a good multiprogramming level for externa l scheduling Bianca Schroeder How to determine a good multiprogramming level for externa l scheduling Bianca Schroeder

How to determine a good multiprogramming level for externa l scheduling Bianca Schroeder - PDF document

test
test . @test
Follow
514 views
Uploaded On 2014-12-18

How to determine a good multiprogramming level for externa l scheduling Bianca Schroeder - PPT Presentation

cmuedu Arun Iyengar Erich Nahum Adam Wierman IBM TJ Watson Research Center Yorktown Heights NY USA aruninahum usibmcom Abstract Schedulingprioritization of DBMS transactions is im portant for many applications that rely on database back ends A conven ID: 25895

cmuedu Arun Iyengar Erich Nahum

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "How to determine a good multiprogramming..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Howtodetermineagoodmulti-programminglevelforexternalschedulingBiancaSchroederxMorHarchol-BalterxxCarnegieMellonUniversityDepartmentofComputerSciencePittsburgh,PAUSAbianca,harchol,acw&#x-1.2;蝗@cs.cmu.eduArunIyengaryErichNahumyAdamWiermanxyIBMT.J.WatsonResearchCenterYorktownHeights,NYUSAaruni,nahum&#x-1.2;虔@us.ibm.comAbstractScheduling/prioritizationofDBMStransactionsisim-portantformanyapplicationsthatrelyondatabaseback-ends.Aconvenientwaytoachieveschedulingistolimitthenumberoftransactionswithinthedatabase,maintain-ingmostofthetransactionsinanexternalqueue,whichcanbeorderedasdesiredbytheapplication.Whileexternalschedulinghasmanyadvantagesinthatitdoesn'trequirechangestointernalresources,itisalsodifculttogetrightinthatitsperformancedependscriticallyontheparticularmultiprogramminglimitused(theMPL),i.e.thenumberoftransactionsallowedintothedatabase.IftheMPListoolow,throughputwillsuffer,sincenotallDBMSresourceswillbeutilized.Ontheotherhand,iftheMPListoohigh,thereisinsufcientcontrolonscheduling.ThequestionofhowtoadjusttheMPLtoachievebothgoalssimultaneouslyisanopenproblem,notjustfordatabasesbutinsystemde-signingeneral.Hereinwestudythisprobleminthecontextoftransactionalworkloads,bothviaextensiveexperimenta-tionandqueueingtheoreticanalysis.WendthatthetwomostcriticalfactorsinadjustingtheMPLarethenumberofresourcesthattheworkloadutilizesandthevariabilityofthetransactions'servicedemands.Wedevelopafeedbackbasedcontroller,augmentedbyqueue-ingtheoreticmodelsforautomaticallyadjustingtheMPL.Finally,weapplyourmethodstothespecicproblemofex-ternalprioritizationoftransactions.Wendthatexternalprioritizationcanbenearlyaseffectiveasinternalprioriti-zation,withoutanynegativeconsequences,whentheMPLissetappropriately.1.IntroductionManyoftodayswebapplicationsarelargelydependentonabackenddatabase,wherethemajorityoftherequest SupportedbyNSFgrantsCCR-0133077,CCR-0311383,0313148,anda2005PittsburghDigitalGreenhouseGrant.processingtimeisspent.Forsuchapplicationsitisoftendesirabletocontroltheorderinwhichtransactionsareexe-cutedattheDBMS.Ane-commerceapplicationsforexam-plemightwanttogivefasterservicetothosetransactionscarryingalotofrevenue.Recently,systemsresearchershavestartedtoinvestigatetheideaofexternalschedulingasamethodofcontrollingtheorderinwhichtransactionsareexecuted.ThebasicmechanisminexternalschedulingisdemonstratedinFig-ure1,andsimplyinvolveslimitingthenumberoftransac-tionsconcurrentlyexecutingwithintheDBMS.ThislimitisreferredtoastheMPL(multi-programminglimit).IftheMPLisalreadymet,allremainingtransactionsarequeuedupinanexternalqueue.Theapplicationcanthencontroltheorderinwhichtransactionsareexecutedbyschedulingtheexternalqueue. DBMSMPL=4 incomingtransactionsexternalqueue Figure1.Simpliedviewofthemechanismusedinexternalscheduling.Axedlimitednumberoftrans-actions(MPL=4)areallowedintotheDBMSsimul-taneously.Theremainingtransactionsareheldbackinanexternalqueue.Responsetimeisthetimefromwhenatransactionarrivesuntilitcompletes,includ-ingtimespentqueueingexternallytotheDBMS.Examplesofrecentworkonexternalschedulingcomefrommanyareasincludingstorageservers,webservers,anddatabaseservers.Forexample,Jinetal.[9]developanex-ternalschedulingfront-endtoprovideproportionalsharingamongtherequestsatastorageserviceutility.Blanqueretal.[4]studyexternalschedulingforqualityofservicepro- visioningatanInternetservicescluster.Inourownrecentwork[22]weproposeexternalschedulingforprovidingclass-basedqualityofserviceguaranteesfortransactional,databasedrivenworkloads.Finally,formanycommercialDBMSthereexisttoolsthatprovidemechanismsforexter-nalscheduling,suchastheIBMDB2QueryPatroller[2].TheadvantageoftheexternalapproachisthatitisportableandeasytoimplementsinceitdoesnotrequirechangestocomplexDBMSinternals.Moreoveritisef-fectiveacrossdifferenttypesofworkloads,since(unliketheinternalapproachwhichdirectlyschedulestheresourcesinsidethebackendDBMS)externalschedulingworksinde-pendentlyofthesystem'sbottleneckresource.Itisalsoveryexibleinthatitallowsapplicationstoimplementtheirowncustom-tailoredschedulingpolicy,ratherthanbeinglimitedtothepoliciessupportedbythebackendDBMS.Whilethebasicideabehindexternalschedulingissim-ple,itsefcacyinpracticehingesontherightchoiceoftheMPL.ForschedulingtobemosteffectivealowMPLisde-sirable,sincethenatanytimeonlyasmallnumberoftrans-actionswillbeexecutinginsidetheDBMS,whilealargenumberarequeuedunderthecontroloftheexternalsched-uler.Ontheotherhand,toolowanMPLcanhurttheover-allperformanceoftheDBMS,e.g.,byunderutilizingtheDBMSresourcesresultinginadropinsystemthroughput.WhilemanyhavecitedtheproblemofchoosingtheMPLinexternalschedulingascritical,previousresearchinallar-easofsystemdesignleavesitasanopenproblem.ExistingtoolsforexternalschedulingleavethechoiceofMPLtothesystemadministrator.Thequestionofthispaperis:HowlowcanwechoosetheMPLtofacilitateeffectivescheduling,withouthurtingoverallsystemperformance?Therearethreeimportantcon-siderationswhenchoosinganMPL:(1)Asalreadymen-tioned,byholdingbacktransactionsoutsidetheDBMS,theconcurrencyinsidetheDBMSislowered,whichcanleadtoadropinthroughput.Weseekalgorithmsthatdetermine,foranyinputscenario,thelowestpossibleMPLvaluenec-essarytoensurenear-optimalthroughputlevels(whencom-paredtothesystemwithoutMPL).(2)Holdingbacktrans-actions,andsequencingthem(ratherthanlettingthemallsharethedatabaseresourcesconcurrently),createsthepo-tentialforhead-of-line(HOL)blockingwheresomelong-runningtransactionspreventothershortertransactionsfromenteringtheDBMSandreceivingservice.Thiscanresultinanactualincreaseinoverallmeanresponsetime.Weseekalgorithmsthatdetermine,foranyinputscenario,thelow-estpossibleMPLvaluenecessarytopreventanincreaseinoverallmeanresponsetime.(3)Lastly,itisnotatallob-viousthatexternalscheduling,evenwithasufcientlylowMPL,willbeaseffectiveasinternalscheduling,sinceanexternalschedulerdoesnothaveanycontroloverthetrans-actionsoncethey'redispatchedtotheDBMS.Section2describesthewiderangeofhardwarecongu-rations,workloadsanddifferentDBMSweuseinourexper-iments.Section3evaluatesexperimentallyhowlowwecansettheMPLwithouthurtingthroughputandoverallmeanresponsetime.Wendthattheanswertothisquestioniscomplex,andweidentifythedominantfactorsthatprovidetheanswertothisquestion.Next,inSection4wecreatequeueingtheoreticmodelsbasedonthendingsinSec-tion3,thatcapturetherelationshipbetweentheMPLandthroughputandoverallmeanresponsetime.Wethenshowhowafeedback-basedcontrollercanbeused,inconjunc-tionwiththequeueingmodels,toautomaticallyadapttheMPL.Finally,inSection5weevaluatetheeffectivenessofexternalschedulinginoneparticularapplicationinvolvingprioritizationoftransactions.WestudywhetherexternalschedulingwiththeappropriatelychosenMPLcanbeaseffectiveasinternalschedulingwithrespecttoprovidingdifferentiationbetweenhighandlowprioritytransactions.Itisimportanttonotethatthroughoutthispapertheques-tionishowlowanMPLonecanchoosewithouthurtingsystemperformance.Whilethisquestionhasnotbeenad-dressedinanypreviouswork,acomplementaryquestioninvolvinghighMPLshasbeenlookedatinthecontextofadmissioncontrol,seeforexample[5,8,10,12,18].ThepointofthesestudiesisthatthroughputsufferswhentoomanytransactionsareallowedintotheDBMSatonce,duetoexcessivelockcontention(lockthrashing)orduetoover-loadofsomesystemresource.HenceitisbenecialtohavesomehighMPLupperboundonthenumberoftransactionsallowedwithintheDBMS,withtheunderstandingthatifthisMPLissettoohigh,thenthroughputwillstarttodrop.Admissioncontrolstudieshowtolimitthenumberofcon-currenttransactionswithintheDBMSbydroppingtrans-actionswhenthislimitisreached.Ourworklooksattheotherendofthisproblem–thatofverylowMPLsneededtoprovideprioritizationdifferentiationorsomeothertypeofscheduling–anddoesnotinvolvedroppingrequests.2.ExperimentalsetupToanswerthequestionsoffeasibilityandeffectivenessofexternalprioritization,itisimportanttoevaluatetheef-fectofdifferentworkloadsandhardwarecongurationsonthesequestions.TheimportanceoflookingatdifferentworkloadsisthatanI/Oboundworkloadmay,forexam-ple,requireahigherMPL,asdisksneedmoresimultaneousrequeststoperformefciently.Theimportanceofconsider-ingdifferenthardwarecongurationsisthatahigherMPLmayberequiredtoachievegoodthroughputinasystemwithalargenumberofhardwareresources,sincemorere-questsareneededtokeepthemanyresourcesbusy.Wewillthereforeexperimentwithawiderangeofhardwarecong-urationsandworkloads,andtwodifferentDBMS. Workload Benchmark Conguration Database Mainmemory Bufferpool CPU IO load load WCPUinventory TPC-C 10warehouses, 1GB 3GB 1GB high low WCPUbrowsing TPC-WBrowsing 100EBs,10Kitems,140Kcustomers 300MB 3GB 500MB high low WI=Obrowsing TPC-WBrowsing 500EBs,10Kitems,288Kcustomers 2GB 512MB 100MB low high WI=Oinventory TPC-C 60warehouses, 6GB 512MB 100MB low high WCPU+I=Oinventory TPC-C 10warehouses, 1GB 1GB 1GB high high WCPUordering TPC-WOrdering 100EBs,10Kitems,140Kcustomers 300MB 3GB 500MB high low Table1.Descriptionoftheworkloadsusedintheexperiments.2.1.ExperimentalarchitecturesTheDBMSweexperimentwithareIBMDB2[1]ver-sion8.1,andShore[20].Shoreisaprototypestorageman-agerwithstate-of-the-arttransactionmanagement,2PL,andAries-stylerecovery;weuseitbecausewehavethesourcecode,enablingustoimplementinternalpriorities.AllofourexternalschedulingresultsarealsocorroboratedusingPost-greSQL[21]version7.3,althoughwedonotshowtheseresultshereforlackofspace.TheDBMSisrunningona2.4-GHzPentium4runningLinux2.4.23.Thebufferpoolsizeandmainmemorysizewilldependontheworkload(seeTable1).Themachineisequippedwithsix120GBIDEdrives,oneofwhichweuseforthedatabaselog.ThenumberofremainingIDEdrivesthatweuseforthedatawilldependontheparticularexperiment.Theworkloadgeneratorisrunonaseparatemachinewiththesamespecicationsasthedatabaseserver.2.2.ExperimentalworkloadsandsetupsWhendiscussingtheeffectoftheMPLitisimportanttoconsiderawiderangeofworkloads.UnfortunatelythereareonlyalimitednumberofstandardOLTPbenchmarkswhicharebothwell-acceptedandpubliclyavailable,inpar-ticularTPC-C[6]andTPC-W[7].Fortunately,however,thesetwobenchmarkscanbeusedtocreateamuchwiderrangeofworkloadsbyvaryingalargenumberof(i)hard-wareand(ii)benchmarkcongurationparameters.Table1describesthedifferentworkloadswecreatebasedondif-ferentcongurationofthetwobenchmarks.Thebench-markcongurationparametersthatwevaryinclude:(a)thenumberofwarehousesinTPC-C,(b)thesizeofthedatabaseinTPC-W(thisincludesboththenumberofitemsincludedinthedatabasestoreandthenumberof“emulatedbrowsers”(EBs)whichaffectsthenumberofcustomers),and(c)thetypeoftransactionmixusedinTPC-W,partic-ularlywhethertheseareprimarily“browsing”transactionsorprimarily“ordering”transactions.WeruntheworkloadsfromTable1underdifferenthardwarecongurationscre-ating17different“Setups”assummarizedinTable2.Thehardwareparametersthatwevaryinclude:(a)thenumberofdisks(1–6),(b)thenumberofCPUs(1or2),and(c) Setup Workload Number Number Isolation CPUs disks level 1 WCPUinventory 1 1 RR 2 WCPUinventory 2 1 RR 3 WCPUbrowsing 1 1 RR 4 WCPUbrowsing 2 1 RR 5 WIOinventory 1 1 RR 6 WIOinventory 1 2 RR 7 WIOinventory 1 3 RR 8 WIOinventory 1 4 RR 9 WIObrowsing 1 1 RR 10 WIObrowsing 1 4 RR 11 WCPU+IOinventory 1 1 RR 12 WCPU+IOinventory 2 4 RR 13 WCPUordering 1 1 RR 14 WCPUordering 1 1 UR 15 WCPUordering 2 1 RR 16 WCPUordering 2 1 UR 17 WCPUinventory 1 1 UR Table2.DenitionofsetupsbasedontheworkloadsinTable1.themainmemory(rangingbetween512MBand3GB).Wealsovarytheisolationleveltocreatedifferentlevelsoflockcontention,startingwiththedefaultisolationlevelof3(correspondingtoRRinDB2–RepeatableRead),butalsoexperimentingwithlowerisolationlevels(UR–Uncommit-tedRead),leadingtolesslockcontention.Inallworkloads,weholdthenumberofclientsconstantat100.3.FeasibilityoflowMPL:ExperimentalstudyInthissectionweaskhowlowcanwemaketheMPLwithoutcausingdeteriorationinthroughputand/oroverallmeanresponsetime.TheaimistolookatlowvaluesoftheMPLandstudytheireffectonthroughputandthenonmeanresponsetimeusingtheexperimentalsetupsdescribedintheprevioussection.(WewillnotbeconsideringhighvaluesoftheMPL,thatarecommonlylookedatinstudiesdealingwithoverloadandadmissioncontrol.)Wewillbeinterestedinidentifyingtheworkloadfactorsthataffecttheanswertothequestionof“howlowcanonemaketheMPL.”TheseresultsaresummarizedinSection3.3. 5 10 15 20 25 30 0 20 40 60 80 100 120 140 Throughput (xact/sec)MPL Two CPUs 0 5 10 15 20 25 0 5 10 15 20 Throughput (xact/sec)MPL Two CPUs (a)WCPUinventory(b)WCPUbrowsingFigure2.EffectofMPLonthroughputinCPUboundworkloads:(a)WCPUinventory(Setups1and2ofTable2)and(b)WCPUbrowsing(setups3and4ofTable2).3.1.EffectonthroughputForCPUboundworkloadsFigure2showstheeffectoftheMPLonthethroughputundertwoCPU-boundworkloads:WCPUinventoryandWCPUbrowsing.Thetwolinesshownconsiderthecaseof1CPUversus2CPUs.InthesingleCPUcase,underbothworkloads,thethroughputreachesitsmaximumatanMPLofabout5.Inthecaseof2CPUs,themaximumthroughputisreachedataroundMPL=10inthecaseofworkloadWCPUinventoryandataroundMPL=7inthecaseofworkloadWCPUbrowsing.ObservethatahigherMPLisneededtoreachmaximumthroughputinthecaseof2CPUsascomparedwith1CPUbecausemoretransactionsareneededtosaturate2CPUs.ThefactthattheWCPUinventoryrequiresaslightlyhigherMPLislikelyduetothefactthattheWCPUinventoryworkloadhassomeI/Ocomponentsduetoupdates.TheadditionalI/OcomponentmeansthatmoretransactionsareneededtofullyutilizetheCPU,sincesometransactionsareblockedonI/Otothedatabaselog.AllthesemaximumthroughputpointsareachievedatsurprisinglylowMPLvalues,consideringthefactthatboththeseworkloadsareintendedtorunwith100clientsaccordingtotheTPCspecications.ForI/OboundworkloadsFigure3showstheeffectoftheMPLonthethrough-putundertwoI/O-boundworkloads:WI=OinventoryandWI=Obrowsing.Thelinesshownconsiderdifferentnum-bersofdisks.TheWI=OinventoryworkloadisapureI/O-onlyworkload,becauseofthelargerdatabasesize.Forthisworkload,theMPLpointatwhichmaximumthroughputisreachedisMPL=2forthecaseof1disk,MPL=5forthecaseof2disks,MPL=7forthecaseof3disks,andMPL=10forthecaseof4disks.ObservethattheMPLneededtomaximizethroughputgrowsforsystemswithmoredisks,sincemoretransactionsarerequiredtosat- 5 10 15 20 25 30 0 5 10 15 20 MPLThroughput (xact/sec) 4 disks 5 10 15 20 25 0 0.5 1 1.5 2 2.5 3 3.5 4 Throughput (xact/sec)MPL 4 disks (a)WI=Oinventory(b)WI=ObrowsingFigure3.EffectofMPLonthroughputinI/Oboundworkloads:(a)WI=Oinventory(setups5–8ofTa-ble2)and(b)WI=O(TPCbrowsing(setups9and10ofTable2).uratemoreresources.Again,thesenumbersareextremelylowconsideringthefactthattheTPCspecicationsforthisworkloadassumes600clients(weuse100clientsexperi-mentally).ItisinterestingtonotethatthetheincreaseinMPLnecessarytoensurex%ofthemaximumthroughputisasomewhatlinearfunction.Wewillgiveanalyticalvali-dationforthisobservationinSection4.AlthoughitmayappearproblematicthatthenecessaryMPLgrowslinearlywithmoredisks,itisimportanttonoticethatsystemswithmanydisksalsohaveaproportionatelylargerpopulationofclients,henceanMPLthatseemslargemaystillbesmallinproportiontotheclientpopulation.ForWI=Obrowsing,theMPLatwhichmaximumthroughputisreachedishigherthanforWI=Oinventory(aboutMPL=13foronediskandaboutMPL=20forfourdisks).ThereasonisthatthesizeofthisdatabaseissmallerthanfortheWI=Oinventoryworkload,thusresultinginalargerCPUcomponentthaninthepurelyI/O-basedWI=Oinventory.AsexplainedinSection3.1theadditionalCPUcomponentwilladdtotheMPLneeded.Still,itissurprisingthatanMPLof20sufcesgiventhattheTPCspecicationsforthisworkloadassumes500clients(recallweuse100clientsexperimentally).For“balanced”CPU+IOworkloadsFigure4considersworkloadWCPU+I=Oinventorywhichisbalanced(equal)initsrequirementsofCPUandI/O(bothresourcesareequallyutilized).Inthecaseofjust1diskand1CPU,anMPLof5sufcestoreachmaximumthroughput.Addingonlydiskstothehardwarecongura-tionchangesthisvalueonlyslightly,sincetheCPUbot-tleneckremains.Similarly,addingonlyCPUschangestherequiredMPLvalueonlyslightly,sincenowtheworkloadbecomessolelyI/Obound.Howeverifweadd4disksand2CPUs(maintainingtheinitialbalancedproportionsofCPUandI/O),wendthattheMPLneededtoreachmaximumthroughputincreasestoaround20.Thisnumberisstilllow 0 5 10 15 20 25 30 35 0 50 100 150 200 Throughput (xact/sec)MPL 4 disks, 2 CPUs WCPU+I=OinventoryFigure4.EffectofMPLonthroughputinworkloadexhibitingbothhighI/OandCPU:WCPU+I=Oinventory(setups11and12ofTable2).inlightofthefactthattheTPCspeciednumberofclientsforthisworkloadis100.Insummary,theMPLrequiredislargelyproportionaltothenumberofresourcesthatareutilizedinasystemwithoutanMPL.Inabalancedworkloadthenumberofresourcesthatareutilizedwillbehigh;hencetheMPLishigher.ForLock-boundworkloadsFigure5illustratestheeffectofincreasingthelockingneededbytransactions(increasingtheisolationlevelfromURtoRR)ontheMPLforworkloadsWCPUinventoryandWCPUordering.WhiletheMPLneededoverallisal-waysunder20,thebasictrendisthatincreasingtheamountoflockinglowerstheMPL.Thereasonisthatwhentheamountoflockingishigh,throwingmoretransactionsintothesystemdoesn'tincreasetherateatwhichtransactionscomplete,sincetheyareallqueueing.Beyondsomepoint,increasingthenumberoftransactionsactuallylowersthethroughput,asseenin[5,8,12,18].3.2.EffectonresponsetimeSection3.1showedthatexternalschedulingwithlowMPLisfeasibleinthatitdoesn'tcauseasignicantlossinthroughputprovidedtheMPLisnottoolow.Becauseweareworkinginaclosedsystem,animmediateconsequenceofthisfactisthattheoverallmeanresponsetimealsodoesnotsuffer(seeLittle'sLaw[15]).However,thispointisfarlessobviousforanopensystem,whereresponsetimeisnotinverselyrelatedtothroughput.InthissectionwewillinvestigatetheeffectoftheMPLvalueonmeanresponsetimeingreatdetail,startingwithexperimentalworkandthenmovingtoqueueingtheoreticanalysis.Experimentally,wemodifyourexperimentalsetuptoanopensystemwithPoissonarrivals.FortheopensystemwendthatforworkloadsbasedonTPC-CtheresponsetimeisinsensitivetotheMPLvalue,provideditisatleast4.InthecaseofTPC-Wbasedworkloads,theMPLvalueneedsto 5 10 15 20 25 30 0 20 40 60 80 100 120 140 Throughput (xact/sec)MPL Isolation UR 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Throughput (xact/sec)MPL UR isolation (a)WCPUinventory(b)WCPUorderingFigure5.EffectofMPLonthroughputinworkloadswithheavylocking:(a)WCPUinventory(setups1and17ofTable2)and(b)WCPUordering(setups15and16ofTable2).beatleast8,forasystemutilizationof70%,andatleast15ifthesystemutilizationincreasesto90inordertoobtainclose-to-optimalmeanresponsetimes(whencomparedtothesystemwithoutMPL).ThemostimportantobservationisthatthedegreetowhichtheMPLaffectsthemeanresponsetimeisdominatedbythevariabilityoftheworkload,ratherthanotherfac-torssuchastheresourceutilization.Forexamplethework-loadsbasedontheTPC-WbenchmarkconsistentlyrequireahigherMPLthantheTPC-Cbasedbenchmarks,indepen-dentofwhethertheyareCPUbound(e.g.WCPUbrowsing)orIObound(e.g.WIObrowsing).ThereasonisthattheservicedemandsofthetransactionsintheTPC-Wbench-markaremorevariablethanthoseintheTPC-Cbenchmark.Theaboveobservationcanbeexplainedbothintuitivelyaswellasthroughqueueingtheory.Intuitively,alowMPLincreasesoverallmeanresponsetimewhenshorttransactions(whichinastandard,non-MPLsystemwouldhaveshortresponsetimes)getstuckwaitingbehindverylongtransactionsintheexternalqueue(independentlyofwhetherthelongtransactionisIO-boundorCPU-bound).Forthistohappentheworkloadneedstoexhibithighvari-abilityoftheservicerequirements,i.e.thetransactionmixmustcontainsometransactionsthataremuchlongerthantheaverage.FromatheoreticalperspectiveourexternalschedulingmechanismwithMPLparametercanbeviewedasasingleunboundedFirst-in-rst-out(FIFO)queuefeed-ingintoaProcessor-Sharing(PS)serverwhereonlyMPLjobsmaysharethePSserver.AhighMPLmakesthesys-tembehavemorelikeaPSserver,whilealowMPLmakesitmoresimilartoaFIFOserver.InqueueingtheoryitiswellknownthatthemeanresponsetimeataFIFOserverisdirectlyaffectedbyjobsizevariability[13],whilethatofaPSserverisinsensitivetojobsizevariability.Togetanideaofwhetherthelevelsofvariabilityex-hibitedbytheTPC-CandTPC-Wbenchmarksarerepre-sentative,weobtaintracesfromoneofthetop-10onlineretailersandfromoneofthetop-10auctioningsitesinthe USforcomparison.Wecomputethesquaredcoefcientofvariation(C2),astandardstatisticalmeasureforvariability,forboththetracesandthebenchmarks.WendthattheC2valuesofthetracesareinagreementwiththeTPC-Cbenchmark:IntheTPC-CbenchmarktheC2valuevariesbetween1.0and1.5(dependingonthesetup),whilethetracesexhibitvaluesforC2ofaround2.ThevariabilityintheTPC-WbenchmarkishigherexhibitingC2valuesof15.3.3Results:FactorsinuencingchoiceofMPLOuraiminthissectionhasbeentodeterminehowlowwecanfeasiblymaketheMPLwithoutnoticeablyhurtingthroughputandmeanresponsetime.Wehaveseen,viaawiderangeofexperimentalworkloads,thattheanswertothisquestionisstronglydominatedbyjustafewkeyfactorsoftheworkload.Forthroughput,what'simportantisthenumberofre-sourcesthattheworkloadwouldutilizeifrunwithoutanMPL.Forexample,ifanIO-boundworkloadisrunonasystemwith4disks,thenahigherMPLisrequiredthanifthesameworkloadisrunonasystemwithonly1disk.Withrespecttonothurtingoverallmeanresponsetime,thedominantfactorinlower-boundingtheMPListhevari-abilityinservicedemandsoftransactions.WorkloadswithmorevariableservicedemandsrequireahigherMPL.Importantly,wendthatthequestionofhowlowonecanfeasiblymaketheMPL,bothwithrespecttothroughputandmeanresponsetime,ishardlyaffectedbywhethertheworkloadisI/Obound,CPUbound,orlockbound.Thisisasurprisingnding,andshowsthatthenumberofresourcesthatmustbeutilizedtokeepthroughputhighismoreimpor-tantthanthetypeofresources.Wenotethatthegraphsshowninthissectionallassumeahighofferedloadintermsofthetransactionarrivalrate,andaswehaveseen,itisquitefeasibletomaketheMPLlowwithonlysmalldeteriorationinthroughput.Whentheofferedloadislow,thedeteriorationinthroughputisevensmaller,sincetheexternalqueueistypicallyempty.4.FindingtherightMPLTheprevioussectiondemonstratesthegeneralfeasibilityofexternalschedulingacrossawiderangeofworkloads.InallexperimentsanMPLoflessthan20sufcestoachievenearoptimalthroughputandmeanresponsetime,whilethenumberofclientsiscomparativelyfarhigherthan20(typi-callyahundredorseveralhundred).However,theperformancestudyintheprevioussectionmerelyindicatesthegeneralexistenceofagoodMPLvalue.ThepurposeofthissectionistodeveloptechniquesforautomaticallytuningtheMPLvaluetomaketheexternalschedulingapproachviableinpractice.WeseekamethodforidentifyingthelowestMPLvaluethatlimitsthroughputandresponsetimepenaltiestosomethresholdspeciedbytheDBA(e.g.“throughputshouldnotdropbymorethan5%”).Databaseworkloadsarecomplex,andexactlypredictingthroughputandresponsetimenumbersisgenerallynotfea-sible.ThekeyobservationisthatforusitsufcestopredicthowagivenMPLchangesthroughputandmeanresponsetimerelativetotheoptimalperformance.Thechangeinper-formancecausedbyanMPLvalueisstronglydominatedbyonlyafewparameters(assummarizedinSection3.3);thechangeinthroughputismostlyaffectedbythenumberofparallelresourcesutilizedinsidetheDBMS;thechangeinmeanresponsetimeismainlyaffectedbythevariabilityintheworkload.Inbothcasesqueueing-relatedeffectsdomi-nate,ratherthanotherperformancefactors.TheaboveobservationsleadsustotheideaoftuningtheMPLthroughafeedbackcontrolloopaugmentedwithqueueingtheoreticguidance.Westartbydevelopingqueue-ingtheoreticmodelsandanalysistocapturebasicpropertiesoftherelationshipbetweensystemthroughputandresponsetimeandtheMPL.WethenusethesemodelstopredictalowerboundontheMPLthatlimitsperformancepenaltiestosomespeciedthreshold.WhiletheanalyticallyobtainedMPLvaluemightnotbeoptimal,itprovidesthecontrolloopwithagoodstartingvalue.Thecontrolloopthenopti-mizesthisstartingvalueinalternatingobservationandreac-tionphases.Theobservationphasecollectsdataontherel-evantperformancemetrics(throughputandmeanresponsetime)andthereactionphaseupdatestheMPLaccordingly,i.e.ifthethroughputistoolowtheMPLisincreasedandifitistoohightheMPLisdecreased.Intheremainderofthissectionwedetailtheaboveap-proach.WerstexplainthequeueingtheoreticmethodsforpredictingtherelationshipbetweenMPLandthrough-put(Section4.1)andmeanresponsetime(Section4.2).InSection4.3,weshowhowthisknowledgecanbeusedinafeedbackcontrollooptone-tunetheMPLparameter.4.1.Queueinganalysisofthroughputvs.MPLWestartbycreatingaverysimplisticmodelofthedatabaseinternalresourcesasshowninFigure6.1WemodeltheMPLbyusinga“closed”systemwithaxed(MPL)numberofclientsasrepresentedinFigure6.Weas-sumethattheservicetimesofalldevicesareexponentiallydistributedwithservicerateproportionaltotheirutilizationintheunlimitedsystem(withunboundedMPL).Thereasonwhysuchasimplemodelissufcientisthatweareonlyinterestedinachievedthroughputrelativetothe 1OurcurrentmodelincludesonlyCPUanddiskresources.Wedon'tmodelmemory(orbufferpool)asaseparateresourcesincethetimeatrans-actionspendsaccessingmemoryistimeiteitheroccupiestheCPU(mem-oryhit)orutilizesadisk(memorymiss)andisthereforeaccountedfor. Disk 1 CPU 1CPU 2 Figure6.TheoreticalmodelrepresentingtheDBMSinternals.ThismodelprovidesuswithatheoreticalupperboundontheMPLneededtoprovidemaximumthroughput.optimalthroughput.Itisthereforenotnecessarytoknowtheexactservicedemandsatadevice,justtherelativepro-portions,sincethesewillequallyaffectthethroughputwithandwithoutMPL(e.g.a5-timeshigherservicedemandwillreducethroughputinbothcasesbyafactorof5).More-over,inthistypeofqueueingmodelthedistributionoftheservicedemandattheindividualserverswillnotimpactthethroughput.Weanalyzethis“closed”systemfordifferentMPLval-uesandanddeterminetheachievedthroughput.Wecom-paretheresultstothemaximumthroughputforthesystem,untilwendthelowestMPLvaluethatleadstothede-siredthroughputlevel(e.g.notmorethan5%lowerthanthemaximumthroughput).Simplebinarysearchcanbeusedtomakethisprocessmoreefcient.TheMPLyieldedbythisanalysisisinfactanupperboundontheactualMPLthatwewouldgetinexperimentsfortworeasons:First,wepurposelycreatethe“worst-case”inouranalyticalmodelbyassumingthatallresourcesareequallyutilized.Thisisrealisticfortheexperimentalsetupsthatweconsider,sinceweassumethatthedataisevenlystripedoverthedisksandtheCPUschedulerwillensurethatonaverageallCPUsareequallyutilized.Forunbal-ancedworkloadsasmallerMPLmightactuallybefeasible,andthiscouldeasilybeintegratedintothemodel.Secondwedonotallowforthefactthataclientmaybeabletoutilizetworesources(e.g.,twodisks)atonce.ToevaluatetheusefulnessofthemodelinpredictinggoodMPLrangesweparameterizeandevaluatethemodelbasedontheWI=Oinventoryworkload.ForthisworkloadthereisalmostnoCPUusage,howeverthenumberofdisksplayanimportantrole.Inourexperiments,wewereabletoexperimentwithupto4disks,asshowninFigure3.How-everinanalysiswecangomuchfurther.Figure7showstheresultsoftheanalysiswithupto16disks.Therstob-servationisthattheresultsoftheanalysisfor1to4diskslookverysimilartotheactualexperimentalresultsfromFigure3.Next,weobservethattheMPLrequiredtoreach 20 40 60 80 100 0 2 4 6 8 10 12 14 16 MPLThroughput 16 disks Figure7.ResultsoftheoreticalanalysisshowingtheeffectoftheMPLonthroughputasafunctionofthenumberofresources.Thesquares(circles)denotetheminimumMPLthatlimitsthroughputlossto5%(20%).Notethatthesetofcirclesformaperfectlystraightline,asdothesquares.nearmaximumthroughputgrowslinearlywiththenumberofdisks:TheminimumMPLthatissufcienttoachieve80%ofthemaximumthroughputismarkedwithcircles,andtheminimumMPLthatissufcienttoachieve95%ofthemaximumthroughputismarkedwithsquares.Boththecirclesandthesquaresformstraightlines.Thismatchesthelineartrendwealsoobservedinexperiments.Thetake-awaypointisthatsimplequeueinganalysis,aswehavedone,capturesthemaintrendsofthethroughputvs.MPLfunctionwell,andisausefultoolinobtaininganinitialestimateoftheMPLrequiredtoachievethede-siredthroughput.Whilewendthatthecurrentanalysisisaverygoodpredictorofourexperimentalresultsforthe4-disksystem,itiscertainlypossibletorenetheanalyticqueueingmodelfurther,ortointegrateitwithexistingsim-ulationtoolsformorerealisticmodelingofthehardwareresourcesinvolved.However,suchimprovementsarenotcrucialsincethemainpurposeoftheabovemodelismerelytoprovidethecontrollerwithagoodstartingvalue,ratherthanaperfectprediction.4.2.Queueinganalysisofresponsetimevs.MPLSection3indicatesthattheeffectoftheMPLonthemeanresponsetimeisdominatedbythevariabilityintheworkloadandhardlyaffectedbyotherworkloadparameterssuchasthebottleneckresourceortheleveloflockcon-tention.Forworkloadswithlittlevariability(C21)MPLvaluesaround4aresufcienttoachieveoptimalmeanre-sponsetime,whilemorevariableworkloads(C215)re-quireanMPLof8-15(dependingonsystemload).How-ever,theseparticularresultsfortherightchoiceoftheMPLarehardtogeneralize,sincetheyarebasedononlytwobenchmarkswithtwodifferentlevelsofvariability(C21andC215).Wethereforeresorttoanalysistoobtainmoregeneralresults. FromatheoreticalperspectiveourexternalschedulingmechanismwithMPLparametercanbeviewedasasin-gleunboundedFirst-in-rst-out(FIFO)queuefeedingintoaProcessor-Sharing(PS)serverwhereonlyMPLjobsmaysharethePSserverasillustratedinFigure8.Notethatthisisnotapoorapproximationofoursysteminthat,asweseein[22],Figure8,theDBMSinmanywaysbehaveslikeaPSsystem. PSMulti- FIFOPoisson Figure8.QueueingnetworkmodelofexternalschedulingmechanismwithMPL=2.Tothebestofourknowledge,thereisnoexistingsimplesolutiontoourqueueingnetworkinFigure8.Therefore,wederivethefollowingsolutionapproach:Westartbymodel-ingthejobsizes(servicerequirements)bya2-phasehyper-exponential(H2)distribution,withprobabilityparameterpandrates1and2,allowingustoarbitrarilyvarytheC2parameter.WecanthenrepresentthenetworkinFigure8byanequivalentspecial“exiblemultiserverqueue”wherethenumberofserversuctuatesbetween1andMPLasneeded,andwherethesumoftheserviceratesatthemultipleserversisalwaysmaintainedconstantandequaltothatatthesinglePSserver.Thecontinuous-timeMarkovchaincorrespond-ingtotheexiblemultiserverqueueisshowninFigure9forthecaseofanH2servicetimedistribution(withparam-etersp,1,and2),arrivalrateandMPL=2.Notethatwedenetheshorthandq=1p.ThisMarkovchainlendsitselftoMatrix-analyticanalysis[14,19],becauseofitsrepeatingstructure.Figure10showstheresultsofevaluatingtheMarkovchaininFigure9.WendthatforlowC2valuesof1or2,themeanresponsetimeislargelyindependentoftheMPLvalueandequaltothatforthepurePSsystem(withinniteMPL),assumingtheMPLisatleast5.ForhigherC2val-uesof5–15,wendthattheMPLdependsontheloadandneedstobeatleast10(forloadof0:7)or30(forloadof0:9)toensurelowmeanresponsetime(similartoPS).4.3.AsimplecontrollertondlowestfeasibleMPLNextweexplainhowweusefeedbackcontrolcombinedwithqueueingtheoryfortuningtheMPLparameter.Whenusingfeedbackcontrolfortuningparameters,thedifcultpartischoosingtherightamountbywhichtoad-justtheparameterineachiteration:toosmall,conservativeadjustmentswillleadtolongconvergencetimes,whiletoo lplqm1m2lqlplplqm1m2m22m12m2qm1p+m2qm1p+m1p2m1p2m1q2m1q2m1qm2q2m2p2m2pm2q2m2p2m1qm2p 0134221343lllll Figure9.Continuous-timeMarkovchain(CTMC)correspondingtotheexiblemultiserverqueuerep-resentationofthequeueingnetworkinFigure8.Thetwojobsinservicemaybothhaveservicerate1(toprow),ormayhaverates1and2(middlerow),ormaybothhaveservicerates2(bottomrow).largeadjustmentscancauseovershootingandoscillations.Wecircumventtheproblembyusingthequeueingtheoreticmodelsfromtheprevioussubsectionsto“jump-start”thecontrol-loopwithagood,close-to-optimalstartingvaluefortheMPL.Initializingthecontrol-loopwithaclose-to-optimalstartingvalueprovidesfastconvergencetimes,evengivenonlysmallconservativeconstantadjustments.Asecondcriticalfactorinimplementingthefeedbackbasedcontrolleristhechoiceoftheobservationperiod.Itneedstocontainenoughsamplestoprovideareliablees-timateofmeanresponsetimeandthroughput.Wedeter-minetheappropriatenumberofsamplesthroughtheuseofcondenceintervals.Forourworkloadsanobservationpe-riodneedstospanaround100transactionstoprovidestableestimates.Itisalsoimportanttheobservationperiodbe-ingstudieddoesnothaveunusuallylowload,asthiswouldcauselowthroughputindependentofthecurrentMPLused.OurcontrollertakestheabovetwopointsintoaccountbyupdatingtheMPLonlyafterobservationperiodsthatcon-tainasufcientnumberofexecutedtransactionsandexhibitrepresentativesystemloads.Wendinexperimentsthatourqueueingtheoreticallyenhancedcontrollerconvergesforallourexperimentalse-tupsinlessthan10iterationstothedesiredMPL.Whilewendthatusingoursimplisticcontrol-loopiseffectiveindeterminingthedesiredMPL,ourapproachcouldeasilybeextendedtoincorporatemorecomplexcontrolmethods,e.g.followingguidelinesprovidedin[11].Thiswillbeparticularlyusefulforsituationswherequeueingtheoretical 5 10 15 20 25 30 35 0 100 200 300 400 500 600 700 800 Multiprogramming limitResponse Time (msec) C2=15C2=10C2=5C2=2PS 5 10 15 20 25 30 35 0 500 1000 1500 2000 2500 3000 Multiprogramming limitResponse Time (msec) C2=15C2=10C2=5C2=2PS Figure10.EvaluationofCTMCfordifferentC2.Thesystemloadis0.7(top)and0.9(bottom).modelsarenotpreciseenoughinpredictinggood,close-to-optimalstartingvaluesforthecontroller.5.ExternalschedulingforPrioritizationThusfarwehavepresentedanalgorithmforndingalowMPLthatdoesn'thurtthroughputoroverallmeanre-sponsetime.ThegoalinkeepingtheMPLlowisthatalowMPLgivesuscontrolontheorderinwhichtransactionsarescheduled,sincewecanpicktheorderinwhichtransac-tionaredispatchedfromtheexternalqueue.Thusweareenablingcertaintransactionstoruninisolationfromothers.Inthissection,weapplyourtechniquetotheproblemofdifferentiatingbetween“high”and“low”prioritytrans-actions.Suchaproblemarisesforexampleinthecaseofadatabasebackendforathree-tierede-commercewebsite.Asmallfractionoftheshoppersatthewebsitespendalargeamountofmoney,whereastheremainingshoppersspendasmallamountofmoney.Itmakessensefromaneconomicperspectivetoprioritizeservicetothe“bigspenders,”pro-vidingthemwithlowermeanresponsetime.Wewouldliketoofferhighprioritytransactionslowre-sponsetimesandlowprioritytransactionshigherresponsetimes.ThelowertheMPLthatweuse,thegreaterthedif-ferentiationwecancreatebetweenhighandlowpriorityresponsetimes.AtthesametimewewouldliketokeeptheMPLhighenoughthatthroughputandoverallmeanre-sponsetimearenothurtbeyondaspeciedthreshold.ThetechniquepresentedinSection4allowsustoachievebothoftheabovegoalsbyspecifyinganexactMPLwhichwillachievetherequiredthroughputandoverallmeanresponsetime,whilebeingaslowaspossible,andhenceprovid-ingmaximaldifferentiationbetweenhighandlowprioritytransactions.InSection5.1wepresentresultsachievedviaexternalprioritization,where,foreachworkload,theMPLisad-justedusingthemethodsfromSection4.InSection5.2wediscusshowonecouldalternativelyimplementpriori-tizationinternallytotheDBMSbyschedulinginternalre-sources.FinallyinSection5.3,wecomparetheeffective-nessofourexternalandinternalapproaches,andshowthatexternalscheduling,withtheproperMPL,canbeaseffec-tiveasinternalschedulingforourworkloads.5.1.EffectivenessofexternalprioritizationWestartbyimplementingandstudyingtheeffective-nessofexternalprioritization.Thealgorithmthatweuseforprioritizationisrelativelysimple.ForanygivenMPL,weallowasmanytransactionsintothesystemasallowedbytheMPL,wherethehigh-prioritytransactionsaregivenrstpriority,andlow-prioritytransactionsareonlychoseniftherearenomorehigh-prioritytransactions(seeFigure1).TheMPLisheldxedduringtheentireexperiment.Notethatthispaperdoesnotdealwithhowthetrans-actionsobtaintheirpriorityclass.Asstatedearlier,weas-sumethatthee-commercevendorhasreasonsforchoosingsometransactions/clientstobehigherorlower-priority.Ex-perimentally,wehandlethisbysimplyatrandomassigning10%ofthetransaction“high”-priorityandtheremainder“low”-priority.WerstconsiderthecasewheretheMPLisadjustedtolimitthroughputlossto5%(comparedtothecasewherenoexternalschedulingisused),seeFigure11(top),andthenthecasewheretheMPLischosentolimitthroughputlossto20%,seeFigure11(bottom).Foreachofthesetwocases,weexperimentwithall15setupsshowninTable2.Ineachexperimentweapplytheexternalschedulingalgorithmde-scribedinaboveandmeasurethemeanresponsetimesforhighandlowprioritytransaction,inadditiontotheoverallmeanresponsetimewhennoprioritiesareused.Wendthatusingexternalprioritization,inthecaseof5%throughputloss(Figure11(top)),highprioritytrans-actionsperform4.2to21.6timesbetterthanlowprioritytransactionswithrespecttomeanresponsetime.Theav-erageimprovementofhighprioritytransactionsoverlowprioritytransactionsisafactorof12.1.Thelowprioritytransactionssufferonlyalittleascomparedtothecaseofnoprioritization,byafactorrangingfrom1.15to1.17,withanaveragesufferingof16%.Theabovenumbersarevisi-blefromthegure(orcaption).Notvisiblefromthegureiswhetherprioritizationcausestheoverallmeanresponse 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 5 10 15 SetupResponse Time (sec) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 5 10 15 SetupResponse Time (sec) High Prio High Prio Figure11.Resultsofexternalschedulingalgorithm.Thisgureshowsthemeanresponsetimesforhighandlowpriorityrequests,aswellasthecaseofnoprioritization,forall17setupsdescribedinTable2.Inthetopgraph,theMPLshavebeensettosacriceamaximumof5%throughputforeachexperiment.Inthebottomgraph,TheMPLsaresettosacriceamaximumof20%throughput.Observethatworkloads5,9,and10havebeencutoff.Thevaluesfortheseworkloadsin(top)are(7.6sec,76.864sec),(26.2sec,111sec),and(9.4sec,50.9sec),respectively,andin(bottom)are(4.1sec,79.3sec)(15sec112sec)(4.2secand51.9sec)respectively.timetorise.Itturnsoutthattheoverallmeanresponsetimeisneverhurtbymorethan6%comparedtotheorginalsys-temwithoutexternalscheduling.Wendthatusingexternalprioritization,inthecaseof20%throughputloss(Figure11(bottom)),highprioritytransactionsperform7to24timesbetterthanlowprior-itytransactionswithrespecttomeanresponsetime.Theaverageimprovementofhighprioritytransactionsoverlowprioritytransactionsisafactorof18.Thelowprioritytrans-actionssufferbyafactorrangingfrom1.35to1.39,ascom-paredtothecaseofnoprioritization,withanaveragesuffer-ingof37%.Theabovenumbersarevisiblefromthegure(orcaption).Notvisiblefromthegureiswhetherprior-itizationcausestheoverallmeanresponsetimetorise.Itturnsoutthattheoverallmeanresponsetimeisneverhurtbymorethan25%comparedtotheorginalsystemwith-outexternalscheduling.Observethatinthecaseof20%throughputloss,thedifferentiationbetweenhighandlowpriorityrequestsismorepronounced,sincetheMPLvaluesarelower,butthiscomesatthecostoflowerthroughputandhigheroverallresponsetimes.5.2.ImplementationofinternalschedulingSchedulingtheinternalsoftheDBMSisobviouslymoreinvolvedthanexternalscheduling.Itisnotevenclearwhichresourceshouldbeprioritized:theCPU,thedisk,thelockqueues,etc.Onceoneresolvesthatrstquestion,thereisthefollow-upquestionofwhichalgorithmshouldweusetogiveprioritytohigh-prioritytransactions,withoutexten-sivelypenalizinglowprioritytransactions.Bothquestionsarenotobvious.Inarecentpublication,[16],weaddresstherstquestionofwhichresourceshouldbeprioritizedviaadetailedre-sourcebreakdown.WendthatinOLTPworkloadsrunon2PL(2-phaselocking)DBMS,transactionexecutiontimesareoftendominatedbylockwaitingtimes,andhencepri-oritizationoftransactionsismosteffectivewhenappliedatthelockqueue.WendthatotherworkloadsorDBMSleadtotransactionexecutiontimesbeingdominatedbyCPUus-ageorI/O,andhenceprioritizationoftransactionsismosteffectivewhenappliedatthoseotherresources.Havingseenthatitisnotobviouswhichinternalresourceneedstobescheduled,wenowturntotheparticular17se-tupsshowninTable2.Someofthese(e.g.,setup3and4)areCPUbound,whileothers(e.g.,1and2)arelock-bound,andstillothersareI/Obound(e.g.setup5-10).Inourexperimentswithinternalschedulingweconsidertwoparticularsetups:Setup1(Lock-bound)andSetup3(CPU-bound).Forsetup1,weimplementthePreempt-on-Wait(POW)lockprioritizationpolicy[17]inShore[20].InPOW,high internal ext95 ext80 ext100 0 0.5 1 1.5 2 2.5 3 3.5 MPLResponse Time (sec) High Prio Figure12.Comparisonofinternalvsexternalpri-oritizationforsetup1. internal ext 95 ext 80 ext 99 0 2 4 6 8 10 12 14 Response Time (sec) High Prio Figure13.Comparisonofinternalvsexternalpri-oritizationforsetup3.prioritytransactionsmoveaheadoflow-prioritytransac-tionsinthelockqueue,andareallowedtoevenpreemptalow-prioritylockholderifthatlow-prioritylockholderiswaitingatanotherlockqueue.Forsetup3,CPUprioritizationisavailableinIBMDB2throughtheDB2govtool[3].However,wendthatweachievebetterprioritydifferentiationby“manually”settingtheCPUschedulingprioritiesusedbytheLinuxoperatingsystem.WeusetherenicecommandinLinuxtosettheCPUpriorityofaDB2processexecutingahighprioritytransactionto-20(thehighestavailableCPUpriority)andtheCPUpriorityofaDB2processexecutingalowprioritytransactionto20(thelowestavailableCPUpriority).Inthenextsectionweshowtheresultsforinternalschedulingforthesesetups.5.3.InternalprioritizationresultsandcomparisonwithexternalresultsInthissectionweconsidersetup1and3fromTable2.Foreachsetup,wecomparetheperformanceobtainedviainternalprioritizationwiththatobtainedviaexternalprior-itization.Weconsider3versionsofexternalprioritization,therstinvolving5%throughputloss,thesecondinvolving20%throughputloss,andthethirdinvolving0%throughputloss.Figure12)showstheresultsforsetup1,andFigure13showstheresultsforsetup3.Forbothsetups,wendthatwithrespecttodifferenti-atingbetweenhighandlowprioritytransactions,externalschedulingisnearlyaseffectiveastheinternalschedulingalgorithmsthatwelookedatherein(forthecaseofzerothroughputloss),andcanevenbemoreeffectivewhentheMPLislow(atthecostofasacriceinthroughput).Look-ingatthesufferingofthelowprioritytransactionsascom-paredtotheoverallmeanresponsetime,wendthatexter-nalschedulingresultsinonlynegligiblymoresufferingforthelowprioritytransactions,whencomparedwiththein-ternalschedulingalgorithmsherein.ThepenaltytothelowprioritytransactionsisminimizedwhentheMPLischosensothatnothroughputislost.Becauseoftheinherentdifcultyinimplementinginter-nalscheduling,wewereonlyabletoprovidenumbersforsetups1and3outofthe17setupsinFigure2.Howeveritisclearthatforthesetwosetups,externalschedulingisaviableapproachwhencomparedwithinternalscheduling,andwehypothesizethatexternalschedulingwillcomparefavorablyontheremainingsetupsaswell,giventhestrongresultsshownforexternalschedulinginFigure2.Wearenottryingtosaythatexternalschedulingisal-waysaseffectiveasinternalscheduling.Althoughtheinter-nalschedulingalgorithmsthatweconsideredarequitead-vanced,theremaybeotherinternalschedulingalgorithmswhicharesuperiortoourexternalapproachforcertainworkloads.Similarly,wearenottryingtosaythatourpro-posedmethodforexternalschedulingisoptimal.Theremaybemanywaysoffurtherenhancingourexternalscheduler,forexamplebyleveragingDBMSinternalinformationonresourceutilization,orinformationonresourcedemandsoftransactions.Thepointthatwemakeinthispaperisthatex-ternalschedulingisapromisingapproach,whentheMPLisadjustedappropriately.6.ConclusionThispaperlaystheexperimentalandtheoreticalground-workforanexplorationoftheeffectivenessofexternalschedulingoftransactionalworkloads.Attheheartofourexplorationisthequestionofhowex-actlyshouldonelimittheconcurrentnumberoftransactionsallowedintotheDBMS,i.e.,theMPL(multi-programminglimit).TheobvioustradeoffisthatonebothwantstheMPLtobelowenoughtocreategoodprioritizationdifferentia-tionandatthesametimehighenoughsoasnottolimitthroughputorcreateotherundesirableeffectslikeincreas-ingoverallmeanresponsetime.OurworkbeginswithanexperimentalstudyofhowtheMPLsettingaffectsthroughputandmeanresponse.Ourexperimentsincludeavastarrayof17experimentalsetups(seeTable2),spanningawidevarietyofhardwarecong- urationsandworkloads,andtwodifferentDBMS(Shore,IBMDB2).WendthatthechoiceofagoodMPLisdom-inatedbyafewkeyfactors.Thedominantfactorinlower-boundingtheMPLwithrespecttominimizingthroughputlossisthenumberofresourcesthattheworkloadutilizes.ThekeyfactorinchoosinganMPLsoasnottohurtoverallmeanresponsetime,isthevariabilityinservicedemandsoftransactions.ThefactofwhetheraworkloadisI/Obound,CPUbound,orlockboundismuchlessimportantinchoos-ingagoodMPL.ThroughoutwendthatthevaluesofMPLthatareneededtoensurehighthroughputandlowover-allmeanresponsetimeareinthelowerrange,inparticularwhencomparedwiththetypicalnumberofusersassociatedwiththeaboveexperimentalsetupworkloads.TheaboveexperimentalstudyencouragesustodevelopatoolfordynamicallydeterminingtheMPLasafunctionoftheworkloadandsystemconguration.ThetooltakesasinputfromtheDBAthemaximumacceptablelossinsystemthroughputandincreaseinmeanresponsetime,anddeter-minesthelowestpossibleMPLthatmeetstheseconditions.Thetoolusesacombinationofqueueingtheoreticmodelsandafeedbackbasedcontroller,basedonourdiscoveryofthedominantfactorsaffectingthroughputandoverallmeanresponsetime.Finally,weapplyourtoolforadjustingtheMPLtotheproblemofprovidingprioritydifferentiation.Givenhighandlowprioritytransactions,wescheduletheexter-nalqueuebasedonthesepriorities(highprioritytransac-tionsareallowedtomoveaheadoflowprioritytransac-tions)andthecurrentMPL.WeexperimentwithdifferentMPLvaluesbyconguringourtoolwithdifferentthresh-oldsforthemaximumacceptablelossinsystemthroughputandincreaseinmeanresponsetime.Wendthattheex-ternalschedulingmechanismishighlyeffectiveinprovid-ingprioritizationdifferentiation.Specically,weachieveafactorof12differentiationinmeanresponsetimebetweenhighandlowprioritytransactionsacrossour17experimen-talsetups,iftheMPLisadjustedtolimitdeteriorationinthroughputandmeanresponsetimeto5%.Ifweallowupto20%deteriorationinthroughputandoverallmeanresponsetime,weobtainafactorof16differentiationbetweenhighandlowpriorityresponsetimes.Lastly,togaugetheeffectivenessofourexternalap-proach,weimplementseveralinternalprioritizationmech-anismsthatschedulethelockresourcesandtheCPUre-sources.Wendthatourexternalmechanismandinternalmechanismsarecomparablewithrespecttotheireffective-nessinprovidingprioritydifferentiationfortheworkloadsstudied.OurmethodsfordynamicallyadaptingtheMPLareverygeneral.WhilewehaveappliedthemonlytoOLTPwork-loadsinthispaper,theyarelikelytoapplytootherwork-loadsaswell,andalsotomoregeneralschedulingpolicies.References[1]DB2productfamily.http://www-3.ibm.com/software/data/db2/.[2]IBMDB2querypatrolleradminsistrationguide,ftp://ftp.software.ibm.com/ps/products/db2/info/vr7/pdf/letter/db2dwe70.pdf.[3]DB2TechnicalSupportKnowledgeBase.Chapter28:Usingthegovernor,http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/docu-ment.d2w/report?fn=db2v7d0frm3toc.htm.[4]J.M.Blanquer,A.Batchelli,K.Schauser,andR.Wolski.Quorum:Flexiblequalityofserviceforinternetservices.InProceedingsofNSDI'05,2005.[5]M.J.Carey,S.Krishnamurthy,andM.Livny.Loadcontrolforlock-ing:The'half-and-half'approach.InInACMSymposiumonPrinci-plesofDatabaseSystems,1990.[6]TransactionProcessingPerformanceCouncil.TPCbenchmarkC.NumberRevision5.1.0,December2002.[7]TransactionProcessingPerformanceCouncil.TPCbenchmarkW(webcommerce).NumberRevision1.8,February2002.[8]H.U.HeissandR.Wagner.Adaptiveloadcontrolintransactionpro-cessingsystems.InProceedingsofVeryLargeDatabaseConference,pages47–54,1991.[9]WeiJin,JeffreyS.Chase,andJasleenKaur.Interposedproportionalsharingforastorageserviceutility.InProceedingsofACMSIG-METRICS'04,pages37–48,2004.[10]A.Kamra,V.Misra,andE.Nahum.Yaksha:Acontrollerforman-agingtheperformanceof3-tieredwebsites.InProceedingsofIEEEInternationalWorkshoponQualityofService(IWQoS2004),2004.[11]C.Karamanolis,M.Karlsson,andX.Zhu.Designingcontrollablecomputersystems.InHotTopicsinOperatingSystems(HotOS'05),2005.[12]N.Katoh,T.Ibaraki,andT.Kameda.Cautioustransactionschedulerswithadmissioncontrol.ACMTrans.DatabaseSyst.,10(2):205–229,1985.[13]LeonardKleinrock.QueueingSystems,volumeI.Theory.JohnWi-ley&Sons,1975.[14]G.LatoucheandV.Ramaswami.IntroductiontoMatrixAnalyticMethodsinStochasticModeling.ASA-SIAM,1999.[15]J.Little.AproofofthetheoremL=W.OperationsResearch,9:383–387,1961.[16]D.T.McWherter,B.Schroeder,A.Ailamaki,andM.Harchol-Balter.PrioritymechanismsforOLTPandtransactionalwebapplications.In20thInt.ConferenceonDataEngineering(ICDE'2004),2004.[17]D.T.McWherter,B.Schroeder,A.Ailamaki,andM.Harchol-Balter.ImprovingpreemptiveprioritizationviastatisticalcharacterizationofOLTPlocking.In21thInternationalConferenceonDataEngineer-ing(ICDE'2005),2005.[18]A.MoenkebergandG.Weikum.Performanceevaluationofanadaptiveandrobustloadcontrolmethodfortheavoidanceofdata-contentionthrashing.InProceedingsofVeryLargeDatabaseCon-ference,pages432–443,1992.[19]M.F.Neuts.Matrix-GeometricSolutionsinStochasticModels.JohnsHopkinsUniversityPress,1981.[20]UniversityofWisconsin.Shore-ahigh-performance,scalable,persistentobjectrepository.http://www.cs.wisc.edu/shore/.[21]PostgreSQL.http://www.postgresql.org.[22]B.Schroeder,M.Harchol-Balter,A.Iyengar,andE.Nahum.Achiev-ingclass-basedQoSfortransactionalworkloads.In22thInterna-tionalConferenceonDataEngineering(ICDE'2006),2006.