Hellerstein Abstract Distributed consistency is a perennial research topic in recent years it has become an urgent practical matter as well The research literature has focused on enforcing various 64258avors of consistency at the IO layer such as li ID: 29842
Download Pdf The PPT/PDF document "Consistency Without Borders Peter Alvaro..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
andlanguagelevelswithvarioustradeoffsbetweenef-ciency,generality,andengineeringcomplexity.Inthispaper,wemotivatethefurther(and,insomecases,renewed)studyofthesealternativeapproachestodistributedconsistency.Weofferaninformaltaxon-omyofstrategiesandassociatedinsightsbothfromourownworkaswellasrecentdevelopmentsfromotherresearchers.Weidentifyopportunitiesforfurtherexplo-rationandhighlightseveralareasinwhichnon-I/O-levelmechanismshavealreadybeguntosucceedinthewild.Asacommunity,wehaveanopportunitytodemonstratethatcorrectnessatscaleisnotinconictwithavailability,performance,andprogrammerproductivity.2CaseStudyToillustratehowconsistentoutcomescanbeachievedatseveraldifferentplacesinthesoftwarestack,considerascenarioinwhichseveralprogramsmanipulateadirectedgraph.Thissystemcanbedividedinto(atleast)twotiers:thestoragetier(e.g.,aDBMSorkey-valuestore)managesthepersistenceofthegraphdatastructure,whiletheapplicationtieraccessesthegraphbysubmittingreadandwriteoperationsagainstthegraphstore.Toimprovefaulttoleranceandscalability,weassumethegraphisreplicatedandpartitioned.Weconsidertwoapplicationsthatusethisgraphstore:1.Thedeadlockdetectorqueriesawaits-forgraphthatrecordsdependenciesbetweenprocesses.Thetaskistocheckwhetherthegraphcontainsacycle,whichindicatesadeadlock[21].2.Thegarbagecollectorusesarefers-tographtorecordreferencesbetweenacollectionofdistributedobjects.Theobjectiveistodetectstronglyconnectedcomponentsthatarenotreachablefromadistin-guishedrootobject;suchcomponentscansafelybereclaimed[1].Bothprogramshavesimilarcorrectnessrequirements.Forthedeadlockdetector,alldeadlocksshouldeventu-allybereported,withnofalsepositives.Similarly,thegarbagecollectorshouldensurethatallunreachablecom-ponentsareeventuallydiscovered,andnoliveobjectsarereturnedassuitableforgarbagecollection.Howshouldwemaptheseapplication-levelsemanticsdowntothelow-levelstorageabstraction?Intheremain-derofthepaper,wewillconsiderthesesemanticsatthetraditionalextremes(Section3),andviaconvergentob-jects(Section4),distributeddataows(Section5),andwholeprogramanalysis(Section6).Beforewedoso,wenotethatneitherexampleappli-cationrequiresastrongconsistencyguaranteesuchaslinearizabilityorserializabilitytomaintaincorrectness.Bothdeadlockandunreferencedmemoryarestableprop-erties[20]:oncesuchapropertyholds,itwillpersist(untilacorrectiveactionistaken,suchasabortingoneoftheparticipantsinadeadlock).However,deadlockisalsoastrongstableproperty[52]:itcanbedetectedgivenasubsetoftheglobalgraph.Thisimpliesthatthedeadlockdetectoronlyrequiresveryweaksemanticguaranteesfromthegraphstore:aslongasallwaits-foredgesareeventuallyobserved(regardlessoforder),alldeadlockedtransactionswillbedetected.Incontrast,thegarbagecollectorrequiresglobalknowledge:justbecauseonepartitionofthegraphstorecontainsnoreferencestoanobjectdoesnotimplytherearenoreferencesglobally.Hence,garbagecollectionrequiresstrongerconsistencyguaranteesthandeadlockdetection.However,neitherre-quiresstrongconsistencyanditsconcomitantcostsindecreasedavailabilityandincreasedlatencytoachievecorrectbehavior.3ConsistencyattheExtremesToguaranteethatapplication-levelinvariantsarenevervi-olated,programmersareoftenforcedtochoosebetweenoneoftwoextremestrategies:genericI/O-levelinter-facesthatcontroltheorderofeventssuchasmessagesorreadsandwrites,andcustom,typicallyadhocsolutionsthatforceapplicationlogictoassumeallresponsibilityforensuringthatcorrectnessinvariantsarepreserved.Bothapproacheshavesignicantlimitations.3.1I/O-LevelConsistencyDatabasesystemshavelongprovidedguaranteesabouttheinterleavingofconictingoperationsonsharedstate[13].Theseguaranteesaredenedintermsofstorageoperationslikereadsandwrites:forexample,conictserializabilitydenesaconictastwoopera-tionsonthesamedataitemsubmittedbydifferenttrans-actions,inwhichatleastoneoftheoperationsisawrite[50].Althoughoriginallydenedforcentralizedsystems,theseconsistencymodelshavesubsequentlybeenappliedtodistributeddatamanagement[12].Awidevarietyofconsistencymodelshavebeenproposedthatmakedifferenttradeoffsbetweenlatency,availabil-ity,andthespaceofpermissibleoperationinterleavings(e.g.,[7,25,44,45,57]).Similarly,distributedsystemsoftenrelyonorderingguaranteesonmessagesthatref-erencesharedstate.Techniquessuchasstatemachinereplication[53]ensureconsistencyamongreplicasofadistributedservicebyguaranteeingthatmessagesaredeliveredinthesameordertoallreplicas.Groupcom-municationsystems[15]provideavarietyoforderingguaranteesforbroadcastmessages. sitionofaconuentreplicatedgraphstoreandacon-uentdeadlockdetectoryieldsaconuentcompositedataow,andallowsthesystemtoexecutewithoutsyn-chronization.Thegarbagecollectioncomponentwouldbeannotatedasnon-conuentbutasiscommonprac-tice[41]partitionedintogenerationsorepochs.Ifthedataowisenhancedtoproducesealingpunctuationsthatindicatewhenindividualallocatorswillproducenomoreedgeswithinagivenepoch,Blazescansynthesizeasimple,barrier-basedcoordinationstrategythatpreventsthegarbagecollectorfromexecutinguntilthegraphpar-titionissealedthatis,themarkphasehasendedforagivenepoch.Thisstrategyismuchlessexpensivethanageneralcoordinationprotocol:ratherthanwaitingforco-ordinationoneverymessage,onlyasinglecoordinationroundisrequiredperepoch.Theprincipaldrawbackofthedataowapproachistheneedformanualcomponentannotations:annotat-ingmodulescanbeburdensomeanderror-prone,espe-ciallyforcomplexcomponents.Incorrectannotationscorrupttheanalysisandcanresultinunsafeoptimiza-tions.Forreusablemodules(liketheCRDTsdiscussedinSection4),itmaybepossibletohaveanexpertsupplyannotations.Thisamortizesthecostofannotationandreducestheriskoferrors,butisonlyapplicableforcom-monlyusedcomponents.Thisdrawbackaside,ow-levelapproachestoconsistencyoccupyaninterestingmiddleground:theyaremorebroadlyapplicablethanlanguage-orapplication-levelapproaches,andmorepowerfulthanobject-levelapproaches,whichcannotcapturecomposi-tionacrossservices.6Language-LevelConsistencyFlow-levelconsistencyonlyrequiresanabstractdataowgraphdepictingthesystemarchitecture,andhencecanbeusedwithexistingprogramsandoff-the-shelfstreamprocessorssuchasStorm[43].However,italsorequiresthatusersmanuallyaddsemanticannotations,whichisburdensomeanderror-prone.Theseconcernsareexac-erbatedasthecomplexityofthesystemincreases.Inthissection,weconsideramoreradicalapproach:iftheentiresystemiswritteninahigh-levellanguagethatdi-rectlyencodesbothdependenciesandappropriateseman-ticproperties,thecompilercanautomaticallyanalyzetheconsistencypropertiesofentireapplications.6.1DependencyAnalysisDatabasesystemsareaprominentexampleofthepowerofautomaticdependencyanalysis.Becausealldatahasauniformrepresentation(relations)anddeclarativerulesareusedtocomputederiveddata(e.g.,views),thesys-temcaneasilyobservehowbasedataisusedtocomputederiveddata.Thisallowspowerfulcapabilitieslikeau-tomaticmaterializedviewmaintenance[35],constraintinference[17,46],andprovenanceanalysis[22].Toenablesimilarlypowerfullineageanalysisforlarge-scaledistributedsystems,severaltechnicalchallengesmustbeaddressed.First,weneedauniformrepresenta-tionforallsystemstate,includingprocess-localknowl-edge,systemeventsliketimersandinterrupts,andnet-workmessages.Second,weneedanotionofdependen-ciesthataccountsforbothsynchronous,process-localde-pendencies(localcomputation)andasynchronous,cross-processdependencies(communication).Wecallthecom-binationoftheseideasdata-centricprogramming[2]:allsystemstateisrepresentedinauniformmanner(asre-lations),whichenablesthesystemlogictobewrittenasdeclarativequeriesoverthatstate.Anextendedlanguagethatadmitsasynchronousqueriescancapturecommuni-cationwithinthesamedeclarativeframework[5].Themostrecentdata-centriclanguagedesignedbyourgroupiscalledBloom[4,16].6.2SemanticsDependencyanalysisrevealshowinputs,outputs,andintermediatestatearerelated;inaddition,weneedknowl-edgeofsemanticsthatis,howthesedatavalueschangeovertimeandwhichinvariantsarepreserved.Semanticpropertiesandcoordinationrequirementsarecloselyre-lated:ifaprogram'ssemanticsallowasituationinwhichacorrectnessinvariantmightbeviolated,thenwemightuseacoordinationprotocoltopreventsuchascenariofromarising.Animportantsemanticpropertyismonotonicity:in-tuitively,amonotonicoperatorisonethatprocessesitsinputsinanorder-insensitivemannerandneverretractsapreviousoutputinthefaceofnewinformation.Typi-calexamplesofmonotonicoperatorsincludesetunion,join,projection,andselection[4],aswellasCRDT-likelatticeswithalgebraiccomposition[24].TheCALMThe-oremstatesthat,ifaprogramcanbeexpressedentirelyusingmonotoniclogic,itisguaranteedtobeconuentthatis,deterministicdespitetheeffectsofnetworknon-determinism[6,38].Hence,monotonicoperationsformasafevocabularyfordistributedprogramming:becausetheprogram'soutputisadeterministicfunctionofitsin-put,itismucheasiertocheckthatcorrectnessinvariantsarepreserved.Fordata-centriclanguagessuchasBloom,thereisasimpleconservativetesttodeterminethemonotonicityofindividualrulesorentireprogramsessentially,mono-tonicityispartofthelanguage'stypesystem[4].Becausemonotonicityimpliesconuence,thistestcanidentifyaprogram'sconsistencyrequirements.Forexample,con- ingwhetheragivenapplicationcanproduceserializableoutcomeswhenrunatalowerisolationlevelhasbeenstudiedinthedatabaseliterature[31].Beyonddeterminism.Workoneventualconsistencyof-tentriestoguaranteedeterministicbehavior.Forexample,conuenceanalysisidentiesprogramfragmentsthatpro-ducedeterministicoutcomesdespitenon-deterministicnetworkbehavior.Similarly,CRDTsensurethatallrepli-casofanobjectconvergetothesamestate,regardlessofduplicatedorreorderedmessages.However,determinismistoostrongforsomecommonapplication-levelinvari-ants.Considerthesimpleinvariant:Apurchaserequestreturnsaconrmationifinventoryisnon-zero;other-wiseitreturnsfailure.Thisisnon-deterministicthesetofsuccessfulpurchasesdependsontheorderinwhichmessagesaredeliveredandprocessed.Whatisthebestwaytoreasonaboutnon-deterministicbutwell-denedcorrectnesscriteria?Onestrategyistosimplyencodethespaceofacceptableoutcomesasadisjunction(e.g.,PurchaseXsucceedsandYfailsORpurchaseXfailsandYsucceeds).Aconuentsystemthatsatisesthisdisjunctionensuresthatanacceptableoutcomeisalwaysproduced.However,enumeratingthespaceofacceptableoutcomesscalespoorlyasapplicationcomplexitygrows.Isthereamorenaturalmodelthanthisenumeratedchoiceofoutcomes,and,ifso,canwebuildprogramanalysistoolstosupportit?Morefundamen-tally,beyondmonotonicity,aretheredesignpatternsthatassistinachievingsuchcontrollednon-determinism,andcansuchpatternsbecodiedintotheorems,analysistechniques,andlanguageconstructs?8ConclusionThedevelopmentofreliabledistributedapplicationsde-pendsuponprogrammers'abilitytoreasonaboutcon-sistency.BynarrowlyfocusingonI/O-levelconsistency,traditionalresearchinthisarearisksincreasingirrele-vance:asthelatencyandavailabilitycostsoftraditionalconsistencyprotocolshavebecomeprohibitiveatscale,developershavebeguntoavoidconsistencymechanismsentirely,insteadrelyingonadhoc,application-specicrulesforconictresolutionandreconciliation.Webe-lievethatthesolutionistomeetapplicationdevelopersontheirhometurf:toexploreavarietyofconsistencymechanisms,analysistools,andprogrammingconstructsthatoperateatdifferentlayersofthesoftwarestack.Thegoalshouldbetohelpprogrammersjudiciouslyemployconsistencyoftheappropriatestrengthandtoreasonaboutconsistencywhereveritismostnatural.Thecoretensionliesinbalancingexpressivityandefciencywithgeneralityandmodularity.Wehavesketchedexamplesandinsightsfromourexperiencestraddlingthesebound-aries,butwesuspectthatfurtherprogresswillrequiretheresearchcommunitytoreconsiderlong-heldassumptionsaboutsoftwarearchitectureandthedivisionbetweenstor-ageandapplicationlogic.AcknowledgmentsWewouldliketothankEmilyAndrews,AlexRasmussen,andtheanonymousreviewersfortheirhelpfulfeed-backonthispaper,andparticularlyourshepherd,PhilBernstein.ThisworkwassupportedbytheAirForceOfceofScienticResearch(grantFA95500810352),DARPAXDataAwardFA8750-12-2-0331,theNaturalSciencesandEngineeringResearchCouncilofCanada,theNationalScienceFoundation(grantsCNS-0722077,IIS-0713661,andIIS-0803690),NSFCISEExpeditionsawardCCF-1139158,theNationalScienceFoundationGraduateResearchFellowship(grantDGE-1106400),andgiftsfromAmazon,Cisco,ClearstoryData,Cloud-era,EMC,Ericsson,Facebook,FitWave,GeneralElectric,Google,Hortonworks,Intel,Microsoft,NetApp,NTT,Oracle,SAP,Samsung,Splunk,VMware,andYahoo!.References[1]S.E.AbdullahiandG.A.Ringwood.Garbagecol-lectingtheInternet:asurveyofdistributedgarbagecollection.ACMComputingSurveys,30(3):330373,1998.[2]P.Alvaro,T.Condie,N.Conway,K.Elmeleegy,J.M.Hellerstein,andR.Sears.BOOMAnalytics:Exploringdata-centric,declarativeprogrammingforthecloud.InEuroSys,2010.[3]P.Alvaro,N.Conway,J.M.Hellerstein,andD.Maier.Blazes:coordinationanalysisfordis-tributedprograms.http://arxiv.org/abs/1309.3324,2013.Insubmission.[4]P.Alvaro,N.Conway,J.M.Hellerstein,andW.R.Marczak.ConsistencyanalysisinBloom:aCALMandcollectedapproach.InCIDR,2011.[5]P.Alvaro,W.R.Marczak,N.Conway,J.M.Heller-stein,D.Maier,andR.Sears.Dedalus:Data-logintimeandspace.InO.deMoor,G.Got-tlob,T.Furche,andA.Sellers,editors,DatalogReloaded,volume6702ofLectureNotesinCom-puterScience,pages262281.SpringerBerlin/Heidelberg,2011. [58]P.A.Tucker,D.Maier,T.Sheard,andL.Fegaras.Exploitingpunctuationsemanticsincontinuousdatastreams.IEEETransactionsonKnowledgeandDataEngineering,15(3):555568,2003.[59]W.Vogels.Eventuallyconsistent.CommunicationsoftheACM,52(1):4044,2009.[60]W.E.Weihl.Commutativity-basedconcurrencycontrolforabstractdatatypes.IEEETransactionsonComputers,37(12):14881505,1988.