/
Eventual Consistency How soon is eventual An Evaluation of Amazon Ss Consistency Behavior Eventual Consistency How soon is eventual An Evaluation of Amazon Ss Consistency Behavior

Eventual Consistency How soon is eventual An Evaluation of Amazon Ss Consistency Behavior - PDF document

tawny-fly
tawny-fly . @tawny-fly
Follow
464 views
Uploaded On 2014-10-25

Eventual Consistency How soon is eventual An Evaluation of Amazon Ss Consistency Behavior - PPT Presentation

lastnamekitedu ABSTRACT Over the last few years Cloud storage systems and socalled NoSQL datastores have found widespread adoption In con trast to traditional databases these storage systems typi cally sacri64257ce consistency in favor of latency and ID: 7472

lastnamekitedu ABSTRACT Over the last

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Eventual Consistency How soon is eventua..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

modelscomeinslightlydi erent avors,rangingfromtradi-tionalstrictconsistency,whichrequiresallreplicaofalldataitemstobeidenticalaswellasallsemanticalrelationshipsbetweendataitemstobeobserved,toconsistencyguaran-teeswhichcanbefoundinsystemsliketheGoogleFileSys-tem[8],wherereplicaaretreatedasconsistentonceeverycopyincludeseverysingleupdateatleastonce.Ontheotherhand,thereareclient-centricconsistencymodelsthatdonotcareabouttheinternalstateofastoragesystem.Insteadtheyfocusontheconsistencyguaranteeswhichcanactuallybeobservedbyoneormoreclients,e.g.,whetherstaledataisreturnedornot.Inconsequence,whenmeasuringhowsooneventualcon-sistencyis(thatis,measuringthelengthoftheinconsis-tencywindow),thereareagaintwodi erentperspectivesonthis.ACloudstorageprovideroranyonewithaccesstothesourcecodeofastoragesystem(inthefollowingjustprovider)wouldratherfocusonadata-centricperspective.Toaconsumer,incontrast,itreallydoesnotmatterwhetherinternallyaCloudstoragesystemcontainsahugenumberofstalereplicaaslongastheproviderhasimplementedmech-anismstodealwiththose.Aslongasnostaledataisob-served,thecustomerissatis ed.3.APPROACHANDIMPLEMENTATIONMeasuringthelengthoftheinconsistencywindowistriv-ialfromaproviderperspective:Byaddingdetailedloggingornoti cationfunctionalitytothestoragesystemitiseasilypossibletohavetheactualtimestampsofeachreplicaup-datereadilyavailable.Bycalculatingthedi erencebetweenthelatestandthe rsttimestampitis,hence,possibletogetthedesiredresult.Fromacustomerperspective,incontrast,whomightonlyhaveblack-boxaccesstothestoragesystem(e.g.,incaseofCloudstoragesystems)ormightnothavethemeansorknowledgetochangethesourcecodeofastoragesystem,itismoreimportanttoknowhowlongittakesfromissu-inganupdatetobeingabletostillreadtheoldversion.Foreventuallyconsistentstoragesystems,thisistypicallyavalueexpectedtobegreaterthanzero.Thisvaluecanbeexperimentallydeterminedbythefollowingsteps:1.Createatimestamp.2.Writeaversionnumbertothestoragesystem.3.Continuouslyreaduntiltheoldversionnumberisnolongerreturned,thencreateanewtimestamp.4.Calculatethedi erencebetweenthewritetimestampandthesecondtimestamp(timeofthelastreadofthepreviousversion).5.Repeatthesestepstoachievestatisticalsigni cance.Dependingonthelatencyofstep2,analternativeap-proachmightcreateanothertimestampbetweensteps2and3andusethemeanofthoseforthecalculationinstep5(in-steadofthetimestampfromstep1).Please,notethatitisnecessarytousethelastreadoftheoldversionandnotthe rstreadofthenewversionas{forsystemswheremonotonicreadconsistency5isviolated 5Monotonicreadconsistencyisde nedasfollows:Afterhav-ingreturnedversionntoaspeci cclientthesystemguar-anteestoreturnonlyversionsn[20]{thetimestampofthelastreadoftheoldversionmaybelongafterthetimestampofthe rstreadofthenewver-sion.Oursystemidenti esthelastreadofaparticularver-sionusinganinternalbu er:foreveryversioneachreaderremembersthelasttimeitcouldreadthatversion.Oncethebu erisfull,theinconsistencywindowiscalculatedfortheoldestversiononly,beforeitisremovedfromthebu er.Forourexperimentswehavechosencombinationsofbu ersizeandwriteintervalwhichguaranteethatthehighestob-servedinconsistencywindoweasily tsintothebu er,i.e.,bufferSizewriteIntervalmaxInconsistencyWindow.E.g.apeakinconsistencywindowofabout33scombinedwithacon gurationwhichenablesustocapturevaluesaslargeas100s.Independentfromourwork,Wadaetal.[21]proposeaverysimilarapproach,alreadywithinterestingresults.Inouropinion,theirapproachhasafundamental aw,though:onlyonereaderisusedintheirexperimentalsetup.Byusingonlyonereader,especiallywhenrunninginthesamedatacenterasthewriterorevenworserunningonthesamemachine,itisimprobabletoactuallydiscoverstaleness.Thisisduetotwofacts:1.Adistributedstoragesystemusuallyusessomekindofloadbalancer.DependingontheintelligenceoftheloadbalanceritisnotunlikelythatallrequestsfromthesameIPrangeareforwardedtothesamereplicaorthatthereisevenacachinglayerinbetween.Ac-curacycanbegreatlyincreasedbyrunningadditionalgeographicallydistributedreaders.2.Onereadercan,dependingonthelatencyLofthestoragesystem,onlyachievearesolutionof1=L,i.e.,sendonly1=Lrequestsperunitoftime.Anythingthathappensinbetweenisunknown.Thisresolutionoftheresultscanbealmostlinearlyimprovedbyaddingmorereaderinstances.Forthesereasons,wehaveimplementedasystemwhereonewriterperiodicallywritesalocaltimestampplusaver-sionnumbertothestoragesystem.Next,thereisanumberofreaders(theactualnumberdependsonthestoragesys-tem)whicharegeographicallydistributed.Thesereaderin-stancescontinuouslypollthestoragesystemandrememberforeachversionthelatestpointintimewheretheycouldstillreadthatspeci cversion.Aftercollectingthisdatafromallreaders,wethenconsiderthedi erencebetweenthelatestreadtimestampofversionnandthewritetimestampofversionn+1.Thisis,becausetheclient-observableincon-sistencywindowistheperiodoftimeaftersubmittinganupdatewhereitisstillpossibletoreadthepreviousversion.Figure1showsanexamplewhichshallservetobetterexplainhowwederiveourresults.Thedatausedisnotrealdataasweusuallyhaveabout1,000readsinbetweentwowrites.Wehaveobservedsimilarlogsinrealmonitoringdata,though.Inthisexample,thestoragesystemviolatesmonotonicreadconsistency.The gureshowsatimelineintheleftcolumn,thedatathewriterwroteinthesecondcolumn,andwhatthetworeadersreadatdi erentpointsintimeintheothertwocolumns.Basedonthehighlightedlastreadsforagivenversionitisthenpossibletocalculatethetableintherightpartofthe gure.Forexample,after5unitsoftime(TU)thewriterwritesversionBtothestoragesystem.Reader1readstheold Figure3:LengthofLOWandSAWPeriodsoverTimeonS3counterparts.Forourpurposesweplacedabucketintheregioneu-west(Ireland)sincewehad,duringourMiniStoragetests,observedthatwecouldnotstartEC2instancesinus-east1awhereaswecouldstartinstancesinallavailabilityzonesofeu-west.WhenwerepeatedourMiniStoragetestforS3startingadditionalreadersincertainintervals,weobservedthatourresultswerefairlyconstantbeyond8readers.Toneverthelessplayitsafe,wedeployed12readers{4peravailabilityzone.Ourwriteraswellasthecollectorweredeployedinzonea,allinstancesagainweresmallinstances.Wechoseanupdateintervalof10stogiveeachupdateenoughtime(inourmind)topropagatewithoutinterfer-ingwitholderupdates.Thepollintervalperreaderwassetto10ms.WestartedthetestonAugust29,20118.30hAM(UTC)andkeptitrunningforaweek.Incontrasttothe ndingsofWadaetal.[21]whocouldnotobserveanyinconsistenciesatall,andincontrasttoourexpectationsofseeinganormaldistributionofinconsistencywindowlengths,ourresultsshowsomestrangeperiodicities.First,thereisalong-termperiodicity:Roughlyevery12hoursthebehaviorofS3abruptlychangesbetweenwhatwewillcallaLOWphaseandaSAWphase.Figure3showsthelengthofthoseperiodsincomparison.DuringtheLOWphaseweactually ndarandom8dis-tributionwithameanvalueof28msandamedianof15ms.Please,notethatthesevaluesmaybeexactbutcouldbeo byatleastafactor2duetotheaccuracylimitationsofNTP[15]whichweuseforclocksynchronization.Webe-lieve,though,thatmedianandmeanvaluesbetween0and100msarerealistic.DuringourSAWperiodswecanobserveacurvewhichresemblesasawtoothwave{hence,thename.ItreallydoesnotmatterwhichSAWphaseweselectanexcerptfrom,theperiodicityfollowsalwaysthesamepattern:First,theinconsistencywindow'slengthisclosetozero.Then,itin-creasesbyaboutoneortwosecondswitheverytestuntilitpeaksataboutelevensecondsbeforedroppingstraightdowntothenextminimum.Theonlydi erencethatcanbefoundisthattheminimumcanbefoundintheintervalbe-tweenzeroand vesecondsandthemaximumcanbefound 8Thedistributionhasthreelocalmaxima:theabsolutemax-imumat7ms,nextsmallerlocalmaximumat26msandan-othersmalllocalmaximumat90ms. Figure4:ObservedInconsistencyWindowLengthduringSAWPeriodsOverTimeonS3(Excerpt)betweentenandtwelveseconds.Thewavelengthofthispattern uctuatesbetweeneightandtwelvetests,i.e.,forourtestsetupthepatternrestartsevery80to120s.Figure4showsanexcerptfromoneoftheSAWphases.Wehavebeenresearchingthequestionofconsistencymon-itoringforquiteawhilenow.RepeatedtestsonS3showedtheexactsameresults.AlreadyinJulyandAugust2010,weexperimentallyanalyzedconsistencyguaranteesofS3viaanindependentimplementationwhichalsousedaslightlydi erentalgorithm.Evenbackthen(whereitwasonlyaby-productofourevaluationof[3])weobservedremarkablysimilarbehavior.Figure5showsthefullresultsofouroneweekevaluationofS3.Duetothesheernumberoftestrunsand,hence,thedensityofthecurve,itisnotpossibletoseethesawtoothpatternduringtheSAWphasesbutitisstilleasilypossibletodistinguishSAWandLOWphases.Another ndingwasthattheavailabilityzonesseemdi er-entintermsofaccessingthelatestversion.Whileourwriterwasinzonea,thelongestinconsistencywindowlengthwasobservedin28%ofalltestsinzonea.Thesameistrueforzonecwhilezonebhadthemaximumin49%ofalltests9.Thisindicatesthatzonebseemstohaveaslightlypoorerconnectiontotheothertwozones,e.g.,bybeinglocatedinadi erentbuilding.Furthermore,regardinglocationswecouldnotseedi er-encesbetweenthezones:Theyalldidthesamesawtoothwaveandhadtheirmaximaandminimaattheexactsametimeonlytheamplitudeswereslightlydi erentwhichcre-atestheresultsfromthelastparagraph.Wealsotestedourresultsforviolationsofmonotonicreadconsistency.Fromatotalof353,357,884reads42,565,840orabout12%ofallrequestsviolatedmonotonicreadconsis-tency[20].Inexchange,weobservedanavailabilityofmorethaneightnines(99.9999997%{onlyonerequestreturnedanerror).6.DISCUSSIONInsummary,weobservedanunexpected,veryinterest-ingconsistencybehaviorofAmazonS3,buthavesofarnotbeenabletocomeupwithasatisfyingexplanationofourexperimental ndings.Possibleexplanationscouldbecachinge ectsormeasurementstocounterDDoSattacts 9Thetotalisnot100%asforabout5%ofallteststwozonesobservedthesameinconsistencywindow. counteredstrangeperiodicities,namelyourso-calledSAWandLOWphaseswhichalternateapproximatelytwiceaday.Furthermore,wedescribedthesawtoothwave-likebehaviorofS3duringSAWphasesbeforediscussingpotentialexpla-nations.Ourapproachofgeographicallydistributedreaderscom-binedwithawriter tsintocurrentresearchregardingbench-markingofdistributeddatastoresaswellassystemsbuildingontopofthat.Ourresultsprovideconcretedatathatservesascriteriaforanapplicationdevelopertodeterminewhetheraneventualconsistencydatastoreprovidesacceptablecon-sistencyguarantees.Infutureendeavors,wewilltrytodeterminedependen-ciesbetween lesonS3,e.g.,howperiodicitiesof leswithinthesamebucketoracrossmultiplebucketscorrelate.Fur-thermore,wearecurrentlybenchmarkingApacheCassandraandtheGoogleAppEnginedatastore.Weplantopublishtheseresultsaswellastoextendoure ortstoadditionalstoragesystemsinafollow-uppaper.Finally,YuandVahdat[22]aswellassimilarmodelsknowotherconsistencydimensionsbeyondstaleness,e.g.,ordererror.Weareinvestigatingmeanstoalsomeasurethesedimensions.9.REFERENCES[1]E.Anderson,X.Li,M.Shah,J.Tucek,andJ.Wylie.Whatconsistencydoesyourkey-valuestoreactuallyprovide.InProceedingsoftheSixthWorkshoponHotTopicsinSystemDependability(HotDep),2010.[2]R.Baldoni,A.Corsaro,L.Querzoni,S.Scipioni,andS.Tucci-Piergiovanni.Anadaptivecoupling-basedalgorithmforinternalclocksynchronizationoflargescaledynamicsystems.InProceedingsofthe2007OTMConfederatedinternationalconferenceonOnthemovetomeaningfulinternetsystems-VolumePartI,pages701{716.Springer-Verlag,2007.[3]D.Bermbach,M.Klems,M.Menzel,andS.Tai.Metastorage:Afederatedcloudstoragesystemtomanageconsistency-latencytradeo s.InProceedingsofthe4thInternationalConferenceonCloudComputing(IEEECloud2011).IEEE,2011.[4]B.Cooper,R.Ramakrishnan,U.Srivastava,A.Silberstein,P.Bohannon,H.Jacobsen,N.Puz,D.Weaver,andR.Yerneni.PNUTS:Yahoo!'shosteddataservingplatform.ProceedingsoftheVLDBEndowment,1(2):1277{1288,2008.[5]B.Cooper,A.Silberstein,E.Tam,R.Ramakrishnan,andR.Sears.Benchmarkingcloudservingsystemswithycsb.InProceedingsofthe1stACMsymposiumonCloudcomputing,pages143{154.ACM,2010.[6]G.DeCandia,D.Hastorun,M.Jampani,G.Kakulapati,A.Lakshman,A.Pilchin,S.Sivasubramanian,P.Vosshall,andW.Vogels.Dynamo:amazon'shighlyavailablekey-valuestore.InProc.SOSP,2007.[7]A.FoxandE.Brewer.Harvest,yield,andscalabletolerantsystems.InProceedingsofthe7thWorkshoponHotTopicsinOperatingSystems,1999,pages174{178.IEEE,2002.[8]S.Ghemawat,H.Gobio ,andS.Leung.TheGoogle lesystem.ACMSIGOPSOperatingSystemsReview,37(5):29{43,2003.[9]M.Klems,M.Menzel,andR.Fischer.Consistencybenchmarking:Evaluatingtheconsistencybehaviorofmiddlewareservicesinthecloud.InProceedingsofthe8thInternationalConferenceonServiceOrientedComputing(ICSOC).Springer,Dec.2010.[10]D.Kossmann,T.Kraska,andS.Loesing.Anevaluationofalternativearchitecturesfortransactionprocessinginthecloud.InProceedingsofthe2010internationalconferenceonManagementofdata,pages579{590.ACM,2010.[11]T.Kraska,M.Hentschel,G.Alonso,andD.Kossmann.ConsistencyRationingintheCloud:Payonlywhenitmatters.ProceedingsoftheVLDBEndowment,2(1):253{264,2009.[12]J.Kubiatowicz,D.Bindel,Y.Chen,S.Czerwinski,P.Eaton,D.Geels,R.Gummadi,S.Rhea,H.Weatherspoon,C.Wells,etal.Oceanstore:Anarchitectureforglobal-scalepersistentstorage.ACMSIGARCHComputerArchitectureNews,28(5):190{201,2000.[13]A.LakshmanandP.Malik.Cassandra:adecentralizedstructuredstoragesystem.ACMSIGOPSOperatingSystemsReview,44(2):35{40,2010.[14]M.Menzel,M.Schoenherr,andS.Tai.(mc2)2:criteria,requirementsandasoftwareprototypeforcloudinfrastructuredecisions.Software:PracticeandExperience,2011.[15]ntp.org.NTPAlgorithm.http://www.ntp.org/ntpfaq/NTP-s-algo.htm(accessedonSeptember6,2011).[16]S.Sakr,L.Zhao,H.Wada,andA.Liu.Clouddbautoadmin:Towardsatrulyelasticcloud-baseddatastore.InThe9thIEEEInternationalConferenceonWebServices(ICWS2011),WashingtonDC,USA,July2011.[17]M.Satyanarayanan,J.Kistler,P.Kumar,M.Okasaki,E.Siegel,andD.Steere.Coda:Ahighlyavailable lesystemforadistributedworkstationenvironment.IEEETransactionsoncomputers,pages447{459,1990.[18]A.S.TanenbaumandM.V.Steen.DistributedSystems-PrinciplesandParadigms.PearsonEducation,UpperSaddleRiver,NJ,2ndedition,2007.[19]D.Terry,M.Theimer,K.Petersen,A.Demers,M.Spreitzer,andC.Hauser.Managingupdatecon ictsinBayou,aweaklyconnectedreplicatedstoragesystem.ACMSIGOPSOperatingSystemsReview,29(5):172{182,1995.[20]W.Vogels.Eventuallyconsistent.Queue,6:14{19,October2008.[21]H.Wada,A.Fekete,L.Zhao,K.Lee,andA.Liu.Dataconsistencypropertiesandthetradeo sincommercialcloudstorages:theconsumers'perspective.In5thbiennialConferenceonInnovativeDataSystemsResearch,CIDR,volume11,2011.[22]H.YuandA.Vahdat.Designandevaluationofaconit-basedcontinuousconsistencymodelforreplicatedservices.ACMTransactionsonComputerSystems(TOCS),20(3):239{282,2002.[23]L.Zhao,A.Liu,andJ.Keung.Evaluatingcloudplatformarchitecturewiththecareframework.In2010AsiaPaci cSoftwareEngineeringConference,pages60{69.IEEE,2010.