fculpt bmmrq84andr3fmpjsousa gmailcom Abstract The increasing popularity of cloud storage services has lead companies that handle critical data to think about using these services for their storage needs Medical record databases power system historic ID: 75844
Download Pdf The PPT/PDF document "EP KY Dependable and Secure Storage in ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Toensurecondentialityofstoreddataonthecloudswithoutrequiringakeydistributionservice,weemployase-cretsharingscheme[Shamir1979].Inthisscheme,aspecialpartycalleddealerdistributesasecrettonplayers,buteachplayergetsonlyashareofthissecret.Themainpropertiesoftheschemeisthatatleastf+1ndifferentsharesofthesecretareneededtorecoveritandthatnoinformationaboutthesecretisdisclosedwithforlessshares.Theschemeisintegratedonthebasicreplicationprotocolinsuchwaythateachcloudreceivesjustashareofthedatabeingwritten,besidesthemetadata.Thisensuresthatnoindividualcloudwillhaveaccesstothedatabeingstored,butthatclientsthathaveauthorizationtoaccessthedatawillbegrantedaccesstothesharesof(atleast)f+1differentcloudsandwillbeabletorebuildtheoriginaldata.Theuseofasecretsharingschemeallowsustointegratecondentialityguaranteestothestoreddatawithoutusingakeydistributionmechanismtomakewritersandreadersofadataunitshareasecretkey.Infact,ourmechanismreusestheaccesscontrolofthecloudprovidertocontrolwhichreadersareabletoaccessthedatastoredonadataunit.Ifwesimplyreplicatethedataonnclouds,themonetarycostsofstoringdatausingDEPSKYwouldincreasebyafac-torofn.Inordertoavoidthis,wecomposethesecretsharingschemeusedontheprotocolwithaninformation-optimalerasurecodealgorithm,reducingthesizeofeachsharebyafactorofn f+1oftheoriginaldata[Rabin1989].Thiscom-positionfollowstheoriginalproposalof[Krawczyk1993],wherethedataisencryptedwitharandomsecretkey,theencrypteddataisencoded,thekeyisdividedusingsecretsharingandeachserverreceivesablockoftheencrypteddataandashareofthekey.Commonsensesaysthatforcriticaldataitisalwaysagoodpracticetonoteraseoldversionsofthedata,unlesswecanbecertainthatwewillnotneedthemanymore[Hamilton2007].Anadditionalfeatureofourprotocolsisthatoldversionsofthedataarekeptintheclouds.3.5DEPSKY-AAvailableDepSkyTherstDEPSKYprotocoliscalledDEPSKY-A,andim-provestheavailabilityandintegrityofcloud-storeddatabyreplicatingitonseveralprovidersusingquorumtechniques.Algorithm1presentsthisprotocol.DuetospaceconstraintsweencapsulatesomeoftheprotocolstepsinthefunctionsofthersttworowsofTable1.Weusethe`.'operatortodenoteaccesstometadataelds,e.g.,givenametadatalem,m:verandm:digestdenotetheversionnumberanddigest(s)storedinm.Weusethe`+'operatortoconcatenatetwoitemsintoastring,e.g.,value+new verproducesastringthatstartswiththestringvalueandendswiththevalueofvariablenew verinstringformat.Finally,themaxfunctionreturnsthemaximumamongasetofnumbers.Thekeyideaofthewritealgorithm(lines1-13)istorstwritethevalueinaquorumofclouds(line8),thenwritethe Function Description queryMetadata(du) obtainsthecorrectlysignedlemetadatastoredinthecontainerduofnfout-ofthencloudsusedtostorethedataunitandreturnsitinanarray. writeQuorum(du;name;value) foreverycloudi2f0;:::;n1g,writesthevalue[i]onalenamednameonthecontainerduinthatcloud.Blocksuntilitreceiveswriteconrmationsfromnfclouds. H(value) returnsthecryptographichashofvalue. Table1.FunctionsusedintheDEPSKY-Aprotocols. Algorithm1:DEPSKY-A 1procedureDepSkyAWrite(du,value)2begin 3ifmax verdu=0then 4m queryMetadata(du)5max verdu max(fm[i]:ver:0in1g) 6new ver max verdu+17v[0::n1] value8writeQuorum(du;value+new ver;v)9new meta hnew ver;H(value)i10sign(new meta;Krw)11v[0::n1] new meta12writeQuorum(du;metadata;v)13max verdu new ver 14functionDepSkyARead(du)15begin 16m queryMetadata(du)17max id i:m[i]:ver=max(fm[i]:ver:0in1g)18v[0::n1] ?19parallelfor0in1do 20tmpi cloudi:get(du;value+m[max id]:ver)21ifH(tmpi)=m[max id]:digestthenv[i] tmpi 22waituntil9i:v[i]6=?23for0in1docloudi:cancel pending()24returnv[i] correspondingmetadata(lines12).Thisorderofoperationsensuresthatareaderwillonlybeabletoreadmetadataforavaluealreadystoredintheclouds.Additionally,whenawriterdoesitsrstwritinginadataunitdu(lines3-5,max verduisinitializedas0),itrstcontactsthecloudstoobtainthemetadatawiththegreatestversionnumber,thenupdatesthemax verduvariablewiththecurrentversionofthedataunit.Thereadalgorithmjustfetchesthemetadatalesfromaquorumofclouds(line16),choosestheonewithgreatestversionnumber(line17)andreadsthevaluecorrespondingtothisversionnumberandthecryptographichashfoundinthechosenmetadata(lines19-22).AfterreceivingtherstreplythatsatisesthisconditionthereadercancelsthependingRPCsandreturnsthevalue(lines22-24).Therationaleofwhythisprotocolprovidesthedesiredpropertiesisthefollowing(proofsontheAppendix).Avail- abilityisguaranteedbecausethedataisstoredinaquo-rumofatleastnfcloudsanditisassumedthatatmostfcloudscanbefaulty.Thereadoperationhastoretrievethevaluefromonlyoneoftheclouds(line22),whichisal-waysavailablebecause(nf)f1.Togetherwiththedata,signedmetadatacontainingitscryptographichashisalsostored.Therefore,ifacloudisfaultyandcorruptsthedata,thisisdetectedwhenthemetadataisretrieved.3.6DEPSKY-CACondential&AvailableDepSkyTheDEPSKY-Aprotocolhastwomainlimitations.First,adataunitofsizeSconsumesnSstoragecapacityofthesystemandcostsonaveragentimesmorethanifitwasstoredinasinglecloud.Second,itstoresthedataincleart-ext,soitdoesnotgivecondentialityguarantees.Tocopewiththeselimitationsweemployaninformation-efcientsecretsharingscheme[Krawczyk1993]thatcombinessym-metricencryptionwithaclassicalsecretsharingschemeandanoptimalerasurecodetopartitionthedatainasetofblocksinsuchawaythat(i.)f+1blocksarenecessarytorecovertheoriginaldataand(ii.)forlessblocksdonotgiveanyinformationaboutthestoreddata1.TheDEPSKY-CAprotocolintegratesthesetechniqueswiththeDEPSKY-Aprotocol(Algorithm2).TheadditionalcryptographicandcodingfunctionsneededareinTable2. Function Description generateSecretKey() generatesarandomsecretkey E(v;k)/D(e;k) encryptsvanddecryptsewithkeyk encode(d;n;t) encodesdonnblocksinsuchaway thattarerequiredtorecoverit decode(db;n;t) decodesarraydbofnblocks,with atleasttvalid,torecoverd share(s;n;t) generatesnsharesinsuchawaythat atleasttofthemarerequiredto obtainanyinformationabouts combine(ss;n;t) combinessharesonarrayssofsizen containingatleasttcorrectsharesto obtainthesecrets Table2.FunctionsusedintheDEPSKY-CAprotocols.TheDEPSKY-CAprotocolisverysimilartoDEPSKY-Awiththefollowingdifferences:(1.)theencryptionofthedata,thegenerationofthekeysharesandtheencodingoftheencrypteddataonDepSkyCAWrite(lines7-10)andthereverseprocessonDepSkyCARead(lines30-31);(2.)thedatastoredincloudiiscomposedbytheshareofthekeys[i]andtheencodedblocke[i](lines12,30-31);and(3.)f+1repliesarenecessarytoreadthedataunit'scurrentvalueinsteadofoneonDEPSKY-A(line28).Additionally,insteadofstoringasingledigestonthemetadatale,thewritergeneratesandstoresndigests,oneforeachcloud.Thesedigestsareaccessedasdifferentpositionsofthedigesteldofametadata.Ifakeydistributioninfrastructureisavailable,orifreadersandwritershareacommonkeyk,thesecret 1Erasurecodesalonecannotsatisfythiscondentialityguarantee.sharingschemecanberemoved(lines7,9and31arenotnecessary). Algorithm2:DEPSKY-CA 1procedureDepSkyCAWrite(du,value)2begin 3ifmax verdu=0then 4m queryMetadata(du)5max verdu max(fm[i]:version:0in1g) 6new ver max verdu+17k generateSecretKey()8e E(value;k)9s[0::n1] share(k;n;f+1)10v[0::n1] encode(e;n;f+1)11for0in1do 12d[i] hs[i];e[i]i13h[i] H(d[i]) 14writeQuorum(du;value+new ver;d)15new meta hnew ver;hi16sign(new meta;Krw)17v[0::n1] new meta18writeQuorum(du;metadata;v)19max verdu new ver 20functionDepSkyCARead(du)21begin 22m queryMetadata(du)23max id i:m[i]:ver=max(fm[i]:ver:0in1g)24d[0::n1] ?25parallelfor0in1do 26tmpi cloudi:get(du;value+m[max id]:ver)27ifH(tmpi)=m[max id]:digest[i]thend[i] tmpi 28waituntiljfi:d[i]6=?gjf29for0in1docloudi:cancel pending()30e decode(d:e;n;f+1)31k combine(d:s;n;f+1)32returnD(e;k) Therationaleofthecorrectnessoftheprotocolissimi-lartotheoneforDEPSKY-A(proofsalsoontheAppendix).Themaindifferencesarethosealreadypointedout:encryp-tionpreventsindividualcloudsfromdisclosingthedata;se-cretsharingallowsstoringtheencryptionkeyinthecloudwithoutffaultycloudsbeingabletoreconstructit;theera-surecodeschemereducesthesizeofthedatastoredineachcloud.3.7ReadOptimizationTheDEPSKY-AalgorithmdescribedinSection3.5triestoreadthemostrecentversionofthedataunitfromallcloudsandwaitsfortherstvalidreplytoreturnit.Inthepay-per-usemodelthisisfarfromideal:evenusingjustasinglevalue,theapplicationwillbepayingforndataaccesses.Alower-costsolutionistousesomecriteriatosortthecloudsandtrytoaccessthemsequentially,oneattime,untilweobtainthedesiredvalue.Thesortingcriteriacanbebasedonaccessmonetarycost(cost-optimal),thelatencyofqueryMetadataontheprotocol(latency-optimal),amixof thetwooranyothermorecomplexcriteria(e.g.,anhistoryofthelatencyandfaultsoftheclouds).Thisoptimizationcanalsobeusedtodecreasethemone-tarycostoftheDEPSKY-CAreadoperation.Themaindif-ferenceisthatinsteadofchoosingoneofthecloudsatatimetoreadthedata,f+1ofthemarechosen.3.8SupportingMultipleWritersLocksTheDEPSKYprotocolspresenteddonotsupportconcurrentwrites,whichissufcientformanyapplicationswhereeachprocesswritesonitsowndataunits.However,thereareap-plicationsinwhichthisisnotthecase.Anexampleisafault-tolerantstoragesystemthatusesDEPSKYasitsbackendob-jectstore.Thissystemcouldhavemorethanonenodewiththewriterrolewritinginthesamedataunit(s)forfaulttol-erancereasons.Ifthewritersareinthesamenetwork,aco-ordinationsystemlikeZookeeper[Hunt2010]orDepSpace[Bessani2008]canbeusedtoelectaleaderandcoordinatethewrites.However,ifthewritersarescatteredthroughtheInternetthissolutionisnotpracticalwithouttrustingthesiteinwhichthecoordinationserviceisdeployed(andeveninthiscase,thecoordinationservicemaybeunavailableduetonetworkissues).Thesolutionweadvocateisalowcontentionlockmech-anismthatusesthecloud-of-cloudsitselftomaintainlocklesonadataunit.Theselesspecifywhichisthewriterandforhowmuchtimeithaswriteaccesstothedataunit.Theprotocolisthefollowing:1.Aprocesspthatwantstobeawriter(andhaspermissiontobe),rstlistslesonthedataunitcontaineronallncloudsandtriestondazero-bytelecalledlock-ID-T.Ifsuchleisfoundonaquorumofclouds,ID6=pandthelocaltimetontheprocessissmallerthanT+D,beingDasafetymarginconcerningthedifferencebetweenthesynchronizedclocksofallwriters,someoneelseisthewriterandpwillwaituntilT+D.2.Ifthetestfails,pcanwritealocklecalledlock-p-Tonallclouds,beingT=t+writer lease time.3.Inthelaststep,plistsagainalllesinthedataunitcontainersearchingforotherlockleswithtT+Dbesidestheoneitwrote.Ifsuchleisfound,premovesthelockleitwrotefromthecloudsandsleepsforasmallrandomamountoftimebeforetryingtoruntheprotocolagain.Otherwise,pbecomesthesingle-writerforthedataunituntilT.Severalremarkscanbemadeaboutthisprotocol.First,thelaststepisnecessarytoensurethattwoprocessestryingtobecomewritersatthesametimeneversucceed.Second,lockscanberenewedperiodicallytoensureexistenceofasinglewriterateverymomentoftheexecution.Moreover,unlockingcanbeeasilydonethroughtheremovalofthelockles.Third,theprotocolrequiressynchronizedclocksinor-dertoemployleasesandthustoleratewritercrashes.Finally,thislockprotocolisonlyobstruction-free[Herlihy2003]:ifseveralprocesstrytobecomewritersatthesametime,itispossiblethatnoneofthemaresuccessful.However,duetothebackoffonstep3,thissituationshouldbeveryrareontheenvisioneddeploymentsforthesystems.3.9AdditionalProtocolsBesidesread,writeandlock,DEPSKYprovidesotheropera-tionstomanagedataunits.Theseoperationsandunderlyingprotocolsarebrieydescribedinthissection.Creationanddestruction.Thecreationofadataunitcanbeeasilydonethroughtheinvocationofthecreateoperationineachindividualcloud.Incontention-proneapplications,thecreatorshouldexecutethelockingprotocoloftheprevi-oussectionbeforeexecutingtherstwritetoensureitisthesinglewriterofthedataunit.Thedestructionofadataunitisdoneinasimilarway:thewritersimplyremovesalllesandthecontainerthatstoresthedataunitbycallingremoveineachindividualcloud.Garbagecollection.AsalreadydiscussedinSection3.4,wechoosetokeepoldversionsofthevalueofthedataunitonthecloudstoimprovethedependabilityofthestoragesystem.However,aftermanywritestheamountofstorageusedbyadataunitcanbecometoocostlyfortheorgani-zationandthussomegarbagecollectionisnecessary.Theprotocolfordoingthatisverysimple:awriterjustlistsalllesvalueVersion]TJ/;ྔ ; .96;& T; 7.;݁ ; Td; [00;inthedataunitcontainerandremovesallthosewithVersion]TJ/;྅ ; .96;& T; 7.;݁ ; Td; [00;smallerthantheoldestversionitwantstokeepinthesystem.Cloudreconguration.SometimesonecloudcanbecometooexpensiveortoounreliabletobeusedforstoringDEP-SKYdataunits.Inthiscaseweneedarecongurationproto-coltomovetheblocksfromonecloudtoanother.Thepro-cessisthefollowing:(1.)thewriterreadsthedata(probablyfromtheothercloudsandnotfromtheonebeingremoved);(2.)itcreatesthedataunitcontaineronthenewcloud;(3.)executesthewriteprotocolonthecloudsnotremovedandthenewcloud;(4.)deletesthedataunitfromthecloudbe-ingremoved.Afterthat,thewriterneedstoinformtheread-ersthatthedataunitlocationwaschanged.Thiscanbedonewritingaspecialleonthedataunitcontaineroftheremain-ingcloudsinformingthenewcongurationofthesystem.Aprocesswillreadthisleandaccepttherecongurationifthisleisreadfromatleastf+1clouds.3.10DealingwithWeaklyConsistentCloudsBothDEPSKY-AandDEPSKY-CAprotocolsimplementsingle-writermulti-readerregularregistersifthecloudsbeingaccessedprovideregularsemantics.However,sev-eralcloudsdonotensurethisguarantee,butinsteadprovideread-after-writeoreventualconsistency[Vogels2009]forthedatastored(e.g.,AmazonS3[Ama2010]).Withaslightmodication,ourprotocolscanworkwiththeseweaklyconsistentclouds.Themodicationisvery simple:repeatthedatalereadingfromthecloudsuntiltherequiredconditionissatised(receiving1orf+1dataunits,respectivelyinlines22and28ofAlgorithms1and2).Thismodicationensuresthereadofavaluedescribedonareadmetadatawillberepeateduntilitisavailable.ThismodicationmakestheDEPSKYprotocolsbecon-sistency-proportionalinthefollowingsense:iftheunder-layingcloudsprovideregularsemantics,theprotocolspro-videregularsemantics;ifthecloudsprovideread-after-writesemantics,theprotocolsatisesread-after-writesemantics;andnally,ifthecloudsprovideeventuallyconsistency,theprotocolsareeventuallyconsistent.Noticethatiftheunder-lyingcloudsareheterogeneousintermsofconsistencyguar-antees,DEPSKYensurestheweakestconsistencyamongthoseprovided.Thiscomesfromthefactthatreadingofarecentlywritevaluedependsonthereadingofthenewmeta-datale,which,afterawriteiscomplete,willonlybeavail-ableeventuallyonweaklyconsistentclouds.Aproblemwithnothavingregularconsistentcloudsisthatthelockprotocolmaynotworkcorrectly.Afterlistingthecontentsofacontainerandnotseeingale,aprocesscannotconcludethatitistheonlywriter.Thisproblemcanbeminimizediftheprocesswaitsawhilebetweensteps2and3oftheprotocol.However,themutualexclusionguaranteewillonlybesatisedifthewaittimeisgreaterthanthetimeforadatawrittentobeseenbyeveryotherreader.Unfortunately,noeventuallyconsistentcloudofourknowledgeprovidesthiskindoftimelinessguarantee,butwecanexperimentallydiscovertheamountoftimeneededforareadtopropagateonacloudwiththedesiredcoverageandusethisvalueintheaforementionedwait.Moreover,toensuresomesafetyevenwhentwowriteshappeninparallel,wecanincludeauniqueidofthewriter(e.g.,thehashofpartofitsprivatekey)asthedecimalpartofitstimestamps,justlikeisdoneinmostByzantinequorumprotocols(e.g.,[Malkhi1998a]).Thissimplemeasureallowsthedurabilityofdatawrittenbyconcurrentwriters(thenameofthedataleswillbedifferent),evenifthemetadatalemaypointtodifferentversionsondifferentclouds.4.DEPSKYImplementationWehaveimplementedaDEPSKYprototypeinJavaasanapplicationlibrarythatsupportsthereadandwriteopera-tions.Thecodeisdividedinthreemainparts:(1)dataunitmanager,thatstoresthedenitionandinformationofthedataunitsthatcanbeaccessed;(2)systemcore,thatimple-mentstheDEPSKY-AandDEPSKY-CAprotocols;and(3)cloudprovidersdrivers,whichimplementthelogicforac-cessingthedifferentclouds.Thecurrentimplementationhas5driversavailable(thefourcloudsusedintheevaluation-seenextsection-andoneforstoringdatalocally),butnewdriverscanbeeasilyadded.Theoverallimplementationisabout2910linesofcode,being1122linesforthedrivers.TheDEPSKYcodefollowsamodelofonethreadpercloudperdataunitinsuchawaythatthecloudaccessescanbeexecutedinparallel(asdescribedinthealgorithms).AllcommunicationsbetweenclientsandcloudprovidersaremadeoverHTTPS(secureandprivatechannels)usingtheRESTAPIssuppliedbythestoragecloudprovider.Ourimplementationmakesuseofseveralbuildingblocks:RSAwith1024bitkeysforsignatures,SHA-1forcrypto-graphichashes,AESforsymmetriccryptography,Shoen-makers'PVSSscheme[Schoenmakers1999]forsecretshar-ingwith192bitssecretsandtheclassicReed-Solomonforerasurecodes[Plank2007].MostoftheimplementationsusedcomefromtheJava6API,whileJavaSecretShar-ing[Bessani2008]andJerasure[Plank2007]wereusedforsecretsharinganderasurecode,respectively.5.EvaluationInthissectionwepresentanevaluationofDEPSKYwhichtriestoanswerthreemainquestions:Whatistheadditionalcostinusingreplicationonstorageclouds?Whatistheadvantageintermsofperformanceandavailabilityofusingreplicatedcloudstostoredata?WhataretherelativecostsandbenetsofthetwoDEPSKYprotocols?Theevaluationfocusonthecaseofn=4andf=1,whichweexpecttobethecommondeploymentsetupofoursystemfortworeasons:(1.)fisthemaximumnumberoffaultycloudstorageproviders,whichareveryresilientandsofaultsshouldberare;(2.)therearecurrentlynotmanymorethanfourcloudstorageprovidersthatareadequateforstoringcriticaldata.Ourevaluationusesthefollowingcloudstorageproviderswiththeirdefaultcongurations:AmazonS3,WindowsAzure,NirvanixandRackspace.5.1EconomicalStoragecloudprovidersusuallychargetheirusersbasedontheamountofdatauploaded,downloadedandstoredonthem.Table3presentsthecostinUSDollarsofexecuting10,000readsandwritesusingtheDEPSKYdatamodel(withmetadataandsupportingmanyversionsofadataunit)con-sideringthreedataunitsizes:100kb,1Mband10Mb.Thistableincludesonlythecostsoftheoperationsbeingexe-cuted(invocations,uploadanddownload),notthedatastor-age,whichwillbediscussedlatter.AllestimationspresentedonthissectionwerecalculatedbasedonthevalueschargedbythefourcloudsatSeptember25th,2010.Inthetable,thecolumnsDEPSKY-A,DEPSKY-Aopt,DEPSKY-CAeDEPSKY-CAoptpresentthecostsofusingtheDEPSKYprotocolswiththereadoptimiza-tionrespectivelydisabledandenabled.Theothercolumnspresentthecostsforstoringthedataunit(DU)inasinglecloud.ThetableshowsthatthecostofDEPSKY-Awithn=4isroughlythesumofthecostsofusingthefourclouds,asex-pected.However,ifthereadoptimizationisemployed,the Operation DUSize DEPSKY-A DEPSKY-Aopt DEPSKY-CA DEPSKY-CAopt AmazonS3 Rackspace Azure Nirvanix 100kb 0.64 0.14 0.32 0.14 0.14 0.21 0.14 0.14 10KReads 1Mb 6.55 1.47 3.26 1.47 1.46 2.15 1.46 1.46 10Mb 65.5 14.6 32.0 14.6 14.6 21.5 14.6 14.6 100kb 0.60 0.60 0.30 0.30 0.14 0.08 0.09 0.29 10KWrites 1Mb 6.16 6.16 3.08 3.08 1.46 0.78 0.98 2.93 10Mb 61.5 61.5 30.8 30.8 14.6 7.81 9.77 29.3 Table3.Estimatedcostsper10000operations(inUSDollars).DEPSKY-AandDEPSKY-CAcostsarecomputedfortherealisticcaseof4clouds(f=1).TheDEPSKY-AoptandDEPSKY-CAoptsetupsconsiderthecost-optimalversionoftheprotocolswithnofailures.lessexpensivecloudcostdominatesthecostofexecutingreads(onlyoneout-offourcloudsisaccessedinfault-freeexecutions).ForDEPSKY-CA,thecostofreadingandwrit-ingisapproximately50%ofDEPSKY-A'sduetotheuseofinformation-optimalerasurecodesthatmakethedatastoredoneachcloudroughly50%ofthesizeoftheoriginaldata.TheoptimizedversionofDEPSKY-CAreadalsoreducesthiscosttohalfofthesumofthetwolesscostlycloudsduetoitsaccesstoonlyf+1cloudsinthebestcase.Recallthatthecostsfortheoptimizedversionsoftheprotocolaccountonlyforthebestcaseintermsofmonetarycosts:readsareexecutedontherequiredlessexpensiveclouds.Intheworstcase,themoreexpensivecloudswillbeusedinstead.Thestoragecostsofa1MbdataunitfordifferentnumbersofstoredversionsispresentedinFigure3.Wepresentthecurvesonlyforonedataunitsizebecauseothersizecostsaredirectlyproportional. Figure3.Storagecostsofa1Mbdataunitfordifferentnumbersofstoredversions.TheresultsdepictedinthegureshowthatthecostofDEPSKY-CAstorageisroughlyhalfthecostofusingDEPSKY-Aandtwicethecostofusingasinglecloud.Thisisnosurprisesincethestoragecostsaredirectlyproportionaltotheamountofdatastoredonthecloud,andDEPSKY-Astores4timesthedatasize,whileDEPSKY-CAstores2timesthedatasizeandanindividualcloudjuststoresasinglecopyofthedata.Noticethatthemetadatacostsarealmostirrelevantwhencomparedwiththedatasizesinceitssizeislessthan500bytes.5.2PlanetLabdeploymentInordertounderstandtheperformanceofDEPSKYinarealdeployment,weusedPlanetLabtorunseveralclientsaccessingacloud-of-cloudscomposedofpopularstoragecloudproviders.Thissectionexplainsourmethodologyandthenpresentstheobtainedresultsintermsofreadandwritelatency,throughputandavailability.Methodology.Thelatencymeasurementswereobtainedusingaloggerapplicationthattriestoreadadataunitfromsixdifferentclouds:thefourstoragecloudsindividuallyandthetwoclouds-of-cloudsimplementedwithDEPSKY-AandDEPSKY-CA.Theloggerapplicationexecutesperiodicallyameasure-mentepoch,whichcomprises:readthedataunit(DU)fromeachofthecloudsindividually,oneafteranother;readtheDUusingDEPSKY-A;readtheDUusingDEPSKY-CA;sleepuntilthenextepoch.Thegoalistoreadthedatathroughdifferentsetupswithinatimeperiodassmallaspos-sibleinordertominimizeInternetperformancevariations.WedeployedtheloggeroneightPlanetLabmachinesacrosstheInternet,onfourcontinents.Ineachofthesema-chinesthreeinstancesoftheloggerwerestartedfordifferentDUsizes:100kb(ameasurementevery5minutes),1Mb(ameasurementevery10minutes)and10Mb(ameasurementevery30minutes).Theseexperimentstookplaceduringtwomonths,butthevaluesreportedcorrespondtomeasurementsdonebetweenSeptember10,2010andOctober7,2010.Intheexperiments,thelocalcosts,inwhichtheprotocolsincurduetotheuseofcryptographyanderasurecodes,arenegligibleforDEPSKY-Aandaccountforatmost5%ofthereadand10%ofthewritelatenciesonDEPSKY-CA.Reads.Figure4presentsthe50%and90%percentileofallobservedlatenciesofthereadsexecuted(i.e.,thevaluesbelowwhich50%and90%oftheobservationsfell).ThenumberofreadsexecutedoneachsiteispresentedonthesecondcolumnofTable5.Basedontheresultspresentedinthegure,severalpointscanbehighlighted.First,DEPSKY-Apresentsthebestla-tencyofallbutonesetups.Thisisexplainedbythefactthatitwaitsfor3out-of4copiesofthemetadatabutonlyoneofthedata,anditusuallyobtainsitfromthebestcloudavail-ableduringtheexecution.Second,DEPSKY-CAlatencyiscloselyrelatedwiththesecondbestcloudstorageprovider, (a)100kbDU. (b)1MbDU. (c)10MbDU. Figure4.50th=90th-percentilelatency(inseconds)for100kb,1Mband10MbDUreadoperationswithPlanetLabclientslocatedondifferentpartsoftheglobe.ThebarnamesareS3forAmazonS3,WAforWindowsAzure,NXforNirvanix,RSforRackspace,AforDEPSKY-AandCAforDEPSKY-CA.DEPSKY-CAandDEPSKY-Aareconguredwithn=4andf=1.sinceitwaitsforatleast2out-of4datablocks.Finally,no-ticethatthereisahugevariancebetweentheperformanceofthecloudproviderswhenaccessedfromdifferentpartsoftheworld.Thismeansthatnoprovidercoversallareasinthesameway,andhighlightanotheradvantageofthecloud-of-clouds:wecanadaptouraccessestousethebestcloudforacertainlocation.Theeffectofoptimizations.AninterestingobservationofourDEPSKY-A(resp.DEPSKY-CA)readexperimentsisthatinasignicantpercentageofthereadsthecloudthatrepliedmetadatafaster(resp.thetwofasterinreplyingmeta-data)isnotthersttoreplythedata(resp.thetworstinreplyingthedata).Moreprecisely,in17%ofthe60768DEPSKY-Areadsand32%ofthe60444DEPSKY-CAreadsweobservedthisbehavior.Apossibleexplanationforthatcouldbethatsomecloudsarebetterservingsmallles(DEPSKYmetadataisaround500bytes)andnotsogoodonservinglargeles(likethe10Mbdataunitofsomeex-periments).ThismeansthatthereadoptimizationsofSec-tion3.7willmaketheprotocollatencyworseinthesecases.Nonethelesswethinkthisoptimizationisvaluablesincetherationalebehinditworkedformorethan4/5(DEPSKY-A)and2/3(DEPSKY-CA)ofthereadsinourexperiments,anditsusecandecreasethemonetarycostsofexecutingareadbyaquarterandhalf,respectively.Writes.Wemodiedourloggerapplicationtoexecutewritesinsteadofreadsanddeployeditonthesamemachinesweexecutedthereads.WerunitfortwodaysinOctoberandcollectedthelogs,withatleast500measurementsforeachlocationanddatasize.Duetospaceconstraints,wedonotpresentalltheseresults,butillustratethecostsofwriteop-erationsfordifferentdatasizesdiscussingonlytheobservedresultsforanUKclient.The50%and90%percentileofthelatenciesobservedarepresentedinFigure5.Thelatenciesinthegureconsiderthetimeofwritingthedataonallfourclouds(lesentto4clouds,waitforonly3 Operation DUSize UK US-CA DEPSKY-A DEPSKY-CA AmazonS3 DEPSKY-A DEPSKY-CA AmazonS3 100kb 189 135 59.3 129 64.9 31.5 Read 1Mb 808 568 321 544 306 104 10Mb 1479 756 559 780 320 147 100kb 3.53 4.26 5.43 2.91 3.55 5.06 Write 1Mb 14.9 26.2 53.1 13.6 19.9 25.5 10Mb 64.9 107 84.1 96.6 108 34.4 Table4.Throughputobservedinkb/sonallreadsandwritesexecutedforthecaseof4clouds(f=1). Figure5.50th=90th-percentilelatency(inseconds)for100kb,1Mband10MbDUwriteoperationforaPlanetLabclientattheUK.ThebarnamesarethesameasinFigure4.DEPSKY-AandDEPSKY-CAareconguredwithn=4andf=1.conrmations)andthetimeofwritingthenewmetadata.Ascanbeobservedinthegure,thelatencyofawriteisofthesameorderofmagnitudeofareadofaDUofthesamesize(thiswasobservedonalllocations).Itisinterestingtoobservethat,whileDEPSKY'sreadlatencyisclosetothecloudwithbestlatency,thewritelatencyisclosetotheworstcloud.ThiscomesfromthefactthatinawriteDEPSKYneedstouploaddatablocksonallclouds,whichconsumesmorebandwidthattheclientsideandrequiresrepliesfromatleastthreeclouds.Secretsharingoverhead.AsdiscussedinSection3.6,ifakeydistributionmechanismisavailable,secretsharingcouldberemovedfromDEPSKY-CA.However,theeffectofthisonreadandwritelatencieswouldbenegligiblesinceshareandcombine(lines9and31ofAlgorithm2)accountforlessthan3and0:5ms,respectively.Itmeansthatsecretsharingisresponsibleforlessthan0.1%oftheprotocolslatencyintheworstcase2.Throughput.Table4showsthethroughputintheexper-imentsfortwolocations:UKandUS-CA.Thevaluesareofthethroughputobservedbyasingleclient,notbymul-tipleclientsasdoneinsomethroughputexperiments.ThetableshowsreadandwritethroughputforbothDEPSKY-AandDEPSKY-CA,togetherwiththevaluesobservedfromAmazonS3,justtogiveabaseline.Theresultsfromotherlocationsandcloudsfollowthesametrendsdiscussedhere. 2ForamorecompreensivediscussionabouttheoverheadimposedbyJavasecretsharingsee[Bessani2008].Bythetableitispossibletoobservethatthereadthrough-putdecreasesfromDEPSKY-AtoDEPSKY-CAandthentoS3,atthesametimethatwritethroughputincreasesforthissamesequence.ThehigherreadthroughputofDEPSKYwhencomparedwithS3isduetothefactthatitfetchesthedatafromallcloudsonthesametime,tryingtoobtainthedatafromthefastestcloudavailable.Thepricetopayforthisbenetisthelowerwritethroughputsincedatashouldbewrittenatleastonaquorumofcloudsinordertocompleteawrite.Thistradeoffappearstobeagoodcompromisesincereadstendtodominatemostworkloadsofstoragesystems.Thetablealsoshowsthatincreasingthesizeofthedataunitimprovesthroughput.Increasingthedataunitsizefrom100kbto1Mbimprovesthethroughputbyanaveragefac-torof5inbothreadsandwrites.Bytheotherhand,in-creasingthesizefrom1Mbto10Mbshowslessbenets:readthroughputisincreasedonlybyanaveragefactorof1.5whilewritethroughputincreasesbyanaveragefactorof3.3.Theseresultsshowthatcloudstorageservicesshouldbeusedforstoringlargechunksofdata.However,increas-ingthesizeofthesechunksbringslessbenetafteracertainsize(1Mb).NoticethattheobservedthroughputsareatleastanorderofmagnitudelowerthanthethroughputofdiskaccessorreplicatedstorageinaLAN[Hendricks2007],buttheelas-ticityofthecloudallowsthethroughputtogrowindenitelywiththenumberofclientsaccessingthesystem(accordingtothecloudproviders).Thisisactuallythemainreasonthatleadustonottryingtomeasurethepeakthroughputofser-vicesbuiltontopofclouds.AnotherreasonisthattheIn-ternetbandwidthwouldprobablybethebottleneckofthethroughput,nottheclouds.Faultsandavailability.Duringourexperimentsweob-servedasignicantnumberofreadoperationsonindividualcloudsthatcouldnotbecompletedduetosomeerror.Table5presentstheperceivedavailabilityofallsetupscalculatedasreads completed reads triedfromdifferentlocations.Therstthingthatcanbeobservedfromthetableisthatthenumberofmeasurementstakenfromeachlocationisnotthesame.ThishappensduetothenaturalunreliabilityofPlanetLabnodes,thatcrashandrestartwithsomeregularity.TherearetwokeyobservationsthatcanbetakenfromTa-ble5.First,DEPSKY-AandDEPSKY-CAarethetwosinglesetupsthatpresentedanavailabilityof1.0000inalmostall ferenceonArchitecturalSupportforProgrammingLanguagesandOperatingSystems-ASPLOS'98,pages92103,1998.[Goodson2004]GarthGoodson,JayWylie,GregoryGanger,andMichealReiter.EfcientByzantine-toleranterasure-codedstor-age.InProc.oftheInt.ConferenceonDependableSystemsandNetworks-DSN'04,pages135144,June2004.[Greer2010]MelvinGreer.Survivabilityandinformationassur-anceinthecloud.InProc.ofthe4thWorkshoponRecentAd-vancesinIntrusion-TolerantSystemsWRAITS'10,2010.[Hamilton2007]JamesHamilton.OndesigninganddeployingInternet-scaleservices.InProc.ofthe21stLargeInstallationSystemAdministrationConferenceLISA'07,pages231242,2007.[Hendricks2007]JamesHendricks,GregoryGanger,andMichaelReiter.Low-overheadbyzantinefault-tolerantstorage.InProc.ofthe21stACMSymposiumonOperatingSystemsPrinciplesSOSP'07,pages7386,2007.[Henry2009]AlyssaHenry.CloudstorageFUD(failure,uncer-tainty,anddurability).KeynoteAddressatthe7thUSENIXConferenceonFileandStorageTechnologies,February2009.[Herlihy2003]MauriceHerlihy,VictorLucangco,andMarkMoir.Obstruction-freesyncronization:double-endedqueuesasanex-ample.InProc.ofthe23thIEEEInt.ConferenceonDistributedComputingSystems-ICDCS2003,pages522529,July2003.[Hunt2010]PatrickHunt,MahadevKonar,FlavioJunqueira,andBenjaminReed.Zookeeper:Wait-freecoordinationforInternet-scaleservices.InProc.oftheUSENIXAnnualTechnicalCon-ferenceATC2010,pages145158,June2010.[Jayanti1998]PrasadJayanti,TusharDeepakChandra,andSamToueg.Fault-tolerantwait-freesharedobjects.JournaloftheACM,45(3):451500,May1998.[Krawczyk1993]HugoKrawczyk.Secretsharingmadeshort.InProc.ofthe13thInt.CryptologyConferenceCRYPTO'93,pages136146,August1993.[Lamport1986]LeslieLamport.Oninterprocesscommunication(partII).DistributedComputing,1(1):203213,January1986.[Lamport1982]LeslieLamport,RobertShostak,andMarshallPease.TheByzantinegeneralsproblem.ACMTransactionsonProgramingLanguagesandSystems,4(3):382401,July1982.[Liskov2006]BarbaraLiskovandRodrigoRodrigues.ToleratingByzantinefaultyclientsinaquorumsystem.InProc.ofthe26thIEEEInt.ConferenceonDistributedComputingSystems-ICDCS'06,July2006.[Mahajan2010]PrinceMahajan,SrinathSetty,SangminLee,AllenClement,LorenzoAlvisi,MikeDahlin,andMichaelWalsh.Depot:Cloudstoragewithminimaltrust.InProc.ofthe9thUSENIXSymposiumonOperatingSystemsDesignandImple-mentationOSDI2010,pages307322,October2010.[Malkhi1998a]DahliaMalkhiandMichaelReiter.Byzantinequorumsystems.DistributedComputing,11(4):203213,1998.[Malkhi1998b]DahliaMalkhiandMichaelReiter.SecureandscalablereplicationinPhalanx.InProc.ofthe17thIEEESymposiumonReliableDistributedSystems-SRDS'98,pages5160,October1998.[Martin2002]Jean-PhilippeMartin,LorenzoAlvisi,andMikeDahlin.MinimalByzantinestorage.InProc.ofthe16thInt.SymposiumonDistributedComputingDISC2002,pages311325,2002.[McCullough2010]JohnC.McCullough,JohnDunagan,AlecWolman,andAlexC.Snoeren.Stout:Anadaptiveinterfacetoscalablecloudstorage.InProc.oftheUSENIXAnnualTechnicalConferenceATC2010,pages4760,June2010.[Metz2009]CadeMetz.DDoSattackrainsdownonAmazoncloud.TheRegister,October2009.http://www.theregister.co.uk/2009/10/05/amazon_bitbucket_outage/.[Muniswamy-Reddy2010]Kiran-KumarMuniswamy-Reddy,Pe-terMacko,andMargoSeltzer.Provenanceforthecloud.InProc.ofthe8thUSENIXConferenceonFileandStorageTech-nologiesFAST'10,pages197210,2010.[Naone2009]EricaNaone.Arewesafeguardingsocialdata?TechnologyReviewpublishedbyMITReview,http://www.technologyreview.com/blog/editors/22924/,February2009.[Plank2007]JamesS.Plank.Jerasure:AlibraryinC/C++facili-tatingerasurecodingforstorageapplications.TechnicalReportCS-07-603,UniversityofTennessee,September2007.[Rabin1989]MichaelRabin.Efcientdispersalofinformationforsecurity,loadbalancing,andfaulttolerance.JournaloftheACM,36(2):335348,February1989.[Sarno2009]DavidSarno.Microsoftsayslostsidekickdatawillberestoredtousers.LosAngelesTimes,Oct.15th2009.[Schoenmakers1999]BerrySchoenmakers.Asimplepubliclyver-iablesecretsharingschemeanditsapplicationtoelectronicvoting.InProc.ofthe19thInt.CryptologyConferenceCRYPTO'99,pages148164,August1999.[Shamir1979]AdiShamir.Howtoshareasecret.CommunicationsofACM,22(11):612613,November1979.[Shraer2010]AlexanderShraer,ChristianCachin,AsafCidon,IditKeidar,YanMichalevsky,andDaniShaket.Venus:Vericationforuntrustedcloudstorage.InProc.oftheACMCloudComput-ingSecurityWorkshopCCSW'10,2010.[Storer2007]MarkW.Storer,KevinM.Greenan,EthanL.Miller,andKaladharVoruganti.Potshards:Securelong-termstoragewithoutencryption.InProc.oftheUSENIXAnnualTechnicalConferenceATC2007,pages143156,June2007.[Vogels2009]WernerVogels.Eventuallyconsistent.Communica-tionsoftheACM,52(1):4044,2009.[Vrable2009]MichaelVrable,StefanSavage,andGeoffreyM.Voelker.Cumulus:Filesystembackuptothecloud.ACMTransactionsonStorage,5(4):128,2009.[Vukolic2010]MarkoVukolic.TheByzantineempireintheintercloud.ACMSIGACTNews,41(3):105111,2010.[Weil2006]SageA.Weil,ScottA.Brandt,EthanL.Miller,DarrellD.E.Long,andCarlosMaltzahn.Ceph:Ascalable,high-performancedistributedlesystem.InProc.ofthe7thUSENIXSymposiumonOperatingSystemsDesignandImplementationOSDI2006,pages307320,2006.