iousalgorithmsworkbyimplicitlyrankingnodesbasedonhowwellthenodesareconnectedtoatrustednodeNodesthathavebetterconnectivitytothetrustednodearerankedhigherandaredeemedtobemoretrustworthyWeshowthatdesp ID: 336013
Download Pdf The PPT/PDF document "An Analysis of Social Network-Based Sybi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
AnAnalysisofSocialNetwork-BasedSybilDefensesBimalViswanathMPI-SWSbviswana@mpi-sws.orgAnsleyPostMPI-SWSabpost@mpi-sws.orgKrishnaP.GummadiMPI-SWSgummadi@mpi-sws.orgAlanMisloveNortheasternUniversityamislove@ccs.neu.eduABSTRACTRecently,therehasbeenmuchexcitementintheresearchcommunityoverusingsocialnetworkstomitigatemultipleidentity,orSybil,attacks.Anumberofschemeshavebeenproposed,buttheydiergreatlyinthealgorithmstheyuseandinthenetworksuponwhichtheyareevaluated.Asaresult,theresearchcommunitylacksaclearunderstandingofhowtheseschemescompareagainsteachother,howwelltheywouldworkonreal-worldsocialnetworkswithdierentstructuralproperties,orwhetherthereexistother(poten-tiallybetter)waysofSybildefense.Inthispaper,weshowthat,despitetheirconsiderabledif-ferences,existingSybildefenseschemesworkbydetectinglocalcommunities(i.e.,clustersofnodesmoretightlyknitthantherestofthegraph)aroundatrustednode.Ourndinghasimportantimplicationsforbothexistingandfu-turedesignsofSybildefenseschemes.First,weshowthatthereisanopportunitytoleveragethesubstantialamountofpriorworkongeneralcommunitydetectionalgorithmsinordertodefendagainstSybils.Second,ouranalysisrevealsthefundamentallimitsofcurrentsocialnetwork-basedSybildefenses:Wedemonstratethatnetworkswithwell-denedcommunitystructureareinherentlymorevulnerabletoSybilattacks,andthat,insuchnetworks,Sybilscancarefullytar-gettheirlinksinordermaketheirattacksmoreeective.GeneralTermsSecurity,Design,Algorithms,ExperimentationCategoriesandSubjectDescriptorsC.4[PerformanceofSystems]:Designstudies;C.2.0[Computer-CommunicationNetworks]:General|Se-curityandprotectionKeywordsSybilattacks,socialnetworks,socialnetwork-basedSybildefense,communitiesPermissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.SIGCOMM'10,August30–September3,2010,NewDelhi,India.Copyright2010ACM978-1-4503-0201-2/10/08...$10.00.1.INTRODUCTIONAvoidingmultipleidentity,orSybil,attacksisknowntobeafundamentalprobleminthedesignofdistributedsys-tems[8].Maliciousattackerscancreatemultipleidentitiesandin\ruencetheworkingofsystemsthatrelyuponopenmembership.Examplesofsuchsystemsrangefromcommu-nicationsystemslikeemailandinstantmessagingtocollabo-rativecontentrating,recommendation,anddeliverysystemssuchasDiggandBitTorrent.TraditionaldefensesagainstSybilattacksrelyontrustedidentitiesprovidedbyacer-ticationauthority.Butrequiringuserstopresenttrustedidentitiesrunscountertotheopenmembershipthatunder-liesthesuccessofthesedistributedsystemsintherstplace.Recently,therehasbeenexcitementintheresearchcom-munityaboutapplyingsocialnetworkstomitigateSybilat-tacks.Anumberofschemeshavebeenproposedthatat-tempttodefendagainstSybilsinasocialnetworkbyus-ingpropertiesofthesocialnetwork'sstructure[7,29,32,33].Unliketraditionalsolutions,theseschemesrequirenocen-traltrustedidentities,andinsteadrelyonthetrustthatisembodiedinexistingsocialrelationshipsbetweenusers.Allsocialnetwork-basedSybildefenseschemesmaketheassumptionthat,althoughanattackercancreatearbitrarySybilidentitiesinsocialnetworks,heorshecannotestablishanarbitrarilylargenumberofsocialconnectionstonon-Sybilnodes.Asaresult,Sybilnodestendtobepoorlyconnectedtotherestofthenetwork,comparedtothenon-Sybilnodes.SybildefenseschemesleveragethisobservationtoidentifySybils.Theyusevariousgraphanalysistech-niquestosearchfortopologicalfeaturesresultingfromthelimitedcapacityofSybilstoestablishsociallinks.Ourfocusinthispaperisonthegraphanalysisalgorithmsbehindtheschemes.TheliteratureonSybildefenseschemesisstillinitsearlystages;mostpapersdescribenewalgo-rithms,butnoneprovideacommoninsightthatexplainshowalloftheseschemesareabletodetectSybils.Eachalgorithmhasbeenshowntoworkwellunderitsownas-sumptionsaboutthestructureofthesocialnetworkandthelinksconnectingnon-SybilandSybilnodes.However,itisunclearhowthesealgorithmswouldcompareagainsteachother,onmoregeneraltopologies,orunderdierentattackstrategies.Asaresult,itisnotknownifthereexistother(potentiallybetter)waystomitigateSybilattacksoriftherearefundamentallimitstousingonlythestructureoftheso-cialnetworktodefendagainstSybils.Inthispaper,wetakearst,butimportant,steptowardsansweringthesequestions.WedecomposeexistingSybilde-fenseschemesanddemonstratethatattheircore,thevar- iousalgorithmsworkbyimplicitlyrankingnodesbasedonhowwellthenodesareconnectedtoatrustednode.Nodesthathavebetterconnectivitytothetrustednodearerankedhigherandaredeemedtobemoretrustworthy.Weshowthat,despitetheirconsiderabledierences,allSybildefenseschemesranknodessimilarly|nodeswithinlocalcommuni-ties(i.e.,clustersofnodesmoretightlyknitthantherestofthenetwork)aroundthetrustednodearerankedhigherthannodesintherestofthenetwork.Thus,Sybildefenseschemesworkbyeectivelydetectinglocalcommunities.Theaboveinsighthasimportantimplicationsforbothex-istingandfuturedesignsofsocialnetwork-basedSybilde-fenseschemes.First,itmotivatesustoinvestigatewhetheraclassofalgorithms,knownascommunitydetectionalgo-rithms[10],thatattempttondsuchclustersofnodesdi-rectly,couldbeusedforSybildefense.Wendthatitispossibletouseo-the-shelfcommunitydetectionalgorithmstondSybils.UnlikeSybildefense,communitydetectionisawell-studiedandmatureeld,implyingthatourndingsopenthedoorforresearcherstoexploitavarietyoftech-niquesfromarichbodyofcommunitydetectionliterature.Second,ourinsightalsohintsatthelimitationsofrelyingoncommunitiesforndingSybils.ForSybildefenseschemestoworkwell,allnon-SybilnodesneedtoformasinglecommunitythatisdistinguishablefromthegroupofSybilnodes.1Inreality,however,usersinmanysocialnetworksformmultiplecommunitiesthatareinterconnectedrathersparsely.Weshowthat,inthesenetworks,itishardforatrustednodetodistinguishSybilsfromnon-Sybilsoutsideitslocalcommunity.Further,wedemonstratehowSybilscanlaunchextremelyeectiveattacksbyestablishingjustasmallnumberoflinkstocarefullytargetednodeswithinsuchnetworks.AssystemsarebeginningtobebuiltontopofSybildefenseschemes[17,18,27],ourndingsquestionthewisdomofbuildingthesesystemswithoutathoroughunderstandingofthelimitationsofSybildefense.2.UNDERSTANDINGSYBILDEFENSEAsnotedbefore,avarietyofSybildefenseschemeshavebeenproposed,buteachhasbeenevaluatedusingdierentsocialnetworksandattackstrategiesbytheSybils.Therefore,itisnotwellunderstoodhowthesedierentschemescompareagainsteachother,orhowapotentialuseroftheseschemes,suchasareal-worldsocialnetworkingsite,wouldselectoneschemeoveranother.2.1ThecoreofSybildefenseschemesGiventheproblemofcomparingcompetingSybildefenseschemes,oneapproachwouldbetoviewtheschemesascompletecoherentproposals(i.e.,treatthemasblackboxes,andcomparetheminreal-worldsettings).Suchanapproachisstraight-forwardandwouldprovideusefulperformancecomparisonsbetweenaxedcongurationofschemesoveragivensetofsocialnetworksandattackstrategiesbytheSybils.However,itwouldnotyieldconclusiveinformationonhowaparticularschemewouldperformifeitherthegivensocialnetworkorthebehavioroftheattackershouldchange.Italsodoesnotallowustoderiveanyfundamentalinsights 1ManySybildefenseschemesimposethisrequirementim-plicitlybyassumingthatthenon-Sybilregionofthenet-workisfastmixing[22],meaningarandomwalkoflengthO(logN)reachesastationarydistributionofnodes.intohowtheseschemeswork,whichmightenableustobuilduponandimprovethem.Analternativeapproachistondacoreinsightcommontoalltheschemesthatwouldexplaintheirperformanceinanysetting.Gainingsuchafundamentalinsight,whiledif-cult,notonlyprovidesguidanceonimprovingfuturede-signs,butalsoshedslightonthelimitsofsocialnetwork-basedSybildefense.However,wecannotgainsuchanin-sightbytreatingeachoftheseschemesasablackbox,witheachcarryingitsownsetofalgorithms,optimizations,andassumptions.Instead,weneedtoreducetheschemestotheircoretaskbeforeanalyzingthem.Atahighlevel,allexistingschemesattempttoisolateSybilsembeddedwithinasocialnetworktopology.EveryschemedeclaresnodesinthenetworkaseitherSybilsornon-Sybilsfromtheperspectiveofatrustednode,eectivelypartitioningthenodesinthesocialnetworkintotwodistinctregions(non-SybilsandSybils).Hence,eachSybildefenseschemecanactuallybeviewedasagraphpartitioningalgo-rithm,wherethegraphisthesocialnetwork.However,thequalityandperformanceofthealgorithmdependsontheinputs,namely,thenetworktopologyandthetrustednode.MostSybildefenseschemesincludeanumberofuse-fulandpracticaloptimizationsthatenhancetheirperfor-manceinspecicapplicationscenarios.Forexample,Sybil-Guard[33]andSybilLimit[32]haveanumberofdesignfeaturesthatfacilitatetheiruseindecentralizedsystems.Similarly,SumUp[29]hasoptimizationsspecictoonlinecontentvotingsystems.However,becauseourgoalistoun-coverthecoregraphpartitioningalgorithm,westudytheseschemesindependentoftheassumptionsabouttheirappli-cationenvironmentsaswellastheoptimizationsthatarespecictothoseenvironments.Laterinthepaper,weshowthatthisapproachnotonlyoershintsforthedesignersoffutureSybildefenseschemes,butalsohelpsusunderstandthecharacteristicsofreal-worldsocialnetworksthatmakethemvulnerabletoSybilattacks.2.2ConvertingpartitionstorankingsEvenwhenviewingtheschemesasgraphpartitioningal-gorithms,comparingthedierentSybildefenseschemesisnotentirelystraightforward.Theoutputofeachschemede-pendsonthesettingofnumerousparameters.Atahighlevel,theseparameterscanbeseenasmakingthepartition-ingbetweenSybilsandnon-Sybilseithermorerestrictiveorpermissive,therebytradingfalsepositivesforfalsenegatives.Whilethedesignersoftheschemesoerroughguidelines Figure1:Diagramofconvertingpartitioningsintoarankingofnodes.Dierentparametersettings(,,\r)causeincreasinglylargepartitionstobemarkedasSybils,therebyinducingaranking. AssumptionsAlgorithmRankingCutoEvaluation SybilGuard[33] Non-Sybilregionisfastmixing[22]RandomwalkperformedbyeachnodeVaryingrandomwalklengthWhetherornotwalkintersectionoccursKleinbergnetwork[12]SybilLimit[32] Non-SybilregionisfastmixingMultiplerandomwalksperformedbyeachnodeVaryingnumberofrandomwalksandwalklengthWhetherornottailsofrandomwalksintersectFriendster,LiveJournal,DBLP,KleinbergSybilInfer[7] Non-Sybilregionisfastmixing,modiedwalksarefastmixingBayesianinferenceontheresultsoftherandomwalksProbabilityofnodebeingnon-SybilfromBayesianinferenceThresholdontheprobabilitythatagivennodeisnon-SybilPower-lawnetwork[24],LiveJournalSumUp[29] Non-Sybilregionisfastmixing,nosmallcutbetweencollectorandnon-SybilregionCreationofvotingenvelopewithappropriatelinkcapacitiesaroundcollectorVaryingthesizeofthevotingenvelopeWhetherornotnodesarewithinthevotingenvelopeYouTube,Flickr,DiggTable1:Overviewofthepropertiesandevaluationofsocialnetwork-basedSybildefenseschemes.forchoosingtheparametervalues(e.g.,setaparametertoO(logN)whereNisthenumberofnetworknodes),therecanbeconsiderablevariationintheoutputfromdierentparametersettingsthatfollowtheguidelines.Giventhedif-cultyinselectingtherightparametersettings,wewouldliketocomparetheschemesindependentofthechoiceoftheirrespectiveparameters.Westudiedtheimpactofchangingparametersontheout-putoftheSybilandnon-Sybilpartitions.WeobservedthatastheSybilpartitiongrowsorshrinksinresponsetopa-rameterchanges,anorderingcanbeimposedonthenodesaddedorremoved.2Thatis,whentheSybilpartitiongrowslarger,newnodesareaddedtothepartitionwithoutremov-ingnodespreviouslyclassiedasSybils.Similarly,whentheSybilpartitiongrowssmaller,somenodesareremovedfromthepartitionwithoutaddinganynodespreviouslyclassiedasnon-Sybils.Figure1illustrateshowdierentpartition-ingsobtainedbychangingparameterscanbeconvertedintoanorderingorrankingofnodes.OurobservationsuggeststhatonecanviewtheSybilde-fenseschemesasimplicitlyorderingorrankingnodesinthenetwork,whiletheparametersettingsdeterminewheretheboundarybetweenthepartitions,calledthecutopoint,lies.Changingtheparametersslidesthecutopointalongtheranking,buttheresultingpartitionsupholdtheob-servedrankingofnodes.Thus,wecancomparethedif-ferentschemesindependentlyoftheirparametersbysimplycomparingtheirrelativerankingsofthenodes.2.3ReductionofexistingschemesWereduceeachSybildefenseschemeintoitscomponentprocessesusingthemodelpresentedinFigure2.Atitscore,eachschemecontainsanalgorithm,which,givenatrustednodeandanetwork,producesarankingofthenodesinthenetworkrelativetothetrustednode.Then,dependingonthesettingofvariousparametervalues,theschemecreatesacuto,whichisappliedtotherankingandproducesaSybil/non-Sybilpartitioning.TheschemesthatweexamineinthispaperareSybil-Guard[33],SybilLimit[32],SybilInfer[7],andSumUp[29].ForeachoftheseSybildefenseschemes,Table1identies 2WhilewedonotformallyprovethatallparametersofanySybildefenseschememustinduceanordering,itisthecaseforallschemes,environments,andparametersweanalyzed.thepartitioningalgorithm,howthispartitioninginducesarankingofnodes,andhowthealgorithmparametersdeter-mineacuto.Wealsodescribetheassumptionstheschemesmakeabouttheirinputenvironment(i.e.,thestructureofnon-SybilandSybiltopologies),andbrie\rydescribethenet-worksthattheseschemeswereevaluatedupon.Amorede-taileddescriptionofhowtheseschemesmapintoourmodelisincludedintheAppendices.Althoughweonlyshowhowourmodelappliestofourwell-knownschemes,webelievethatitcouldbeappliedtootherschemesaswell.Forexample,arecentworkpro-posesaSybil-resilientdistributedhashtableroutingpro-tocol[17,18],byusingsocialconnectionsbetweenuserstobuildroutingtables.TheprotocolreliesonrandomwalksmuchinthesamemannerasSybilGuardandSybilLimit,sowebelieveouranalysiswouldapplytoitaswell.Similarly,Querciaetal.[27]recentlyproposedaSybildefenseschemethatreliesonagraph-theoreticmetriccalledbetweennesscentralitytocalculatethelikelihoodofanodebeingaSybil.Toapplyouranalysis,thecentralitymeasurecanbeuseddirectlytoinducearankingofthenodes.2.4RestofthepaperInthissection,wehaveshownthatexistingSybildefenseschemesallworkbyinducinganimplicitrankingofthenodes.Wenowtakeacloserlookattheserankings,us- Figure2:DiagramshowingtheprocessesinvolvedinaSybildefensescheme.Inbrief,theschemeitselfcanbesplitintoanalgorithm,whichwhengivenasocialnetworkandatrustednode,producesaranking.Theparameterstotheschemeareusedtocreateacuto,whichdenesaSybil/non-Sybilpartitioningfromtheranking. Figure3:ThesyntheticnetworkusedinSection3.1forexploringtherankings.Eachofthetwocommu-nitiescontains256nodes.ingthemtocomparetheschemesacrossawiderangeofconditions.OurgoalintheremainingsectionsistobetterunderstandtherankingalgorithmsunderlyingexistingSybildefenseschemes,andthroughthisunderstanding,toprovideabasisforansweringthefollowingquestions:ArethedierentSybildefenseschemesperformingthecoretaskofrankingnodesinthesameway,oriseachrankingunique?(Section3)Arethereother(potentiallybetter)waystoobtainthesenoderankings?(Section4)Whatstructuralpropertiesofthesocialnetworkde-terminehowwelltheschemeswork?(Section5.1)AretheschemesrobustagainstthedierentpossibleSybilattackstrategies?(Section5.2)3.RANKINGSANDSYBILDEFENSEInthissection,wedevelopabetterunderstandingofthepro-cessbywhichSybildefenseschemescomputenoderankingsbycomparingtherankingsofthedierentschemes.3.1RankingsinsyntheticnetworksWestartbyexaminingthenoderankingsgeneratedbytheschemeswhenrunoverasyntheticnetworktopology,takenfrom[3]andshowninFigure3.Inbrief,thisnetworkiscon-structedusingtheBarabasi-Albertpreferentialattachmentmodel[4],andthenrewired3tohavetwodenselyconnectedcommunitiesof256nodeseach,connectedbyasmallnum-berofedges.3.1.1ComparingnoderankingsWerandomlyselectedanodeinoneofthecommunitiesasthetrustednodeandcalculatedthenoderankingsonthissyntheticnetworkforthefourSybildefenseschemespreviouslydiscussed.Wethenexaminedhowcloselythevariousrankingsmatched.Tocomparetherankings,weusemutualinformation[28],whichmeasuresthesimilarityoftwopartitioningsofaset.Inbrief,mutualinformationrangesbetween0and1,where0representsnocorrelationbetweenthepartitionings,and1representsaperfectmatch. 3Inbrief,therewiringworksasfollows:Nodesarerstran-domlyassignedtotwocommunities.Then,rewiringworksbyselectingtwolinksA$BandC$DwhereAandCareinthesamecommunityandBandDareinthesamecom-munity.ThesetwolinksarereplacedwiththelinksA$CandB$D,therebyincreasingtheintra-communitylinkswithoutchangingthedegreedistributionorlinkcount.TheresultsofthisexperimentareshowninthetopgraphofFigure4.Forclarity,weonlyshowthemutualinfor-mationbetweenpartitioningsofSybilGuardandeachoftheotherthreeschemes(theotherpairsaresimilar).Thex-axisdenotesthesizeofthepartitioncontainingnon-Sybils.Forexample,thex-axisvalueof10dividestherankingintotwoparts,onewiththerst10nodesintheranking(markedasnon-Sybils)andtheotherwiththerestofthenodes(markedasSybils).Thus,Figure4showsthemutualinformationbe-tweenpairsofrankingsatallpossiblecutopoints.Figure4showsthatthemutualinformationmetricismax-imizedatapartitioningofsize256.Interestingly,itfallsosharplybeforeandafterthiscutovalue.Tounderstandthisplotbetter,weinvestigatedthestrongcorrelationbe-tweenthedierentnoderankingsatthepartitioningsizeof256andfoundthatthe256membersthateachschemeassignedtothenon-SybilpartitionstronglycorrespondedtothehalfofthenetworkinFigure4thatcontainedthetrustednode.Thisindicatesthatallschemesarebiasedtowardsrankingnodesinthelocalcommunityaroundthetrustednodehigherthannodesoutsideofthecommunity.However,thereislittlecorrelationbetweentheorderingofnodeswithinthecommunity,orthenodesoutsideofit,asthemutualinformationislowbetweenpairsofrankingsbe-foreandafterthispoint.3.1.2ThecommonfactorbehindtherankingsOnehypothesisthatcouldexplainouraboveobservationsisthatthenodesarebeingrankedsuchthatnodeswellcon-nectedtothetrustednodearemorelikelytobehigherintherankings.Sincethereareseveralnodeswithinthelocalcom-munityofthetrustednodethatareequallywellconnected,therankingamongstthesenodesisnotstrictlyenforced,i.e.,thedierentschemesrankthesenodesdierently.Sim-ilarly,severalnodesoutsidethelocalcommunityareequally 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 ConductancePartition at Node RankSG SL SU SI 0 0.2 0.4 0.6 0.8 1 Mutual InformationSG-SL SG-SU SG-SI Figure4:Mutualinformationbetweenpairsofrank-ingsandconductanceofeachrankingplottedforvariouspartitionsforthesyntheticnetwork,usingschemesSybilGuard(SG),SybilLimit(SL),SumUp(SU),andSybilInfer(SI).Astrongcorrelationisobservedat256nodes,indicatingahighdegreeofoverlapbetweenthepartitionings,andastrongcom-munitystructureinthenon-Sybils,atthispoint. 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 ConductancePartition at Node RankSG SL SU SI 0 0.2 0.4 0.6 0.8 1 Mutual InformationSG-SL SG-SU SG-SI Figure5:Mutualinformationbetweenpairsofrank-ingsandconductanceofeachrankingplottedforvariouspartitionsofthefourschemeswhenrunontheFacebooknetwork.poorlyconnectedandsotheirrelativerankingisnotcon-sistentacrossthedierentSybilschemes.However,thereisasharpdistinctionbetweentheconnectivityofnodesin-sideandoutsidethelocalcommunity,andsotheformerarerankedbeforethelatter.Toconrmthishypothesis,weusedawellknownmetriccalledconductance[16]fordetermininghowcloselyasub-setofnodeswithinanetworkareconnectedamongthem-selvesrelativetotherestofthenetwork.Conductanceisawidelyusedmetricforevaluatingthequalityofcommu-nitieswithinlargenetworks.Inbrief,theconductanceofasetofnodesrangesbetween0and1,withlowernumbersindicatingstrongercommunities.Weplottheconductanceofthenon-SybilsubsetinthebottomofFigure4andnoticethatthereisasharpin\rec-tionpointintheconductanceat256nodesforallschemes.Thiscorrespondstotheboundarybetweenthetwocom-munitiesinoursyntheticnetworktopology.Addingnodesfromanothercommunitysharplyincreasestheconductance,soallschemesassignhigherrankingstonodesfromwithinthecommunityaroundthetrustednodethantonodesfromoutsidethecommunity.Thishelpsexplainwhytheparti-tionsobtainedfromtherankingsmatchverywellwhenthecutoissetatthein\rectionpoint.3.2Rankingsinreal-worldnetworksInthissection,weverifythattheresultswefoundforoursyntheticnetworkalsoholdinreal-worldnetworks.First,wewishtocheckthatnodesarerankedinabiasedmanner,suchthatnodesfromthetrustednode'slocalcommunityrankhigherthananyothernodes.Second,wewishtotestifthepointatwhichallSybildefenseschemesagreecorrespondstoatroughintheconductancevalue,indicatingtheboundaryofthecommunityaroundthetrustednode.Todoshowthis,werepeattheexperimentabovefortworealworldnetworks:Facebook,consistingofthesocialnetworkbetweenRiceUniversitygraduatestudentstakenfromFacebook[21],andAstrophysics,consistingoftheco-authorshipnetworkbetweenastrophysicists[25].DetailsonthesedatasetsareprovidedinTable2. 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 ConductancePartition at Node RankSG SL SU SI 0 0.2 0.4 0.6 0.8 1 Mutual InformationSG-SL SG-SU SG-SI Figure6:Mutualinformationbetweenpairsofrank-ingsandconductanceofeachrankingplottedforvariouspartitionsofthefourschemeswhenrunontheAstrophysicsnetwork.AswecanseeinFigures5and6,themutualinforma-tionrevealsalocalcutowhereallrankingshavestrongcorrelation,andthiscutoisalsocharacterizedbyalowconductancevalue.Takentogether,ourexperimentsshowthatallSybildefenseschemesareidentifyingalocalcommu-nitythatsurroundsthetrustednode,butthattherankingofnodestheyusetoreachthelocalcommunity(andthattheyuseafterthispoint)isnotstronglycorrelated.3.3SummaryofobservationsWenowsummarizethendingsfromourcomparisonofthewayinwhichvariousalgorithmsranknodes:Therankingofnodesisbiasedtowardsthosewhichdecreaseconductance.Thus,nodesthataretightlyconnectedaroundatrustednode(i.e.,thosethatformsubsetswithlowerconductance)aremorelikelytoberankedhigher.Whentherearemultiplenodesthataresimilarlywellconnectedtothetrustednode(i.e.,theyformsubsetswithsimilarconductance)theyareoftenordereddif-ferentlyindierentalgorithms.Whenthetrustednodeislocatedinadenselycon-nectedcommunityofnodes,withaclearboundarybe-tweenthiscommunityandtherestofthenetwork,thenodesinthelocalcommunityaroundthetrustednodearerankedbeforeothers.4.APPLYINGCOMMUNITYDETECTIONIntheprevioussection,weobservedthatallSybildefenseschemesworkbyidentifyingnodesinthelocalcommu-nityaroundagiventrustednodeandrankingthemasmoretrustworthythanthoseoutside.Inthissection,weexaminewhetheralgorithmsthatareexplicitlydesignedtodetectcommunities,calledcommunitydetectionalgo-rithms[2,3,6,19],canbeusedforSybildefenseinthesamemannerasexistingschemes.Ourgoalistoinvestigatethe potentialforleveragingexistingliteratureincommunityde-tectiontodefendagainstSybils.Tothisend,werstselectano-the-shelfcommunitydetectionalgorithmandgener-ateanoderankingfromthealgorithm.WethencompareitsnoderankingwiththoseofexistingSybildefenseschemes,todetermineifitisabletodefendagainstSybilswithsimilaraccuracy.4.1CommunitydetectionCommunitydetectioninnetworksisawellstudiedandma-tureeld.Therearenumerousapproachesthatusedier-entmechanismsinordertodetectcommunitiesanddierentmetricstoevaluatethequalityofcommunities.Below,wegiveabriefoverviewofhowcommunitydetectionschemeswork.Inthispaper,wefocusonlocalcommunitydetectionschemes[3],whichdonotrequireaglobalviewofthenet-work.4Mostofthelocalapproachesworkbystartingwithone(ormore[2])seednodesandgreedilyaddingneighboringnodesuntilasucientlystrongcommunityisfound.Forex-ample,Mislove'salgorithm[21]iterativelyaddsnodesthatimprovethethenormalizedconductance(ametriccloselyre-latedtoconductance)ateachstep,andstopswhenthecon-ductancemetricreachesanin\rectionpoint.Foradetailedsurveyoflocalcommunitydetectionalgorithms,wereferthereadertotherecentsurveypaperbyFortunato[10],whichdiscussesnumerousalgorithmsforcommunitydetection.Asthereisalargebodyofworkoncommunitydetec-tion,wecouldtheoreticallyutilizeanyofthesealgorithmsastherankingalgorithm.Fortheevaluationpresentedinthissection,weselectedMislove'salgorithm[21],butwiththeconductancemetricfromSection3.1.2.Wechosethisalgorithmasitisconceptuallyeasytounderstand,sinceitgreedilyminimizesconductance.However,ourdecisionisnotfundamental,andtheremaybeotheralgorithmsthatperformbetter(especiallysincedierentcommunitydetec-tionalgorithmshavebeenshowntoperformbetterondif-ferentnetworks[15]).Rather,ourgoalhereissimplytoinvestigatehowwello-the-shelfcommunitydetectionalgo-rithmsareabletondSybils.InordertousecommunitydetectiontondSybils,weneedtogenerateanoderankinginthesamemannerastheotherschemes.Todoso,werunMislove'scommunitydetec-tionalgorithmandrecordthenodethatititerativelyaddsateachsteptominimizeconductance.Notethatwemodifythealgorithmtonotstoponcealocaltroughisfound;in-steadweallowittocontinuerunninguntilallofthenodeshavebeenadded.Thisresultsinanoderankingthatwecanusetocompareagainsttheotherschemes.4.2EvaluatingSybildetectionWenowevaluatethecommunitydetectionalgorithmagainstourexistingSybildefenseschemes.WhencomparingagainsteachoftheSybildefenseschemes,weusedexperimentalset-tingssimilartothosedescribedinthepaperinwhichthe 4Ourdecisiontofocusonlocalcommunitydetectionalgo-rithms,asopposedtoglobalones,isduetothefactthattheyworkinasimilarmannerasexistingSybildefenseschemesbynotassumingaglobalview.However,ithasbeenshownthatdierentglobalcommunitydetectionalgorithmshavemanyofthesamepropertiesaslocalones[15],indicatingthatourresultswouldlikelyholdforglobalalgorithmsaswell.Weleavethistofuturework.Network Nodes Links Avg.degree YouTube[20] 446,181 1,728,938 7.7Astrophysicists[25] 14,845 119,652 16Advogato[1] 5,264 43,027 16Facebook[21] 514 3,313 13Table2:Statisticsofdatasetsusedinourevaluation.schemewasproposed.Thisrequiredustosplitourevalu-ationresultsintwoseparatesections;oneforSybilGuard,SybilLimit,andSybilInferandanotherforSumUp.ThesplitisnecessarybecauseSumUpwasoriginallyevaluatedforitsabilitytolimitthenumberofvotesSybilidentitiescanplace,andnotforitsabilitytoaccuratelydetectSybilnodes.Thus,theexperimentalsettingsforevaluatingSumUparequitedierentfromthoseoftheotherschemes,necessitatingaseparateevaluation.AsummaryofthedatasetsthatweuseintheevaluationisshowninTable2.Inadditiontothedatasetsfromthepre-vioussection,weexamineYouTube,consistingofthesocialnetworkofusersinYouTube[20],andAdvogato,consistingofthetrustnetworkbetweenfreesoftwaredevelopers[1].4.2.1MeasuringSybildetectionaccuracyInordertomeasuretheaccuracyofthevariousschemesatidentifyingSybils,weneedawaytocomputehowoftenaschemeranksSybilnodestowardsthebottomoftheranking.Todoso,weusethemetricAreaundertheReceiverOperat-ingCharacteristic(ROC)curveorA0.Inbrief,thismetricrepresentstheprobabilitythataSybildefenseschemeranksarandomlyselectedSybilnodelowerthanarandomlyse-lectednon-Sybilnode[9].Therefore,theA0metrictakesonvaluesbetween0and1:Avalueof0.5representsarandomranking,withhighervaluesindicatingabetterrankingand1representingaperfectnon-Sybil/Sybilranking.Valuesbe-low0.5indicateaninverseranking,oronewhereSybilstendtoberankedhigherthannon-Sybils.Averyusefulpropertyofthismetricisthatitisdenedindependentofthenum-berofSybilandnon-Sybilnodes,aswellasthecutovalue,soitiscomparableacrossdierentexperimentalsetupsandschemes.4.2.2SybilGuard,SybilLimit,andSybilInferForcomparingSybilGuard,SybilLimit,andSybilInfertothecommunitydetectionalgorithm,weusethesameexperi-mentalmethodologyasthemostrecentproposal,SybilInfer.Specically,weusea1,000nodescale-freetopology[4]forthenon-Sybilpartofthenetwork.Amongthissetofnon-Sybilnodes,asmallfraction(10%)ofthenodesarecom-promisedbyanadversaryandbecomeSybilnodes.These100maliciousnodesarechosenuniformlyatrandom.ThesenodesthenintroduceadditionalSybilidentitiesintothenet-work,whichformascalefreetopologyamongthemselvesus-ingthesameparametersasnon-Sybilregion.Wevarythenumberofintroducednodesfrom30to1,000,andaveragetheresultsover100experimentalruns.WepresenttheresultsofthisexperimentinFigure7.Wemaketwoimportantobservations:First,SybilInferandcommunitydetectionperformwell,withimprovingaccuracyasmoreSybilsareadded.ThereasonforthisincreaseisthattheSybilregionbecomeslargerand,therefore,easierdistin-guishfromthenon-Sybilregion.Second,bothSybilGuardandSybilLimitperformlesswellthantheothertwoschemes. 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 Area under ROC curve (A')Number of additional Sybil nodesRandom SG SL SI CD Figure7:AccuracyforSybildefenseschemes,aswellascommunitydetection(CD),onthesynthetictopologyaswevarythenumberofadditionalSybilidentitiesintroducedbycolludingentities.ThiseectisbecausethenumberofSybilnodesaddedislowerthantheboundenforcedbythesetwoschemes,aswasobservedintheevaluationonSybilInfer[7].Inmoredetail,theSybilregionisconnectedtothenon-Sybilregionby789attackedgesontheaverage;SybilGuardandSybilLimiten-surethatnomorethatO(logN)nodeswillbeacceptedperattackedge,whereNisthenumberofnodesinthenetwork.Sinceweonlyaddamaximumof1,000Sybilnodes,neitheroftheseschemesmarksmanynodesasSybils.Wenowevaluatetheseschemesonareal-worldsocialnet-work.Specically,werepeatthisexperimentontheFace-bookgraduatestudentnetworkfrombefore.Thisnetworkhassimilardensityasthesyntheticnetwork,butisonlyhalfthesize.TheresultsofthisexperimentarepresentedinFig-ure8.Aswecansee,thecommunitydetectionalgorithmperformsfavorablycomparedtotheexplicitSybildefenseschemes,andallbecomemoreaccurateasmoreSybilsareadded.Acarefulreadermaynotethattheabsoluteaccu-racyofallschemes(communitydetectionincluded)issig-nicantlylowerthanthatobservedaboveinFigure7.TheunderlyingreasonforthislowerperformanceisastructuralcharacteristicoftheFacebooknetworkthatmakesitinher-entlyhardertodistinguishSybilsfromnon-Sybils.Weex-plorethislimitationingreaterdetailinSection5.4.2.3SumUpRecallthatSumUpprovidesaSybil-resilientvotingservice.Todoso,SumUpdenesavotingenvelopewhereinthelinksareassignedacapacitysothatallvotesfromwithintheenve-lopecanbecollected.Outsidethisenvelope,votesareonlycollectedifthevotercanndanpathwithcapacitytothe 0 0.2 0.4 0.6 0.8 1 0 50 100 150 200 250 300 350 400 Area under ROC curve (A')Number of additional Sybil nodesRandom SG SL SI CD Figure8:AccuracyintheFacebooknetworkaswevarythenumberofadditionalSybilidentitiesintro-ducedbycolludingentities.votecollector(i.e.,thetrustednode).Inordertoapplycom-munitydetection,wereplacetheprocessthatdeterminesthevotingenvelopewithacommunitydetectionalgorithm,pickthecommunitywiththelowestconductancevaluetobetheenvelope,andunconditionallyacceptallvotesfromnodeswithinthisenvelope.Fornodesoutsidetheenvelope,weassignallotherlinkstohavecapacityone,andwecollecttheirvotesiftheycanndapathwithweighttoanynodewithintheenvelope.Thisdierenceisnecessarysincewedon'tassignweightstolinkswithintheenvelope,asSumUpdoes.WeevaluateandcomparethecommunitydetectionschemeagainstSumUponthreedierentdatasets:Ad-vogato,Astrophysics,andYouTube.WefollowthesamemethodologyusedintheoriginalSumUpevaluation[29]:foreachnetwork,weinject100attackedgesbyinserting10Sybilnodeswithlinksto10otheruniformlyrandomlycho-sennon-Sybilnodes.Inordertocastbogusvotes,eachSybilnodeisfurtherattachedtoalargenumberofSybilidentitiesbyasinglelinkeach.Asintheoriginalevaluation,weran-domlyselectavotecollectorandrandomlychooseasubsetofnon-Sybilsasvoters.WeplottheaveragestatisticsoverveexperimentalrunsforbothSumUpandthecommunitydetectionalgorithm.Toevaluatetheaccuracyoftheseschemes,wemustdeneanewmetric.ThisisbecauseSumUpdoesnotclassifyallnodesasSybilornon-Sybil(neededforA0),butrather,onlythosenodeswhichissuevotes.Sincesubsetsofboththenon-SybilandSybilnodesareissuingvotes,ideally,theschemewouldonlycountthenon-Sybilvotes.Thus,ourmetricshouldpenalizetheundercountingofnon-Sybilvotes,aswellasthecountingofanySybilvotes.Themetricwedene,voteaccuracy,isexpressedasthenumberofnon-Sybilvotescounteddividedbythesumofthenumberofnon-SybilvotesissuedandthenumberofSybilvotescounted.Voteaccuracyrangesbetween0and1,wherehighervaluesrepresentbetterperformance.Figure9presentstheresultsofthisexperiment,aswevarythenumberofnon-Sybilvoters(Sybilstrytovoteas 0 0.2 0.4 0.6 0.8 1 0.001 0.01 0.1 1 Number of non-Sybil voters / Total non-Sybil nodesAstrophysics 0 0.2 0.4 0.6 0.8 1 Vote accuracyAdvogato SU CD 0 0.2 0.4 0.6 0.8 1 YouTube Figure9:VoteaccuracyofSumUpandcommunitydetectiononthreenetworks. oftenastheycan).Themostsalientresultisthattheac-curacyforSumUpvarieswidelyacrossthethreenetworks;thisisadirectresultofusingtheenvelopetechnique.Incertainnetworks,oneormoreoftheSybilnodesisacceptedintotheenvelope,andalargenumberofmaliciousvotesarecast.Theresultsforthecommunitydetectionalgorithmaresignicantlymorestable,producingusefulresultsoncethenumberofnon-Sybilvotersrisesabove1%.4.3ImplicationsWebeganthissectionbyobservingthat,sinceallSybilde-fenseschemesappearedtobeidentifyinglocalcommunities,explicitcommunitydetectionalgorithmsmaybeabletode-fendagainstSybilsaswell.Itisinterestingtonote|evenwithoutchangingtheexperimentalsetupunderwhichexist-ingschemeswereevaluated|oursimplecommunitydetec-tionalgorithmgivescomparableresultstoexistingschemes.OurresultshavebothpositiveandnegativeimplicationsforfuturedesignersofSybildefenseschemes.Onthepositiveside,ourresultsdemonstratethatthereisaopportunitytoleveragethelargebodyofexistingworkoncommunitydetectionalgorithmsforSybildefense[10].Priorworkoncommunitydetectionprovidesareadilyavailablesourceofsophisticatedgraphanalysisalgorithmsaroundwhichresearcherscouldimproveexistingschemesanddesignnewapproaches.Onthenegativeside,relyingoncommu-nitydetectionforperformingSybildefensefundamentallylimitstheabilityoftheseschemestondSybilsinmanyreal-worldgraphs.Weexploretheselimitationsinthenextsection.5.LIMITATIONSOFSYBILDEFENSEIntheprevioussections,weshowedthatSybildefenseschemesworkbyeectivelyidentifyingnodeswithintightly-knitcommunitiesaroundagiventrustednodeasmoretrust-worthythanthosefartheraway.Inthissection,weinvesti-gatethelimitationsofrelyingoncommunitystructureofthesocialnetworktondSybils.Morespecically,weexplorehowthestructureofthesocialnetworkimpactstheper-formanceofSybildefenseschemesandhowattackerswithknowledgeofthestructureofthesocialnetworkcanleverageittolaunchmoreecientSybilattacks.Sincesocialnetwork-basedSybildefenseschemesusethestructureofsocialnetworkstodistinguishtheSybilnodesfromthenon-Sybilnodes,webeginbyaskingthefollowingquestion:Aretherenetworkswhereitishardtotellthesetwotypesofnodesapart?Inotherwords,couldtherebenet-workswherethenon-SybilnodeslooklikeSybilsorwhereitwouldbeeasyforSybilnodestomasqueradeasnon-Sybils?Intuitively,onewouldexpectnetworkswherethenon-Sybilregioniscomprisedofmultiple,small,tightly-knitcommunitiesthatareinterconnectedsparselytobemorevulnerabletoSybilattacks.Insuchnetworks,nodeswithinonecommunitymightmistakenon-SybilnodesinanothercommunityforSybils,duetolimitedconnectivitybetweenthecommunities.Furthermore,anattackercaneasilydis-guiseSybilnodesasjustanothercommunityinthenetworkbyestablishingasmallnumberofcarefullytargetedlinkstothecommunitycontainingthetrustednode.Next,weverifythisintuitionusingexperimentsoversyntheticandreal-worldsocialnetworkswherethenon-SybilnodeshavedierentcommunitystructuresandtheSybilnodesusedif-ferentattackstrategies. Figure10:IllustrationsofthesyntheticnetworksusedinSection5.1(theactualnetworksaremuchlarger).Non-SybilsaredarkgreenandSybilslightorange.Whilethenon-Sybilregionsof(a),(b),and(c)showincreasingamountsofcommunitystruc-ture,allnon-Sybilregionshavethesamenumberofnodesandlinks,anddegreedistribution.5.1ImpactofsocialnetworkstructureWerstexaminethesensitivityofSybildefenseschemestothestructureofthenon-Sybilregion.AsinSections3and4,weanalyzesyntheticnetworksandthenshowthattheresultsfromthesesimplecasesapplytoreal-worldnetworksaswell.WerstgenerateaBarabasi-Albertrandomsyntheticnet-work[4]with512nodesandinitialdegreem=8.Thisresultsinarandompower-lawnetworkwithapproximately3,900links,andwithoutanycommunitystructure.Wetheniterativelygenerateaseriesofnetworksbyrewiring[3]velinksinsamemannerasinSection3(resultinginanetwork),thenrewiringvemorelinks(resultinginanothernetwork),andsoon,untilonlyvelinksremainbetweenthetwocom-munitiesof256nodeseach(resultinginanalnetwork).Theoutputisaseriesofnetworksthatallhavethesamenumberofnodes,numberoflinks,anddegreedistribution,butareincreasinginthelevelofcommunitystructurethattheyexhibit.Figure10givesaillustrationoftheinitial,intermediate,andnalnetworks.WeusethisseriesofnetworkstoevaluatehowwellSybildefenseschemesperformonnetworkswithincreasingamountsofcommunitystructure.Todoso,wetreateachofthesenetworksasthenon-Sybilregion,andwerandomlyattachaSybilregionof256nodesusing40links.WethenevaluatehowwelltheexistingschemesareabletodetectSybilsbyusingtheA0metric.Theresultofthisexperi- 0 0.2 0.4 0.6 0.8 1 1 4 7 10 13 16 Area under ROC curve (A')Network number (increasing rewiring)Random SG SL SU SI Figure11:AccuracyofSybildefenseschemesonsyntheticnetworkswithincreasingcommunitystructureinducedbyrewiring.Withhighlevelsofcommunitystructure,theaccuracyofallschemeseventuallyfallstoclosetorandom. Network Nodes Links Modularity Facebookundergrad[21] 1,208 43,043 0.278Advogato[1] 5,264 43,027 0.318Wikipediavotes[13] 7,066 100,736 0.350URVemail[11] 1,133 5,451 0.504Astrophysicists[25] 14,845 119,652 0.621Facebookgrad[21] 514 3,313 0.644High-energyphysics[14] 8,638 24,806 0.690Relativity[14] 4,158 13,422 0.790Table3:Sizeandmodularityofthereal-worlddatasetsusedinourevaluation.Weassumeallthegraphstobeundirectedandusethelargestcon-nectedcomponent.mentforthenal16networksareshowninFigure11.ItcanclearlybeseenthattheSybildefenseschemesperformmuchbetterinthenetworkswithlesscommunitystructurethaninthosewithmorecommunitystructure.Infact,whenthereisahighlevelofcommunitystructure,theSybilde-fenseschemesperformclosetowhatwouldbeexpectedwitharandomranking(indicatedbyaA0valueof0.5).Thus,theeectivenessoftheseschemesisverysensitivetothelevelofcommunitystructurepresentinthenon-Sybilregionofthenetwork.Next,weexaminewhetherthisobservationholdsinreal-worldnetworks.Todoso,wecollectedasetofreal-worldnetworksthathavevaryinglevelsofcommunitystructure,showninTable3.Inordertomeasurethelevelofcom-munitystructurepresentinthenetworks,weusethewell-knownmetricmodularity[26].Inbrief,modularityrangesbetween-1and1,with0representingnomorecommunitystructurethanarandomgraph.Stronglypositivevaluesindicatesignicantcommunitystructureandstronglynega-tivevaluesindicatelesscommunitystructurethanarandomgraph.Ascanbeobservedinthetable,theseeightnetworkshavemodularityvaluerangingfrom0.28to0.79,indicatingmoderatetostronglevelsofcommunitystructure.Weconductedasimilarexperimenttotheoneabove,treatingthesenetworksasthenon-Sybilregion,attachingaSybilregion,andevaluatingtheaccuracyofSybildefense.However,sincethesenetworksareofverydierentscales,wecreatedapower-lawSybilregionforeachnetworkwithone-quarterthenumberofSybilsastherearenon-Sybils,andat-tachedtheseSybilregionstothenon-Sybilsrandomlywith 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Area under ROC curve (A')ModularityRandom SG SL SI SU Figure12:AccuracyofSybildefenseschemesonreal-worldnetworksfromTable3withvariouslevelsofcommunitystructure.Signicantlyworseperfor-manceisobservedasthelevelofcommunitystruc-tureincreases. Figure13:IllustrationsofthesyntheticnetworksusedinSection5.2(theactualnetworksaremuchlarger).Non-SybilsaredarkgreenandSybilslightorange.Withdecreasingk,theSybilnodesplacetheirlinksclosertothetrustednode.anumberlinksequalto5%ofthelinksbetweennon-Sybilnodes.TheresultsofthisexperimentareshowninFigure12.Weobserveacleartrend:Asthelevelofcommunitystructureincreases,evidencedbyincreasingmodularity,theperfor-manceoftheSybildefenseschemesfallsclosetorandom.Infact,acorrelationcoecientof-0.81isobservedbetweenthemodularityvalueandtheA0metric,demonstratingthatincreasinglevelsofcommunitystructurearestronglyanti-correlatedwiththeabilitytodistinguishSybils.Thispooraccuracyalsocorrespondswellwithrecentwork[23]thathassuggestedthatmanyreal-worldnetworksmaynotbeasfast-mixingaswaspreviouslythought.Thus,asobservedaboveforsyntheticnetworks,Sybildefenseschemesareex-tremelysensitivetothelevelofcommunitystructurepresentinreal-worldnetworksaswell.5.2ResiliencetotargetedSybilattacksWenowexaminethesensitivityofSybildefenseschemestoSybilattacksthatleverageknowledgeofthestructureofthesocialnetworktoestablishlinkstoatargetedsubsetofnodesinthenetwork.RecallthatallschemesassumethattheSybilnodesareallowedtocreateonlyaboundednumberoflinkstonon-Sybils.Whenevaluatingtheschemes,theauthorsoftheseschemesassumethattheattackerestablishestheselinkstorandomnodesinthenetwork.Wenowexplorehowthisoneaspectoftheattackmodel(randomlinkplacementtonon-Sybils)canaecttheperformanceofSybildefenseschemesbyallowingtheSybilsalevelofcontroloverwherethoselinksareplaced.Asbefore,werstexaminethebe-haviorusingsyntheticnetworksandthenexaminereal-worldnetworks.Tocreatethesyntheticnetwork,weusethemethodologyfromSection5.1,withrewiringdoneuntilonly40linksre-mainbetweenthetwocommunitiesof256nodeseach.WethencreateaseriesofscenarioswhereweincreasinglyallowtheSybilsmorecontroloverwheretheirlinkstonon-Sybilsareplaced.Specically,insteadofrequiringtheSybillinkstobeplacedrandomlyovertheentirenon-Sybilregion,weallowtheSybilstoplacetheselinksrandomlyamongtheknodesclosesttothetrustednode,whereclosenessisdenedbytherankinggivenbythecommunitydetectionalgorithmusedinSection4.Inallcases,thenumberofSybil-to-non-Sybillinksremainsthesame.Thus,askisreduced,theSybilsareallowedtotargettheirlinksclosertothetrustednode.WethencalculatetheaccuracyoftheSybildefenseschemes.AnillustrationofthesenetworksisshowninFig-ure13. Figure14presentstheresultsofthisexperiment.WeseeadecreaseinaccuracyastheSybilsareallowedtoplacetheirlinksclosertothetrustednode.ThisisaresultoftheSybilnodesbeingplacedhigherintheSybildefensescheme'srank-ing,andthereforebeinglesslikelytobedetected.Fromthissimpleexperiment,itisclearthattheperformanceofSybildefenseschemesishighlydependentontheattackmodel,depending(forexample)onnotjustuponthenumberoflinkstheattackercanform,butonhowwellthoselinkscanbetargeted.WethenrepeatthesameexperimentusingtheFacebookgraduatestudentnetwork.TheresultsofthisexperimentareshowninFigure15,andareevenmorestrikingthanthepreviousexperiment.Astheattackersareallowedmorecon-troloverlinkplacement(i.e.,askisreduced),theaccuracyrstfallstonobetterthanrandom,beforedroppingsig-nicantlybelow0.5.ThisindicatesthattheSybildefenseschemesarerankingSybilssignicantlyhigherthannon-Sybils,meaningtheschemesareadmittingSybilsandblock-ingnon-Sybils.ThereasonforthisisthestrongcommunitystructurepresentintheFacebooknetworkcombinedwiththestrongerattackmodel:astheSybilstargettheirlinksmorecarefully,theyappearaspartofthetrustednode'slocalcommunityandarethereforemorehighlyranked.5.3ImplicationsInthissection,weexploredhowtheperformanceofSybildefenseschemesisaectedbythestructureofthesocialnetworkandbytheabilityoftheattackertoexploitthestructureofthesocialnetworktolaunchtargetedattacks.BasedonourunderstandingofhowSybildefenseschemeswork,wehypothesizedthatnetworkswithwell-denedcom-munitystructurewouldbemorevulnerabletoSybilattacks.Weveriedourhypothesisbydemonstratingthat,asthenon-Sybilregioncontainsmoresignicantcommunitystruc-ture,thedetectionaccuracyofallschemesfallssignicantlyandtheschemesarevulnerabletotargetedSybilattacks.OuranalysisrevealsfundamentallimitationsofexistingSybildefenseschemesthatariseoutoftheirrelianceoncom-munitystructureinthenetwork.Ourlistoflimitationsisbynomeansexhaustive;othervulnerabilitiesofrelyingoncommunitydetectionexist.Forexample,arecentstudyhasshownthatidentifyingcommunitiesreliablyinawiderangeofreal-worldnetworksisanotoriouslydiculttask[15].Wehopethat,bypointingouttheselimitations,wemoti- 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 Area under ROC curve (A')kRandom SG SL SU SI Figure14:AccuracyofSybildefenseschemesonsyntheticnetworkswhenSybilsareallowedtotargettheirlinksamongtheclosestknodestothetrustednode.AstheSybilsplacetheirlinkscloser(lowerk),theaccuracyofallschemesfalls. 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 Area under ROC curve (A')kRandom SG SL SU SI Figure15:AccuracyofSybildefenseschemesontheFacebooknetworkwhenSybilsareallowedtotargettheirlinksamongtheclosestknodestothetrustednode.AstheSybilsplacetheirlinkscloser,allschemesbeginrankingSybilnodeshigherthannon-Sybils(asevidencedbytheA0below0.5).vatetheneedforSybildefenseschemestobeevaluatedonawiderrangeofsocialnetworksandattackmodels.Ournd-ingsalsopointtoaneedtodevelopSybildefenseschemesthatworkbyleveragingdierentnetworkfeatures(oraddi-tionalinformationbeyondthenetworkstructure)thanex-istingschemes,allowingSybildefensetobeeectivewherenowitisnot.6.CONCLUDINGDISCUSSIONInthispaper,wehavetakentherststepstowardsdevelop-ingadeeperunderstandingofhowthenumerousproposedsocialnetwork-basedSybildefenseschemeswork.Wefoundthat,despitetheirconsiderabledierences,allSybildefenseschemesrelyonidentifyingcommunitiesinthesocialnet-work.Unfortunately,wealsodiscoveredthatthisrelianceoncommunitydetectionmakestheschemesfundamentallyvulnerabletoSybilattackswhenoperatingovernetworkswherethenon-Sybilnodesformstrongcommunities.Inlightofthesenegativeresults,welookforalternativeapproachestoSybildefensethatcouldbedeployedinprac-tice.Inthissection,werstfocusourdiscussiononaddi-tionalchallengesthatarisewhendeployingsocialnetwork-basedSybildefenseschemesinpractice.WethendiscusstwowaystoimproveSybildefensesmovingforward.Wepresentourdiscussionpointsasquestionsandanswers.Arelinksinsocialnetworkshardtoform?AlltheSybildefenseschemesdiscussedinthispapermaketheas-sumptionthatSybilscanonlyformacertainnumberoflinkstonon-Sybils.However,itremainsanopenquestionwhetherthisistrueinanyonlinesocialnetworkoftoday;itisclearthat,atleastinsomesocialnetworks,theassump-tiondoesnothold[5].AreSybilsnecessarilybad?InalloftheSybildefenseschemes,itisassumedthatthepresenceofSybilsisevi-denceofmisbehavior,andassuch,nonon-Sybilshouldin-teractwithaSybil.However,therearelegitimatereasonswhyausermightwishtocreatemultipleidentities.Forex-ample,usersmaywishtopartitiontheiridentityintoonethatisusedtointeractwithco-workers,andanotherthatisusedtointeractwithfriendsandfamily(e.g.,themultipleemailaddressesthatmanypeopleusetoday).UserspostingvideostoYouTubemaywishtopostunderpseudonymsinordertoavoidrevealingtheirreal-worldidentitywhilestillusingapersonalaccounttoratevideosandpostcomments. Sincethemerepresenceofuserswithmultipleaccountsisnotnecessarilyindicativeofmisbehavior,whatweshouldbeconcernedwithisnotnecessarilythepresenceofSybils,but,rather,theuseofSybilsformisbehavior.DetectingSybilsandsimplyexcludingthemfromasystemisonlyoneparticularlydraconianwayofaccomplishingthis.ShouldSybildefensesmovetowardsSybiltoler-ance?InsteadofexplicitlyidentifyingSybilslikeSybil-Guard,SybilLimit,andSybilInfer,asystemcouldaimtoinsteadjustpreventSybilsfromgainingaccesstoextraprivi-leges.SumUp,forexample,attemptstolimitthevotescom-ingfromSybilnodesbylimitingtheeectofvotesfrompo-tentialSybilregions.Insteadofexplicitlyidentifyingnodes,theprotocolseekstolimittheirabilitytodisproportionallyaecttheresultingvotecount.Asaresult,thesystemdoesnottrytopreventusersfromcreatingmultipleidentities,butrather,ensuresthatbydoingso,theyareunabletogainanyadditionalprivileges.WebelievethatbuildingSybiltoler-anceintoapplicationsmayrequiremoreeort,andisclearlylessgeneralthanidentifyingSybils,butallowsapplicationdesignerstosidestepthearmsraceoflocatingSybilsinthesocialnetwork.ShouldSybildefensesleveragemoreinformation?Giventheinherentlimitationsofrelyingsolelyontheso-cialnetworkinordertodefendagainstSybils,anattractivewaytoimproveontheseschemesistogiveSybildefenseschemesadditionalinformation.Asasimpleexample,sup-poseaSybildefenseschemeweregivenalistofnodes,oneineachofthedierentcommunitieswithinthenetwork,whowereeitherknowntobeSybils,orknowntobenon-Sybils.Inthiscase,itisclearthatthisadditionalinformationcouldbeusedbycommunitydetectionalgorithmstoaccuratelydierentiatebetweencommunitiescontainingSybilandnon-Sybilnodes.Incontrast,currentSybildefenseschemesaregivenonlyasingletrustednodeasinputandconsequently,theyperformpoorly.Asanotherexample,recentworkhassuggestedthatactiv-itybetweenusersmaybeabetterpredictorofthestrengthofthesociallinkbetweenthem[30,31].Thesestudiesindi-catethateveninnetworkswhereusersacceptfriendrequestsfromarbitrarysources,usersengageinsharedactivity(e.g.,exchangingmessages)withonlyalimitedsubsetoftheirfriends.Thus,havingadditionalinformationaboutuserac-tivitycouldhelpweedoutweaksocialconnections,includinglinksfromSybilnodes.Finally,foralloftheworkthathasfocusedonsocialnetwork-basedSybildefenses,itisunclearhowfarwearefromhavingtheseideasappliedtoactualdeployedsystems.However,asdigitalidentitiesbecomemoreimportant,itisclearthatthepotentialforfraud,deception,andothermis-behaviorwillincrease,therebynecessitatingSybildefenses.Understandingthebenets,limitations,andtradeosas-sociatedwithalternativeapproachestoSybildefenseisanimportantsteptowardsmakingthishappen.AcknowledgementsWethanktheanonymousreviewersandourshepherd,CristinaNita-Rotaru,fortheirhelpfulcomments.Thisre-searchwassupportedinpartbyanAmazonWebServicesinEducationGrant.7.REFERENCES[1]Advogatotrustnetwork.http://www.trustlet.org/wiki/Advogato.[2]R.AndersenandK.J.Lang.Communitiesfromseedsets.InProc.WWW'06,Edinburgh,Scotland,May2006.[3]J.P.Bagrow.Evaluatinglocalcommunitymethodsinnetworks.J.Stat.Mech.,2008(5),2008.[4]A.-L.BarabasiandR.Albert.EmergenceofScalinginRandomNetworks.Science,286:509{512,1999.[5]L.Bilge,T.Strufe,D.Balzarotti,andE.Kirda.Allyourcontactsarebelongtous:Automatedidentitytheftattacksonsocialnetworks.InProc.WWW'09,Madrid,Spain,Apr2009.[6]A.Clauset.Findinglocalcommunitystructureinnetworks.PhysicalReviewE,72(2),2005.[7]G.DanezisandP.Mittal.SybilInfer:DetectingSybilNodesusingSocialNetworks.InProc.NDSS'09,SanDiego,CA,Feb2009.[8]J.Douceur.TheSybilAttack.InProc.IPTPS'02,Cambridge,MA,Mar2002.[9]J.Fogarty,R.S.Baker,andS.E.Hudson.CasestudiesintheuseofROCcurveanalysisforsensor-basedestimatesinhumancomputerinteraction.InProc.GI'05,Victoria,BC,May2005.[10]S.Fortunato.Communitydetectioningraphs.PhysicsReports,486:75,2010.[11]R.Guimera,L.Danon,A.Diaz-Guilera,F.Giralt,andA.Arenas.Self-similarcommunitystructureinanetworkofhumaninteractions.PhysicalReviewE,68(6),2003.[12]J.Kleinberg.TheSmall-WorldPhenomenon:AnAlgorithmicPerspective.InProc.STOC'00,Portland,OR,May2000.[13]J.Leskovec,D.Huttenlocher,andJ.Kleinberg.SignedNetworksinSocialMedia.InProc.CHI'10,Atlanta,GA,Apr2010.[14]J.Leskovec,J.Kleinberg,andC.Faloutsos.GraphEvolution:DensicationandShrinkingDiameters.ACMTKDD,1(1),2007.[15]J.Leskovec,K.Lang,andM.Mahoney.Empiricalcomparisonofalgorithmsfornetworkcommunitydetection.InProc.WWW'10,Raleigh,NC,Apr2010.[16]J.Leskovec,K.J.Lang,A.Dasgupta,andM.W.Mahoney.StatisticalPropertiesofCommunityStructureinLargesocialandinformationnetworks.InProc.WWW'08,Beijing,China,Apr2008.[17]C.Lesniewski-Laas.ASybil-proofone-hopDHT.InProc.SNS'08,Glasgow,Scotland,Apr2008.[18]C.Lesniewski-LaasandM.F.Kaashoek.Whanau:Asybil-proofdistributedhashtable.InProc.NSDI'10,SanJose,CA,Apr2010.[19]F.Luo,J.Z.Wang,andE.Promislow.Exploringlocalcommunitystructuresinlargenetworks.WebIntelligentandAgentSystems,6(4):387{400,2008.[20]A.Mislove,A.Post,K.P.Gummadi,andP.Druschel.Ostra:LeveragingTrusttoThwartUnwantedCommunication.InProc.NSDI'08,SanFrancisco,CA,Apr2008.[21]A.Mislove,B.Viswanath,K.P.Gummadi,andP.Druschel.Youarewhoyouknow:Inferringuserprolesinonlinesocialnetworks.InProc.WSDM'10,NewYork,NY,Feb2010.[22]M.MitzenmacherandE.Upfal.ProbabilityandComputing.CambridgeUniversityPress,Cambridge,UK,2005.[23]A.Mohaisen,A.Yun,andY.Kim.Measuringthemixingtimeofsocialgraphs.Technicalreport,UniversityofMinnesota,2010.[24]S.Nagaraja.Anonymityinthewild:Mixeson unstructurednetworks.InProc.PET'07,Ottawa,ON,Jun2007.[25]M.E.J.Newman.Thestructureofscienticcollaborationnetworks.PNAS,98(2):404{409,2001.[26]M.E.J.Newman.Fastalgorithmfordetectingcommunitystructureinnetworks.PhysicalReviewE,69(6),2004.[27]D.QuerciaandS.Hailes.Sybilattacksagainstmobileusers:Friendsandfoestotherescue.InProc.INFOCOM'10,SanDiego,CA,Mar2010.[28]A.StrehlandJ.Ghosh.ClusterEnsembles{AKnowledgeReuseFrameworkforCombiningPartitionings.InProc.AAAI'02,PaloAlto,CA,Mar2002.[29]N.Tran,B.Min,J.Li,andL.Subramanian.Sybil-ResilientOnlineContentVoting.InProc.NSDI'09,Boston,MA,Apr2009.[30]B.Viswanath,A.Mislove,M.Cha,andK.P.Gummadi.OntheEvolutionofUserInteractioninFacebook.InProc.WOSN'09,Barcelona,Spain,Aug2009.[31]C.Wilson,B.Boe,A.Sala,K.P.N.Puttaswamy,andB.Y.Zhao.UserInteractionsinSocialNetworksandtheirImplications.InProc.Eurosys'09,Nuremberg,Germany,Apr2009.[32]H.Yu,P.B.Gibbons,M.Kaminsky,andF.Xiao.SybilLimit:ANear-OptimalSocialNetworkDefenseagainstSybilAttacks.InProc.IEEES&P,Oakland,CA,May2008.[33]H.Yu,M.Kaminsky,P.B.Gibbons,andA.Flaxman.SybilGuard:DefendingAgainstSybilAttacksviaSocialNetworks.InProc.SIGCOMM'06,Pisa,Italy,Sep2006.APPENDIXA.ANALYSISOFSYBILGUARDAssumedsocialnetworktopology:SybilGuard[33]as-sumesthatthenon-Sybilregionisfastmixing[22],meaningthatafterO(logn)hops(wherenisthenumberofnon-Sybils),theprobabilitydistributionofthelastnodeonaran-domwalkreachesthestationarydistribution.SybilGuardassumesthattheentirenetwork(theSybilregioncombinedwiththenon-Sybilregion)isnotfastmixing.Partitioningalgorithm:SybilGuardusesconstrainedrandomwalkformarkingnodesasnon-SybilorSybil.Itmarksasuspectnodeasnon-Sybiliftherandomwalkfromthetrustednodeandthesuspectintersect,otherwisethesuspectismarkedasaSybil.Noderankingbypartitioningalgorithm:Inordertogeneratearanking,weconductrandomwalksfromthetrustednode.Westartwithawalklength1andincreaseittok,wherekisthelengthoftherandomwalksuchthatallnodesinthenetworkaremarkedasnon-Sybil.Theorderinwhichnodesaremarkedasnon-Sybilintheseincreas-inglylongrandomwalksimposesaranking.Intherarecasewhenallthenodesinthenetworkarenotmarkedasnon-Sybilusingasinglerandomseedandalongwalklength,weconductaseriesofrandomwalkswithdierentrandomseedstoinducearankingfortheremainingnodes.Determiningcuto:SybilGuardusesO(p nlogn)ran-domwalkstogathersamplesfromthenon-Sybilregionofnnodes.ForasocialnetworkwithO(logn)mixingtime,basedonthebirthdayparadox,twonon-Sybilnodeswithp nsamplesfromthenon-Sybilregionwillhaveanintersec-tionwithhighprobability.SybilGuardreliesonanestima-tionprocedurefordeterminingtheappropriatelengthoftherandomwalk,andconsequently,thecutovalue.B.ANALYSISOFSYBILLIMITAssumedsocialnetworktopology:SybilLimit[32]makesthesameassumptionsaboutthenetworkasSybil-Guard.Partitioningalgorithm:SybilLimitperformsO(p m)in-dependentrandomwalksoflengthO(logn)fromeachnode.Twoconditionsmustbesatisedforthetrustednodetomarkasuspectasanon-Sybil.Therstcondition|calledtheintersectioncondition|requiresthatthelastedgeofoneoftherandomwalksofthetrustednodeandthesuspectmustintersect.Thesecondcondition|calledthebalancecondition|limitsthenumberofnon-Sybilsperattackedge.Eachtailofarandomwalkisassigneda\load"thatisnotallowedtoexceedagiventhreshold;theloadisincrementedeachtimethetrustednodemarksanothersuspectasanon-Sybil.Noderankingbypartitioningalgorithm:SybilLimithastwoprimaryparametersforcontrollingthenumberofnodesmarkedasnon-Sybilinthenetwork|thenumberofrandomwalksfromeachnodeandthelengthofthesewalks.Astheseparametersareincreased,greaternumbersofnodesaremarkedasnon-Sybil.SimilartoSybilGuard,weinferarankingbasedontheorderinwhichnodesaremarkedasnon-Sybil.Determiningcuto:SimilartoSybilGuard,SybilLimitreliesonanestimationproceduretondlengthofrandomwalkandthenumberofrandomwalkrequired.Thesetwoparametersimposeacuto.C.ANALYSISOFSYBILINFERAssumedsocialnetworktopology:SybilInfer[7]makesthesameassumptionasSybilGuard.SybilInferalsomakesafurtherassumptionthatthemodiedrandomwalksarefastmixinginrealsocialnetworks.Partitioningalgorithm:SybilInferperformsmultiplerandomwalksfromeachnodetosamplenodesfromthenon-Sybilregion.ItfurtherusesaBayesianinferencetech-niquetodeterminetheprobabilityofanynodeinthesystembeingmarkedasnon-Sybil.Noderankingbypartitioningalgorithm:SinceSybil-Inferassignseachnodeaprobabilityofbeinganon-Sybil,thenodescanberankedbasedonthisprobability.Wecon-duct30runsofSybilInferwithdierentrandomseeds,andusetheaverageprobabilityoveralltherunstodeterminethenalrankingofthenodes.Determiningcuto:SybilInferpartitionsthenodesbasedonathresholdvaluefortheprobabilityofanodebeingnon-Sybil.D.ANALYSISOFSUMUPAssumedsocialnetworktopology:SumUpassumesthatthemin-cutbetweenthevotecollector(i.e.,thetrustednode)andnon-Sybilnodesoccursatthecollector,andthatthemin-cutbetweenSybilsandthenon-Sybilsoccursattheattackedges.Partitioningalgorithm:SumUppartitionsnodesbasedonwhethertheirvoteisacceptedornot.Nodeswhosevotesareacceptedaretreatedasnon-Sybils,whereasnodeswhosevotesaresubjecttocapacityconstraintsaretreatedasSybils.Noderankingbypartitioningalgorithm:SumUpde-cideswhetheravotewillbecollectedornotbydeningavotingenvelopewithinwhichallvotesarecollectedandout-sideofwhichvotesareconstrainedtooneperlinkoutoftheenvelope.ThesizeofthevotingenvelopeiscontrolledbytheparameterCmax,whichisthemaximumnumberofvotesthatcanbecollectedbythetrustednode.Inordertoranknodes,weincreaseCmaxfrom1tok,wherekisthevalueforwhichthevotingenvelopecontainstheentirenetwork.Theorderinwhichthesenodesareaddedtothevotingenvelopeinducesaranking.Determiningcuto:Cmaxdeterminesthesizeofthevot-ingenvelopeandservesasthecut-oparameter.