/
An Analysis of Social Network-Based Sybil Defenses An Analysis of Social Network-Based Sybil Defenses

An Analysis of Social Network-Based Sybil Defenses - PDF document

trish-goza
trish-goza . @trish-goza
Follow
383 views
Uploaded On 2017-04-05

An Analysis of Social Network-Based Sybil Defenses - PPT Presentation

iousalgorithmsworkbyimplicitlyrankingnodesbasedonhowwellthenodesareconnectedtoatrustednodeNodesthathavebetterconnectivitytothetrustednodearerankedhigherandaredeemedtobemoretrustworthyWeshowthatdesp ID: 336013

iousalgorithmsworkbyimplicitlyrankingnodesbasedonhowwellthenodesareconnectedtoatrustednode.Nodesthathavebetterconnectivitytothetrustednodearerankedhigherandaredeemedtobemoretrustworthy.Weshowthat desp

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "An Analysis of Social Network-Based Sybi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

AnAnalysisofSocialNetwork-BasedSybilDefensesBimalViswanathMPI-SWSbviswana@mpi-sws.orgAnsleyPostMPI-SWSabpost@mpi-sws.orgKrishnaP.GummadiMPI-SWSgummadi@mpi-sws.orgAlanMisloveNortheasternUniversityamislove@ccs.neu.eduABSTRACTRecently,therehasbeenmuchexcitementintheresearchcommunityoverusingsocialnetworkstomitigatemultipleidentity,orSybil,attacks.Anumberofschemeshavebeenproposed,buttheydi ergreatlyinthealgorithmstheyuseandinthenetworksuponwhichtheyareevaluated.Asaresult,theresearchcommunitylacksaclearunderstandingofhowtheseschemescompareagainsteachother,howwelltheywouldworkonreal-worldsocialnetworkswithdi erentstructuralproperties,orwhetherthereexistother(poten-tiallybetter)waysofSybildefense.Inthispaper,weshowthat,despitetheirconsiderabledif-ferences,existingSybildefenseschemesworkbydetectinglocalcommunities(i.e.,clustersofnodesmoretightlyknitthantherestofthegraph)aroundatrustednode.Our ndinghasimportantimplicationsforbothexistingandfu-turedesignsofSybildefenseschemes.First,weshowthatthereisanopportunitytoleveragethesubstantialamountofpriorworkongeneralcommunitydetectionalgorithmsinordertodefendagainstSybils.Second,ouranalysisrevealsthefundamentallimitsofcurrentsocialnetwork-basedSybildefenses:Wedemonstratethatnetworkswithwell-de nedcommunitystructureareinherentlymorevulnerabletoSybilattacks,andthat,insuchnetworks,Sybilscancarefullytar-gettheirlinksinordermaketheirattacksmoree ective.GeneralTermsSecurity,Design,Algorithms,ExperimentationCategoriesandSubjectDescriptorsC.4[PerformanceofSystems]:Designstudies;C.2.0[Computer-CommunicationNetworks]:General|Se-curityandprotectionKeywordsSybilattacks,socialnetworks,socialnetwork-basedSybildefense,communitiesPermissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.SIGCOMM'10,August30–September3,2010,NewDelhi,India.Copyright2010ACM978-1-4503-0201-2/10/08...$10.00.1.INTRODUCTIONAvoidingmultipleidentity,orSybil,attacksisknowntobeafundamentalprobleminthedesignofdistributedsys-tems[8].Maliciousattackerscancreatemultipleidentitiesandin\ruencetheworkingofsystemsthatrelyuponopenmembership.Examplesofsuchsystemsrangefromcommu-nicationsystemslikeemailandinstantmessagingtocollabo-rativecontentrating,recommendation,anddeliverysystemssuchasDiggandBitTorrent.TraditionaldefensesagainstSybilattacksrelyontrustedidentitiesprovidedbyacer-ti cationauthority.Butrequiringuserstopresenttrustedidentitiesrunscountertotheopenmembershipthatunder-liesthesuccessofthesedistributedsystemsinthe rstplace.Recently,therehasbeenexcitementintheresearchcom-munityaboutapplyingsocialnetworkstomitigateSybilat-tacks.Anumberofschemeshavebeenproposedthatat-tempttodefendagainstSybilsinasocialnetworkbyus-ingpropertiesofthesocialnetwork'sstructure[7,29,32,33].Unliketraditionalsolutions,theseschemesrequirenocen-traltrustedidentities,andinsteadrelyonthetrustthatisembodiedinexistingsocialrelationshipsbetweenusers.Allsocialnetwork-basedSybildefenseschemesmaketheassumptionthat,althoughanattackercancreatearbitrarySybilidentitiesinsocialnetworks,heorshecannotestablishanarbitrarilylargenumberofsocialconnectionstonon-Sybilnodes.Asaresult,Sybilnodestendtobepoorlyconnectedtotherestofthenetwork,comparedtothenon-Sybilnodes.SybildefenseschemesleveragethisobservationtoidentifySybils.Theyusevariousgraphanalysistech-niquestosearchfortopologicalfeaturesresultingfromthelimitedcapacityofSybilstoestablishsociallinks.Ourfocusinthispaperisonthegraphanalysisalgorithmsbehindtheschemes.TheliteratureonSybildefenseschemesisstillinitsearlystages;mostpapersdescribenewalgo-rithms,butnoneprovideacommoninsightthatexplainshowalloftheseschemesareabletodetectSybils.Eachalgorithmhasbeenshowntoworkwellunderitsownas-sumptionsaboutthestructureofthesocialnetworkandthelinksconnectingnon-SybilandSybilnodes.However,itisunclearhowthesealgorithmswouldcompareagainsteachother,onmoregeneraltopologies,orunderdi erentattackstrategies.Asaresult,itisnotknownifthereexistother(potentiallybetter)waystomitigateSybilattacksoriftherearefundamentallimitstousingonlythestructureoftheso-cialnetworktodefendagainstSybils.Inthispaper,wetakea rst,butimportant,steptowardsansweringthesequestions.WedecomposeexistingSybilde-fenseschemesanddemonstratethatattheircore,thevar- iousalgorithmsworkbyimplicitlyrankingnodesbasedonhowwellthenodesareconnectedtoatrustednode.Nodesthathavebetterconnectivitytothetrustednodearerankedhigherandaredeemedtobemoretrustworthy.Weshowthat,despitetheirconsiderabledi erences,allSybildefenseschemesranknodessimilarly|nodeswithinlocalcommuni-ties(i.e.,clustersofnodesmoretightlyknitthantherestofthenetwork)aroundthetrustednodearerankedhigherthannodesintherestofthenetwork.Thus,Sybildefenseschemesworkbye ectivelydetectinglocalcommunities.Theaboveinsighthasimportantimplicationsforbothex-istingandfuturedesignsofsocialnetwork-basedSybilde-fenseschemes.First,itmotivatesustoinvestigatewhetheraclassofalgorithms,knownascommunitydetectionalgo-rithms[10],thatattemptto ndsuchclustersofnodesdi-rectly,couldbeusedforSybildefense.We ndthatitispossibletouseo -the-shelfcommunitydetectionalgorithmsto ndSybils.UnlikeSybildefense,communitydetectionisawell-studiedandmature eld,implyingthatour ndingsopenthedoorforresearcherstoexploitavarietyoftech-niquesfromarichbodyofcommunitydetectionliterature.Second,ourinsightalsohintsatthelimitationsofrelyingoncommunitiesfor ndingSybils.ForSybildefenseschemestoworkwell,allnon-SybilnodesneedtoformasinglecommunitythatisdistinguishablefromthegroupofSybilnodes.1Inreality,however,usersinmanysocialnetworksformmultiplecommunitiesthatareinterconnectedrathersparsely.Weshowthat,inthesenetworks,itishardforatrustednodetodistinguishSybilsfromnon-Sybilsoutsideitslocalcommunity.Further,wedemonstratehowSybilscanlaunchextremelye ectiveattacksbyestablishingjustasmallnumberoflinkstocarefullytargetednodeswithinsuchnetworks.AssystemsarebeginningtobebuiltontopofSybildefenseschemes[17,18,27],our ndingsquestionthewisdomofbuildingthesesystemswithoutathoroughunderstandingofthelimitationsofSybildefense.2.UNDERSTANDINGSYBILDEFENSEAsnotedbefore,avarietyofSybildefenseschemeshavebeenproposed,buteachhasbeenevaluatedusingdi erentsocialnetworksandattackstrategiesbytheSybils.Therefore,itisnotwellunderstoodhowthesedi erentschemescompareagainsteachother,orhowapotentialuseroftheseschemes,suchasareal-worldsocialnetworkingsite,wouldselectoneschemeoveranother.2.1ThecoreofSybildefenseschemesGiventheproblemofcomparingcompetingSybildefenseschemes,oneapproachwouldbetoviewtheschemesascompletecoherentproposals(i.e.,treatthemasblackboxes,andcomparetheminreal-worldsettings).Suchanapproachisstraight-forwardandwouldprovideusefulperformancecomparisonsbetweena xedcon gurationofschemesoveragivensetofsocialnetworksandattackstrategiesbytheSybils.However,itwouldnotyieldconclusiveinformationonhowaparticularschemewouldperformifeitherthegivensocialnetworkorthebehavioroftheattackershouldchange.Italsodoesnotallowustoderiveanyfundamentalinsights 1ManySybildefenseschemesimposethisrequirementim-plicitlybyassumingthatthenon-Sybilregionofthenet-workisfastmixing[22],meaningarandomwalkoflengthO(logN)reachesastationarydistributionofnodes.intohowtheseschemeswork,whichmightenableustobuilduponandimprovethem.Analternativeapproachisto ndacoreinsightcommontoalltheschemesthatwouldexplaintheirperformanceinanysetting.Gainingsuchafundamentalinsight,whiledif- cult,notonlyprovidesguidanceonimprovingfuturede-signs,butalsoshedslightonthelimitsofsocialnetwork-basedSybildefense.However,wecannotgainsuchanin-sightbytreatingeachoftheseschemesasablackbox,witheachcarryingitsownsetofalgorithms,optimizations,andassumptions.Instead,weneedtoreducetheschemestotheircoretaskbeforeanalyzingthem.Atahighlevel,allexistingschemesattempttoisolateSybilsembeddedwithinasocialnetworktopology.EveryschemedeclaresnodesinthenetworkaseitherSybilsornon-Sybilsfromtheperspectiveofatrustednode,e ectivelypartitioningthenodesinthesocialnetworkintotwodistinctregions(non-SybilsandSybils).Hence,eachSybildefenseschemecanactuallybeviewedasagraphpartitioningalgo-rithm,wherethegraphisthesocialnetwork.However,thequalityandperformanceofthealgorithmdependsontheinputs,namely,thenetworktopologyandthetrustednode.MostSybildefenseschemesincludeanumberofuse-fulandpracticaloptimizationsthatenhancetheirperfor-manceinspeci capplicationscenarios.Forexample,Sybil-Guard[33]andSybilLimit[32]haveanumberofdesignfeaturesthatfacilitatetheiruseindecentralizedsystems.Similarly,SumUp[29]hasoptimizationsspeci ctoonlinecontentvotingsystems.However,becauseourgoalistoun-coverthecoregraphpartitioningalgorithm,westudytheseschemesindependentoftheassumptionsabouttheirappli-cationenvironmentsaswellastheoptimizationsthatarespeci ctothoseenvironments.Laterinthepaper,weshowthatthisapproachnotonlyo ershintsforthedesignersoffutureSybildefenseschemes,butalsohelpsusunderstandthecharacteristicsofreal-worldsocialnetworksthatmakethemvulnerabletoSybilattacks.2.2ConvertingpartitionstorankingsEvenwhenviewingtheschemesasgraphpartitioningal-gorithms,comparingthedi erentSybildefenseschemesisnotentirelystraightforward.Theoutputofeachschemede-pendsonthesettingofnumerousparameters.Atahighlevel,theseparameterscanbeseenasmakingthepartition-ingbetweenSybilsandnon-Sybilseithermorerestrictiveorpermissive,therebytradingfalsepositivesforfalsenegatives.Whilethedesignersoftheschemeso erroughguidelines Figure1:Diagramofconvertingpartitioningsintoarankingofnodes.Di erentparametersettings( , ,\r)causeincreasinglylargepartitionstobemarkedasSybils,therebyinducingaranking. AssumptionsAlgorithmRankingCuto Evaluation SybilGuard[33] Non-Sybilregionisfastmixing[22]RandomwalkperformedbyeachnodeVaryingrandomwalklengthWhetherornotwalkintersectionoccursKleinbergnetwork[12]SybilLimit[32] Non-SybilregionisfastmixingMultiplerandomwalksperformedbyeachnodeVaryingnumberofrandomwalksandwalklengthWhetherornottailsofrandomwalksintersectFriendster,LiveJournal,DBLP,KleinbergSybilInfer[7] Non-Sybilregionisfastmixing,modi edwalksarefastmixingBayesianinferenceontheresultsoftherandomwalksProbabilityofnodebeingnon-SybilfromBayesianinferenceThresholdontheprobabilitythatagivennodeisnon-SybilPower-lawnetwork[24],LiveJournalSumUp[29] Non-Sybilregionisfastmixing,nosmallcutbetweencollectorandnon-SybilregionCreationofvotingenvelopewithappropriatelinkcapacitiesaroundcollectorVaryingthesizeofthevotingenvelopeWhetherornotnodesarewithinthevotingenvelopeYouTube,Flickr,DiggTable1:Overviewofthepropertiesandevaluationofsocialnetwork-basedSybildefenseschemes.forchoosingtheparametervalues(e.g.,setaparametertoO(logN)whereNisthenumberofnetworknodes),therecanbeconsiderablevariationintheoutputfromdi erentparametersettingsthatfollowtheguidelines.Giventhedif- cultyinselectingtherightparametersettings,wewouldliketocomparetheschemesindependentofthechoiceoftheirrespectiveparameters.Westudiedtheimpactofchangingparametersontheout-putoftheSybilandnon-Sybilpartitions.WeobservedthatastheSybilpartitiongrowsorshrinksinresponsetopa-rameterchanges,anorderingcanbeimposedonthenodesaddedorremoved.2Thatis,whentheSybilpartitiongrowslarger,newnodesareaddedtothepartitionwithoutremov-ingnodespreviouslyclassi edasSybils.Similarly,whentheSybilpartitiongrowssmaller,somenodesareremovedfromthepartitionwithoutaddinganynodespreviouslyclassi edasnon-Sybils.Figure1illustrateshowdi erentpartition-ingsobtainedbychangingparameterscanbeconvertedintoanorderingorrankingofnodes.OurobservationsuggeststhatonecanviewtheSybilde-fenseschemesasimplicitlyorderingorrankingnodesinthenetwork,whiletheparametersettingsdeterminewheretheboundarybetweenthepartitions,calledthecuto point,lies.Changingtheparametersslidesthecuto pointalongtheranking,buttheresultingpartitionsupholdtheob-servedrankingofnodes.Thus,wecancomparethedif-ferentschemesindependentlyoftheirparametersbysimplycomparingtheirrelativerankingsofthenodes.2.3ReductionofexistingschemesWereduceeachSybildefenseschemeintoitscomponentprocessesusingthemodelpresentedinFigure2.Atitscore,eachschemecontainsanalgorithm,which,givenatrustednodeandanetwork,producesarankingofthenodesinthenetworkrelativetothetrustednode.Then,dependingonthesettingofvariousparametervalues,theschemecreatesacuto ,whichisappliedtotherankingandproducesaSybil/non-Sybilpartitioning.TheschemesthatweexamineinthispaperareSybil-Guard[33],SybilLimit[32],SybilInfer[7],andSumUp[29].ForeachoftheseSybildefenseschemes,Table1identi es 2WhilewedonotformallyprovethatallparametersofanySybildefenseschememustinduceanordering,itisthecaseforallschemes,environments,andparametersweanalyzed.thepartitioningalgorithm,howthispartitioninginducesarankingofnodes,andhowthealgorithmparametersdeter-mineacuto .Wealsodescribetheassumptionstheschemesmakeabouttheirinputenvironment(i.e.,thestructureofnon-SybilandSybiltopologies),andbrie\rydescribethenet-worksthattheseschemeswereevaluatedupon.Amorede-taileddescriptionofhowtheseschemesmapintoourmodelisincludedintheAppendices.Althoughweonlyshowhowourmodelappliestofourwell-knownschemes,webelievethatitcouldbeappliedtootherschemesaswell.Forexample,arecentworkpro-posesaSybil-resilientdistributedhashtableroutingpro-tocol[17,18],byusingsocialconnectionsbetweenuserstobuildroutingtables.TheprotocolreliesonrandomwalksmuchinthesamemannerasSybilGuardandSybilLimit,sowebelieveouranalysiswouldapplytoitaswell.Similarly,Querciaetal.[27]recentlyproposedaSybildefenseschemethatreliesonagraph-theoreticmetriccalledbetweennesscentralitytocalculatethelikelihoodofanodebeingaSybil.Toapplyouranalysis,thecentralitymeasurecanbeuseddirectlytoinducearankingofthenodes.2.4RestofthepaperInthissection,wehaveshownthatexistingSybildefenseschemesallworkbyinducinganimplicitrankingofthenodes.Wenowtakeacloserlookattheserankings,us- Figure2:DiagramshowingtheprocessesinvolvedinaSybildefensescheme.Inbrief,theschemeitselfcanbesplitintoanalgorithm,whichwhengivenasocialnetworkandatrustednode,producesaranking.Theparameterstotheschemeareusedtocreateacuto ,whichde nesaSybil/non-Sybilpartitioningfromtheranking. Figure3:ThesyntheticnetworkusedinSection3.1forexploringtherankings.Eachofthetwocommu-nitiescontains256nodes.ingthemtocomparetheschemesacrossawiderangeofconditions.OurgoalintheremainingsectionsistobetterunderstandtherankingalgorithmsunderlyingexistingSybildefenseschemes,andthroughthisunderstanding,toprovideabasisforansweringthefollowingquestions:Arethedi erentSybildefenseschemesperformingthecoretaskofrankingnodesinthesameway,oriseachrankingunique?(Section3)Arethereother(potentiallybetter)waystoobtainthesenoderankings?(Section4)Whatstructuralpropertiesofthesocialnetworkde-terminehowwelltheschemeswork?(Section5.1)Aretheschemesrobustagainstthedi erentpossibleSybilattackstrategies?(Section5.2)3.RANKINGSANDSYBILDEFENSEInthissection,wedevelopabetterunderstandingofthepro-cessbywhichSybildefenseschemescomputenoderankingsbycomparingtherankingsofthedi erentschemes.3.1RankingsinsyntheticnetworksWestartbyexaminingthenoderankingsgeneratedbytheschemeswhenrunoverasyntheticnetworktopology,takenfrom[3]andshowninFigure3.Inbrief,thisnetworkiscon-structedusingtheBarabasi-Albertpreferentialattachmentmodel[4],andthenrewired3tohavetwodenselyconnectedcommunitiesof256nodeseach,connectedbyasmallnum-berofedges.3.1.1ComparingnoderankingsWerandomlyselectedanodeinoneofthecommunitiesasthetrustednodeandcalculatedthenoderankingsonthissyntheticnetworkforthefourSybildefenseschemespreviouslydiscussed.Wethenexaminedhowcloselythevariousrankingsmatched.Tocomparetherankings,weusemutualinformation[28],whichmeasuresthesimilarityoftwopartitioningsofaset.Inbrief,mutualinformationrangesbetween0and1,where0representsnocorrelationbetweenthepartitionings,and1representsaperfectmatch. 3Inbrief,therewiringworksasfollows:Nodesare rstran-domlyassignedtotwocommunities.Then,rewiringworksbyselectingtwolinksA$BandC$DwhereAandCareinthesamecommunityandBandDareinthesamecom-munity.ThesetwolinksarereplacedwiththelinksA$CandB$D,therebyincreasingtheintra-communitylinkswithoutchangingthedegreedistributionorlinkcount.TheresultsofthisexperimentareshowninthetopgraphofFigure4.Forclarity,weonlyshowthemutualinfor-mationbetweenpartitioningsofSybilGuardandeachoftheotherthreeschemes(theotherpairsaresimilar).Thex-axisdenotesthesizeofthepartitioncontainingnon-Sybils.Forexample,thex-axisvalueof10dividestherankingintotwoparts,onewiththe rst10nodesintheranking(markedasnon-Sybils)andtheotherwiththerestofthenodes(markedasSybils).Thus,Figure4showsthemutualinformationbe-tweenpairsofrankingsatallpossiblecuto points.Figure4showsthatthemutualinformationmetricismax-imizedatapartitioningofsize256.Interestingly,itfallso sharplybeforeandafterthiscuto value.Tounderstandthisplotbetter,weinvestigatedthestrongcorrelationbe-tweenthedi erentnoderankingsatthepartitioningsizeof256andfoundthatthe256membersthateachschemeassignedtothenon-SybilpartitionstronglycorrespondedtothehalfofthenetworkinFigure4thatcontainedthetrustednode.Thisindicatesthatallschemesarebiasedtowardsrankingnodesinthelocalcommunityaroundthetrustednodehigherthannodesoutsideofthecommunity.However,thereislittlecorrelationbetweentheorderingofnodeswithinthecommunity,orthenodesoutsideofit,asthemutualinformationislowbetweenpairsofrankingsbe-foreandafterthispoint.3.1.2ThecommonfactorbehindtherankingsOnehypothesisthatcouldexplainouraboveobservationsisthatthenodesarebeingrankedsuchthatnodeswellcon-nectedtothetrustednodearemorelikelytobehigherintherankings.Sincethereareseveralnodeswithinthelocalcom-munityofthetrustednodethatareequallywellconnected,therankingamongstthesenodesisnotstrictlyenforced,i.e.,thedi erentschemesrankthesenodesdi erently.Sim-ilarly,severalnodesoutsidethelocalcommunityareequally 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 ConductancePartition at Node RankSG SL SU SI 0 0.2 0.4 0.6 0.8 1 Mutual InformationSG-SL SG-SU SG-SI Figure4:Mutualinformationbetweenpairsofrank-ingsandconductanceofeachrankingplottedforvariouspartitionsforthesyntheticnetwork,usingschemesSybilGuard(SG),SybilLimit(SL),SumUp(SU),andSybilInfer(SI).Astrongcorrelationisobservedat256nodes,indicatingahighdegreeofoverlapbetweenthepartitionings,andastrongcom-munitystructureinthenon-Sybils,atthispoint. 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 ConductancePartition at Node RankSG SL SU SI 0 0.2 0.4 0.6 0.8 1 Mutual InformationSG-SL SG-SU SG-SI Figure5:Mutualinformationbetweenpairsofrank-ingsandconductanceofeachrankingplottedforvariouspartitionsofthefourschemeswhenrunontheFacebooknetwork.poorlyconnectedandsotheirrelativerankingisnotcon-sistentacrossthedi erentSybilschemes.However,thereisasharpdistinctionbetweentheconnectivityofnodesin-sideandoutsidethelocalcommunity,andsotheformerarerankedbeforethelatter.Tocon rmthishypothesis,weusedawellknownmetriccalledconductance[16]fordetermininghowcloselyasub-setofnodeswithinanetworkareconnectedamongthem-selvesrelativetotherestofthenetwork.Conductanceisawidelyusedmetricforevaluatingthequalityofcommu-nitieswithinlargenetworks.Inbrief,theconductanceofasetofnodesrangesbetween0and1,withlowernumbersindicatingstrongercommunities.Weplottheconductanceofthenon-SybilsubsetinthebottomofFigure4andnoticethatthereisasharpin\rec-tionpointintheconductanceat256nodesforallschemes.Thiscorrespondstotheboundarybetweenthetwocom-munitiesinoursyntheticnetworktopology.Addingnodesfromanothercommunitysharplyincreasestheconductance,soallschemesassignhigherrankingstonodesfromwithinthecommunityaroundthetrustednodethantonodesfromoutsidethecommunity.Thishelpsexplainwhytheparti-tionsobtainedfromtherankingsmatchverywellwhenthecuto issetatthein\rectionpoint.3.2Rankingsinreal-worldnetworksInthissection,weverifythattheresultswefoundforoursyntheticnetworkalsoholdinreal-worldnetworks.First,wewishtocheckthatnodesarerankedinabiasedmanner,suchthatnodesfromthetrustednode'slocalcommunityrankhigherthananyothernodes.Second,wewishtotestifthepointatwhichallSybildefenseschemesagreecorrespondstoatroughintheconductancevalue,indicatingtheboundaryofthecommunityaroundthetrustednode.Todoshowthis,werepeattheexperimentabovefortworealworldnetworks:Facebook,consistingofthesocialnetworkbetweenRiceUniversitygraduatestudentstakenfromFacebook[21],andAstrophysics,consistingoftheco-authorshipnetworkbetweenastrophysicists[25].DetailsonthesedatasetsareprovidedinTable2. 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 ConductancePartition at Node RankSG SL SU SI 0 0.2 0.4 0.6 0.8 1 Mutual InformationSG-SL SG-SU SG-SI Figure6:Mutualinformationbetweenpairsofrank-ingsandconductanceofeachrankingplottedforvariouspartitionsofthefourschemeswhenrunontheAstrophysicsnetwork.AswecanseeinFigures5and6,themutualinforma-tionrevealsalocalcuto whereallrankingshavestrongcorrelation,andthiscuto isalsocharacterizedbyalowconductancevalue.Takentogether,ourexperimentsshowthatallSybildefenseschemesareidentifyingalocalcommu-nitythatsurroundsthetrustednode,butthattherankingofnodestheyusetoreachthelocalcommunity(andthattheyuseafterthispoint)isnotstronglycorrelated.3.3SummaryofobservationsWenowsummarizethe ndingsfromourcomparisonofthewayinwhichvariousalgorithmsranknodes:Therankingofnodesisbiasedtowardsthosewhichdecreaseconductance.Thus,nodesthataretightlyconnectedaroundatrustednode(i.e.,thosethatformsubsetswithlowerconductance)aremorelikelytoberankedhigher.Whentherearemultiplenodesthataresimilarlywellconnectedtothetrustednode(i.e.,theyformsubsetswithsimilarconductance)theyareoftenordereddif-ferentlyindi erentalgorithms.Whenthetrustednodeislocatedinadenselycon-nectedcommunityofnodes,withaclearboundarybe-tweenthiscommunityandtherestofthenetwork,thenodesinthelocalcommunityaroundthetrustednodearerankedbeforeothers.4.APPLYINGCOMMUNITYDETECTIONIntheprevioussection,weobservedthatallSybildefenseschemesworkbyidentifyingnodesinthelocalcommu-nityaroundagiventrustednodeandrankingthemasmoretrustworthythanthoseoutside.Inthissection,weexaminewhetheralgorithmsthatareexplicitlydesignedtodetectcommunities,calledcommunitydetectionalgo-rithms[2,3,6,19],canbeusedforSybildefenseinthesamemannerasexistingschemes.Ourgoalistoinvestigatethe potentialforleveragingexistingliteratureincommunityde-tectiontodefendagainstSybils.Tothisend,we rstselectano -the-shelfcommunitydetectionalgorithmandgener-ateanoderankingfromthealgorithm.WethencompareitsnoderankingwiththoseofexistingSybildefenseschemes,todetermineifitisabletodefendagainstSybilswithsimilaraccuracy.4.1CommunitydetectionCommunitydetectioninnetworksisawellstudiedandma-ture eld.Therearenumerousapproachesthatusedi er-entmechanismsinordertodetectcommunitiesanddi erentmetricstoevaluatethequalityofcommunities.Below,wegiveabriefoverviewofhowcommunitydetectionschemeswork.Inthispaper,wefocusonlocalcommunitydetectionschemes[3],whichdonotrequireaglobalviewofthenet-work.4Mostofthelocalapproachesworkbystartingwithone(ormore[2])seednodesandgreedilyaddingneighboringnodesuntilasucientlystrongcommunityisfound.Forex-ample,Mislove'salgorithm[21]iterativelyaddsnodesthatimprovethethenormalizedconductance(ametriccloselyre-latedtoconductance)ateachstep,andstopswhenthecon-ductancemetricreachesanin\rectionpoint.Foradetailedsurveyoflocalcommunitydetectionalgorithms,wereferthereadertotherecentsurveypaperbyFortunato[10],whichdiscussesnumerousalgorithmsforcommunitydetection.Asthereisalargebodyofworkoncommunitydetec-tion,wecouldtheoreticallyutilizeanyofthesealgorithmsastherankingalgorithm.Fortheevaluationpresentedinthissection,weselectedMislove'salgorithm[21],butwiththeconductancemetricfromSection3.1.2.Wechosethisalgorithmasitisconceptuallyeasytounderstand,sinceitgreedilyminimizesconductance.However,ourdecisionisnotfundamental,andtheremaybeotheralgorithmsthatperformbetter(especiallysincedi erentcommunitydetec-tionalgorithmshavebeenshowntoperformbetterondif-ferentnetworks[15]).Rather,ourgoalhereissimplytoinvestigatehowwello -the-shelfcommunitydetectionalgo-rithmsareableto ndSybils.Inordertousecommunitydetectionto ndSybils,weneedtogenerateanoderankinginthesamemannerastheotherschemes.Todoso,werunMislove'scommunitydetec-tionalgorithmandrecordthenodethatititerativelyaddsateachsteptominimizeconductance.Notethatwemodifythealgorithmtonotstoponcealocaltroughisfound;in-steadweallowittocontinuerunninguntilallofthenodeshavebeenadded.Thisresultsinanoderankingthatwecanusetocompareagainsttheotherschemes.4.2EvaluatingSybildetectionWenowevaluatethecommunitydetectionalgorithmagainstourexistingSybildefenseschemes.WhencomparingagainsteachoftheSybildefenseschemes,weusedexperimentalset-tingssimilartothosedescribedinthepaperinwhichthe 4Ourdecisiontofocusonlocalcommunitydetectionalgo-rithms,asopposedtoglobalones,isduetothefactthattheyworkinasimilarmannerasexistingSybildefenseschemesbynotassumingaglobalview.However,ithasbeenshownthatdi erentglobalcommunitydetectionalgorithmshavemanyofthesamepropertiesaslocalones[15],indicatingthatourresultswouldlikelyholdforglobalalgorithmsaswell.Weleavethistofuturework.Network Nodes Links Avg.degree YouTube[20] 446,181 1,728,938 7.7Astrophysicists[25] 14,845 119,652 16Advogato[1] 5,264 43,027 16Facebook[21] 514 3,313 13Table2:Statisticsofdatasetsusedinourevaluation.schemewasproposed.Thisrequiredustosplitourevalu-ationresultsintwoseparatesections;oneforSybilGuard,SybilLimit,andSybilInferandanotherforSumUp.ThesplitisnecessarybecauseSumUpwasoriginallyevaluatedforitsabilitytolimitthenumberofvotesSybilidentitiescanplace,andnotforitsabilitytoaccuratelydetectSybilnodes.Thus,theexperimentalsettingsforevaluatingSumUparequitedi erentfromthoseoftheotherschemes,necessitatingaseparateevaluation.AsummaryofthedatasetsthatweuseintheevaluationisshowninTable2.Inadditiontothedatasetsfromthepre-vioussection,weexamineYouTube,consistingofthesocialnetworkofusersinYouTube[20],andAdvogato,consistingofthetrustnetworkbetweenfreesoftwaredevelopers[1].4.2.1MeasuringSybildetectionaccuracyInordertomeasuretheaccuracyofthevariousschemesatidentifyingSybils,weneedawaytocomputehowoftenaschemeranksSybilnodestowardsthebottomoftheranking.Todoso,weusethemetricAreaundertheReceiverOperat-ingCharacteristic(ROC)curveorA0.Inbrief,thismetricrepresentstheprobabilitythataSybildefenseschemeranksarandomlyselectedSybilnodelowerthanarandomlyse-lectednon-Sybilnode[9].Therefore,theA0metrictakesonvaluesbetween0and1:Avalueof0.5representsarandomranking,withhighervaluesindicatingabetterrankingand1representingaperfectnon-Sybil/Sybilranking.Valuesbe-low0.5indicateaninverseranking,oronewhereSybilstendtoberankedhigherthannon-Sybils.Averyusefulpropertyofthismetricisthatitisde nedindependentofthenum-berofSybilandnon-Sybilnodes,aswellasthecuto value,soitiscomparableacrossdi erentexperimentalsetupsandschemes.4.2.2SybilGuard,SybilLimit,andSybilInferForcomparingSybilGuard,SybilLimit,andSybilInfertothecommunitydetectionalgorithm,weusethesameexperi-mentalmethodologyasthemostrecentproposal,SybilInfer.Speci cally,weusea1,000nodescale-freetopology[4]forthenon-Sybilpartofthenetwork.Amongthissetofnon-Sybilnodes,asmallfraction(10%)ofthenodesarecom-promisedbyanadversaryandbecomeSybilnodes.These100maliciousnodesarechosenuniformlyatrandom.ThesenodesthenintroduceadditionalSybilidentitiesintothenet-work,whichformascalefreetopologyamongthemselvesus-ingthesameparametersasnon-Sybilregion.Wevarythenumberofintroducednodesfrom30to1,000,andaveragetheresultsover100experimentalruns.WepresenttheresultsofthisexperimentinFigure7.Wemaketwoimportantobservations:First,SybilInferandcommunitydetectionperformwell,withimprovingaccuracyasmoreSybilsareadded.ThereasonforthisincreaseisthattheSybilregionbecomeslargerand,therefore,easierdistin-guishfromthenon-Sybilregion.Second,bothSybilGuardandSybilLimitperformlesswellthantheothertwoschemes. 0 0.2 0.4 0.6 0.8 1 0 200 400 600 800 1000 Area under ROC curve (A')Number of additional Sybil nodesRandom SG SL SI CD Figure7:AccuracyforSybildefenseschemes,aswellascommunitydetection(CD),onthesynthetictopologyaswevarythenumberofadditionalSybilidentitiesintroducedbycolludingentities.Thise ectisbecausethenumberofSybilnodesaddedislowerthantheboundenforcedbythesetwoschemes,aswasobservedintheevaluationonSybilInfer[7].Inmoredetail,theSybilregionisconnectedtothenon-Sybilregionby789attackedgesontheaverage;SybilGuardandSybilLimiten-surethatnomorethatO(logN)nodeswillbeacceptedperattackedge,whereNisthenumberofnodesinthenetwork.Sinceweonlyaddamaximumof1,000Sybilnodes,neitheroftheseschemesmarksmanynodesasSybils.Wenowevaluatetheseschemesonareal-worldsocialnet-work.Speci cally,werepeatthisexperimentontheFace-bookgraduatestudentnetworkfrombefore.Thisnetworkhassimilardensityasthesyntheticnetwork,butisonlyhalfthesize.TheresultsofthisexperimentarepresentedinFig-ure8.Aswecansee,thecommunitydetectionalgorithmperformsfavorablycomparedtotheexplicitSybildefenseschemes,andallbecomemoreaccurateasmoreSybilsareadded.Acarefulreadermaynotethattheabsoluteaccu-racyofallschemes(communitydetectionincluded)issig-ni cantlylowerthanthatobservedaboveinFigure7.TheunderlyingreasonforthislowerperformanceisastructuralcharacteristicoftheFacebooknetworkthatmakesitinher-entlyhardertodistinguishSybilsfromnon-Sybils.Weex-plorethislimitationingreaterdetailinSection5.4.2.3SumUpRecallthatSumUpprovidesaSybil-resilientvotingservice.Todoso,SumUpde nesavotingenvelopewhereinthelinksareassignedacapacitysothatallvotesfromwithintheenve-lopecanbecollected.Outsidethisenvelope,votesareonlycollectedifthevotercan ndanpathwithcapacitytothe 0 0.2 0.4 0.6 0.8 1 0 50 100 150 200 250 300 350 400 Area under ROC curve (A')Number of additional Sybil nodesRandom SG SL SI CD Figure8:AccuracyintheFacebooknetworkaswevarythenumberofadditionalSybilidentitiesintro-ducedbycolludingentities.votecollector(i.e.,thetrustednode).Inordertoapplycom-munitydetection,wereplacetheprocessthatdeterminesthevotingenvelopewithacommunitydetectionalgorithm,pickthecommunitywiththelowestconductancevaluetobetheenvelope,andunconditionallyacceptallvotesfromnodeswithinthisenvelope.Fornodesoutsidetheenvelope,weassignallotherlinkstohavecapacityone,andwecollecttheirvotesiftheycan ndapathwithweighttoanynodewithintheenvelope.Thisdi erenceisnecessarysincewedon'tassignweightstolinkswithintheenvelope,asSumUpdoes.WeevaluateandcomparethecommunitydetectionschemeagainstSumUponthreedi erentdatasets:Ad-vogato,Astrophysics,andYouTube.WefollowthesamemethodologyusedintheoriginalSumUpevaluation[29]:foreachnetwork,weinject100attackedgesbyinserting10Sybilnodeswithlinksto10otheruniformlyrandomlycho-sennon-Sybilnodes.Inordertocastbogusvotes,eachSybilnodeisfurtherattachedtoalargenumberofSybilidentitiesbyasinglelinkeach.Asintheoriginalevaluation,weran-domlyselectavotecollectorandrandomlychooseasubsetofnon-Sybilsasvoters.Weplottheaveragestatisticsover veexperimentalrunsforbothSumUpandthecommunitydetectionalgorithm.Toevaluatetheaccuracyoftheseschemes,wemustde neanewmetric.ThisisbecauseSumUpdoesnotclassifyallnodesasSybilornon-Sybil(neededforA0),butrather,onlythosenodeswhichissuevotes.Sincesubsetsofboththenon-SybilandSybilnodesareissuingvotes,ideally,theschemewouldonlycountthenon-Sybilvotes.Thus,ourmetricshouldpenalizetheundercountingofnon-Sybilvotes,aswellasthecountingofanySybilvotes.Themetricwede ne,voteaccuracy,isexpressedasthenumberofnon-Sybilvotescounteddividedbythesumofthenumberofnon-SybilvotesissuedandthenumberofSybilvotescounted.Voteaccuracyrangesbetween0and1,wherehighervaluesrepresentbetterperformance.Figure9presentstheresultsofthisexperiment,aswevarythenumberofnon-Sybilvoters(Sybilstrytovoteas 0 0.2 0.4 0.6 0.8 1 0.001 0.01 0.1 1 Number of non-Sybil voters / Total non-Sybil nodesAstrophysics 0 0.2 0.4 0.6 0.8 1 Vote accuracyAdvogato SU CD 0 0.2 0.4 0.6 0.8 1 YouTube Figure9:VoteaccuracyofSumUpandcommunitydetectiononthreenetworks. oftenastheycan).Themostsalientresultisthattheac-curacyforSumUpvarieswidelyacrossthethreenetworks;thisisadirectresultofusingtheenvelopetechnique.Incertainnetworks,oneormoreoftheSybilnodesisacceptedintotheenvelope,andalargenumberofmaliciousvotesarecast.Theresultsforthecommunitydetectionalgorithmaresigni cantlymorestable,producingusefulresultsoncethenumberofnon-Sybilvotersrisesabove1%.4.3ImplicationsWebeganthissectionbyobservingthat,sinceallSybilde-fenseschemesappearedtobeidentifyinglocalcommunities,explicitcommunitydetectionalgorithmsmaybeabletode-fendagainstSybilsaswell.Itisinterestingtonote|evenwithoutchangingtheexperimentalsetupunderwhichexist-ingschemeswereevaluated|oursimplecommunitydetec-tionalgorithmgivescomparableresultstoexistingschemes.OurresultshavebothpositiveandnegativeimplicationsforfuturedesignersofSybildefenseschemes.Onthepositiveside,ourresultsdemonstratethatthereisaopportunitytoleveragethelargebodyofexistingworkoncommunitydetectionalgorithmsforSybildefense[10].Priorworkoncommunitydetectionprovidesareadilyavailablesourceofsophisticatedgraphanalysisalgorithmsaroundwhichresearcherscouldimproveexistingschemesanddesignnewapproaches.Onthenegativeside,relyingoncommu-nitydetectionforperformingSybildefensefundamentallylimitstheabilityoftheseschemesto ndSybilsinmanyreal-worldgraphs.Weexploretheselimitationsinthenextsection.5.LIMITATIONSOFSYBILDEFENSEIntheprevioussections,weshowedthatSybildefenseschemesworkbye ectivelyidentifyingnodeswithintightly-knitcommunitiesaroundagiventrustednodeasmoretrust-worthythanthosefartheraway.Inthissection,weinvesti-gatethelimitationsofrelyingoncommunitystructureofthesocialnetworkto ndSybils.Morespeci cally,weexplorehowthestructureofthesocialnetworkimpactstheper-formanceofSybildefenseschemesandhowattackerswithknowledgeofthestructureofthesocialnetworkcanleverageittolaunchmoreecientSybilattacks.Sincesocialnetwork-basedSybildefenseschemesusethestructureofsocialnetworkstodistinguishtheSybilnodesfromthenon-Sybilnodes,webeginbyaskingthefollowingquestion:Aretherenetworkswhereitishardtotellthesetwotypesofnodesapart?Inotherwords,couldtherebenet-workswherethenon-SybilnodeslooklikeSybilsorwhereitwouldbeeasyforSybilnodestomasqueradeasnon-Sybils?Intuitively,onewouldexpectnetworkswherethenon-Sybilregioniscomprisedofmultiple,small,tightly-knitcommunitiesthatareinterconnectedsparselytobemorevulnerabletoSybilattacks.Insuchnetworks,nodeswithinonecommunitymightmistakenon-SybilnodesinanothercommunityforSybils,duetolimitedconnectivitybetweenthecommunities.Furthermore,anattackercaneasilydis-guiseSybilnodesasjustanothercommunityinthenetworkbyestablishingasmallnumberofcarefullytargetedlinkstothecommunitycontainingthetrustednode.Next,weverifythisintuitionusingexperimentsoversyntheticandreal-worldsocialnetworkswherethenon-Sybilnodeshavedi erentcommunitystructuresandtheSybilnodesusedif-ferentattackstrategies. Figure10:IllustrationsofthesyntheticnetworksusedinSection5.1(theactualnetworksaremuchlarger).Non-SybilsaredarkgreenandSybilslightorange.Whilethenon-Sybilregionsof(a),(b),and(c)showincreasingamountsofcommunitystruc-ture,allnon-Sybilregionshavethesamenumberofnodesandlinks,anddegreedistribution.5.1ImpactofsocialnetworkstructureWe rstexaminethesensitivityofSybildefenseschemestothestructureofthenon-Sybilregion.AsinSections3and4,weanalyzesyntheticnetworksandthenshowthattheresultsfromthesesimplecasesapplytoreal-worldnetworksaswell.We rstgenerateaBarabasi-Albertrandomsyntheticnet-work[4]with512nodesandinitialdegreem=8.Thisresultsinarandompower-lawnetworkwithapproximately3,900links,andwithoutanycommunitystructure.Wetheniterativelygenerateaseriesofnetworksbyrewiring[3] velinksinsamemannerasinSection3(resultinginanetwork),thenrewiring vemorelinks(resultinginanothernetwork),andsoon,untilonly velinksremainbetweenthetwocom-munitiesof256nodeseach(resultingina nalnetwork).Theoutputisaseriesofnetworksthatallhavethesamenumberofnodes,numberoflinks,anddegreedistribution,butareincreasinginthelevelofcommunitystructurethattheyexhibit.Figure10givesaillustrationoftheinitial,intermediate,and nalnetworks.WeusethisseriesofnetworkstoevaluatehowwellSybildefenseschemesperformonnetworkswithincreasingamountsofcommunitystructure.Todoso,wetreateachofthesenetworksasthenon-Sybilregion,andwerandomlyattachaSybilregionof256nodesusing40links.WethenevaluatehowwelltheexistingschemesareabletodetectSybilsbyusingtheA0metric.Theresultofthisexperi- 0 0.2 0.4 0.6 0.8 1 1 4 7 10 13 16 Area under ROC curve (A')Network number (increasing rewiring)Random SG SL SU SI Figure11:AccuracyofSybildefenseschemesonsyntheticnetworkswithincreasingcommunitystructureinducedbyrewiring.Withhighlevelsofcommunitystructure,theaccuracyofallschemeseventuallyfallstoclosetorandom. Network Nodes Links Modularity Facebookundergrad[21] 1,208 43,043 0.278Advogato[1] 5,264 43,027 0.318Wikipediavotes[13] 7,066 100,736 0.350URVemail[11] 1,133 5,451 0.504Astrophysicists[25] 14,845 119,652 0.621Facebookgrad[21] 514 3,313 0.644High-energyphysics[14] 8,638 24,806 0.690Relativity[14] 4,158 13,422 0.790Table3:Sizeandmodularityofthereal-worlddatasetsusedinourevaluation.Weassumeallthegraphstobeundirectedandusethelargestcon-nectedcomponent.mentforthe nal16networksareshowninFigure11.ItcanclearlybeseenthattheSybildefenseschemesperformmuchbetterinthenetworkswithlesscommunitystructurethaninthosewithmorecommunitystructure.Infact,whenthereisahighlevelofcommunitystructure,theSybilde-fenseschemesperformclosetowhatwouldbeexpectedwitharandomranking(indicatedbyaA0valueof0.5).Thus,thee ectivenessoftheseschemesisverysensitivetothelevelofcommunitystructurepresentinthenon-Sybilregionofthenetwork.Next,weexaminewhetherthisobservationholdsinreal-worldnetworks.Todoso,wecollectedasetofreal-worldnetworksthathavevaryinglevelsofcommunitystructure,showninTable3.Inordertomeasurethelevelofcom-munitystructurepresentinthenetworks,weusethewell-knownmetricmodularity[26].Inbrief,modularityrangesbetween-1and1,with0representingnomorecommunitystructurethanarandomgraph.Stronglypositivevaluesindicatesigni cantcommunitystructureandstronglynega-tivevaluesindicatelesscommunitystructurethanarandomgraph.Ascanbeobservedinthetable,theseeightnetworkshavemodularityvaluerangingfrom0.28to0.79,indicatingmoderatetostronglevelsofcommunitystructure.Weconductedasimilarexperimenttotheoneabove,treatingthesenetworksasthenon-Sybilregion,attachingaSybilregion,andevaluatingtheaccuracyofSybildefense.However,sincethesenetworksareofverydi erentscales,wecreatedapower-lawSybilregionforeachnetworkwithone-quarterthenumberofSybilsastherearenon-Sybils,andat-tachedtheseSybilregionstothenon-Sybilsrandomlywith 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Area under ROC curve (A')ModularityRandom SG SL SI SU Figure12:AccuracyofSybildefenseschemesonreal-worldnetworksfromTable3withvariouslevelsofcommunitystructure.Signi cantlyworseperfor-manceisobservedasthelevelofcommunitystruc-tureincreases. Figure13:IllustrationsofthesyntheticnetworksusedinSection5.2(theactualnetworksaremuchlarger).Non-SybilsaredarkgreenandSybilslightorange.Withdecreasingk,theSybilnodesplacetheirlinksclosertothetrustednode.anumberlinksequalto5%ofthelinksbetweennon-Sybilnodes.TheresultsofthisexperimentareshowninFigure12.Weobserveacleartrend:Asthelevelofcommunitystructureincreases,evidencedbyincreasingmodularity,theperfor-manceoftheSybildefenseschemesfallsclosetorandom.Infact,acorrelationcoecientof-0.81isobservedbetweenthemodularityvalueandtheA0metric,demonstratingthatincreasinglevelsofcommunitystructurearestronglyanti-correlatedwiththeabilitytodistinguishSybils.Thispooraccuracyalsocorrespondswellwithrecentwork[23]thathassuggestedthatmanyreal-worldnetworksmaynotbeasfast-mixingaswaspreviouslythought.Thus,asobservedaboveforsyntheticnetworks,Sybildefenseschemesareex-tremelysensitivetothelevelofcommunitystructurepresentinreal-worldnetworksaswell.5.2ResiliencetotargetedSybilattacksWenowexaminethesensitivityofSybildefenseschemestoSybilattacksthatleverageknowledgeofthestructureofthesocialnetworktoestablishlinkstoatargetedsubsetofnodesinthenetwork.RecallthatallschemesassumethattheSybilnodesareallowedtocreateonlyaboundednumberoflinkstonon-Sybils.Whenevaluatingtheschemes,theauthorsoftheseschemesassumethattheattackerestablishestheselinkstorandomnodesinthenetwork.Wenowexplorehowthisoneaspectoftheattackmodel(randomlinkplacementtonon-Sybils)cana ecttheperformanceofSybildefenseschemesbyallowingtheSybilsalevelofcontroloverwherethoselinksareplaced.Asbefore,we rstexaminethebe-haviorusingsyntheticnetworksandthenexaminereal-worldnetworks.Tocreatethesyntheticnetwork,weusethemethodologyfromSection5.1,withrewiringdoneuntilonly40linksre-mainbetweenthetwocommunitiesof256nodeseach.WethencreateaseriesofscenarioswhereweincreasinglyallowtheSybilsmorecontroloverwheretheirlinkstonon-Sybilsareplaced.Speci cally,insteadofrequiringtheSybillinkstobeplacedrandomlyovertheentirenon-Sybilregion,weallowtheSybilstoplacetheselinksrandomlyamongtheknodesclosesttothetrustednode,whereclosenessisde nedbytherankinggivenbythecommunitydetectionalgorithmusedinSection4.Inallcases,thenumberofSybil-to-non-Sybillinksremainsthesame.Thus,askisreduced,theSybilsareallowedtotargettheirlinksclosertothetrustednode.WethencalculatetheaccuracyoftheSybildefenseschemes.AnillustrationofthesenetworksisshowninFig-ure13. Figure14presentstheresultsofthisexperiment.WeseeadecreaseinaccuracyastheSybilsareallowedtoplacetheirlinksclosertothetrustednode.ThisisaresultoftheSybilnodesbeingplacedhigherintheSybildefensescheme'srank-ing,andthereforebeinglesslikelytobedetected.Fromthissimpleexperiment,itisclearthattheperformanceofSybildefenseschemesishighlydependentontheattackmodel,depending(forexample)onnotjustuponthenumberoflinkstheattackercanform,butonhowwellthoselinkscanbetargeted.WethenrepeatthesameexperimentusingtheFacebookgraduatestudentnetwork.TheresultsofthisexperimentareshowninFigure15,andareevenmorestrikingthanthepreviousexperiment.Astheattackersareallowedmorecon-troloverlinkplacement(i.e.,askisreduced),theaccuracy rstfallstonobetterthanrandom,beforedroppingsig-ni cantlybelow0.5.ThisindicatesthattheSybildefenseschemesarerankingSybilssigni cantlyhigherthannon-Sybils,meaningtheschemesareadmittingSybilsandblock-ingnon-Sybils.ThereasonforthisisthestrongcommunitystructurepresentintheFacebooknetworkcombinedwiththestrongerattackmodel:astheSybilstargettheirlinksmorecarefully,theyappearaspartofthetrustednode'slocalcommunityandarethereforemorehighlyranked.5.3ImplicationsInthissection,weexploredhowtheperformanceofSybildefenseschemesisa ectedbythestructureofthesocialnetworkandbytheabilityoftheattackertoexploitthestructureofthesocialnetworktolaunchtargetedattacks.BasedonourunderstandingofhowSybildefenseschemeswork,wehypothesizedthatnetworkswithwell-de nedcom-munitystructurewouldbemorevulnerabletoSybilattacks.Weveri edourhypothesisbydemonstratingthat,asthenon-Sybilregioncontainsmoresigni cantcommunitystruc-ture,thedetectionaccuracyofallschemesfallssigni cantlyandtheschemesarevulnerabletotargetedSybilattacks.OuranalysisrevealsfundamentallimitationsofexistingSybildefenseschemesthatariseoutoftheirrelianceoncom-munitystructureinthenetwork.Ourlistoflimitationsisbynomeansexhaustive;othervulnerabilitiesofrelyingoncommunitydetectionexist.Forexample,arecentstudyhasshownthatidentifyingcommunitiesreliablyinawiderangeofreal-worldnetworksisanotoriouslydiculttask[15].Wehopethat,bypointingouttheselimitations,wemoti- 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 Area under ROC curve (A')kRandom SG SL SU SI Figure14:AccuracyofSybildefenseschemesonsyntheticnetworkswhenSybilsareallowedtotargettheirlinksamongtheclosestknodestothetrustednode.AstheSybilsplacetheirlinkscloser(lowerk),theaccuracyofallschemesfalls. 0 0.2 0.4 0.6 0.8 1 0 100 200 300 400 500 Area under ROC curve (A')kRandom SG SL SU SI Figure15:AccuracyofSybildefenseschemesontheFacebooknetworkwhenSybilsareallowedtotargettheirlinksamongtheclosestknodestothetrustednode.AstheSybilsplacetheirlinkscloser,allschemesbeginrankingSybilnodeshigherthannon-Sybils(asevidencedbytheA0below0.5).vatetheneedforSybildefenseschemestobeevaluatedonawiderrangeofsocialnetworksandattackmodels.Our nd-ingsalsopointtoaneedtodevelopSybildefenseschemesthatworkbyleveragingdi erentnetworkfeatures(oraddi-tionalinformationbeyondthenetworkstructure)thanex-istingschemes,allowingSybildefensetobee ectivewherenowitisnot.6.CONCLUDINGDISCUSSIONInthispaper,wehavetakenthe rststepstowardsdevelop-ingadeeperunderstandingofhowthenumerousproposedsocialnetwork-basedSybildefenseschemeswork.Wefoundthat,despitetheirconsiderabledi erences,allSybildefenseschemesrelyonidentifyingcommunitiesinthesocialnet-work.Unfortunately,wealsodiscoveredthatthisrelianceoncommunitydetectionmakestheschemesfundamentallyvulnerabletoSybilattackswhenoperatingovernetworkswherethenon-Sybilnodesformstrongcommunities.Inlightofthesenegativeresults,welookforalternativeapproachestoSybildefensethatcouldbedeployedinprac-tice.Inthissection,we rstfocusourdiscussiononaddi-tionalchallengesthatarisewhendeployingsocialnetwork-basedSybildefenseschemesinpractice.WethendiscusstwowaystoimproveSybildefensesmovingforward.Wepresentourdiscussionpointsasquestionsandanswers.Arelinksinsocialnetworkshardtoform?AlltheSybildefenseschemesdiscussedinthispapermaketheas-sumptionthatSybilscanonlyformacertainnumberoflinkstonon-Sybils.However,itremainsanopenquestionwhetherthisistrueinanyonlinesocialnetworkoftoday;itisclearthat,atleastinsomesocialnetworks,theassump-tiondoesnothold[5].AreSybilsnecessarilybad?InalloftheSybildefenseschemes,itisassumedthatthepresenceofSybilsisevi-denceofmisbehavior,andassuch,nonon-Sybilshouldin-teractwithaSybil.However,therearelegitimatereasonswhyausermightwishtocreatemultipleidentities.Forex-ample,usersmaywishtopartitiontheiridentityintoonethatisusedtointeractwithco-workers,andanotherthatisusedtointeractwithfriendsandfamily(e.g.,themultipleemailaddressesthatmanypeopleusetoday).UserspostingvideostoYouTubemaywishtopostunderpseudonymsinordertoavoidrevealingtheirreal-worldidentitywhilestillusingapersonalaccounttoratevideosandpostcomments. Sincethemerepresenceofuserswithmultipleaccountsisnotnecessarilyindicativeofmisbehavior,whatweshouldbeconcernedwithisnotnecessarilythepresenceofSybils,but,rather,theuseofSybilsformisbehavior.DetectingSybilsandsimplyexcludingthemfromasystemisonlyoneparticularlydraconianwayofaccomplishingthis.ShouldSybildefensesmovetowardsSybiltoler-ance?InsteadofexplicitlyidentifyingSybilslikeSybil-Guard,SybilLimit,andSybilInfer,asystemcouldaimtoinsteadjustpreventSybilsfromgainingaccesstoextraprivi-leges.SumUp,forexample,attemptstolimitthevotescom-ingfromSybilnodesbylimitingthee ectofvotesfrompo-tentialSybilregions.Insteadofexplicitlyidentifyingnodes,theprotocolseekstolimittheirabilitytodisproportionallya ecttheresultingvotecount.Asaresult,thesystemdoesnottrytopreventusersfromcreatingmultipleidentities,butrather,ensuresthatbydoingso,theyareunabletogainanyadditionalprivileges.WebelievethatbuildingSybiltoler-anceintoapplicationsmayrequiremoree ort,andisclearlylessgeneralthanidentifyingSybils,butallowsapplicationdesignerstosidestepthearmsraceoflocatingSybilsinthesocialnetwork.ShouldSybildefensesleveragemoreinformation?Giventheinherentlimitationsofrelyingsolelyontheso-cialnetworkinordertodefendagainstSybils,anattractivewaytoimproveontheseschemesistogiveSybildefenseschemesadditionalinformation.Asasimpleexample,sup-poseaSybildefenseschemeweregivenalistofnodes,oneineachofthedi erentcommunitieswithinthenetwork,whowereeitherknowntobeSybils,orknowntobenon-Sybils.Inthiscase,itisclearthatthisadditionalinformationcouldbeusedbycommunitydetectionalgorithmstoaccuratelydi erentiatebetweencommunitiescontainingSybilandnon-Sybilnodes.Incontrast,currentSybildefenseschemesaregivenonlyasingletrustednodeasinputandconsequently,theyperformpoorly.Asanotherexample,recentworkhassuggestedthatactiv-itybetweenusersmaybeabetterpredictorofthestrengthofthesociallinkbetweenthem[30,31].Thesestudiesindi-catethateveninnetworkswhereusersacceptfriendrequestsfromarbitrarysources,usersengageinsharedactivity(e.g.,exchangingmessages)withonlyalimitedsubsetoftheirfriends.Thus,havingadditionalinformationaboutuserac-tivitycouldhelpweedoutweaksocialconnections,includinglinksfromSybilnodes.Finally,foralloftheworkthathasfocusedonsocialnetwork-basedSybildefenses,itisunclearhowfarwearefromhavingtheseideasappliedtoactualdeployedsystems.However,asdigitalidentitiesbecomemoreimportant,itisclearthatthepotentialforfraud,deception,andothermis-behaviorwillincrease,therebynecessitatingSybildefenses.Understandingthebene ts,limitations,andtradeo sas-sociatedwithalternativeapproachestoSybildefenseisanimportantsteptowardsmakingthishappen.AcknowledgementsWethanktheanonymousreviewersandourshepherd,CristinaNita-Rotaru,fortheirhelpfulcomments.Thisre-searchwassupportedinpartbyanAmazonWebServicesinEducationGrant.7.REFERENCES[1]Advogatotrustnetwork.http://www.trustlet.org/wiki/Advogato.[2]R.AndersenandK.J.Lang.Communitiesfromseedsets.InProc.WWW'06,Edinburgh,Scotland,May2006.[3]J.P.Bagrow.Evaluatinglocalcommunitymethodsinnetworks.J.Stat.Mech.,2008(5),2008.[4]A.-L.BarabasiandR.Albert.EmergenceofScalinginRandomNetworks.Science,286:509{512,1999.[5]L.Bilge,T.Strufe,D.Balzarotti,andE.Kirda.Allyourcontactsarebelongtous:Automatedidentitytheftattacksonsocialnetworks.InProc.WWW'09,Madrid,Spain,Apr2009.[6]A.Clauset.Findinglocalcommunitystructureinnetworks.PhysicalReviewE,72(2),2005.[7]G.DanezisandP.Mittal.SybilInfer:DetectingSybilNodesusingSocialNetworks.InProc.NDSS'09,SanDiego,CA,Feb2009.[8]J.Douceur.TheSybilAttack.InProc.IPTPS'02,Cambridge,MA,Mar2002.[9]J.Fogarty,R.S.Baker,andS.E.Hudson.CasestudiesintheuseofROCcurveanalysisforsensor-basedestimatesinhumancomputerinteraction.InProc.GI'05,Victoria,BC,May2005.[10]S.Fortunato.Communitydetectioningraphs.PhysicsReports,486:75,2010.[11]R.Guimera,L.Danon,A.Diaz-Guilera,F.Giralt,andA.Arenas.Self-similarcommunitystructureinanetworkofhumaninteractions.PhysicalReviewE,68(6),2003.[12]J.Kleinberg.TheSmall-WorldPhenomenon:AnAlgorithmicPerspective.InProc.STOC'00,Portland,OR,May2000.[13]J.Leskovec,D.Huttenlocher,andJ.Kleinberg.SignedNetworksinSocialMedia.InProc.CHI'10,Atlanta,GA,Apr2010.[14]J.Leskovec,J.Kleinberg,andC.Faloutsos.GraphEvolution:Densi cationandShrinkingDiameters.ACMTKDD,1(1),2007.[15]J.Leskovec,K.Lang,andM.Mahoney.Empiricalcomparisonofalgorithmsfornetworkcommunitydetection.InProc.WWW'10,Raleigh,NC,Apr2010.[16]J.Leskovec,K.J.Lang,A.Dasgupta,andM.W.Mahoney.StatisticalPropertiesofCommunityStructureinLargesocialandinformationnetworks.InProc.WWW'08,Beijing,China,Apr2008.[17]C.Lesniewski-Laas.ASybil-proofone-hopDHT.InProc.SNS'08,Glasgow,Scotland,Apr2008.[18]C.Lesniewski-LaasandM.F.Kaashoek.Whanau:Asybil-proofdistributedhashtable.InProc.NSDI'10,SanJose,CA,Apr2010.[19]F.Luo,J.Z.Wang,andE.Promislow.Exploringlocalcommunitystructuresinlargenetworks.WebIntelligentandAgentSystems,6(4):387{400,2008.[20]A.Mislove,A.Post,K.P.Gummadi,andP.Druschel.Ostra:LeveragingTrusttoThwartUnwantedCommunication.InProc.NSDI'08,SanFrancisco,CA,Apr2008.[21]A.Mislove,B.Viswanath,K.P.Gummadi,andP.Druschel.Youarewhoyouknow:Inferringuserpro lesinonlinesocialnetworks.InProc.WSDM'10,NewYork,NY,Feb2010.[22]M.MitzenmacherandE.Upfal.ProbabilityandComputing.CambridgeUniversityPress,Cambridge,UK,2005.[23]A.Mohaisen,A.Yun,andY.Kim.Measuringthemixingtimeofsocialgraphs.Technicalreport,UniversityofMinnesota,2010.[24]S.Nagaraja.Anonymityinthewild:Mixeson unstructurednetworks.InProc.PET'07,Ottawa,ON,Jun2007.[25]M.E.J.Newman.Thestructureofscienti ccollaborationnetworks.PNAS,98(2):404{409,2001.[26]M.E.J.Newman.Fastalgorithmfordetectingcommunitystructureinnetworks.PhysicalReviewE,69(6),2004.[27]D.QuerciaandS.Hailes.Sybilattacksagainstmobileusers:Friendsandfoestotherescue.InProc.INFOCOM'10,SanDiego,CA,Mar2010.[28]A.StrehlandJ.Ghosh.ClusterEnsembles{AKnowledgeReuseFrameworkforCombiningPartitionings.InProc.AAAI'02,PaloAlto,CA,Mar2002.[29]N.Tran,B.Min,J.Li,andL.Subramanian.Sybil-ResilientOnlineContentVoting.InProc.NSDI'09,Boston,MA,Apr2009.[30]B.Viswanath,A.Mislove,M.Cha,andK.P.Gummadi.OntheEvolutionofUserInteractioninFacebook.InProc.WOSN'09,Barcelona,Spain,Aug2009.[31]C.Wilson,B.Boe,A.Sala,K.P.N.Puttaswamy,andB.Y.Zhao.UserInteractionsinSocialNetworksandtheirImplications.InProc.Eurosys'09,Nuremberg,Germany,Apr2009.[32]H.Yu,P.B.Gibbons,M.Kaminsky,andF.Xiao.SybilLimit:ANear-OptimalSocialNetworkDefenseagainstSybilAttacks.InProc.IEEES&P,Oakland,CA,May2008.[33]H.Yu,M.Kaminsky,P.B.Gibbons,andA.Flaxman.SybilGuard:DefendingAgainstSybilAttacksviaSocialNetworks.InProc.SIGCOMM'06,Pisa,Italy,Sep2006.APPENDIXA.ANALYSISOFSYBILGUARDAssumedsocialnetworktopology:SybilGuard[33]as-sumesthatthenon-Sybilregionisfastmixing[22],meaningthatafterO(logn)hops(wherenisthenumberofnon-Sybils),theprobabilitydistributionofthelastnodeonaran-domwalkreachesthestationarydistribution.SybilGuardassumesthattheentirenetwork(theSybilregioncombinedwiththenon-Sybilregion)isnotfastmixing.Partitioningalgorithm:SybilGuardusesconstrainedrandomwalkformarkingnodesasnon-SybilorSybil.Itmarksasuspectnodeasnon-Sybiliftherandomwalkfromthetrustednodeandthesuspectintersect,otherwisethesuspectismarkedasaSybil.Noderankingbypartitioningalgorithm:Inordertogeneratearanking,weconductrandomwalksfromthetrustednode.Westartwithawalklength1andincreaseittok,wherekisthelengthoftherandomwalksuchthatallnodesinthenetworkaremarkedasnon-Sybil.Theorderinwhichnodesaremarkedasnon-Sybilintheseincreas-inglylongrandomwalksimposesaranking.Intherarecasewhenallthenodesinthenetworkarenotmarkedasnon-Sybilusingasinglerandomseedandalongwalklength,weconductaseriesofrandomwalkswithdi erentrandomseedstoinducearankingfortheremainingnodes.Determiningcuto :SybilGuardusesO(p nlogn)ran-domwalkstogathersamplesfromthenon-Sybilregionofnnodes.ForasocialnetworkwithO(logn)mixingtime,basedonthebirthdayparadox,twonon-Sybilnodeswithp nsamplesfromthenon-Sybilregionwillhaveanintersec-tionwithhighprobability.SybilGuardreliesonanestima-tionprocedurefordeterminingtheappropriatelengthoftherandomwalk,andconsequently,thecuto value.B.ANALYSISOFSYBILLIMITAssumedsocialnetworktopology:SybilLimit[32]makesthesameassumptionsaboutthenetworkasSybil-Guard.Partitioningalgorithm:SybilLimitperformsO(p m)in-dependentrandomwalksoflengthO(logn)fromeachnode.Twoconditionsmustbesatis edforthetrustednodetomarkasuspectasanon-Sybil.The rstcondition|calledtheintersectioncondition|requiresthatthelastedgeofoneoftherandomwalksofthetrustednodeandthesuspectmustintersect.Thesecondcondition|calledthebalancecondition|limitsthenumberofnon-Sybilsperattackedge.Eachtailofarandomwalkisassigneda\load"thatisnotallowedtoexceedagiventhreshold;theloadisincrementedeachtimethetrustednodemarksanothersuspectasanon-Sybil.Noderankingbypartitioningalgorithm:SybilLimithastwoprimaryparametersforcontrollingthenumberofnodesmarkedasnon-Sybilinthenetwork|thenumberofrandomwalksfromeachnodeandthelengthofthesewalks.Astheseparametersareincreased,greaternumbersofnodesaremarkedasnon-Sybil.SimilartoSybilGuard,weinferarankingbasedontheorderinwhichnodesaremarkedasnon-Sybil.Determiningcuto :SimilartoSybilGuard,SybilLimitreliesonanestimationprocedureto ndlengthofrandomwalkandthenumberofrandomwalkrequired.Thesetwoparametersimposeacuto .C.ANALYSISOFSYBILINFERAssumedsocialnetworktopology:SybilInfer[7]makesthesameassumptionasSybilGuard.SybilInferalsomakesafurtherassumptionthatthemodi edrandomwalksarefastmixinginrealsocialnetworks.Partitioningalgorithm:SybilInferperformsmultiplerandomwalksfromeachnodetosamplenodesfromthenon-Sybilregion.ItfurtherusesaBayesianinferencetech-niquetodeterminetheprobabilityofanynodeinthesystembeingmarkedasnon-Sybil.Noderankingbypartitioningalgorithm:SinceSybil-Inferassignseachnodeaprobabilityofbeinganon-Sybil,thenodescanberankedbasedonthisprobability.Wecon-duct30runsofSybilInferwithdi erentrandomseeds,andusetheaverageprobabilityoveralltherunstodeterminethe nalrankingofthenodes.Determiningcuto :SybilInferpartitionsthenodesbasedonathresholdvaluefortheprobabilityofanodebeingnon-Sybil.D.ANALYSISOFSUMUPAssumedsocialnetworktopology:SumUpassumesthatthemin-cutbetweenthevotecollector(i.e.,thetrustednode)andnon-Sybilnodesoccursatthecollector,andthatthemin-cutbetweenSybilsandthenon-Sybilsoccursattheattackedges.Partitioningalgorithm:SumUppartitionsnodesbasedonwhethertheirvoteisacceptedornot.Nodeswhosevotesareacceptedaretreatedasnon-Sybils,whereasnodeswhosevotesaresubjecttocapacityconstraintsaretreatedasSybils.Noderankingbypartitioningalgorithm:SumUpde-cideswhetheravotewillbecollectedornotbyde ningavotingenvelopewithinwhichallvotesarecollectedandout-sideofwhichvotesareconstrainedtooneperlinkoutoftheenvelope.ThesizeofthevotingenvelopeiscontrolledbytheparameterCmax,whichisthemaximumnumberofvotesthatcanbecollectedbythetrustednode.Inordertoranknodes,weincreaseCmaxfrom1tok,wherekisthevalueforwhichthevotingenvelopecontainstheentirenetwork.Theorderinwhichthesenodesareaddedtothevotingenvelopeinducesaranking.Determiningcuto :Cmaxdeterminesthesizeofthevot-ingenvelopeandservesasthecut-o parameter.