/
JMLR Workshop and Conference Proceedings     Asian Con JMLR Workshop and Conference Proceedings     Asian Con

JMLR Workshop and Conference Proceedings Asian Con - PDF document

briana-ranney
briana-ranney . @briana-ranney
Follow
450 views
Uploaded On 2015-04-23

JMLR Workshop and Conference Proceedings Asian Con - PPT Presentation

biggiodieeunicait Dept of Electrical and Electronic Engineering University of Cagliari Piazza dArmi 09123 Cagliari Italy and Blaine Nelson blainenelsonwsiiunituebingende Dept of Mathematics and Natural Sciences EberhardKarlsUniversitat Tubingen Sand ID: 54370

biggiodieeunicait Dept Electrical and

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "JMLR Workshop and Conference Proceedings..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

SupportVectorMachinesUnderAdversarialLabelNoisee ectofoutliersintrainingdata.Similarlytothepreviouscase,thismethodisalsobasedonadi erentde nitionofthelossfunction,whichyieldsanon-convexoptimizationproblem,approximatelysolvedthroughaconvexrelaxation.Fromatheoreticalstandpoint,robustnesswasstudiedinthecontextofbothclassicalstatisticsandmachinelearning.Therobuststatisticsapproach(Huber,1981;Hampeletal.,1986;Maronnaetal.,2006)hasstudiedgeneralpropertiesofstatisticalestimatorsunderthechangeoftheunderlyingdistributions.Awell-knowninstrumentofsuchanalysisistheso-calledin uencefunction.Therobustnessissuesofmargin-basedlearningmethodshavebeenstudiedbyChristmannandSteinwart(2004).Inparticular,theystudiedthebehaviorofSVM-likealgorithmsundersmallperturbationsoftrainingdataandprovedthatundersomeconditions,thein uencefunctionofSVMscanbebounded.3.LabelNoiseRobustSVMsInthissectionweintroduceourapproach,LabelNoiserobustSVMs(LN-robustSVMs),toimproveSVMs'robustnesstolabelnoiseintrainingdata.Wepointoutthat,withrespecttopreviousworks,thisapproachdoesnota ectthecomputationalcomplexityofthestandardSVMlearningalgorithm,asitonlyyieldsasimplekernelmatrixcorrection.Labelnoisecanbeexplicitlymodelledbyassumingthatthelabelsinthetrainingsetfxi;yigni=12Xf�1;+1gcanbe ipped.Tothisend,we rstintroduceasetofrandomvariables"i2f0;1g,i=1;:::;n,whichrepresentwhetherthecorrespondinglabelyiis ipped(1)ornot(0).Accordingly,wethenreplaceyiwithy0i=yi(1�2"i)suchthatif"i=1,y0i=�yi(label ip),whiley0i=yiotherwise.InthedualSVMproblem(Problem3)theclasslabelssolelya ectthematrixQ=Kyy�.Inparticular,takingintoaccountlabelnoise,wecanwriteitselementsasQij=yiyjK(xi;xj)(1�2"i)(1�2"j):(4)Notethat,intheabsenceofnoise"i=0,i=1;:::;n,and,thus,theelementsofQaresimplyQij=yiyjK(xi;xj),asinthestandardSVMformulation.Ifweassumethateverylabelisindependently ippedwiththesameprobability,then"i,i=1;:::;n,areni.i.d.Booleanrandomvariables,whosemeanissimplytheprobabilityof"i=1,andwhosevarianceis2=(1�).Withinthisassumption,wecancomputetheexpectedvalueofQfromEq.4,whichisgivenbyE"[Qij]=(yiyjK(xi;xj)(1�42);ifi6=j;yiyjK(xi;xj);otherwise.(5)Now,wecanusetheexpectedvalueofQ(whichisstillapositivesemi-de nitekernelmatrix)tosolvetheSVMproblem.ThisshouldreasonablyimprovetherobustnessofthelearnedSVMtolabel ipnoise.Theproposedmethodonlyyieldsakernelmatrixcorrection(Eq.5),anddoesnotmodifythestandardSVMproblem.However,itisanheuristicmethodanditisthusnotguaranteedtoful llanyoptimalitycriterion(e.g.,beingoptimalundertheconsiderednoisemodel).Thesolutionissymmetricwithrespectto=0:5,i.e.,the valuesobtainedfor=and=1�arethesame,andareexactlythesameasthestandardSVMsolutionwhen101 BiggioNelsonLaskoviseither0or1(asthecorrespondingkernelcorrectioniszero).Moreover,theequationsofwandbobtainedbysolvingthestandardSVMproblemhavetobemultipliedby1�2(aswetaketheirexpectationsoverlabelnoise).Thus,when�0:5,wandbaremultipliedbyanegativefactor.Thisrepresentsthefactthatmorethanhalfofthetrainingpointsareassumedtohaveawronglabel,and,thus,thedecisionregionsareinverted.Forinstance,when=1,thesolutionisexactlygivenby�wand�b(beingwandbthestandardSVMsolution):weareinfactassumingthatallsamplesarewronglylabelledinthetrainingset,and,consequently,thehyperplaneobtainedbythestandardSVMresultsrotatedby180.Notelastlythat,if=0:5,w=0.Thisisadegeneratecaseinwhichlabelsintrainingdataareassumedtobecompletelyrandom,sotheSVMisnotabletodetermine,onaverage,anymeaningfuldecisionhyperplane.3.1.Dualproblemand equalizationWearenowinapositiontobetteranalyzethechangeinducedbythekernelcorrectionofEq.5intheSVMdualproblem.Indeed,thedualproblemcanbere-writtenasmin 1 2 �(QM) �1� s.t.0 iC;i=1;:::;n;nXi=1 iyi=0;(6)wheretheelementsMijofMaregivenbyMij=(1;ifi=j1�S;otherwise,(7)whereweuseS=42tosimplifynotation.ThematrixMcanbefurtherdecomposedasM=(1�S)1nn+SInn,where1nnisannmatrixwhoseelementsareallones,andInnisthennidentitymatrix.SubstitutingthisdecompositionofMintoProblem6,addingandsubtractingSPni=1 ifromtheLagrangian,anddividingitby1�S,yieldsthefollowing(equivalent)dualproblem:min 1 2 �Q �1� +S 1�S1 2 �(QInn) �1� s.t.0 iC;i=1;:::;n;nXi=1 iyi=0;(8)wheretheonlydi erencewiththestandardSVMformulationisgivenbyanadditionaltermweightedbyS 1�S.Thisrevealssomeinterestinginsightsaboutthee ectoftheproposedkernelcorrection.First,notethatasincreasesfrom0to0:5,S 1�Sapproachesin nity,namely,the valuesareonlydeterminedbyminimizingthelatterterminProblem8.Second,thistermdoesnotdependontheclasslabels,asitonlyinvolves andthediagonalofQ(whichisindeedequaltothediagonalofK).Theaboveobservationshighlightthat,as102 BiggioNelsonLaskovandthusweresorttoaheuristicapproachwhichhasshowntobequitee ectiveonoursetofexperiments(seeSect.6).Theideabehindtheadversariallabel ipattackis rstto iplabelsofsampleswithnon-uniformprobabilities,dependingonhowwelltheyareclassi edbytheSVMlearnedontheuntaintedtrainingset;and,second,torepeatthisprocessanumberoftimes,eventuallyretainingthelabel ipswhichmaximallydecreasedperformance.Inparticular,weincreasetheprobabilityof ippinglabelsofsampleswhichareclassi edwithveryhighcon dence(i.e.,non-supportvectors),anddecreasetheproba-bilityof ippinglabelsofsupportvectorsanderrorvectors(inverselyproportionaltotheir value).Thereasonisthattheformer(mainly,thenon-supportvectors)aremorelikelytobecomesupportvectorsorerrorvectorswhentheSVMislearnedonthetaintedtrainingset,and,consequently,thedecisionhyperplanewillbeclosertothem.Thiswillre ectaconsiderablechangeintheSVMsolution,and,potentially,initsclassi cationaccuracy.Furthermore,thelabelsofsamplesindi erentclassescanbe ippedinacorrelatedway,toforcethehyperplanetorotateasmuchaspossible.Tothisend,onecandrawarandomhyperplanewrnd,brndinfeaturespace,andfurtherincreasetheprobabilityof ippingthelabelofapositivesamplex+(respectively,anegativeonex�),if�wrnd�x++brnd�0(�wrnd�x�+brnd0).WeimplementedtheabovedescribedattackasAlgorithm4,usingtwoweightingparameters 1and 2,setto0:1(basedonsomepreliminaryexperi-mentalobservations,wefoundthatthesevaluesachievedgoodresults).Asimpleexampleofapplicationofthisattackstrategyisreportedinthenextsection.5.ToyexampleWepresenthereasimpletoyexampletodemonstratetheadversariallabel- ippingattack,andhowthekernelcorrectionproposedinSect.3cane ectivelycounteractbothrandomandadversariallabel ips.Wegenerateatwo-dimensionaldatasetof100samples,wheresamplesofclassz2f�1;+1garedrawnfromaNormaldistributionwithmean[z;0]&#x]TJ/;ø 1;�.90;‘ T; 13;&#x.685;&#x 0 T; [0;and(diagonal)covariancematrixequalto1 2I.AnSVMwithlinearkernelislearnedonthis(untainted)trainingset,asdepictedinFig.1( rstplotfromleftinthetoprow).Then,we iplabelsof10samplesusingtheadversariallabel ipattackdescribedintheprevioussection.NotefromFig.1(secondtofourthplotfromleftinthetoprow,label ipsarehighlightedwithgreencircles)that:(1)theadversariallabel ipsmainlya ectsampleswhicharefartherawayfromtheuntaintedSVMdecisionboundary,and(2)thecorrelationimposedbetweenlabel ipsofsamplesofdi erentclassesinducesasubstantialchangeinthetaintedSVMdecisionboundary(secondplotfromleftinthetoprow).Besidesthis,notehowtheSVMslearnedusing=0:1and=0:51(thirdandfourthplotinthetoprow)areabletocompensatefortheadversariallabel ips,althoughnotcompletely.Tobetterunderstandofthisbehaviorandcon rmthecorrectnessoftheobservationsinSect.3.1,wealsoplotthe valuesofstandardSVMsandoftheproposedLN-robustSVMsagainstthescores(i.e.,thedistancesfromthehyperplane)assignedbyeachSVMtoeachtrainingsample(seeFig.1,bottomrow).Themeanandvarianceofthe valuesforeachSVMarealsoreported.Asexpected,thevarianceofthe valuesoftheLN-robustSVMsdecreaseswithrespecttothestandardSVMs,anddecreasesmoreasapproaches 1.Fromnowon,with=0:5wewillimplicitlyassume=0:5�,with�0butsmallenough(e.g.,0.001),sothatwdoesnotdegenerateto0.104 SupportVectorMachinesUnderAdversarialLabelNoise Algorithm1Adversariallabel ipattack. Input:theuntaintedtrainingdataD=fxi;yigni=1,theregularizationparameterC(andthekernel'sparameters,ifany),thenumberoflabel ipsL,thenumberofrepetitionsR,andtheweightingparameters 1and 2.Output:thetaintedlabelsy01;:::;y0n.1:( ,b) trainanSVMonD2:fori=1;:::;n,dosi yi[Pnj=1yj jK(xi;xj)+b],endfor3:normalizescores(s1;:::;sn)in[0;1],dividingbymax(s1;:::;sn)4:( rnd,brnd) generatearandomSVM(drawn+1numbersfromauniformdistribution)5:fori=1;:::;n,doqi yi[Pnj=1yj rndjK(xi;xj)+brnd],endfor6:normalizescores(q1;:::;qn)in[0;1],dividingbymax(q1;:::;qn)7:fori=1;:::;n,dovi i=C� 1si� 2qi,endfor8:(k1;:::;kn) sort(v1;:::;vn)inascendingorder,andreturnthecorrespondingindexes9:(y01;:::;y0n) (y1;:::;yn)10:fori=1;:::;L,y0ki=�yki,endfor11:trainanSVMonfxi;y0igni=112:estimateitstrainingerroronD13:repeatRtimesfrompoint4,andreturnthesetoflabelsy01;:::;y0nwhichyieldedthemaximumtrainingerror.14:returny01;:::;y0n 0:5.Thiscon rmsthatthesolutionoftheLN-robustSVMisexpectedtobelesssparsethanthestandardSVM,andthus,lesssensitivetooutliersintrainingdata.Beforeconcludingthissection,weshowthattheproposedLN-robustSVMcanbee ectiveagainstrandomlabel ipsaswell.Tothisaim,weconductasimplearti cialexperimentsimilartothepreviouscase.WeconsiderSVMswithlinearkernel,andeachclasstobenormallydistributedwithmean[z;0;:::;0]�and(diagonal)covariancematrixequalto1 2I.However,thistimeweconsider300featuresand400trainingsamples,sinceweneedahigherfeaturestosamplesratiofortherandomlabel ipattacktobee ective.Notethattheoptimal(Bayes)classi erinthiscaseissimplygivenbyw=[1;0;:::;0],andb=0.Wevarythepercentageofrandomlabel ipsinthetrainingsetupto40%,andplotthecorrespondingtestingaccuracy(evaluatedonaseparateuntaintedtestsetof1;000samples).Theresultsareaveragedover5repetitions,andreportedinFig.2,forfourdi erentvaluesoftheregularizationparameterC:0:1;1;10;100.Notehow,inthiscase(andforallvaluesofC)theLN-robustSVMisabletosigni cantlyoutperformthestandardSVM(inparticularwhen=0:5),andthat,surprisingly,thisdoesnotcauseadecreaseoftheclassi cationaccuracyattainedontheuntainteddataset(i.e.,whenthepercentageof ippedlabelsiszero).Notably,thishighlightsthatthereneednotnecessarilybeatrade-o betweenaccuracyonuntainteddataandrobustnesstoattacks.105 BiggioNelsonLaskov Figure1:(toprow)StandardSVMtrainedonuntaintedandtainteddata( rstandsecondplot,respectively),robustSVMwith=0:1and=0:5trainedonuntainteddata(thirdandfourthplot,respectively);(bottomrow) valuesofeachtrainingsampleversusitsdistancetothehyperplaneg(x),correspondingtotheSVManddatashownintheaboveplots.Meanandvarianceofthe valuesarealsoreported.Dataistaintedbyperforming10adversariallabel ips,highlightedwithgreencircles.ThesupportvectorsofeachSVMarecircledinblack. Figure2:Classi cationaccuracyonarti cialnormaldata(untainted)fortheSVMandtheLN-robustSVMswithlinearkernel.LN-robustSVMsweretrainedwith=0:05,=0:1and=0:5.Resultsareshownfordi erentpercentagesofrandomlabel ipsintrainingdata,anddi erentvaluesoftheregularizationparameterC.6.ExperimentsWereportexperimentalresultstoempiricallyvalidatethesoundnessoftheproposedap-proach.Weconsideranumberofrealdatasets,andcomparetheLN-robustSVMtothestandardSVMlearningalgorithm,eitherwithlinearorradialbasisfunction(RBF)kernels,underrandomandadversariallabel ips.Wereporttheclassi cationaccuracyattained106 SupportVectorMachinesUnderAdversarialLabelNoisebyeachclassi eronuntaintedtestdata,asafunctionofthepercentageoflabel ipsintrainingdata;themoregracefullytheperformancedecreases,themorerobusttheclassi eris.Datasets.Wedownloaded7two-classdatasetsfromLibSVMandUCIrepositories,withfeaturevaluesalreadyscaledin[�1;+1]2(ChangandLin,2001;AsuncionandNewman,2007).TheircharacteristicsaresummarizedinTable1.Withintheseexperiments,everydatasetwasrandomlysplit5timesintodi erenttraining(TR)andtesting(TS)setpairs,respectivelywith60%and40%ofsampleseach.Theresultswerethenaveragedoverthese5trials. Name #samples #features Breast-cancer 683 10 Australian 690 14 Diabetes 768 8 Fourclass 862 2 Heart 270 13 Ionosphere 351 34 Sonar 208 60 Table1:MaincharacteristicsofthedatasetsSetup.Weconsideredfourdi erentvaluesoftheregularizationparameterC,namely,C=0:1,1,10,100,forboththelinearandRBFkernels.Moreover,whentheRBFkernelwasused,forany xedCvalue,theparameter wasselectedamongthevaluesf0:01;0:1;1;10;100gbyperforminga5-foldcrossvalidationontrainingdata.Ineachplot,wereporttheperformanceofthestandardSVMandofthreeLN-robustSVMsrespectivelytrainedwith=0:05,0:1,0:5.Whentheadversariallabel ipattackisconsidered,thelabelstobe ippedaredeterminedusingthestandardSVMsolution.However,wealsonotedthattherewasnotanyrelevantdi erenceintheresultswhenthesameattackwascomputedusingthesolutionsoftheLN-robustSVMs.Results.ResultsforSVMswiththelinearkernelagainstadversariallabel ipsandrandomlabel ipsarerespectivelyreportedinFig.3and4.Duetolackofspace,weomitresultsfortheRBFkernel,whichexhibitsimilarbehaviorandleadtosimilarconclusions.First,notethat,asexpected,adversariallabel ipsgenerallydecreasetheperformancewithfewer ipsthanrandom ipping.AsthestandardSVMisnaturallysomewhatrobusttorandomlabelnoise(seeFig.4),theresilienceoftheLN-robustSVMismostpronouncedforadversariallabel ipsalthoughitalsogenerallyoutperformsthestandardSVMwithrandom ips.Second,forlowvaluesofC(i.e.,0:1and1),theLN-robustSVMdoesnotgenerallyimprovetheperformanceoverthestandardSVM;indeed,sometimesitisevenlessrobusttolabel ips(see,e.g.,the\diabetes"datasetunderadversariallabel ips,Fig.3).Ontheotherhand,theLN-robustSVMcansigni cantlyimprovetherobustnesswhenCisrelativelyhigh(see,e.g.,\australian",\breast-cancer",\fourclass",\heart"and\ionosphere").ThereasonforthisisthatwhentheregularizationparameterCishigh,theSVMtendsto ndahard-marginsolutionwhichisclearlymoresensitivetolabelnoise,and,inthiscase, 2.http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html107 BiggioNelsonLaskov Figure4:Randomlabel ipattackagainstSVMandLN-robustSVMs(with=0:05;0:1;0:5)withlinearkernel,fordi erentvaluesofC,andpercentageofnoise.110 SupportVectorMachinesUnderAdversarialLabelNoise8.ConclusionsandfutureworksThroughoutthispaper,wehaveinvestigatedtherobustnessofSVMsunderadversariallabelnoiseandproposedamethodtoimproveitbasedonasimplekernelmatrixcorrection.Weshowedthee ectivenessoftheproposedapproachonseveralarti cialandrealdatasets.Weempiricallyobservedthatourmethodleadstoequalizationof valuesinSVMs,whichintuitivelyhedgesthein uenceofindividualpointsandleadstoamorerobustestimator.Ourexperimentalresultssupportthecommonobservationthatrobustnessexhibitsatrade-o withclassi cationaccuracy.Acurrentlimitationofourmethodistheneedtoa-prioriagreeonapotentialdegreeoflabelcontamination.Whilesomead-hocheuristicsareconceivableforsettingthecor-respondingparameterinpractice,theinvestigationoftheoreticallysoundmethodsforselectinganoptimal\robustnesslevel"wouldbeaninterestingissueforfuturework,aswellasconsideringourmethodinrealadversarialproblemslikespam lteringandintru-siondetection,andcomparingitwithotherSVMimplementationswhicharemeanttoberobustagainstlabelnoise(e.g.,StempfelandRalaivola,2009).AcknowledgmentsTheauthorswishtoacknowledgetheAlexandervonHumboldtFoundationandtheHeisen-bergFellowshipoftheDeutscheForschungsgemeinschaft(DFG)forproviding nancialsup-porttocarryoutthisresearch.ThisworkwasalsopartlysupportedbyagrantawardedtoB.BiggiobyRegioneAutonomadellaSardegna,POSardegnaFSE2007-2013,L.R.7/2007\Promotionofthescienti cresearchandtechnologicalinnovationinSardinia".Theopin-ionsexpressedinthispaperaresolelythoseoftheauthorsanddonotnecessarilyre ecttheopinionsofanysponsor.ReferencesA.AsuncionandD.J.Newman.UCImachinelearningrepository,2007.http://www.ics.uci.edu/~mlearn/MLRepository.html.M.Barreno,B.Nelson,A.Joseph,andJ.Tygar.Thesecurityofmachinelearning.MachineLearning,81:121{148,2010.J.BiandT.Zhang.Supportvectorclassi cationwithinputdatauncertainty.InAdvancesinNeuralInformationProcessingSystems17,2004.C.M.Bishop.PatternRecognitionandMachineLearning(InformationScienceandStatis-tics).Springer,1sted.,2007.C.J.C.Burges.Atutorialonsupportvectormachinesforpatternrecognition.DataMin.Knowl.Discov.,2:121{167,1998.C.-C.ChangandC.-J.Lin.LibSVM:alibraryforsupportvectormachines,2001.http://www.csie.ntu.edu.tw/~cjlin/libsvm/.111