/
Detecting Software Theft via Whole Program Path Birthmarks Ginger Myles and Christian Detecting Software Theft via Whole Program Path Birthmarks Ginger Myles and Christian

Detecting Software Theft via Whole Program Path Birthmarks Ginger Myles and Christian - PDF document

sherrill-nordquist
sherrill-nordquist . @sherrill-nordquist
Follow
498 views
Uploaded On 2015-02-23

Detecting Software Theft via Whole Program Path Birthmarks Ginger Myles and Christian - PPT Presentation

arizonaedu Abstract A software birthmark is a unique characteristic of a program that can be used as a software theft detection technique In this paper we present and empirically evaluate a novel birthmarking technique Whole Program Path Birthmarking ID: 38700

arizonaedu Abstract software birthmark

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Detecting Software Theft via Whole Progr..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

DetectingSoftwareTheftviaWholeProgramPathBirthmarksGingerMylesandChristianCollbergDepartmentofComputerScience,UniversityofArizona,Tucson,AZ,85721,USAAsoftwarebirthmarkisauniquecharacteristicofaprogramthatcanbeusedasasoftwaretheftdetectiontechnique.InthispaperwepresentandempiricallyevaluateanovelbirthmarkingtechniqueWholeProgramPathBirthmarking„whichuniquelyidenti“esaprogrambasedonacompletecontrol”owtraceofitsexecution.Toeval-uatethestrengthoftheproposedtechniqueweexaminetwoimportantproperties:credibilityandtoleranceagainstprogramtransformationssuchasoptimizationandobfuscation.Ourevaluationdemonstratesthat,forthedetectionoftheftofanentireprogram,WholeProgramPathbirthmarksaremoreresilienttoattackthanpreviouslyproposedtechniques.Inaddition,weillustrateseveralinstanceswhereabirth-markcanbeusedtoidentifyprogramtheftevenwhenanembeddedwatermarkwasdestroyedbyprogramtransformation.Keywords:softwarepiracy,copyrightprotection,softwarebirthmark.1IntroductionSupposeAlicecreatesaprogramwhichshesellstoBob.Subsequently,AlicediscoversBobissellingaprogramwhichisremarkablysimilarto.AlicesuspectsBobcopiedandisresellingitunderthenewname.Inordertotakelegalaction,Aliceneedstobeabletoprovethatisindeedacopyof.Inthispaperwewilldescribeatechniqueknownassoftwarebirthmarkingwhichcanbeusedtoprovidesuchproof.Asoftwarebirthmarkisauniquecharacteristic,orsetofcharacteristics,thataprogrampossessesandwhichcanbeusedtoidentifytheprogram.Thegeneralideaisthatiftwoprogramsbothhavethesamebirthmarkthenitishighlylikelythatoneisacopyoftheother.Therearetwoimportantpropertiesofabirthmarkingtechniquethatmustbeconsidered:thedetectorshouldnotproducefalsepositives(i.e.itshouldnotsaythatoriginatefromthesamesource,if,infact,theydonot),anditshouldberesilienttosemanticspreservingtransformations(suchasoptimizationandobfuscation)thatanat-tackermaylaunchinordertodefeatthedetector.InthispaperweproposeandevaluateanewsoftwarebirthmarkingtechniquewecallWholeProgramPath(WPPB).WPPBisatechnique,relyingontheexecutionpatternoftheprogramtodetectthebirthmark.ThisisincontrasttopreviouslyK.ZhangandY.Zheng(Eds.):ISC2004,LNCS3225,pp.404…415,2004.Springer-VerlagBerlinHeidelberg2004 DetectingSoftwareTheftviaWholeProgramPathBirthmarks405proposedtechniqueswhichare,i.e.theycomputethebirthmarkbasedonthecharacteristicsoftheprogramsourceorbinarycode.WewillshowthattheWPPBtechniqueismoreresilienttoattacksbysemantic-preservingtransfor-mationsthanpublishedstatictechniques.Thispapermakesthefollowingcontributions:1.Weintroduceanewcategoryofsoftwarebirthmarkswhichwecall2.WeproposeandevaluateanewdynamicbirthmarkingtechniquebasedonWholeProgramPathsPaths3.WeevaluatethefourstaticbirthmarkingtechniquesproposedbyTamada,etal.[23,24]andshowthattheyareeasilydefeatedbycurrentcodeobfuscationtools.4.WeprovideanempiricalevaluationbetweenourWPPBtechniqueandTamadasbirthmarks,anddemonstratethatWPPBsarelessvulnerabletoattacksbysemantics-preservingtransformations.5.Finally,weshowthatbirthmarkscanbeusedtoidentifyprogramtheftevenwhenanembeddedwatermarkhasbeendestroyedbyaprogramtransfor-2RelatedWorkTherearethreemajorthreatsrecognizedagainsttheintellectualpropertycon-tainedinsoftware.Softwarepiracyistheillegalresellingoflegallyobtainedcopiesofaprogram.Softwaretamperingistheillegalmodi“cationofaprogramtocircumventlicensechecks,toobtainaccesstodigitalmediaprotectedbythesoftware,etc.Maliciousreverseengineeringistheextractingofapieceofaprograminordertoreuseitinonesown.Avarietyoftechniqueshavebeenproposedtoaddresstheseattacks.Eachtechniquetargetsadierentattackandcanoftenbecombinedtoproduceastrongerdefense.Codeobfuscation[12]isatechniquedevelopedtoaidinthepreventionofreverseengineering.Anobfuscationisasemantics-preservingtrans-formationwhichmakestheprogrammorediculttounderstandandreverseengineer.Probablythemostwell-knowntechniquefordetectingsoftwarepiracysoftwarewatermarking[7,9,11,14,17,20,22,25].Thebasicideaistoembedauniqueidenti“erintheprogram.Piracyiscon“rmedbyprovingtheprogramcontainsthewatermark.AlesserknowntechniqueforthedetectionoftheftissoftwarebirthmarksSoftwarebirthmarksdierfromsoftwarewatermarksintwoimportantways.First,itisoftennecessarytoaddcodetotheapplicationinordertoembedawatermark.Inthecaseofabirthmarkadditionalcodeisneverneeded.Insteadabirthmarkreliesonaninherentcharacteristicoftheapplicationtoshowthatoneprogramisacopyofanother.Secondly,abirthmarkcannotproveauthorshiporbeusedtoidentifythesourceofanillegalredistribution.Rather,abirthmarkcanonlycon“rmthatoneprogramisacopyofanother.Astrongbirthmarkwill 406G.MylesandC.Collbergbeabletoprovidesuchcon“rmationevenwhencodetransformationshavebeenappliedtothecodebytheadversaryinordertohidethetheft.Oneofthe“rstoccurrencesoftheuseofthetermbirthmarkwasbyGrover[15]wherethetermwasusedtomeancharacteristicsoccurringintheprogrambychancewhichcouldbeusedtoaidinprogramidenti“cation.Thistermwasdistinguishedfroma“ngerprintinthatthecharacteristicsusedtoem-bedthe“ngerprintareintentionallyplacedinthecode.Thegeneralideaofasoftwarebirthmarkissimilartothatofacomputervirussignature.AnearlyexampleoftheuseofbirthmarkswasinanIBMcourtcase[6].InthiscaseIBMusedtheorderinwhichtheregisterswerepushedandpoppedtoprovethattheirPC-ATROMhadbeenillegallycopied.Tamada,etal.[23,24]haveproposedfourbirthmarksthatarespeci“ctoJavaclass“les:constantvaluesin“eldvariables(CVFV),sequenceofmethodcalls(SMC),inheritancestructure(IS),andusedclasses(UC).TheCVFVbirth-markextractsinformationaboutthevariablesdeclaredintheclass.Foreachvariablethetypeisextractedalongwiththeinitialvalue.Thebirthmarkisthenthesequence((,...,)).SMCexaminesthesequenceofmethodcallsastheyappearintheclass,butnotnecessarilyinexecutionorder.Becauseitiseasytochangethenamesofthemethodswithintheap-plicationonlythosemethodcallswhichareinasetofwell-knownclassesareconsideredinthesequence.ISextractstheinheritancestructureoftheclass.Thebirthmarkisconstructedbytraversingthesuperclassesoftheclassback.Allclasseswhichareinthesetofwell-knownclassesareincludedinthesequence.TheUCbirthmarkexaminesallclasseswhichareusedbyagivenclass,i.e.theyappearasasuperclassofthegivenclass,thereturnorargumenttypesofamethod,thetypesof“elds,etc.Allclassesinthesetofwell-knownclassesareincludedinthesequencewhichisthenarrangedinalpha-beticalorder.AswewillseeinSect.5Tamadasbirthmarksareeasilydefeatedbyapplyingsimplecodeobfuscatingtransformationstotheprogram.Plagiarismdetectionisanotherareawhichisverysimilartosoftwarebirth-marking.Avarietyofplagiarismdetectiontechniqueshavebeenproposed(e.g.Moss[5,21],Plaque[26],andYAP[27])whichhavebeenquitesuccessfulatde-tectingplagiarismwithinstudentprograms.Unfortunately,thesesystemscom-putesimilarityatthesourcecodelevel.Inmanyinstancessourcecodeisun-available.Inaddition,thesesystemsdonotconsidersemantics-preservingtrans-formationsandtheeectsofdecompilationontheformattingofthesourcecode.Forexample,itwasshownbyCollberg,etal.[10]thatgiventhesourcecodeofaJavaapplication,simplycompilingthendecompilingwillcauseMosstoindicate0%similaritybetweentheoriginalandthedecompiledsourcecode.3SoftwareBirthmarksBeforewecanpreciselyde“netheideaofabirthmarkwemustde“newhatitmeansforaprogramtobeacopyofanotherprogram.Themostobviousde“nitioniswhereisanexactduplicateof.However,inordertohidethe DetectingSoftwareTheftviaWholeProgramPathBirthmarks407factthatcopyinghastakenplaceanattackermightapplysemantics-preservingtransformationsto.Forexample,alloftheidenti“ersinmighthavebeenrenamedoranoptimizingregisterallocatormighthavebeenappliedtosothatnowhavedierentregisterassignments.Inthiscasewewouldstillliketobeabletosaythatisacopyof.Inaddition,itisimportantthatourde“nitionre”ectsthatifisacopyofshouldexhibitthesameexternalbehavior.(Notethatthereverseofthispropertydoesnotnecessaryhold.Itispossibleto“ndtwoprogramswhichexhibitthesameexternalbehaviorbutarenotcopies.Anexampleisiterativeandrecursiveversionsofthesamefunction.)Thefollowingde“nitionofasoftwarebirthmarkisarestatementofthedef-initiongivenbyTamada,etal.[23,24].De“nition1(Birthmark).Letp,qbeprograms.Letbeamethodforextract-ingasetofcharacteristicsfromaprogram.Thenisabirthmarkofisobtainedonlyfromitself(withoutanyextrainformation),andisacopyofAswithsoftwarewatermarkingwecancharacterizeabirthmarkaseitherstaticordynamic.Astaticbirthmarkextractsthesetofcharacteristicsfromthestaticallyavailableinformationinaprogramsuchasinformationaboutthetypesorinitialvaluesofthe“elds.Adynamicbirthmarkreliesoninformationgatheredfromtheexecutionoftheapplication.Adynamicalgorithmtypicallyworksattheprogramlevelwhereasastaticalgorithmtargetsanentireprogramorindividualmoduleswithintheprogram.Thesamedistinctionistruewithstaticanddynamicwatermarkingalgorithms.Adynamicalgorithmcanprovideevidenceifanentireprogramisstolenandastaticalgorithmmaybeabletodetectthetheftofasinglemodule.ThefourbirthmarktechniquesproposedbyTamada,etal.arecharacterizedasstaticandtargetclass-leveltheft.De“nition1abovede“nesastaticbirthmark.De“nition2(DynamicBirthmark).Letp,qbeprogramsandaninputtotheseprograms.Letbeamethodforextractingasetofcharacteristicsfromaprogram.Thenp,iisadynamicbirthmarkofp,iisobtainedonlyfromitselfbyexecutingwiththegiveninput,andisacopyofp,iq,iTheWholeProgramPathBirthmarkproposedinthispapercomputesthebirthmarkfromtheexecutiontraceoftheprogram.Itistherefore,adynamicbirthmarkdesignedtodetectprogramleveltheft.3.1EvaluatingSoftwareBirthmarksWewouldlikeabirthmarktechniquetosatisfythefollowingtwoproperties.Property1(Credibility).beindependentlywrittenprogramswhichaccomplishthesametask.Thenwesayisacrediblemeasureif 408G.MylesandC.CollbergProperty2(ResistancetoTransformation).beaprogramobtainedfrombyapplyingsemantics-preservingtransformations.ThenwesayisresilientProperty1isconcernedwiththepossibilityofthebirthmarkfalselyindicat-ingthatisacopyof.Thiscouldoccurwithindependentlyimplementedpro-gramswhichperformthesametask.Itishighlyunlikelythattwoindependentlyimplementedalgorithmswillcontainallofthesamedetailssothebirthmarkshouldbedesignedtoextractthosedetailswhicharelikelytodier.Property2addressestheissueofidentifyingacopyinthepresenceofatrans-formation.Withtheproliferationoftoolsforcodeoptimizationandobfuscation,forexample[1,2,3,4],itishighlyprobablethatanattackerwillapplyatleastonetransformationpriortodistributinganillegallycopiedprogram.Itisdesirablethatabirthmarkbeabletodetectacopyevenifatransformationhasbeenappliedtothatprogram.4WholeProgramPathBasedBirthmarksInthenextsectionwepresentthe“rstknowndynamicbirthmarktechnique.ThroughexperimentswehaveperformedonthefourtechniquesproposedbyTamada,etal.webelievetheyaresusceptibletoavarietyofsimpleprogramtransformations.Thus,thereareothercharacteristicsofaprogramwhichcouldbeusedtoconstructastrongerbirthmarktechnique.4.1WholeProgramPathsWholeProgramPaths(WPP)isatechniquepresentedin[16]torepresentaprogramsdynamiccontrol”ow.TheWPPisconstructedbycollectingatraceofthepathexecutedbytheprogram.Thetraceisthentransformedintoamorecompactformbyidentifyingitsregularity,whichisrepeatedcode.Tocollectthetracetheedgesoftheprogramscontrol”owgraphareinstrumented,byuniquelylabelingeachedge.Astheprogramexecutestheedgesarerecorded,producingatrace.ThetraceisthenrunthroughtheSEQUITURalgorithmwhichcompressesitandrevealsitsinherentregularity[18,19].TheoutputoftheSEQUITURalgorithmisacontext-freegrammarfromwhichadirectedacyclicgraph(DAG)isproduced.Eachruleofthegrammariscomposedofanon-terminalandasequenceofsymbolswhichthenon-terminalrepresents.ToconstructtheDAGrepresentationofthegrammaranodeisaddedforeachuniquesymbol.Foreachruleanedgeisaddedfromthenon-terminaltoeachofthesymbolsitrepresents.The“nalDAGistheWPP.TheconstructionoftheWPPisillustratedinFig.1.Atacontrol”owgraphwith6basicblocksand8edgesisconstructedfromtheinputprogram.Thecontrol”owgraphisinstrumentedsothateachedgeislabeled.Atprogramisexecutedproducinganedgetrace.ThetraceisrunthroughtheSEQUITURalgorithmattoproducethegivencontext-freegrammar.This DetectingSoftwareTheftviaWholeProgramPathBirthmarks409grammarcontains3uniquenon-terminalsand8uniqueterminals.AtaDAGwith3internalnodes,8leafnodes,and14directededgesisconstructedwhichrepresentsthegrammar.  i=0;i5;i++) a=1; a=2;  a 2345678   1R1R1R2R28 c  d R1 R2 3 4 2  R1 R2 Fig.1.AnillustrationofthestagesinvolvedinconstructingaWholeProgramPath(WPP).Theconstructionbeginswithaprogram.Aprogramcontrol”owgraphisconstructedandinstrumented.Byexecutingtheprogramonagiveninputanedgetraceisconstructed.ThistraceisrunthroughtheSEQUITURalgorithmtoproduceacontext-freegrammar.ThegrammaristhenusedtoconstructadirectedacyclicgraphwhichrepresentstheWPP.AllterminalnodesandcorrespondingedgesareremovedfromtheWPPtoconstructtheWPPbirthmark.OurWPPbirthmarkisconstructedinanidenticalmannerastheWPPwiththeexceptionoftheDAGinthe“nalstage.Anessentialpropertyofabirthmarkisthatitcapturesaninherentcharacteristicabouttheprogramwhichisdiculttomodifythroughsemantics-preservingtransformations.TheWPPbirthmarkcapturestheinherentregularityinthedynamicbehaviorofaprogram.SinceweareonlyinterestedintheregularityweeliminateallterminalnodesintheDAG.Itistheinternalnodeswhichwillbemorediculttomodifythroughprogramtransformations.Thus,theDAGinFig.1istransformedintothebirthmarkoftheexampleprogramat 410G.MylesandC.Collberg4.2SimilarityofWPPBirthmarksTheWPPbirthmarkisintheformofaDAG.Supposewehavethebirthmarksforprograms)and)arethesameiareisomorphic.Sinceitisunlikelythatisanidenticalcopyofwouldliketobeabletosaysomethingaboutthesimilaritybetween)and).Inotherwords,wewouldliketobeabletoconcludethatisacopyofeveninthepresenceofsemantics-preservingtransformations.Tocomputesimilarityweuseaslightlymodi“edversionofthegraphdistancemetricin[8].Thesimilarityisbasedon“ndingamaximalcommonsubgraph,,between.Thepercentageofthatweareabletoidentifyinby“ndingthemaximalcommonsubgraphindicatesthesimilaritybetweenthetwoprograms.ThereasonwearecomparingthesizeofofthemaximumofisthatwearetryingtoidentifyacopyofWethereforewanttoknowhowmuchofiscontainedinDe“nition3(GraphDistance).Thedistanceoftwonon-emptygraphsisde“nedas whereisthemaximumcommonsubgraphofDe“nition4(Similarity).Letbebirthmarksex-tractedfromprograms.Thesimilaritybetweenisde“ned5EvaluationToevaluatetheeectivenessoftheWPPbirthmarkingtechniqueweexamineditsabilitytosatisfythetwopropertiesfromSect.3.WelookatwhetherWPPbirthmarkswillproducefalsepositivesgiventwoindependentlywrittenappli-cationswhichaccomplishthesametaskandthetoleranceofthebirthmarkagainstprogramtransformations.Asanadditionalevaluationwedemonstratehowbirthmarkscanbeusedinconjunctionwithwatermarking.5.1CredibilityToevaluatethecredibilityofWPPbirthmarksweexaminedtheabilitytodis-tinguishbetweentwoindependentlywrittenapplicationswhichperformedthesametask.Welookedattwoproblems:calculatingafactorialandgeneratingFibonaccinumbers.Eachoftheseproblemscanbesolvedrecursivelyanditera-tively.TheWPPbirthmarkfoundthefactorialprogramstobe50%similarandtheFibonacciprograms7%similar.Fromtheseresultsweareabletoconclude DetectingSoftwareTheftviaWholeProgramPathBirthmarks411thattherecursiveanditerativeformsoftheprogramswereprobablywrittenindependently.Tamada,etal.[23,24]statethattheirbirthmarktechniquesareunabletodistinguishbetweenindependentlywrittenapplicationswhicharesmall.ThisistruegiventheFactorialandFibonacciprograms.UsingthefourbirthmarksproposedbyTamada,etal.therecursiveanditerativeversionsarefoundtobe100%similar.TheonlyexceptionwasSMConfactorialwhichhadasimilar-ityof16%.Thus,withrespecttosmallapplicationsandcredibilitytheWPPbirthmarksprovidestrongerresults.5.2ResistancetoTransformationToevaluatetheWPPbirthmarksresistancetotransformationweappliedvari-ousobfuscationsandoptimizationstoautomaticallytransformourtestprogramintoanequivalent,butnotidenticalprogram.ToperformthetransformationsweusedZelixKlassmaster(ZKM)[4],Smokescreen[3],Codeshield[1],andSand-Mark[2].ZKM,Smokescreen,andCodeshieldallincludenameobfuscation,theeliminationofdebugginginformation,andsometypeofcontrol”owobfucations.Additionally,SmokescreensupportsdeadcodeeliminationandZKMincludesstringencryption.OurtestprogramwasaJavaprogramthatworksliketheUNIXwcprogram.ForeachofthetoolsexceptSandMarkweappliedthetoolwiththestrongestlevelofobfuscation.TheSandMarktoolpermittedustopickandchoosewhichobfuscationswereappliedtotheprogram.SandMarkincludes31obfuscationalgorithmswhichweappliedindividuallytoobtaining31obfuscatedprograms.Inaddition,weappliedmultipleobfuscationsinsuccessiontoWecomputedtheWPPbirthmarkforeachofthetransformedapplications,the31fromSandMarkplusthethreeadditional,aswellastheoriginalIneverycasethesimilaritybetweentheoriginalandtheobfuscatedapplicationswasfoundtobe100%.WeperformedthesameevaluationofthefourtechniquesproposedbyTamada,etal.Table1showsacomparisonoftheresultswithourWPPbirth-markusingZKM,Smokescreen,andCodeshield.ThetableshowsthatonlyWPPandIScompute100%foreachofthethreeobfuscatedprograms.EventhoughIScomputes100%similaritywebelievethetechniqueisnotstrongenoughtobeusedonitsown.Thereasonforthisisthatthetechniquecouldproducemanyfalsepositivesforindependentlyimplementedprogramswhichbothdoanddonotperformthesametask.Wealsotestedthefourstaticbirthmarksagainsteachofthe31obfusca-tionsincludedintheSandMarktool.ForCVFV,SMC,andUCwewereableto“ndobfuscationswhichcastdoubtonthesimilaritybetweentheoriginalandobfuscatedversion.UsingtheCVFVbirthmarkalessthan100%similar-itywasdetectedfortheobfuscationsBogusFields(75%),NodeSplitter(0%),Objectify(66%),OpaqueBranchInsertion(75%),andTransparentBranchIn-sertion(75%).Whenall“veoftheseobfuscationswereappliedinconjunctiontowc.jarCVFVdetecteda0%similarity.TheSMCbirthmarkdetectedalessthan 412G.MylesandC.Collberg100%similarityonfourobfuscations:BuggyCode(69%),PrimitivePromoter(5%),StaticMethodBodies(82%),andTransparentBranchInsertion(83%).Asimilarityof1%wasdetectedwhenallfourobfuscationswereapplied.Fourob-fuscationsalsocausedtheUCbirthmarktodetectalessthan100%similarity:Objectify(92%),OpaqueBranchInsertion(92%),PrimitivePromoter(56%),andTransparentBranchInsertion(92%).Thecombinationoftheobfuscationsyieldeda52%similarity.TheseinitialresultsindicatethattheWPPbirthmarkisstrongerthenthefourtechniquesproposedin[23,24]whentheftofanentireapplicationisinquestion.Table1.Similaritypercentagefoundusingeachbirthmarktechniqueonanoriginalandobfuscatedversionof ZKM Smokescreen Codeshield WPP 100% 100% 100% CVFV 66.7% 83.3% 83.3% SMC 25.0% 15.9% 100% IS 100% 100% 100% UC 100% 100% 45.0% WedoknowoftwoattacksthattheWPPbirthmarkiscurrentlyvulnerableto.The“rstisanylooptransformationthatalterstheloopinwayssimilartoloopunrollingorloopsplitting.Executingtheloopbackwards,however,willnoteecttheWPPbirthmark.WPPbirthmarksarealsovulnerabletomethodinliningincertaininstances.Ifthemethodcalloccursinsideofalooptheninliningwillnotalterthebirthmark.Ontheotherhand,ifthemethodisahelpermethodwhichiscalledfromvariouslocationsthroughouttheprogram,inliningthemethodcallwillhaveaneectonthebirthmarksimilarity.5.3BirthmarksandWatermarksOnelimitationofsoftwarebirthmarksisthattheyprovideweakerevidencethansoftwarewatermarks.Theyareonlyabletosaythatoneprogramislikelytobeacopyofanothernotwhotheoriginalauthorisorwhoisguiltyofpiracy.However,birthmarkscanbeusedininstanceswherewatermarkingisnotfeasiblesuchasapplicationswherecodesizeisaconcernandthewatermarkwouldinsertadditionalcode.Birthmarkscanalsobeusedinconjunctionwithwatermarkingtoprovidestrongerevidenceoftheft.OnesuchexampleisthewatermarkingalgorithmproposedbyStern,etal.[22]whichprovidesaprobabilitythataspeci“cwatermarkiscontainedintheprogram.Ifthewatermarkingalgorithmdoesnot100%guaranteethatthewatermarkiscontainedintheprogramthenabirthmarkcouldbeusedasadditionalevidenceoftheft.Therearealsoinstanceswherewatermarksfail,e.g.anattackerisabletoapplyanobfuscationwhichdestroysthewatermark.Intheseinstancesabirthmarkmaystillbeableto DetectingSoftwareTheftviaWholeProgramPathBirthmarks413provideproofofprogramtheftsincethebirthmarkmaybemoreresilienttoWewereabletoveryeasilyconstructthreeinstancesusingthegramwhereawatermarkisdestroyedbyanobfuscation,butWPPbirthmarksstilldetect100%similaritybetweentheprograms.Inthe“rstinstanceweusedaverysimplestaticwatermarkingalgorithmwhichembedsthewatermarkbysplittingitinhalfandusingthe“rsthalftonameanew“eldandthesecondinanameofanewmethod.Wethenappliedanobfuscationwhichaddsadditional“eldstotheprogram.Inthesecondinstancethesamewatermarkingalgorithmisusedbutthistimetheobfuscationrenamesalloftheidenti“ersinthepro-gram.InthethirdinstancewewatermarkedtheprogramusingthealgorithmproposedbyArboit[7]whichencodedthewatermarkinopaquepredicates[13]thatareappendedtovariousbranchesthroughouttheprogram.Wethenap-pliedanobfuscationwhichaddsopaquepredicatestoeverybooleanexpressionthroughouttheapplication.Ineachoftheseinstancethewatermarkisdestroyedwhichwouldhavepreventedpiracydetection,buttheWPPbirthmarkwasabletodetect100%similarity.6FutureWorkThemostpressingfutureworkistoconductamoreextensiveevaluationoftheWPPbirthmarktechnique.Theevaluationconductedinthispaperwasonlypreliminaryandthuswewouldliketostudytheeectivenessonalargersetoftestapplicationsaswellasmorecombinationsofobfuscations.AswasdiscussedinSect.5.2WPPbirthmarksaresusceptibletovariouslooptransformations.Toaddressthisproblemwewanttoevaluatetheeective-nessofincorporatingtransformations,suchaslooprerolling,inapreprocessingstagethatwouldreversethetransformation.Inaddition,wewouldliketoaddfunctionalitytothetechniquewhichwouldmakeitpossibletotargetmodulelevelaswellasprogramleveltheft.OncethisfunctionalityhasbeenaddedwewouldliketoevalutetheeectivenessofWPPbirthmarksinthedetectionofplagiarismwithinstudentprograms.Anotherinterestingareaofsoftwarebirthmarksthatshouldbeexploredisthecombinationofstaticanddynamicbirthmarks.Unlikewatermarks,whereitispossibletodestroyonewatermarkwithanother,twoormorebirthmarkscanalwaysbeusedinconjunctiontoprovidestrongerevidenceoftheft.7SummaryInthispaperweexpandedontheideaofsoftwarebirthmarkingbyintroducingdynamicbirthmarksandinparticularaspeci“cdynamicbirthmarkcalledWholeProgramPaths.Weevaluatedthetechniquewithrespecttotwoproperties:credibilityandresistancetotransformation.Inbothevaluationsthetechniquedemonstratedpromisingresults.WPPbirthmarksdidnotfalselyidentifytwoindependentlywrittenprogramsasbeingcopieseventhoughtheyperformthe 414G.MylesandC.Collbergsametask.Basedonthetestprogram,,andtheavailableobfuscationsWPPbirthmarkscalculatedasimilarityof100%betweentheoriginalandthetransformedprogram.Wealsodemonstratedhowbirthmarkscanbeusedinconjunctionwithwatermarksandinsomeinstancesareabletodetectpiracyevenwhenthewatermarkhasbeendestroyed.1.Codeshieldjavabytecodeobfuscator.http://www.codingart.com/codeshield.html.2.Sandmark.http://www.cs.arizona.edu/sandmark/.3.Smokescreenjavaobfuscator.http://leesw.com.4.Zelixklassmaster.http://www.zelix.com/klassmaster/index.html.5.AlexAiken.Moss…asystemfordetectingsoftwareplagiarism.http://www.cs.berkeley.edu/aiken/moss.html.6.RossJ.AndersonandFabienA.P.Petitcolas.Onthelimitsofsteganography.IEEEJournalofSelectedAreasinCommunications,16(4):474…481,May1998.Specialissueoncopyright&privacyprotection.7.Genevi`eveArboit.Amethodforwatermarkingjavaprogramsviaopaquepred-icates.InTheFifthInternationalConferenceonElectronicCommerceResearch,2002.8.H.BunkeandK.Shearer.Agraphdistancemetricbasedonthemaximalcommonsubgraph,1998.9.C.Collberg,E.Carter,S.Debray,A.Huntwork,C.Linn,andM.Stepp.Dynamicpath-basedsoftwarewatermarking.InACMSIGPLANConferenceonProgram-mingLanguageDesignandImplementation(PLDI04),2004.10.ChristianCollberg,GingerMyles,andMikeStepp.Cheatingcheatingdetectors.TechnicalReportTR04-05,UniversityofArizona,2004.11.ChristianCollbergandClarkThomborson.Softwarewatermarking:Modelsanddynamicembeddings.InInConferenceRecordofPOPL99:The26thACMSIGPLAN-SIGACTSymposiumonPrinciplesofProgrammingLanguages(Jan.,1999.12.ChristianCollberg,ClarkThomborson,andDouglasLow.Ataxonomyofobfus-catingtransformations.TechnicalReport148,DepartmentofComputerScience,UniversityofAuckland,July1997.13.ChristianCollberg,ClarkThomborson,andDouglasLow.Manufacturingcheap,resilient,andstealthyopaqueconstructs.InPrinciplesofProgrammingLanguages1998,POPL98,SanDiego,CA,January1998.14.R.L.DavidsonandN.Myhrvold.Methodandsystemforgeneratingandauditingasignatureforacomputerprogram.USPatent5,559,884,Assignee:MicrosoftCorporation,1996.15.DerrickGrover.Programidenti“cation.InDerrickGrover,editor,TheProtectionofComputerSoftware…ItsTechnologyandApplications,pages122…154.Cam-bridgeUniversityPress,1989.16.JamesR.Larus.Wholeprogrampaths.InACMSIGPLANConferenceonPro-grammingLanguageDesignandImplementation(PLDI99),1999.17.A.Monden,H.Iida,K.Matsumoto,KatsuroInoue,andKojiTorii.Apracticalmethodforwatermarkingjavaprograms.Incompsac2000,24thComputerSoftwareandApplicationsConference,2000. DetectingSoftwareTheftviaWholeProgramPathBirthmarks41518.C.G.Nevill-ManningandI.H.Witten.Compressionandexplanationusinghier-archicalgrammars.TheComputerJournal,40(2/3),1997.19.C.G.Nevill-ManningandI.H.Witten.Linear-time,incrementalhierarchyinferenceforcompression.InProceedingsoftheDataCompressionConference(DCC97)20.GangQuandMiodragPotkonjak.Hidingsignaturesingraphcoloringsolutions.InformationHiding,pages348…367,1999.21.SaulSchleimer,DanielWilkerson,andAlexAiken.Winnowing:Localalgorithmsfordocument“ngerprinting.InProceedingsofthe2003SIGMODConference22.JulienP.Stern,GaelHachez,FrancoisKoeune,andJean-JacquesQuisquater.Ro-bustobjectwatermarking:Applicationtocode.InInformationHiding,pages368…378,1999.23.HaruakiTamada,MasahideNakamura,AkitoMonden,andKenichiMatsumoto.Detectingthetheftofprogramsusingbirthmarks.InformationScienceTechnicalReportNAIST-IS-TR2003014ISSN0919-9527,GraduateSchoolofInformationScience,NaraInstituteofScienceandTechnology,Nov2003.24.HaruakiTamada,MasahideNakamura,AkitoMonden,andKenichiMatsumoto.Designandevaluationofbirthmarksfordetectingtheftofjavaprograms.InProc.IASTEDInternationalConferenceonSoftwareEngineering(IASTEDSE2004)pages569…575,Feb2004.25.RamarathnamVenkatesan,VijayVazirani,andSaurabhSinha.Agraphtheo-reticapproachtosoftwarewatermarking.In4thInternationalInformationHiding,Pittsburgh,PA,April2001.26.GeoWhale.Identi“cationofprogramsimilarityinlargepopulations.,33:140…146,1990.27.MichealJ.Wise.Detectionofsimilaritiesinstudentprograms:Yapingmaybepreferabletoplagueing.In23rdSIGCSETechnicalSymposium,pages268…271,