Fig1NegativeandpositivetaintmarkingsforaSQLqueryAnattackisdetectedwhenacriticalpartofthequerystructureisnegativelytaintedorwhenitisnotpositivelytaintedNotethatattackdetectionisorthogonaltowhether ID: 855292
Download Pdf The PPT/PDF document "JozaHybridTaintInferenceforDefeatingWebA..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1 Joza:HybridTaintInferenceforDefeatingWeb
Joza:HybridTaintInferenceforDefeatingWebApplicationSQLInjectionAttacksAbbasNaderi-Afooshteh,AnhNguyen-Tuongy,MandanaBagheri-Marzijaraniz,JasonD.Hiserx,JackW.Davidson{DepartmentofComputerScienceUniversityofVirginia,Charlottesville,USe-mail:abiusx@virginia.edu,yan7s@virginia.edu,zmb3wz@virginia.edu,xdh8d@virginia.edu,{jwd@virginia.eduAbstractDespiteyearsofresearchontaint-trackingtech-niquestodetectSQLinjectionattacks,tainttrackingisrarelyusedinpracticebecauseitsuffersfromhighperformanceoverhead,intrusiveinstrumentation,andotherdeploymentis-sues.Taintinferencetechniquesaddresstheseshortcomingsbyobviatingtheneedtotracktheowofdataduringprogramexecutionbyinferringmarkingsbasedoneithertheprogram'sinput(negativetaintinference),ortheprogramitself(positivetaintinference).Weshowthatexistingtaintinferencetechniquesareinsecurebydevelopingnewattacksthatexploitinherentweaknessesoftheinferencingprocess.Toaddresstheseexposedweaknesses,wedevelopedJoza,anovelhybridtaintinferenceapproachthatexploitsthecomplementarynatureofnegativeandpositivetaintinferencetomitigatetheirrespectiveweaknesses.OurevaluationshowsthatJozapreventsreal-worldSQLinjec-tionattacks,exhibitsnofalsepositives,incurslowperformanceoverhead(4%),andiseasytodeploy.I.INTRODUCTIONDespiteincreasingawarenessofsecurityissuesinrecentyears[34],widely-usedWebapplicationsremainvulnerabletoSQLinjectionsandothercommonattacks[35],[38].Theimpactofsuchattacksissevereandcanleadtofullservertakeovers[38].SQLinjectionshaveconsistentlyrankedontopofvariouslists,e.g.,#1onMITRE's2011CWE/SANSlistofTop25MostDangerousSoftwareErrors[17],and#1or#2onOWASPTop10WebApplicationVulnerabilitiesfor2007[34],2010[39]and2013[38].Proposedsolutionsprimarilyrelyondeveloperawarenessofsecurecodingpractices(suchaspreparedstatementsandsanitizinginputs),butthesepracticesareroutinelyignoredorexercisedincorrectly.Furthermore,thesebestpracticesarerarelyretrottedtotheever-growingbaseofexistinglegacycode.Compoundingthisproblem,popularWebframeworkssuchasWordPressactivelyencouragetheirdevelopercommunitytoextendthebaseframeworkwithnewfunctionalityviaapluginarchitecture.Whilethecoreframeworksareheavilyscrutinizedandemploybestcodingpractices,thequalityofpluginsvarieswidely.Attestingtothelowqualityofplugins,wecollected50vulnerableWordpressplugins.UsingarangeofSQLinjectionattacks,wethenharvestedandadaptedaworkingexploitforeachplugin[2].Awell-exploredtechniquefordetectingSQLinjectionattacksisnegativerun-timetaint-tracking,whereuntrusteddataisannotatedwithtaintmarkingsandthesemarkingsaremaintainedasdataowsthroughanapplication[21],[26],[9],[7],[23],[14].Security-criticalcommandsintheapplicationcanthenbecheckedforthepresenceoftaintedcommands,which,ifpresent,indicatesapotentialattack.Anotherformoftaint-trackingisthatofpositivetaint-tracking,inwhichtaintmarkingsareassociatedwithdatathatoriginatefromwithinaprogram,andarethereforetrusted[12],[13].Inthiscase,asecurity-criticalcommandthatisnotmarkedaspositivelytaintedindicatesanattackisbeingattempted.Figure1illustratesthecomplementarynatureofusingnegativeandpositivetainttodetectattacks.Inthegure,-indicatesnegativetaintmarkings(untrusted),+indicatespositivetaintmarkings(trusted),andcindicatescriticalSQLtokensobtainedbyparsingthecommand.Despitethesecurityeffectivenessoftaint-trackingtechniques,theyarerarelydeployed.ForPHP,ourtargetlanguage,solutionsthatprovidegoodperformance(inthe10%range)requireadministratorprivilegestoinstallcustominterpretersorextensions[21],[26],[29],[14].Further,PHPcontinuestoevolveatarapidpace,whichmakesadoptingtaint-trackingextensionsariskybusinesspropositionastheseextensionswillinvariablyfallbehindreleasesofthemaindistribution.Solutionsthatmanagethepropagationoftaintinformationatthesourcecodelevel,e.g.,directlyinPHP,incurhighoverhead(inthe200%range)[23].Alow-overhead,emergentalternativeapproachtotainttrackingistaintinference.Taintinferencetechniquesseekt
2 oinfertaintmarkings,obviatingtheneedfort
oinfertaintmarkings,obviatingtheneedforthecomplexma-chineryandmodelingrequiredtopropagateandmaintaintaintinformation[28],[22].Analogouslytotaint-trackingtechniques,taintinferencetechniquesarecategorizedasnegative[28]orpositive[22]dependingonwhethertheyseektoinfertaintmarkingsforuntrusteddata(negativetaint)ortrusteddata(positivetaint)(Figure1).Thepotentialdisadvantageoftaintinferenceisthatitissusceptibletofalsenegatives,i.e.,missedattackdetection,duetotheinherentimprecisionintheinferenceprocess(Section3).Thekeyinsightunderlyingourapproachisthatahybridtaintinferencemodelthatexploitsthecomplementarynatureofnegativeandpositivetainttechniquesresultsinamuchmoresecuresystemthanusingeitherinferencetechniqueinisolationwhilesimultaneouslymitigatingtheirrespectiveweaknesses.Theprimarycontributionsofthispaperare:Aconvincingdemonstrationthatneithernegativetaintinferencenorpositivetaintinferenceisadequatelysecure.Usingnovelbutstraightforwardtechniqueswesuccess-1 Fig.1:NegativeandpositivetaintmarkingsforaSQLquery.Anattackisdetectedwhenacriticalpartofthequerystructureisnegativelytainted,orwhenitisnotpositivelytainted.Notethatattackdetectionisorthogonaltowhetherthetaintmarkingsareobtainedviatraditionaltaint-trackingtechniquesorviataintinference.fullymutated51outof53real-worldexploitstobypassnegativetaintinference.Forpositivetaintinferencewedevelopedanautomatedevasiontooltoadapt14outof53real-worldexploitstobypasspositivetaintinference.Thedevelopmentofanovelhybridtaintinferencemodelthatsynergisticallycombinesnegativeandpositivetaintinference,resultinginamoresecuresystemthaneither.Attacksthatevadenegativetaintinferencearedetectedbypositivetaintinference,andvice-versa.AcomprehensiveevaluationofahybridtaintinferenceprototypecalledJoza.1WeshowthatJozaincurslessthan5%overhead,withnofalsepositives,iseasytodeploy,andthwartsawiderangeofSQLinjectionattacktypes,including53instancesofnovelattacksdesignedtobypasspositiveornegativetaintinferencing.ThedevelopmentofWP-SQLI-LAB[2],anopen-sourcefully-automatedSQLinjectiontestsuite.Therestofthispaperisorganizedasfollows.ThethreatmodelispresentedinSectionII.SectionIIIdescribesthehybridtaintinferencingmodel,discussespositiveandnegativetaintinferencetechniques,includingtheircomplementarystrengthsandweaknessesindetail.SectionIVpresentsahigh-levelarchitectureofJozaanditsdeploymentmodel.WepresentthesecurityevaluationinSectionV,followedbyperformanceevaluationinSectionVI.SectionVIIdiscussesrelatedwork,whileSectionVIIIprovidesconcludingremarks.II.THREATMODELOurthreatmodelassumessoftwareisintendedtobebenign,butlikelycontainsaws.Theprogram,whenrun,acceptsuntrustedinput,possiblyfrommanysourcessuchasles,environmentvariables,HTTPrequestbodies,HTTPrequestheaders,databasesandothers.TheinputisthenusedtocreateSQLqueriesthatareissuedtothedatabase.Mostinputstotheprogramarebenignandcausethequeriestobehaveasintended,butmaliciousinputsmayexploittheprogramtoviolatethesecuritypolicyintendedfortheSQLqueries.AnSQLinjectionoccurswhenattacker-controlledinputsareinterpretedasSQLkeywords,built-infunctions,ordelimiters,orwhentheychangetheprogrammer-intendedsyntacticstructureofacommand[36],[28].1JozaisthePersian/ArabicequivalentoftheGeminizodiacconstellation,whichisLatinfortwins.WeconsideredusingastrictdenitionofSQLinjectionattackssuchastheonedenedbyRayandLigatti[27],[29].Unfortunately,manyprograms,suchasthosethatincorporateadvancedsearchfunctionality,wouldbreakastheyalloweldandtablenamestobespeciedthroughuserinputs[6],[30],[31],[3].Weassumeamorepragmaticstance,whichpermitsthesecommonprogrammingpractices,butthetechniquespresentedcanbeeasilyadjustedtoenforceauser'sdesiredpolicy.III.TAINTINFERENCEMODELSTomotivatethekeyinsightsunderlyingtheJozahybridtaintinferencemodel,werstpresentthestrengthsandweaknessesofcurrentnegativeandpositivetaintinferencemodels.A.NegativeTaintInference(NTI)Negativetaintinference(NTI
3 )inferstaintmarkingsbycorrelatingapplica
)inferstaintmarkingsbycorrelatingapplicationinputswithquerystrings[28].Thepseudo-codefortheNTIinferencealgorithmisasfollows: queryq=intercept_query()foreachinputsource,Sforeachinputp,inSdiff_ratio=substring_distance(q,p)ifdiff_ratiothresholdmark_negative_taint(q,p) NTIemploysanapproximatestringmatchingalgorithmtomakeallowanceforcommonandsmallstringtrans-formationsperformedbyanapplication,suchasstrip-pingwhitespaceandperformingcase-conversions.Functionsubstring_distancecomputesadifferenceratiowhichisthestringdistancebetweenaninputandaquerydividedbythelengthofthematchedquerysubstring.Adifferenceratioofzeromeansthattheinputstringappearsunchangedinsidethequery.Ifthediff_ratioisbelowathresholdthealgorithminfersthatamatchhasoccurred.Aswillbediscussedshortly,selectingaproperthresholdisnotstraightforward.Findingtheminimumsubstringdistanceisacomputationallyexpensivealgorithm.Initssimplestform,everysubstringofthequeryiscomparedtotheinputusingtheLevenshteinedit-distancealgorithm[15].ThissimpleformhasacomputationalcostofOn2m2wherenisthelengthoftheinputparameterandmislengthofthequery.TherunningtimeofthealgorithmisOln2m2wherelisthenumberofinputparameters.Thisalgorithmisimpracticalforlongqueries2 Fig.2:NTIMarkings.PartA:benigninput,PartB:maliciousinput(attackdetected),PartC:evasiveinput(attackundetected).composedoflargeuserinputs,suchaswhenauserpostsamulti-pageblogentryoruploadsale,orwhenavisitorpostsasizablecomment.Numerousoptimizationsexistforthisalgorithm,suchascomputingdistancesusingdynamicprogrammingandusingheuristicstoskipimplausiblecomparisons[28].TheoptimizationsusedinJoza'sNTIcomponentareexplainedintheperformanceevaluation(SectionVI).Figure2showsthetaintmarkingsinferredforvariousinputssenttoavulnerableapplication.InpartAofFigure2,thequeryisdeemedsafeasnocriticaltokenhasbeenmarkedasnegativelytainted.PartBofFigure2illustrateshowNTIdetectsanattack.NTIinfersthat-1OR1=1isnegativelytaintedasitpreciselymatchesthevalueoftheinputparameterid.BecausethecriticaltokensORand=aretainted,NTIdetectsapotentialattack.1)Strengths:LowOverheadandLowImplementationComplexity.NTIperformswellwhenthereisastrongcorrespondencebetweenapplicationinputsandqueries.NTIhasnegligiblememoryandprocessorfootprintforsmallinputsandqueries,andonlyneedstobecomputedwheninputisprovidedtotheapplication[28].2)Weaknesses:SensitivitytoThresholdValue.Aspreviouslynoted,NTIusesanapproximatestringmatchingalgorithmtoallowfortransformationsoftheinput.Thesensitivityofthestringmatchingalgorithmistunedbyspecifyingathresholdvaluethatisproportionallyrelatedtotheeditdistancebetweenaninputvalueanditsmatchinthequerystring.Settingthethresholdvaluetoohighyieldstheinferenceoftoomanytaintmarkings,whichcausesfalsepositives.Ontheotherhand,settingthethresholdvaluetoolowyieldstoofewtaintmarkings,whichcausesfalsenegatives.Selectinganoptimumthresholdvalueforanapplicationoracrossasetofapplicationsisnotstraightforward.EvasionviaApplication-levelTransformations.AnyinputtransformationappliedinsideanapplicationcanpotentiallyresultinthebypassofNTIasitbreaksthecorrespondencebetweeninputsandquerystrings.Forexample,acommondatatransformationistouseaBase64encodingwherebinarydataisconvertedtohuman-readablecharactersfortransferoverASCII-basedprotocols.Mostwebapplicationsapplysomeformofinputmanipula-tionforthepurposeofvalidation,sanitizationornormalization.Forexample,WordpressenforcesMagicQuotes,adeprecatedPHPfacilitythatescapesquotes,backslashesanddoublequoteswithadditionalbackslashes.Wordpressalsotrimswhitespacefrominputprovidedbyauthenticatedusers.ForapplicationsthatperformsimilartransformationstoWordpress,anattackercancraftaninjectionpayloadthatincludesacommentblock,insideofwhichanarbitrarynumberofspecialcharacters(e.g.,quotesinthecaseofWordpress)canbeadded.ThewebapplicationwillthentransformandincludetheseescapedquotesinacommentblockinsidetheSQLquery,resultinginahigherstringeditdistancetha
4 nthespeciedthreshold,effectivelybypa
nthespeciedthreshold,effectivelybypassingnegativetaintinference.Anattackercanalsoleveragewhitespacetrimming(acommonoperation)byappendinganarbitrarynumberofwhitespaces,andrelyonthefactthatthesewhitespaceswillberemovedbythewebapplication.Again,theneteffectisahigherstringeditdistancethanthespeciedthreshold.Notethatevasionscanbedoneviaanytransformationofinputinsidetheapplicationcode,andarenotlimitedtotheexamplesdiscussedhere.PartCofFigure2illustratesNTIevasion.Theeditdistancebetweentheinputandthematchedportionoftheoutputisve(thenumberofbackslashesaddedbymagicquotes).Dividingbythelengthoftheentirematchedportion(22)yieldsa22.7%differenceratio,whichisnotsmallenoughtocauseamatchforathresholdof20%.AnadversaryevadesNTIbyaddingenoughquotestodrivethedifferenceratiohigherthanthethresholdvalue.PayloadConstruction.ConcatenationoftwoormoreinputsbyanapplicationenablesattackerstoconstructanattackpayloadthatpotentiallyevadesNTI.ThefollowingPHPcodeandsampleinputdemonstratethisattack: $input=$_GET['q1'].$_GET['q2'].$_GET['q3'];$query="SELECT*FROMdataWHEREID=".$input; 3 Fig.3:PTIMarkings.PartA:benigninput,PartB:maliciousinput(attackdetected),PartC:maliciousinput(attackundetected).Input:q1=1O q2=RTR q3=UE Query:SELECT*FROMdataWHEREID=1O RTR UE Notethattaintmarkingsinferredfromdifferentinputscannotbecombinedtodetectanattackasitwouldintroducetoomanyfalsepositives.Forexample,bycombiningcommononeletterinputssuchasOandR,allqueriescontainingthewordORwouldbeincorrectlyinferredasnegativelytainted.Alsotoalleviatefalsepositivesthatwouldresultfrommatchingveryshortinputs(suchassingleletters),NTIdetectsanattackonlyifaninputmatchesatleastonewholeSQLtoken.B.PositiveTaintInference(PTI)Incontrasttonegativetaintinference,positivetaintinference(PTI)infersthepartsofaSQLquerystringthatshouldbetrusted.ThePTItechniqueworksbyreconstructingsecurity-criticalcommandsusingstringfragmentsextractedfromtheprogram.PTIwassuccessfullyusedpreviouslytothwartOScommandinjectionattacks[22].WegeneralizethisworkandadaptPTItocoverSQLinjectionsforwebapplications.ThePTIinferenceprocessisconceptuallysimpleandisshownwiththefollowingpseudo-code: LetFbethesetofstringfragmentsextractedfromprogramPqueryq=intercept_query()foreachstringfragmentfinFforeachposition,i,inqiff==q[i..i+len(f)]mark_positive_taint(q[i..i+len(f)]); Thesetofstringfragments,F,isextractedbyprocessingtheapplicationandallpluginstoidentifystringliteralscontainedintheapplication.Asshown,thisalgorithmiscomputationallyexpensive,runninginOnm2wherenisthenumberoffragmentsandmisthelengthofthequery.SectionVI-Adescribesoptimizationstospeeduptheinferenceprocess.ConsiderthefollowingvulnerablePHPprogram: $postid=$_GET['id'];$query="SELECT*FROMrecordsWHEREID=".$postid;$query=$query."LIMIT5";$result=mysql_query($query); Forthisexample,thestringfragmentextractionprocessyieldsthefollowingfragments: idSELECT*FROMrecordsWHEREID=LIMIT5 NotethatthespacebeforeLIMIT5ispartofthefragmentextractedfromtheprogramandcanbeimportantinthematchingprocess.Figure3illustratespositivetaintmarkings(denotedwith+).InpartAofFigure3,thequeryisdeemedsafeasallcriticaltokensarepositivelytainted.PartBofFigure3illustratesthecasewhenanattackpayloadsuchas-1UNIONSELECTusername()istheapplicationinput.Thispayloadextractsthedatabaseusername,butisdetectedbyPTIbecausethreecriticaltokens(UNION,SELECTandusername())arenotmarkedaspositivelytainted.Topreventattackersfromcombiningfragmentstoformacriticaltoken,PTIrequiresthatcriticaltokensbefullycontainedwithinasinglefragment.Forexample,PTIdoesnotallowthecriticaltokenORtobecreatedbycombiningthesingle-letterfragmentsOandR.Additionally,PTItreatsSQLcommentsasonecriticaltokenandrequiresthatcommentsbefullycontainedinonefragment.1)PTIStrengths:Input-Independence.Adistinguishingfeatureofpositivetaintingtechniquesingeneralisthattheprocessofobtainingthetaintmarkingsisintrinsictoaprogram.Thisprocessisnotaffectedbyexternalinputandthere
5 foreisnotsubjecttocontrolbyanadversary[1
foreisnotsubjecttocontrolbyanadversary[12],[13],[22].Toreinforcethiskeypoint,notethatthealgorithmusedbyPTItoinfertaintmarkingsforaquerydependsonlyonstringfragmentsextractedfromtheprogram.IndependencefromexternalinputsmeansthatPTIisimmunetoissuesthatplaguenegativetaint-trackingtechniques,e.g.,correctlyidentifyingallsourcesofuntrusteddata,correctlypropagatingtaintmarkingsthroughoutexecutionofaprogram,andpreciselymodelingcomplexstringfunctionssuchasregularexpressionsreplacementfunctions.2PTIisresistanttosecondorderattacks,suchaswhentheinjectionpayloadiscachedintoale,andthenretrievedbytheapplicationandfedintoaquery.PTIisalsoresistantto2Forexample,Diglossia[29],PHPrevent[21]andthePHPtaint-trackingextension[14]donotmodelfunctionssuchaspreg_replaceprecisely.4 Fig.4:PartA:AttackthatisundetectedbyPTIbutisdetectedbyNTI,PartB:AttackthatisundetectedbyNTI,butisdetectedbyPTI.mixedinput-sourceattacks,suchaswhenaninjectionpayloadisconstructedinsidetheapplicationbyconcatenatingharmlessinputsfromdifferentsources.Furthermore,input-independenceenablesextensiveuseofcachingforperformanceoptimization,sinceaquerycanbeanalyzedonceandtheanalysisresultcachedindenitely.Encoding-Resistance.Encodingsperformedbythedatabaseengineandapplicationlogicarefrequentinwebapplications.Forexamplemanywebapplicationsstoreencodedorencrypteddataincookies,sessionsanddatabasesforsubsequentuse.Firewallsandintrusiondetectionsystemstypicallyoperateonuser-inputatthenetworklevelandhavenovisibilityintotheactualvalueoftheseinputs.PTIcanaccesstheoriginaldata,becausethedataiseventuallydecodedandusedinaSQLquery.2)PTIWeaknesses:Application-dependentAttackSurface.Thesetofextractedstringfragmentsformsthevocabularywithwhichanattackercancraftanexploit.Forexample,inpartCofFigure3,theattacker-suppliedinput,1OR1=1,woulderroneouslybedeemedsafeiftheprogramcontainedboththestringfragmentsORand=.Ingeneral,longerattackpayloadsthatrequiremultiplecriticaltokenshaveahigherprobabilityofdetectionthanshorterattacks.C.HybridTaintInferenceModelThecomplementarynatureofnegativeandpositivetaintinferencetechniquesisconcretelyillustratedintheexamplesofFigure4.PartAofFigure4showsanattackpayloadthatisundetectedbyPTIbutdetectedbyNTI.Conversely,partBofFigure4showsanattackpayloadthatisundetectedbyNTIbutdetectedbyPTI.PTIissusceptibletoshortattackpayloadsbuiltwithonlyafewcriticaltokens.ThesepayloadsarelikelyinterceptedbyNTI,sincetheyareofshortlengthandappearmostlyunchangedintheoutput.NTIissusceptibletolongpayloadsconstructedbyleveragingapplication-specictransformations.ThesepayloadsaretypicallyinterceptedbyPTIsincetheyarecomposedofalargenumberofcriticaltokensoruselargeblockslledwithtransformabledata(suchaswhitespacesorcomments).ToexploitthecomplementarynatureofPTIandNTI,wecombinetheminonesystemsothatevenattacksexplicitlydesignedtobypassone,willbedetectedbytheother.Ifeitheralgorithmdetectsanattack,anattackisreported.Ifneithertechniquedetectsanattack,noattackisreported.Thus,thecombinationmitigatesthesecurityweaknessofeachindividualtechnique.CombiningNTIandPTIinahybridmodelalsomeanscomposingfalsepositiveratesandoverheadrates.PreviousstudiesofNTIandPTIhaveshownperformanceoverheadratestobelessthan5%withnofalsepositives[28],[22].SectionsVandVIexperimentallydemonstratethatJoza,oursystemthatimplementsahybridNTIandPTImodel,retainsfavorableperformancecharacteristicswithoutincurringfalsepositives.IV.JOZASYSTEMFigure5providesanarchitecturaloverviewoftheJozasystem.Jozaconsistsoftwomajoranalysescomponents,PTIAnalysisandNTIAnalysis.ThePTIAnalysiscomponentimplementsthepositivetaintinferencealgorithm,whereastheNTIcomponentimplementsthenegativetaintinferencealgorithm.Allcommandsintendedforthebackenddatabasemanagementsystem(DBMS)areinterceptedandrstsenttothePTIAnalysiscomponent,andthentotheNTIAnalysiscomponentbeforebeingallowedtoproceedtotheDBMS.A.InstallationJozaisinitiallyinstalledbyaddingthepreprocessingcomponenttotheen
6 trypointofawebapplication.InthecaseofWor
trypointofawebapplication.InthecaseofWordpress,thisstepcanbedonebyplacingJozainthepluginsdirectoryofWordpressandconguringtheWordpresspluginmanagertorunJozaautomaticallyoneveryrequest.AwebapplicationinPHPistypicallyacollectionofPHPsourcecodelesresidinginonetop-leveldirectoryandseveralsubdirectories.Jozarecursivelyparsesallsourcecodelesreachablefromthetopdirectoryandextractsstringliteralsfromeachletoformthenalsetofstringfragments.ThesefragmentswillsubsequentlybeusedbythePTIAnalysiscomponent.Inthecaseofformatstringsorotherstringswithplaceholders,Jozabreaksthemdownintomultiplefragments.5 Fig.5:JozaArchitectureForexample,thestringSELECT*fromuserswhereid=$idandpassword=$passwordwouldbebro-kendownintotwofragments: SELECT*fromuserswhereid=andpassword= NotethatonlyfragmentsthatcontainatleastonevalidSQLtokenneedtoberetained.Tointerceptqueries,theinstallationprocesswrapsallstandardPHPfunctionsandclassesthatinteractwithbackenddatabases,e.g.,mysql*andPDO*.Thesewrappersareimplementedusingasource-leveltransformationtoreplaceallcallstodatabasefunctionswithcallstoequivalentJozawrappers.B.PreprocessingThepreprocessingcomponentdenesJozawrappersandstoresacopyofallinputstothewebapplicationtopreservethemforNTIanalysis.Thisstepisrequiredasmanywebapplicationsmodifyuser-inputbeforeitreachesNTIanalysis.Thepreprocessingcomponentalsoinvokestheinstallerwhen-evernewormodiedlesarefoundintheapplication(e.gwhentheapplicationisupdatedoranewpluginisinstalled),tokeepthesetofstringfragmentscompleteandenableJozatointerceptallqueriessenttothedatabasebytheapplication.C.PTIAnalysisComponentThePTIAnalysiscomponentsendsinterceptedqueriestoaPTIdaemon.Thedaemonperformstwoprimaryfunctions.Therstistoparseinterceptedqueriestoextractcriticaltokensandkeywords.ThesecondistoinferwhichpartsoftheinterceptedqueryshouldbetrustedusingthePTIalgorithmdescribedinSectionIII-B,andreturnwhetherthequeryisdeemedsafeornot.Asanoptimization,thePTIAnalysiscomponentmaintainsaquerycachetostoresafequeries.ForapplicationssuchasWordpresswithaworkloadheavily-skewedtowardsreads,thiscachingmechanismdramaticallyboostsperformance(SectionVI).1)PTIDaemon:ThePTIDaemonisanativebinaryapplicationthatloadsthePTIdynamiclibraryaswellasthestringfragmentsintomemory,connectstothewebapplicationandwaitsforincomingqueries.Onceaqueryarrives,itsstructureandtheresultofitstaintanalysisiscommunicatedbacktothewebapplication.Multipledaemonprocessescancoexisttogether.Thelifetimeofasingledaemoninstancecanrangefromasinglewebapplicationinstance(comprisingofmultipledatabasequeries)tohours.Thedaemonislaunchedondemand(asabinaryprocess)bythePHPapplicationandcommunicateswiththePHPapplicationusingnamedoranonymouspipes.Initsshortestlifespan,thedaemonlivesforthedurationofonewebrequest,communicatingviaanonymouspipesandterminatingalongsidetheapplication.Toallowlongerlifetimes,thedaemonislaunchedindependentlyofthelaunchingwebapplication(e.g.usingnohup)andcommunicatestothewebapplicationinstancesusingnamedpipes.Toimproveperformance,thedaemonalsoincludesaquerystructurecachewhichcachesabstractsyntaxtreesofparsedquerieswithoutstoringcontentsofdatanodes.ThisoptimizationisdiscussedinmoredetailinSectionVI.2)PTIQueryCache:ThePTIquerycacheusesanin-memoryhashtableinthebackenddatabasetocachethePTIanalysisresultofaquery(i.ewhetherthequeryissafeornot).Becausemanyqueriesofawebapplicationareconstantanddonotrelyonanyuser-input,cachingimprovesperformancesignicantlywithoutnoticeablyincreasingthememoryfootprintofthedaemon.D.NTIAnalysisComponentToimplementtheNTIalgorithmdescribedinSectionIII-A,JozamustrstmakeacopyofallinputsincludingcookiescontainedinHTTPheaders,aswellasHTTPGETandPOSTvalues.Whilecomputingthenecessarysubstringdistancebetweeninputsandtheinterceptedquerycanbeexpensive,PHPdirectlysupportsthiscomputationusingabuilt-inLevenshteinedit-distancealgorithm[15].Oncethenegativetaintmarkingshavebeeninferred,th
7 eNTIAnalysisComponentreusesthecriticalto
eNTIAnalysisComponentreusesthecriticaltokensandkeywordspreviouslyobtainedbythePTIDaemon,andcanthendeterminewhetheraqueryissafe.E.AttackrecoveryAqueryissafeifandonlyifbothPTIandNTIcomponentsdeemthequerysafe.Whenanattackisdetected,Jozasupportstworecoverypolicies:errorvirtualizationandtermination.The6 errorvirtualizationpolicyreturnsanerrorcodeasifthequeryhadfailedandreliesontheapplicationlogictohandlethiserrorgracefully.Theterminationpolicyforcestheapplicationtoexit.ThedefaultJozapolicyistoassumeaconservativesecurityposture;Jozausestermination,whichtypicallyresultsinablankHTMLpagereturnedtotheenduser.F.ArchitectureRationaleThetwinrequirementsforJozatoexhibitlowoverheadandbeeasy-to-deploy,i.e.,withoutrequiringadministratorprivi-leges,motivateourdecisiontoimplementthePTIalgorithmasauserdaemon.TwoalternativedesignsforthePTIalgorithmarePHPextensionsandapurePHPimplementation.APHPextensionisanativelibrarylinkedagainstaspecicversionofPHPheadersandisnotcompatiblewithotherPHPversions,andwouldthereforerequirethePTIdaemontobeupdatedasfrequentlyasthePHPinterpreter.LoadingorinstallingPHPextensionsrequiresadministrativeprivileges,whichisimpracticalinmanydeploymentscenarios,e.g.,sharedhostingenvironments.ApurePHPimplementationofaSQLparserandthePTIalgorithmwasalsotested,butrejected,astheresultingoverheadrangedfrom20%to200%.AsforNTI,movingtheanalysistothedaemonwouldnotbenetperformance,becauseNTIrequiresallinputsoftheapplicationandcommunicatingthemtothedaemonwouldincurmoreoverheadthantheperformancegain,especiallywhenprocessingsizableinputs(suchasleuploads).V.SECURITYEVALUATIONToevaluateJoza'ssecurity,wecreatedWP-SQLI-LAB,anopen-sourcesecuritytestbedconsistingofarecentWordpressversion(v3.8)packagedwith50pluginspubliclyreportedtobevulnerabletoSQLinjectionattacks[2].Thepluginsrepresentadiversesetofapplications,includingsocialmedia,e-commerce,imagegalleriesandforums.Exploitswereobtainedfromvariouspublicsources,includingCVEreports,securityresearchblogsandothersecurity-relatedwebsites. AttackType NO.ofPlugins UnionBased 15 StandardBlind 17 DoubleBlind 14 Tautology 4 TABLEI:ClassicationofWP-SQLI-LABattacktypesTableIliststhetypeofexploitscollectedandtheirfrequencyinthetestbed.Aunion-basedexploitallowsattackerstoreplacetheexpectedresultofaquerywithadatarecordobtainedbyaqueryoftheirchoosing.Thistypeofexploitallowseasyextractionofanyinformationfromthedatabase.Astandard-blindexploitreturnserrorsifthequeryreturnsnoresults,andvalidresultsotherwise.Thistypeofexploitallowsanattackertoextractdesireddatabybinarysearchingeachcharacterusingconditionalpayloadsgeneratedbyautomatedtools(suchasSQLMap)ormanually.Double-blindexploitsseektodeterminethevalidityofaninjectedpayloadbyobservingtheapplication'sresponsetime.Withajudiciouschoiceofpayload,adouble-blindexploitcanleakvitalinformationsuchaspasswords.Again,typicalattacksusingthisexploitarecarriedoutusingabinarysearchtoleakdataonevalidcharacteratatime.Tautologiessuchas1OR1=1canresultintheleakageofinformationorbypassingofauthenticationcode.A.NTIandPTIEvaluationThegoalofourrstexperimentwastoevaluatetheeffectivenessofNTIandPTIindividuallyusingourtestbed.Tothebestofourabilities,wedevelopedexploitsforthetestbedwithoutconsiderationforeitherNTIorPTI.AsshowninTableII,theNTIcomponentdetected49outofthe50originalexploits.(NTIfailedtodetectanattackinapluginthatusedaBase64encodingofitsinputs.)ThePTIcomponentdetectedall50originalexploits.Theseresultscorroboratetheeffectivenessoftaintinferencetechniquespreviouslyreported[28],[22].TofurtherevaluatetheeffectivenessofNTIandPTI,weusedapowerfulpenetrationtool(SQLMap[10])onfourofthe50plugins.ThefourpluginswereselectedsuchthateachoftheexploittypesinTableIwaspresent.Onaverage,SQLMapgenerated40validattackpayloadsforeachplugin.BothNTIandPTIdetectedallattackvariants. Exploits NTI PTI Testbed 49/50 50/50 GeneratedbySQLMap 160/160 160/160 TABLEII:BaselineeffectivenessofNTIandPTI Fragm
8 ent UNION AND OR SELECT CHAR , # -- ; /*
ent UNION AND OR SELECT CHAR , # -- ; /**/ ) ( GROUPBY ORDERBY CAST WHERE1 INSERT INSERTINTO = usersWHEREID= :-) ))) ? To: ∗ iframe tail-c tdrowspan= TABLEIII:Sam-plefragmentsinWordpressTheresultsinTableIIwereencouragingasbothNTIandPTIdefeatedalmost100%oftheattacks.However,asophisticatedattackerwouldactivelyseektotakead-vantageoftheweaknessesidentiedinSectionIII.Inthenextsetofexperiments,weexploredthedesignspaceofattackstargetedexplicitlytoevadeeitherNTIorPTI.NTIEvasion.SinceNTIissusceptibletoapplication-inducedtransformations,weleveragedtheWordpressimplementationofmagicquotes(magicquoteaddsanextrabackslashforeveryquote).Wemutatedtheoriginalattacksbyin-corporatingcommentblocksthatincludedquotes.RegardlessofthethresholdusedbyNTIfordeterminingamatch,anattackercanevadeNTIbysimplyaddingenoughquotestoensurethattheattackinputisabovethethreshold.Thus,changingthesensitivitythresholdusedbyNTIwouldnotbeaneffectiveremedy.Figure6Cshowssuchanattack.ThisnovelevasionapproachresultedinthecompletebypassofNTI.7 Fig.6:Real-worldexploitforoneofWP-SQLI-LABvulnerabilities.PartAshowstheoriginalexploit,partBshowstheexploitmutatedusingTaintlesstobypassPTI,partCshowstheexploitadaptedforNTIevasionandpartDdepictsthemixtureofPTIandNTIevasionsintheexploit.PTIEvasion.Toexploittheapplication-dependentattacksurfaceofPTI,wecreatedTaintless[1],anautomatedevasiontoolthatreconstructsattackpayloadsusingstringfragmentsavailableinanapplication.TaintlessreplacescertainSQLtokenswiththeirequivalents(e.g.UNIONwithUNIONALL,CHARwithstringliterals),matchesthelettercaseofattacktokenswiththoseavailableintheapplication,removesthosetokensnotfoundinsidetheapplicationthatcanbesafelyremovedfromtheattackpayload,andalsomatchesthetypeandnumberofwhitespaceswiththoseavailableintheapplication.UsingTaintless,wesuccesfullyadapted13outof50exploitsinthetestbedtoevadePTIdetection.Figure6Bshowsoneoftheseadaptedexploits.TableIIIlistsexamplefragmentsextractedfromWordpressandthe50plugins.SincethesefragmentsincludeORand=(amongmanySQLtokens),PTIdoesnotdetectanattackwithapayloadofOR1=1.Tounderstandhowcommonsimpleinjectionpayloadsareinreal-worldapplications,weanalyzed100recentlyreportedSQLinjectionvulnerabilities(containingexploitcodes)listedbyMITRECVEfrom2012to2014.Oftheseonly4weretautologies(vulnerabletosimplepayloads)[20].B.HybridModelEvaluationTheprevioussectionevaluatedthesecurityofNTIandPTIindividually.WenowevaluateJoza,asystemwherebothNTIandPTIarecombined.Jozadetectsallattacksinthetestbed,evenattackssuccessfullyadaptedtoevadeNTIandPTI(TableIV).OnesuchattackisshowninFigure6.PartAofthegureshowstheoriginalexploitintheresultingquery,whilepartsBandCdisplayadaptationstobypassPTIandNTIrespectively.PartDshowsanunsuccessfulattemptatevadingbothtaintinferencetechniquesinasingleexploitaseachtechniquedetectstheadaptationusedtobypasstheother.Ingeneral,theJozaPTIcomponentstopsthepracticeofusingNTIevasioninanattackpayload,asPTIrequiresthattheentireevasionblockoriginatefromasinglefragment.Ontheotherhand,thesusceptibilityofPTItomaliciouspayloadsthatcontainasmallnumberofcriticaltokensavailableintheapplicationiscompensatedbyNTI.Joza'shybridtaintinferencealgorithmdramaticallyraisesthebarformountingasuccessfulSQLinjectionattack.ToevadeJozaonemustconstructanattackthatevadesbothNTIandPTI.AnexamplewouldbeanattackagainstapluginwhereNTIfailstodetecttheattackbecausethestringdistanceistoohigh,andPTIalsofailstodetecttheattackbecausetheattackusesfragmentspreviouslyextractedfromWordpressandtheplugins.Despiteourbestefforts,wehavenotbeenabletocreatesuchanattackagainstthe50pluginsinourtestbed.Tofurtherdemonstratetheeffectivenessofourapproach,weusedJozatoprotectDrupal,JoomlaandosCommerce,popularapplicationswithwell-known,recentlyreportedvulnerabilities.TheDrupalvulnerability[19]isbasedonencodeduser-inputusedtoconstructpreparedstatementsinthewebapplication.PreparedstatementsareusedtopreventSQLinjectionattack
9 bysendingthequerytobepreparedbythedataba
bysendingthequerytobepreparedbythedatabaseenginerst,andthenseparatelysendinguserdatatothedatabaseenginetobeusedinnamedoranonymousplaceholdersdenedinthepreparedquery.Useofpreparedstatementswouldremovetheattackers'abilitytomodifyaquery,andanyinputprovidedbyanattackerwouldbetreatedasdatabythebackenddatabase.Unfortunately,preparedstatementsarenotapanacea.Inthiscase,userinputwasusedtoconstructtheplaceholdernamesinthequerysenttothedatabasetobeprepared,allowinganattackertoprovidecarefullycraftedinputtomodifytheoriginalcommandtothedatabase,regardlessofthedataparameters.JoomlawasvulnerabletoaverycomplicateddoubleblindSQLinjectionattackwhichusedencodedinputtoinstantiateanobjectofaparticularclassinsidetheapplication[18].ThisobjectwouldconstructanSQLquerybasedonitsmembervariables(whichcouldbeoverriddenbytheattacker),andexecutethequeryondestruction.osCommercewassusceptibletoatautologyattackthatextractedsensitiveinformationfromthedatabase[8].PTIorNTIwerenotsufcienttodetectallthreeoftheseattacksonpopularhighlyscrutinizedwebapplications,butJozasuccessfullydetectedandpreventedthem.FalsePositives.Toevaluatefalsepositives,wedevelopedascripttoperformafullcrawloftheWordpressapplicationtestbed,includingpostingrandomcommentsandperformingrandomsearches.WealsomanuallyclickedthroughvariouspartsofWordpressanddidnotuncoveranyfalsepositives.WeranSQLMaponWordpressconguredwiththepluginsandveriedthatallattacksdetectedbyJozaweretruepositives,i.e.,validattacks.8 Plugin/Application Version CVE/OSVDB SQLVulnerability NTIOriginalExploit NTIMutatedExploit PTIOriginalExploit PTIMutatedExploit Joza AtoZCategoryListing 1.3 86069 Tautology Yes No Yes No Yes AdRotate 3.6.6 2011-4671 Tautology No No Yes No Yes Advertizer 1.0 DoubleBlind Yes No Yes Yes Yes AjaxGallery 3.0 DoubleBlind Yes No Yes Yes Yes AllowPHPinpostsandpages 2.0.0 DoubleBlind Yes No Yes Yes Yes CommunityEvents 1.2.1 75252 UnionBased Yes No Yes No Yes ContusHDFLVPlayer 1.3 74573 Tautology Yes No Yes No Yes CountperDay 2.17 75598 UnionBased Yes No Yes Yes Yes Couponer 1.2 UnionBased Yes No Yes Yes Yes CrawlRateTracker 2.02 Blind Yes No Yes Yes Yes EasyContactFormLite 1.0.7 Tautology Yes No Yes No Yes EventRegistrationplugin 5.43 Blind Yes No Yes Yes Yes Eventify 1.7.f 86245 UnionBased Yes No Yes No Yes FacebookPromotions 1.3.3 DoubleBlind Yes No Yes Yes Yes FileGroups 1.1.2 74572 Blind Yes No Yes Yes Yes FireStormRealEstatePlugin UnionBased Yes No Yes No Yes GDStarRating 1.9.10 83466 Blind Yes No Yes Yes Yes GlobalContentBlocks 1.2 74577 DoubleBlind Yes No Yes Yes Yes iCopyright 1.1.4 Blind Yes No Yes Yes Yes IP-Logger 3.0 UnionBased Yes No Yes Yes Yes Js-appointment 1.5 74804 DoubleBlind Yes No Yes Yes Yes KNRAuthorListWidget 2.0.0 Blind Yes No Yes Yes Yes LinkLibrary 5.2.1 84579 Blind Yes No Yes Yes Yes MediaLibraryCategories 1.0.6 UnionBased Yes No Yes Yes Yes MingleForum 1.0.31 75791 DoubleBlind Yes No Yes Yes Yes MMDuplicate 1.2 Blind Yes No Yes Yes Yes MyStat 2.6 DoubleBlind Yes No Yes Yes Yes OdiHostNewsletter 1.0 74575 Blind Yes No Yes Yes Yes PaidDownloads 2.01 86247 Blind Yes No Yes Yes Yes posthighlights 2.2 UnionBased Yes No Yes No Yes Proles 2.0.RC1 Blind Yes No Yes Yes Yes ProPlayer 4.7.7 UnionBased Yes No Yes No Yes PureHTML 1.0.0 DoubleBlind Yes No Yes Yes Yes SCORMCloud 1.0.6.6 DoubleBlind Yes No Yes Yes Yes SearchAutocomplete 1.0.8 UnionBased Yes No Yes No Yes SHSlideshow 3.1.4 74813 Blind Yes No Yes Yes Yes SocialSlider 5.6.5 74421 Blind Yes No Yes Yes Yes UPMPolls 1.0.3 UnionBased Yes No Yes No Yes VideoWhisperVideoPresentation 1.1 DoubleBlind Yes No Yes Yes Yes FacebookOpengraphMeta UnionBased Yes No Yes Yes Yes PaypalDonationPlugin 74838 Blind Yes No Yes Yes Yes WPAudioGalleryPlaylist 0.12 UnionBased Yes No Yes No Yes WPBannerize 2.8.7 76658 Blind Yes No Yes Yes Yes WPDSFAQ 1.3.2 74574 DoubleBlind Yes No Yes Yes Yes WPeCommerce 3.8.6 75590 Tautology Yes No Yes Yes Ye
10 s WPFileBase 0.2.9 75308 Blind Yes No Ye
s WPFileBase 0.2.9 75308 Blind Yes No Yes Yes Yes WPForumServer 1.7.8 2012-6625 UnionBased Yes No Yes No Yes WPMenuCreator 1.1.7 74578 DoubleBlind Yes No Yes Yes Yes yolinkSearchforWordPress 1.1.4 74832 UnionBased Yes No Yes Yes Yes Zotpress 4.4 DoubleBlind Yes No Yes Yes Yes Joomla 3.0.1 2013-1453 DoubleBlind No No Yes Yes Yes Drupal 7.31 2014-3704 UnionBased Yes No Yes Yes Yes osCommerce 2.3.3.4 103365 Tautology Yes No No No Yes TABLEIV:Jozasecurityeffectivenessevaluatedusingoriginalandmutatedreal-worldexploitsontheWordpresstestbed.Joomla,DrupalandosCommercewereevaluatedusingonlytheoriginalexploits.9 VI.PERFORMANCEEVALUATIONTheperformanceevaluationofJozawascarriedoutusingWordpress,apopularcontentmanagementsystemthatpowers22%ofthetop10millionwebsites[32].Allevaluationswereperformedona4-coreiMacusingMacOSX10.10with24GBRAMrunningat2.9GHz.A.PTIOptimizationTomeasuretheperformanceoftheJozaPTIcomponent,wesetupafullyfunctionalWordpresssitepopulatedwith1001uniqueURLs.Crawlingtheentirewebsiteresultedinapproximately20,000SQLqueriesasWordpressrequiresmultipledatabasequeriestorenderapage.OurinitialimplementationofPTIinitiatedanewprocesstodetectSQLinjections.TomakePTItforpracticaluse,wedramaticallyincreaseditsperformancebyrunningPTIasadaemonprocessandbyperformingtwoprimaryoptimizations.Therstoptimizationwastouseamost-recently-usedcachingpolicyforfragmentsthatmatchaquerytotakeadvantageoftheSQLqueryworkingsetofaWebapplication[22].Thesecondoptimizationwastorstparsethequerytodeterminethecriticalsetoftokensbeforeattemptingtomatchthesetokens.Whencoupledwiththerstoptimization,benignqueriesarethereforequicklymatched,whilemaliciousqueriesmayrequirescanningtheentiresetoffragments. Fig.7:PerformancebreakdownofoptimizedPTIdaemoncomparedtobinaryPTIontopofWordpresscoreFigure7illustratesJoza'sPTIperformancebreakdownforaWordpressrequest.TheunoptimizedversionisclearlydominatedbyPTIprocessing.Theoptimizeddaemonreducesthisprocessingtimeby66%. Original WithPTI ExactCache StructureCache Read 0.2170 0.4440 0.2378 0.2255 Write 0.3319 0.8538 0.4441 0.3725 TABLEV:AverageRead/WritetimeforaWordpressrequestwithPTI(seconds).TableVcharacterizesperformanceoverheadbasedonwhetheraWordpressrequestisareadorawriterequest.AtypicalreadrequestistoreadaWordpresspost,whereasawriterequestmightbetopostacomment.Notethatbothtypesofrequestmayresultinmultipledatabasequeries.Forreadrequests,theuseofaquerycachetostorepreviousPTIdecisions,i.e.,whetheragivenqueryhaspreviouslybeendeemedsafe,reducesoverheadtolessthan4%.Forwriterequests,thequerycachealsoimprovesperformanceoverthenon-cachedversion,butincurs34%overhead.ThereasonaWordpresswriterequeststillbenetsfromcachingisthatpostingacommentresultsinmultipledatabasequeries,someofwhicharedatabasereadsandsomayhavebeencached.Anothercachingmechanismwasintroducedtoincreaseperformanceofwriteandotherdynamicqueries.ThequerystructurecachecachesthestructureoftheSQLqueryabstract-syntax-treewithoutthecontentofdatanodes.Thiscachingmechanismcachesthesafetyresultofallqueriesexceptthosedynamicallygeneratedinsidetheapplication(suchasadvancedsearch).Withthiscachinginplace,writerequestsincuronlya12%overhead.TosupportJoza'sgoalofease-of-deployment,wedeliber-atelychosenottoimplementPTIasadirectPHPextensionasitwouldhaverequiredadministratorprivilegestoinstallorload.OurresultsestimatethatimplementingPTIasaPHPextensionwouldincuronly0.2%overheadforreadrequestsand3.2%forwriterequests(asdescribedinSectionC).B.NTIOptimizationAnaiveimplementationofNTI'sstringmatchingalgorithmwouldbetooslowforpracticaluse.Fortunately,previousworkprovidesseveraloptimizations[33],[28].JozausesPHP'sinternalLevenshteindistancefunctionforshortinputsandqueries.AsaninternalPHPfunction,itsimplementationrunsatnativespeedinsteadofbeingemulatedbythePHPinterpreter.WheninputorquerylengthislargerthanthatsupportedbyPHP'sLevenshteinfunction,JozausesanoptimizedLeven-shteinfunctionwritteninPHP
11 thatrequireslinearmemoryandtime.C.JozaOv
thatrequireslinearmemoryandtime.C.JozaOverallEvaluationFigure8displaysthetimespentonPTIandNTIforafullsitecrawl(read),randomcommentposting(write)andrandomsearching.NTIandPTIoverheadsandthetotaloverheadofJozacanbeobservedinthegurefordifferenttypesofrequests.TheperformanceofJozadependsontherelativefrequencyofreadsvs.writerequests.TableVIshowsoverheadforavarietyofworkloads.Aworkloadconsistingof10%writesand90%readsresultsinanoveralloverheadof5%,whereasaworkloadof99%readsand1%writesresultsinanoveralloverheadof4%.WealsoestimatethecostofourdesigndecisiontoimplementJozacompletelyattheuser-level.Thisestimationisbasedonnotincludingdaemonspawnandcommunicationtimesinthecalculations.AJozasystemimplementedasadirectPHPextensionwouldincuronly1.7%overheadevenwitha10 Fig.8:Comparisonofread/write/searchtimeswithandwithoutJoza'sprotectioninWordpressworkloadconsistingof50%writerequests,whichwouldmakeJozawell-suitedforperformance-criticaldeploymentscenarioswithfulladministrativeprivileges.TableVIIliststheaveragenumberofnewblogposts,pages,commentsandRPCposts(postswrittenorreadviathirdpartyapplications)overthelastveyears,aswellastheaveragenumberofannualpageviewsonallblogshostedonWordpress.com[41],[40].Fromthesestatistics,wecomputethetypicalread/writeworkloadforWordpress.com,theofcialwebsiteforhostingWordpresssites.Onaverage,lessthanonepercentofallrequestsinvolvewrites,whichwouldresultinlessthan4%overheadonaveragewhenprotectedbyJoza. Writes Reads PlainTime ProtectedTime Overhead 50% 50% 0.2744 0.2990 8.96% 10% 90% 0.2284 0.2402 5.16% 5% 95% 0.2227 0.2328 4.53% 1% 99% 0.2181 0.2269 4.03% TABLEVI:OverheadofJozaondifferentworkloads.Inpractice,Joza'soverallperformancewouldbefurtherimprovedbyusingcontent-cachingengines.Heavily-trafckedWordpresssitesoftenmakeuseofsuchcaches.Withcontent-cachingenabled,onlytherstrequesttoaURLresultsintheexecutionofdatabasequeriestoserveuptherequestedpage.Subsequentrequestswouldbemostlyservedbyretrievingastaticcachedcopy,therebyreducingthedemandonJoza'sprocessingtime.VII.RELATEDWORKTheappealoftaintinferencetechniquesisthattheyobviatetheneedforpropagatingtaintinformationduringprogramexecution.PreviousPTIworkfocusedondefeatingOScom-mandinjectionattacksforx86binaries[22].TheworkreportedherewidenstheattackclassescoveredbyPTItoincludeSQLinjectionattackstargetedtowardswebapplications.Thevastmajorityofworkintainttrackingusesaformofnegativetainttracking,i.e.theytrackexternal(untrusted)dataasitowsthroughaprogramandcheckwhethersuchdataisusedinasecurity-sensitiveoperation[11],[21],[26],[9],[42],[12],[13],[7].Livshitsprovidesanextensivereviewofdynamictainttrackingprojects[16]andtheirpotentialpitfalls,includingthedifcultyofpropagatingtaintmarkingsacrossfunctionscorrectly.Forexample,neitherPHPrevent[21],northePHPtaint-trackingextension[14]modeltaintaccuratelyacrossstringfunctionsthatsupportcomplexregularexpressionpatterns,e.g.preg_replace.Failuretomodelsuchfunctionsaccuratelycanresultinincreasedfalsenegativeorfalsepositiverates.Jozasidestepsthisissuecompletelyasitdoesnotpropagatetaintmarkingsacrossfunctions.Whilemosttaint-trackingapproacheskeeptrackofexternaldata,Halfondetal.usepositivetainttrackingtotrackinternal(trusted)data[12],[13].Theprimarytradeoffisthatpositivetainttrackingpotentiallyresultsinhigherfalsepositiverates(breakingapplicationfunctionality),whereasnegativetainttrackingtiltstowardshigherfalsenegatives(missingattacks).Halfondadvocatestheuseofpositivetainttrackingasitprovidesamoreconservativesecurityposture.CANDIDandDiglossiadetectcommandinjectionsusingshadowcomputationsinsteadoftrackingtaintinformationdirectly.CANDIDbuildsshadowquerystringsinwhichuserinputisreplacedwithknownnon-attackstringssuchasasequenceof'a'characters[4].Anystructuraldifferenceintheparsetreeoftherealandshadowqueriesrevealsanattack.Diglossiausesacomplementaryapproachtogenerateshadowqueries.Insteadoftransformingstringsderivedfromexternali
12 nputs,Diglossiaremapsstringsthatoriginat
nputs,Diglossiaremapsstringsthatoriginatefromwithintheapplicationintoanalternativecharacterset[29].Todetectanattack,Diglossiachecksthattheparsetreesaresyntacticallyisomorphic,andthatallSQLcodeintheshadowparsetreeisencodedwiththealternativecharacterset.SinceCANDIDandDiglossiaseektodelineatedatafromcode,theyarealsosubjecttothecomplexityofmodelingcomplexstringfunctions.Forexample,Diglossiadoesnotmodelpreg_replace().Despitethelargebodyofresearchwithampleevidenceoftheeffectivenessoftaint-trackingtechniquesindefendingwebapplications,taint-trackingisnotwidelydeployedorused.Tothebestofourknowledge,PerlandRubyaretheonlytwomajorprogramminglanguagesthatprovidebuilt-insupportfordynamictainttracking[24],[25].Onereasonforthelackofdeploymentisthatpropagatingtaintinformationoftenrequireschangestotheunderlyingrun-timesystem[21],[26],[9],whichhindersdeploymentassuchchangestypicallyrequireadministratorprivileges.Anotherpotentialreasonistheperceivedhighcostoftaint-tracking.Whilethisperceptionistrueforsomeprojects(e.g.2.2XfortheASPISprojectonWordpress[23]),othershavereportedaverageperformanceoverheadinthe10-15%rangewhenmeasuredagainstvariouswebapplicationworkloads[12],[13],[21],[26],[7].SQLRand[5]usesrandomizationofcriticalSQLtokenstoimplementanalternateandsecretSQLinstructionset.Thisrandomizationisthenreversedatrun-timesothatthedatabaseprocessestheoriginalquery.Withoutknowledgeofthekeyusedforrandomization,anattackercannotinjectvalidSQLtokens.WeatherwaxrelaxestheSQLRandrequirementthatthe11 Posts Pages Comments RPC Views TotalDynamic Writes% Reads% 2014 40,537 6,789 50,341 5,767 14,426,095 103,434 0.71 99.29 2013 48,7281 84,409 668,469 74,257 144,777,605 1,314,417 0.90 99.10 2012 351,612 79,046 468,318 40,725 112,329,943 939,701 0.83 99.17 2011 176,507 56,033 147,738 32,928 79,614,461 413,206 0.52 99.48 2010 145,696 39,519 126,097 21,155 49,021,659 332,467 0.67 99.33 TABLEVII:Wordpress.comstatistics,rawnumbersaredividedby103randomizationkeybekeptsecretbyusingredundantparallelexecutioninsuchamannerthataSQLtokenvalidinonevariantisguaranteedinvalidintheother[37].However,thisapproachissubjecttothesamelimitationsasSQLRandinthatitrequiresthecompleteandaccurateidenticationofallSQLtokens,aprocesswhichisverydifculttoautomate,orerror-proneifdonemanually.VIII.CONCLUSIONSThispaperhasshownthattaintinferencetechniquesoffermanypracticaladvantagesincludingspeedandeaseofde-ployment,buttheindividualsecurityoftheseapproachesisweak.Toaddressthisweakness,wehavedevelopedanovelhybridtaintinferencingapproachthatsynergisticallycombinesthestrengthsofnegativetaintinferenceandpositivetaintinference.Toillustratethepowerofthehybridapproach,aprototypesystem,calledJoza,wasdevelopedtoautomaticallyprotectPHP-basedapplications.ThispaperdiscussesthearchitectureandimplementationofJoza,whichseamlesslyandsynergisticallyincorporatesbothnegativeandpositivetaintinferencemethods.UsingJozaandWordpressasatestbed,thepapershowsthatthehybridapproachisextremelyeffectiveatthwartingSQLinjectionattacksonWebapplicationswithoutrequiringdevelopereffortanddoessowithnegligibleperformanceoverhead.ACKNOWLEDGMENTThisresearchwassupportedbytheAirForceResearchLaboratory(AFRL)contractsFA8650-10-C-7025andFA8750-13-2-0096,theU.SDepartmentofCommerce(DOC)grant01-79-14214,andtheCommonwealthResearchCommercializationFund(CRCF)grantMF13-071-CS.Theviewsandconclusionscontainedhereinarethoseoftheauthorsandshouldnotbeinterpretedasnecessarilyrepresentingtheofcialpoliciesorendorsements,eitherexpressedorimplied,ofAFRL,DOC,CRCF,ortheU.S.Government.REFERENCES[1]Anonymized-for-review.Taintless:Tainttrackingandinferenceanalysisandbreakingtool.InBlackHatUSA,2014.[2]Anonymized-for-review.WP-SQLI-LAB:WordpressSQLinjectionlab,February2014.[3]A.Axelsen.Wordpressadvancedsearchwidget.[4]S.Bandhakavi,P.Bisht,P.Madhusudan,andV.Venkatakrishnan.Candid:preventingSQLinjectionattacksusingdynamiccandidateevaluations.InProceedingsofthe14thA
13 CMconferenceonComputerandcommunicationss
CMconferenceonComputerandcommunicationssecurity,pages1224.ACM,2007.[5]S.W.BoydandA.D.Keromytis.SQLrand:PreventingSQLinjectionattacks.InAppliedCryptographyandNetworkSecurity,pages292302.Springer,2004.[6]M.Chartier.Wordpresswp-advanced-searchplugin.[7]E.ChinandD.Wagner.Efcientcharacter-leveltainttrackingforjava.InProceedingsofthe2009ACMWorkshoponSecureWebServices,SWS'09,pages312,NewYork,NY,USA,2009.ACM.[8]E.DB.oscommerce2.3.3.4(geo zones.php,zidparam)sqlinjectionvulnerability.[9]A.Futoransky,E.Gutesman,andA.Waissbein.Adynamictechniqueforenhancingthesecurityandprivacyofwebapplications.Proc.BlackHatUSA,2007.[10]B.D.A.GuimaraesandM.Stampar.SQLmap,Februry2014.[11]V.Haldar,D.Chandra,andM.Franz.Dynamictaintpropagationforjava.InProceedingsofthe21stAnnualComputerSecurityApplicationsConference,pages303311,2005.[12]W.G.Halfond,A.Orso,andP.Manolios.Usingpositivetaintingandsyntax-awareevaluationtocounterSQLinjectionattacks.InProceedingsofthe14thACMSIGSOFTinternationalsymposiumonFoundationsofsoftwareengineering,pages175185.ACM,2006.[13]W.G.Halfond,A.Orso,andP.Manolios.Wasp:Protectingwebapplicationsusingpositivetaintingandsyntax-awareevaluation.SoftwareEngineering,IEEETransactionson,34(1):6581,2008.[14]X.Hui.PECLPHPtainttracker.[15]V.I.Levenshtein.Binarycodescapableofcorrectingdeletions,insertionsandreversals.InSovietphysicsdoklady,volume10,page707,1966.[16]B.Livshits.Dynamictainttrackinginmanagedruntimes.MicrosoftResearchTechnicalReport,2012.[17]B.Martin,M.Brown,A.Paller,D.Kirby,andS.Christey.2011CWE/SANStop25mostdangeroussoftwareerrors.CommonWeaknessEnumeration,7515,2011.[18]MITRE.Cve-2013-1453.[19]MITRE.Cve-2014-3704.[20]MITRE.Cvedetails(SQLinjection).[21]A.Nguyen-Tuong,S.Guarnieri,D.Greene,J.Shirley,andD.Evans.Automaticallyhardeningwebapplicationsusingprecisetainting.Springer,2005.[22]A.Nguyen-Tuong,J.D.Hiser,M.Co,J.W.Davidson,andJ.C.Knight.ToBornottoB:BlessingOScommandswithsoftwareDNAshotgunsequencing.In10thEuropeanDependableComputingConference,2014.[23]I.Papagiannis,M.Migliavacca,andP.Pietzuch.PHPAspis:usingpartialtainttrackingtoprotectagainstinjectionattacks.In2ndUSENIXConferenceonWebApplicationDevelopment,page13,2011.[24]Perltaintmode.[25]Rubytaintfeature.[26]T.PietraszekandC.V.Berghe.Defendingagainstinjectionattacksthroughcontext-sensitivestringevaluation.InRecentAdvancesinIntrusionDetection,pages124145.Springer,2006.[27]D.RayandJ.Ligatti.Deningcode-injectionattacks.InACMSIGPLANNotices,volume47,pages179190.ACM,2012.[28]R.Sekar.Anefcientblack-boxtechniquefordefeatingwebapplicationattacks.InNDSS,2009.[29]S.Son,K.S.McKinley,andV.Shmatikov.Diglossia:detectingcodeinjectionattackswithprecisionandefciency.InProceedingsofthe2013ACMSIGSACconferenceonComputer&communicationssecurity,pages11811192.ACM,2013.[30]TC.K.Wordpressadvancewpquerysearchlterplugin.[31]TC.K.Wordpressultimatewpquerysearchlterplugin.[32]W.Techs.Historicaltrendsintheusageofcontentmanagementsystemsforwebsites,February2014.[33]E.Ukkonen.Algorithmsforapproximatestringmatching.Informationandcontrol,64(1):100118,1985.[34]A.vanderStock,J.Williams,andD.Wichers.Top102007.Technicalreport,OWASP,2007.[35]Verizon.The2013databreachinvestigationsreport.Technicalreport,Verizon,2013.12 [36]G.WassermannandZ.Su.Soundandpreciseanalysisofwebapplicationsforinjectionvulnerabilities.InACMSigplanNotices,volume42,pages3241.ACM,2007.[37]E.Weatherwax.ModelingSecretlessSecurityinN-variantSystems.PhDthesis,UniversityofVirginia,2009.[38]D.Wichers.Top102013.Technicalreport,OWASP,2013.[39]J.WilliamsandD.Wichers.Top102010.Technicalreport,OWASP,2010.[40]Wordpress.com.Wordpress.compostsstatistics,2014.[41]Wordpress.com.Wordpress.comtrafcstatistics,February2014.[42]W.Xu,S.Bhatkar,andR.Sekar.Taint-enhancedpolicyenforcement:Apracticalapproachtodefeatawiderangeofattacks.InProceedingsofthe15thUSENIXSecuritySymposium,pages121