aintEnhanced olicy Enf or cement Practical ppr oach to Defeat ide Range of Attacks ei Xu Sandeep Bhatkar R - PDF document

101K - views

aintEnhanced olicy Enf or cement Practical ppr oach to Defeat ide Range of Attacks ei Xu Sandeep Bhatkar R

Sekar Department of Computer Science Stony Br ook Univer sity Stony Br ook NY 117944400 weixusbhatkarsekar cssuny sbedu Abstract Polic ybased con64257nement emplo yed in SELinux and speci64257cationbased intrusion detection systems is pop ular appro

Embed :
Pdf Download Link

Download Pdf - The PPT/PDF document "aintEnhanced olicy Enf or cement Practic..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

aintEnhanced olicy Enf or cement Practical ppr oach to Defeat ide Range of Attacks ei Xu Sandeep Bhatkar R






Presentation on theme: "aintEnhanced olicy Enf or cement Practical ppr oach to Defeat ide Range of Attacks ei Xu Sandeep Bhatkar R"— Presentation transcript:

Taint-EnhancedPolicyEnforcement:APracticalApproachtoDefeataWideRangeofAttacksWeiXuSandeepBhatkarR.SekarDepartmentofComputerScienceStonyBrookUniversity,StonyBrook,NY11794-4400fweixu,sbhatkar,sekarg@cs.sunysb.eduAbstracty-basedconnement,employedinSELinuxandspecication-basedintrusiondetectionsystems,isapop-ularapproachfordefendingagainstexploitationofvul-nerabilitiesinbenignsoftware.Conventionalaccesscon-trolpoliciesemployedintheseapproachesareeffectiveindetectingprivilegeescalationattacks.However,theyareunabletodetectattacksthathijacklegitimateac-cessprivilegesgrantedtoaprogram,e.g.,anattackthatsubvertsanFTPservertodownloadthepasswordle.(NotethatanFTPserverwouldnormallyneedtoac-cessthepasswordleforperforminguserauthentica-tion.)Someofthecommonattacktypesreportedtoday,suchasSQLinjectionandcross-sitescripting,involvesuchsubversionoflegitimateaccessprivileges.Inthispaper,wepresentanewapproachtostrengthenpolicyenforcementbyaugmentingsecuritypolicieswithinfor-mationaboutthetrustworthinessofdatausedinsecurity-sensitiveoperations.Weevaluatedthistechniqueusing9availableexploitsinvolvingseveralpopularsoftwarepackagescontainingtheabovetypesofvulnerabilities.Ourtechniquesucessfullydefeatedtheseexploits.1IntroductionInformationowanalysis(a.k.a.taintanalysis)hasplayedacentralroleincomputersecurityforoverthreedecades[1,10,8,30,25].Therecentworksof[22,28,5]demonstratedanewapplicationofthistechnique,namely,detectionofexploitsoncontemporarysoftware.Specically,thesetechniquestrackthesourceofeachbyteofdatathatismanipulatedbyaprogramduringitsexecution,anddetectattacksthatoverwritepointerswithuntrusted(i.e.,attacker-provided)data.Sincethisisanessentialstepinmostbufferoverowandrelatedattacks,andsincebenignusesofprogramsshouldneverinvolveoutsiderssupplyingpointervalues,suchattackscanbedetectedaccuratelybythesenewtechniques.Inthispaper,webuildonthebasicideaofusingne-grainedtaintanalysisforattackdetection,butexpanditsscopebyshowingthatthetechniquecanbeappliedtodetectamuchwiderrangeofattacksprevalenttoday.Specically,werstdevelopasource-to-sourcetrans-formationofCprogramsthatcanefcientlytrackinfor-mationowsatruntime.Wecombinethisinformationwithsecuritypoliciesthatcanreasonaboutthesourceofdatausedinsecurity-criticaloperations.Thiscombina-tionturnsouttobepowerfulforattackdetection,andof-fersthefollowingadvantagesoverprevioustechniques:Practicality.Thetechniquesof[28]and[5]relyonhardware-levelsupportfortaint-tracking,andhencecannotbeappliedtotoday'ssystems.TaintCheck[22]addressesthisdrawback,andisapplicabletoarbitraryCOTSbinaries.However,duetodifcultiesassoci-atedwithstaticanalysis(ortransformation)ofbina-ries,theirimplementationusestechniquesbasedonaformofruntimeinstructionemulation[21],whichcausesasignicantslowdown,e.g.,Apacheserverre-sponsetimeincreasesbyafactorof10whilefetching10KBpages.Incontrast,ourtechniqueismuchfaster,increasingtheresponsetimebyafactorof1.1.Broadapplicability.Ourtechniqueisdirectlyapplica-bletoprogramswritteninC,andseveralotherscript-inglanguages(e.g.,PHP,Bash)whoseinterpretersareimplementedinC.Security-criticalserversaremostfrequentlyimplementedinC.Inaddition,PHPandsimilarscriptinglanguagesarecommonchoicesforimplementingwebapplications,andmoregenerally,server-sidescripts.Abilitytodetectawiderangeofcommonattacks.Bycombiningexpressivesecuritypolicieswithne-grainedtaintinformation,ourtechniquecanaddressabroaderrangeofattacksthanprevioustechniques.Figure1showsthedistributionofthe139COTSsoftwarevulnerabilitiesreportedin2003and2004inthemostrecentofcialCVEdataset(Ver.20040901).Ourtechniqueisapplicablefordetectingexploita-tionsofabout2/3rdsofthesevulnerabilities,includ-ingbufferoverows,format-stringattacks,SQLinjec-tion,cross-sitescripting,commandandshell-codein-jection,anddirectorytraversal.Incontrast,previousapproachestypicallyhandledsmallerattackclasses, Other logicerrors22%Formatstring4%Memoryerrors27%Input validation/DoS9%Directorytraversal10%Cross-sitescripting4%Commandinjection15%SQL injection2%Tempfile4%Configerrors3%Figure1:BreakdownofCVEsoftwaresecurityvulnera-bilities(2003and2004)e.g.,[7,9,2,22,28,5]handlebufferoverows,[6]handlesformatstringattacks,and[24,23]handlein-jectionattacksinvolvingstrings.Thefocusofthispaperisonthedevelopmentofprac-ticalne-graineddynamictaint-trackingtechniques,andonillustratinghowthisinformationcanbeusedtosignif-icantlystrengthenconventionalaccesscontrolpolicies.Forthispurpose,weusesimpletaint-enhancedsecuritypolicies.Ourexperimentalevaluation,involvingread-ilyavailableexploitsthattargetvulnerabilitiesinseveralpopularapplications,showsthatthetechniqueiseffec-tiveagainsttheseexploits.Nevertheless,manyofthesepoliciesneedfurtherrenementbeforetheycanbeex-pectedtostanduptoskilledattackers.Section7.2dis-cussessomeoftheissuesinpolicyrenement,buttheactualdevelopmentofsuchrenedpoliciesisnotafo-cusareaofthispaper.Wehavesuccessfullyappliedourtechniquetoseveralmediumtolargeprograms,suchasthePHPinterpreter(300KLOC+)andglibc,theGNUstandardClibrary(about1MLOC).Byleveragingthelow-levelnatureoftheClanguage,ourimplementationworkscorrectlyeveninthefaceofmemoryerrors,typecasts,aliasing,andsoon.Atthesametime,byexploitingthehigh-levelnatureofC(ascomparedtobinarycode),wehavedevelopedoptimizationsthatsignicantlyreducetheruntimeover-headsofne-graineddynamictaint-tracking.ApproachOverview.Ourapproachconsistsofthefol-lowingsteps:Fine-grainedtaintanalysis:Therststepinourap-proachisasource-to-sourcetransformationofCpro-gramstoperformruntimetaint-tracking.Taintorig-inatesatinputfunctions,e.g.,areadorrecvfunc-tionusedbyaservertoreadnetworkinput.Inputoperationsthatreturnuntrustedinputarespeciedus-ingmarkingspecicationsdescribedinSection4.Inthetransformedprogram,eachbyteofmemoryisas-sociatedwithonebit(ormore)oftaintinformation.Logically,wecanviewthetaintinformationasabit-arraytagmap,withtagmap[a]representingthetaintinformationassociatedwiththedataatmemoryloca-tiona.Asdatapropagatesthroughmemory,theasso-ciatedtaintinformationispropagatedaswell.Sincetaintinformationisassociatedwithmemorylocations(ratherthanvariables),ourtechniquecanensurecor-rectpropagationoftaintinthepresenceofmemoryerrors,aliasing,typecasts,andsoon.Policyenforcement:Thisstepisdrivenbysecuritypoliciesthatareassociatedwithsecurity-criticalfunc-tions.Therearetypicallyasmallnumberofsuchfunc-tions,e.g.,systemcallssuchasopenandexecve,li-braryfunctionssuchasvfprintf,functionstoaccessexternalmodulessuchasanSQLdatabase,andsoon.Thesecuritypolicyassociatedwitheachsuchfunctionchecksitsargumentsforunsafecontent.OrganizationofthePaper.Webeginwithmotivat-ingattackexamplesinSection2.Section3describesoursource-codetransformationforne-grainedtaint-tracking.OurpolicylanguageandsamplepoliciesaredescribedinSection4.Theimplementationofourap-proachisdescribedinSection5,followedbytheexper-imentalevaluationinSection6.Section7discussesim-plicitinformationowsandsecuritypolicyrenement.Section8presentsrelatedwork.Finally,concludingre-marksappearinSection9.2MotivationforTaint-EnhancedPoliciesInthissection,wepresentseveralmotivatingattackex-amples.Weconcludebypointingouttheintegralroleplayedbytaintanalysisaswellassecuritypoliciesindetectingtheseattacks.2.1SQLandCommandInjection.SQLinjectionisacommonvulnerabilityinwebapplications.Theseserver-sideapplicationscommunicatewithawebbrowserclienttocollectdata,whichissubsequentlyusedtoconstructanSQLquerythatissenttoaback-enddatabase.Considerthestatement(writteninPHP)forconstructinganSQLqueryusedtolookupthepriceofanitemspeciedbythevariablename.$cmd="SELECTpriceFROMproductsWHEREname='".$name."'"IfthevalueofnameisassignedfromanHTMLformeldthatisprovidedbyanuntrusteduser,thenanSQLinjectionispossible.Inparticular,anattackercanpro-videthefollowingvalueforname:xyz';UPDATEproductsSETprice=0WHEREname='OneCaratDiamondRingWiththisvalueforname,cmdwilltakethevalue:SELECT...WHEREname='xyz';UPDATEproductsSETprice=0WHERE name='OneCaratDiamondRing'NotethatsemicolonsareusedtoseparateSQLcom-mands.Thus,thequeryconstructedbythepro-gramwillrstretrievethepriceofsomeitemcalledxyz,andthensetthepriceofanotheritemcalledOneCaratDiamondRingtozero.Thisattackenablestheattackertopurchasethisitemlaterfornocost.Fine-grainedtaintanalysiswillmarkeverycharacterinthequerythatiswithintheboxastainted.Now,apolicythatprecludestaintedcontrol-characters(suchassemicolonsandquotes)orcommands(suchasUPDATE)intheSQLquerywilldefeattheaboveattack.AmorerenedpolicyisdescribedinSection7.2.CommandinjectionattacksaresimilartoSQLin-jection:theyinvolveuntrustedinputsbeingusedastoconstructcommandsexecutedbycommandinterpreters(e.g.,bash)ortheargumenttoexecvesystemcall.2.2Cross-SiteScripting(XSS).Consideranexam-pleofabankthatprovidesaATMlocatorwebpagethatcustomerscanusetondthenearestATMmachine,basedontheirZIPcode.Typically,thewebpagecon-tainsaformthatsubmitsaquerytothewebsite,whichlooksasfollows:http://www.xyzbank.com/findATM?zip=90100IftheZIPcodeisinvalid,thewebsitetypicallyreturnsanerrormessagesuchas:&#xHTML;ZIPcodenotfound:90100&#x/HTM;&#xL000;Noteintheaboveoutputfromthewebserver,theuser-suppliedstring90100isreproduced.ThiscanbeusedbyanattackertoconstructanXSSattackasfollows.Todothis,theattackermaysendanHTMLemailtoanun-suspectinguser,whichcontainstextsuchas:Toclaimyourreward,pleaseclickhref="http://www.xyzbank.com/findATM?zip=ꀀ&#x/scr;&#xipt0;&#x/scr;&#xipt0;&#x/a00;malicious_script.js'Whentheuserclicksonthislink,therequestgoestothebank,whichreturnsthefollowingpage:&#xHTML;ZIPcodenotfound:src='http://www.attacker.com/&#xscri;&#xpt00;&#x/scr;&#xipt0;malicious_script.js'&#x/HTM;&#xL000;Thevictim'sbrowser,onreceivingthispage,willdownloadandrunJavascriptcodefromtheattacker'swebsite.Sincetheabovepagewassentfromhttp://www.xyzbank.com,thisscriptwillhaveac-cesstosensitiveinformationstoredonthevictimcom-puterthatpertainstothebank,suchascookies.Thus,theaboveattackwillallowcookieinformationtobestolen.Sincecookiesareoftenusedtostoreauthenticationdata,stealingthemcanallowattackerstoperformnancialtransactionsusingvictim'sidentity.Fine-grainedtaintanalysiswillmarkeverycharacterinthezipcodevalueastainted.Nowtheabovecross-sitescriptingattackcanbepreventedbydisallowingtaintedscripttagsinthewebapplicationoutput.2.3MemoryErrorExploits.Therearemanydif-ferenttypesofmemoryerrorexploits,suchasstack-smashing,heap-overowsandintegeroverows.Allofthemsharethesamebasiccharacteristics:theyexploitbounds-checkingerrorstooverwritesecurity-criticaldata,almostalwaysacodepointeroradatapointer,withattacker-provideddata.Whenne-grainedtaintanalysisisused,itwillmarktheoverwrittenpointerastainted.Now,thisattackcanbestoppedbyapolicythatprohibitsdereferencingoftaintedpointers.2.4FormatStringVulnerabilities.Theprintffamilyoffunctions(whichprovideformattedprintinginC)takeaformatstringasaparameter,followedbyzeroormoreparameters.Acommonmisuseofthesefunc-tionsoccurswhenuntrusteddataisprovidedasthefor-matstring,asinthestatementprintf(s).Ifscon-tainsanalphanumericstring,thenthiswillnotcauseaproblem,butifanattackerinsertsformatdirectivesins,thenshecancontrolthebehaviorofprintf.Intheworstcase,anattackercanusethe%nformatdirective,whichcanbeusedtooverwriteareturnaddresswithattacker-provideddata,andexecuteinjectedbinarycode.Whenne-grainedtaintanalysisisused,theformatdirectives(suchas%n)willbemarkedastainted.Theaboveattackcanbethenpreventedbyataint-enhancedpolicythatdisallowstaintedformatdirectivesinthefor-matstringargumenttotheprintffamilyoffunctions.2.5AttacksthatHijackAccessPrivileges.Inthissection,weconsiderattacksthatattempttoevadede-tectionbystayingwithintheboundsofnormalaccessesmadebyanapplication.Theseattacksarealsoreferredtoastheconfuseddeputyattacks[13].Considerawebbrowservulnerabilitythatallowsanattack(embeddedwithinawebpage)touploadanar-bitrarylefownedbythebrowseruserwithouttheuser'sconsent.Sincethebrowseritselfneedstoaccessmanyoftheuser'sles(e.g.,cookies),apolicythatpro-hibitsaccesstofmaypreventnormalbrowseropera-tions.Instead,weneedapolicythatcaninferwhethertheaccessisbeingmadeduringthenormalcourseofanoperationofthebrowser,orduetoanattack.Onewaytodothisistotakethetaintinformationassociatedwiththelename.Ifthisleisaccessedduringnormalbrowseroperation,thelenamewouldhaveoriginatedwithinitsprogramtextorfromtheuser.However,ifthelenameoriginatedfromaremotewebsite(i.e.,anun-trustedsource),thenitislikelytobeanattack.Similarexamplesincludeattackson(a)P2Papplicationstoup-load(i.e.,steal)userles,and(b)FTPserverstodown- loadsensitivelessuchasthepasswordlethatarenor-mallyaccessedbytheserver.Avariantoftheabovescenariooccursinthecontextofdirectorytraversalattacks,whereanattackerattemptstoaccesslesoutsideofanauthorizeddirectory,e.g.,thedocumentrootinthecaseofawebserver.Typically,thisisdonebyincluding..charactersinlenamestoascendabovethedocumentroot.Incasethevictimapplicationalreadyincorporateschecksfor..charac-ters,attackermayattempttoevadethischeckbyreplac-ing.withitshexadecimalorUnicoderepresentation,orbyusingvariousescapesequences.Ataint-enhancedpolicycanbeusedtoselectivelyenforceamorerestric-tivepolicyonleaccesswhenthelenameistainted,e.g.,accessesoutsideofthedocumentrootdirectorymaybedisallowed.Suchapolicywouldnotinterferewiththewebserver'sabilitytoaccessotherles,e.g.,itsaccesslogorerrorlog.Thekeypointaboutallattacksdiscussedinthissec-tionisthatconventionalaccesscontrolpoliciescannotdetectthem.Thisisbecausetheattacksdonotstraybeyondthesetofresourcesthatarenormallyaccessedbyavictimprogram.However,taintanalysisprovidesacluetoinfertheintendeduseofanaccess.Byincor-poratingthisinferredintentingrantingaccessrequests,taint-enhancedpoliciescanprovidebetterdiscrimina-tionbetweenattacksandlegitimateusesoftheprivilegesgrantedtoavictimapplication.2.6Discussion.Theexamplesdiscussedabovebringoutthefollowingimportantpoints:Importanceofne-grainedtaintinformation.Ifweusedcoarsergranularityfortaint-tracking,e.g.,bymarkingaprogramvariableastaintedoruntainted,wewouldnotbeabletodetectmostoftheattacksde-scribedabove.Forinstance,inthecaseofSQLin-jectionexample,thevariablecmdcontainingtheSQLquerywillalwaysbemarkedastainted,asitderivespartofitsvaluefromanuntrustedvariablename.Asaresult,wecannotdistinguishbetweenlegitimateusesofthewebapplication,whennamecontainsanal-phanumericstring,fromanattack,whennamecon-tainscharacterssuchasthesemicolonandSQLcom-mands.Asimilaranalysiscanbemadeinthecaseofstack-smashingandformat-stringattacks,cross-sitescripting,directorytraversal,andsoon.Needfortaint-enhancedpolicies.Itisnotpossibletopreventtheseattacksbyenforcingconventionalac-cesscontrolpolicies.Forinstance,intheSQLinjec-tionexample,onecannotuseapolicythatuniformlypreventstheuseofsemicolonsandSQLcommandsincmd:suchapolicywouldprecludeanyuseofthedatabase,andcausethewebapplicationtofail.Simi-larly,inthememoryerrorexample,onecannothaveaworkingprogramifallcontroltransfersthroughpoint-ersareprevented.Finally,theexamplesinSection2.5werespecicallychosentoillustratetheneedforcom-biningtaintinformationintopolicies.Anotherpointtobemadeinthisregardisthatattacksarenotcharacterizedsimplybythepresenceorab-senceoftaintedinformationinargumentstosecurity-criticaloperations.Instead,itisnecessarytodeveloppoliciesthatgovernthemannerinwhichtainteddataisusedinthesearguments.3TransformationforTaintTrackingTherearethreemainstepsintaint-enhancedpolicyen-forcement:(i)marking,i.e.,identifyingwhichexternalinputstotheprogramareuntrustedandshouldbemarkedastainted,(ii)trackingtheowoftaintthroughthepro-gram,and(iii)checkinginputstosecurity-sensitiveop-erationsusingtaint-enhancedpolicies.Thissectiondis-cussestracking,whichisimplementedusingasource-to-sourcetransformationonCprograms.TheothertwostepsaredescribedinSection4.3.1RuntimeRepresentationofTaintOurtechniquetrackstaintinformationatthelevelofbytesinmemory.Thisisnecessarytoensureaccuratetaint-trackingfortype-unsafelanguagessuchasC,sincetheapproachcancorrectlydealwithsituationssuchasout-of-boundsarraywritesthatoverwriteadjacentdata.Aone-bittaint-tagisusedforeachbyteofmemory,witha`0'representingtheabsenceoftaint,anda`1'repre-sentingthepresenceoftaint.Abit-arraytagmapstorestaintinformation.Thetaintbitassociatedwithabyteataddressaisgivenbytagmap[a].3.2BasicTransformationThesource-codetransformationdescribedinthissec-tionisdesignedtotrackexplicitinformationowsthattakeplacethroughassignmentsandarithmeticandbit-operations.FlowsthattakeplacethroughconditionalsareaddressedinSection7.1.ItisunusualinCprogramstohaveboolean-valuedvariablesthatareassignedthere-sultsofrelationalorlogicaloperations.Hencewehavenotconsideredtaintpropagationthroughsuchoperatorsinthispaper.Atahigh-level,explicitowsaresimpletounderstand:theresultofanarithmetic/bitexpressionistaintedifanyofthevariablesintheexpressionistainted;avariablexistaintedbyanassignmentx=ewhen-evereistainted.Specically,Figure2showshowtocomputethetaintvalueT(E)foranexpressionE.Figure3deneshowastatementSistransformed,andusesthedenitionofT(E).Whendescribingthetransformationrules,we ET(E)Commentc0Constantsareuntaintedvtag(&v;tag(a;n)referstonbitssizeof(v))startingattagmap[a]&E0AnaddressisalwaysuntaintedEtag(E;sizeof(E))(cast)ET(E)Typecastsdon'tchangetaint.op(E)T(E)forarithmetic/bitop0otherwiseE1opE2T(E1)jjT(E2)forarithmetic/bitop0otherwiseFigure2:DenitionoftaintforexpressionsSTrans(S)v=Ev=E;tag(&v;sizeof(v))=T(E);S1;S2Trans(S1);Trans(S2)if(E)S1if(E)Trans(S1)elseS2elseTrans(S2)while(E)Swhile(E)Trans(S)returnEreturn(E;T(E))f(a)fSgf(a;ta)ftag(&a;sizeof(a))=ta;Trans(S)gv=f(E)(v;tag(&v;sizeof(v)))=f(E;T(E))v=(f)(E)(v;tag(&v;sizeof(v)))=(f)(E;T(E))Figure3:Transformationofstatementsfortaint-trackinguseasimplerformofC(e.g.expressionshavenosideeffects).Inourimplementation,weusetheCIL[19]toolkitastheCfrontendtoprovidethesimplerCformthatweneed.Thetransformationrulesareself-explanatoryformostpart,soweexplainonlythefunction-callrelatedtransfor-mations.Considerastatementv=f(E),whereftakesasingleargument.Weintroduceanadditionalargumenttainthedenitionoffsothatthetainttagassociatedwithits(single)parametercouldbepassedin.taisex-plicitlyassignedasthetaintvalueofaatthebeginningoff'sbody.(ThesetwostepsarenecessarysincetheClan-guageusescall-by-valuesemantics.Ifcall-by-referenceweretobeused,thenneitherstepwouldbeneeded.)Inasimilarway,thetaintassociatedwiththereturnvaluehastobeexplicitlypassedbacktothecaller.Werep-resentthisinthetransformationbyreturningapairofvaluesasthereturnvalue.(Inourimplementation,wedonotactuallyintroduceadditionalparametersorreturnvalues;instead,weuseasecondstacktocommunicatethetaintvaluesbetweenthecallerandthecallee.)Itisstraight-forwardtoextendthetransformationrulestohandlemulti-argumentfunctions.Weconcludethissectionwithaclaricationonournotionofsoundnessoftaintinformation.Consideranyvariablexatanypointduringanyexecutionofatrans-formedprogram,andletadenotethelocationofthisvariable.Ifthevaluestoredataisobtainedfromanytaintedinputthroughassignmentsandarithmetic/bitop-erations,thentagmap[a]shouldbeset.Notethatbyreferringtothelocationofxratherthanitsname,were-quirethattaintinformationbeaccuratelytrackedinthepresenceofmemoryerrors.Tosupportthisnotionofsoundness,weneededtoprotectthetagmapfromcor-ruption,asdescribedinSection3.4.3.3OptimizationsThebasictransformationdescribedaboveiseffective,butintroduceshighoverheads,sometimescausingaslowdownbyafactorof5ormore.Toimproveper-formance,wehavedevelopedseveralinterestingruntimeandcompile-timeoptimizationsthathavereducedover-headssignicantly.MoredetailsabouttheperformancecanbefoundinSection6.4.3.3.1RuntimeOptimizationsInthissection,wede-scribeoptimizationstotheruntimedatastructures.Useof2-bittaintvalues.Intheimplementation,ac-cessingoftaint-bitsrequiresseveralbit-masking,bit-shiftingandunmaskingoperations,whichdegradeper-formancesignicantly.Weobservedthatif2-bittainttagsareused,thetaintvalueforanintegerwillbecontainedwithinasinglebyte(assuming32-bitarchi-tecture),therebyeliminatingthesebit-leveloperations.Sinceintegerassignmentsoccurveryfrequently,thisop-timizationisquiteeffective.Thisapproachdoesincreasethememoryrequirementfortagmapbyafactoroftwo,butontheotherhand,itopensupthepossibilityoftrackingrichertaintinfor-mation.Forinstance,itbecomespossibletoassociatedifferenttainttagswithdifferentinputsourcesandtrackthemindependently.Alternatively,itmaybepossibletousethetwobitstocapturedegreeoftaintedness.Allocationoftagmap.Initially,weusedaglobalvari-abletoimplementtagmap.Buttheinitializationofthishugearray(1GB)thattookplaceattheprogramstartin-curredsignicantoverheads.Notethattaginitializationiswarrantedonlyforstaticdatathatisinitializedatpro-gramstart.Otherdata(e.g.,stackandheapdata)shouldbeinitialized(usingassignments)beforeuseinacor-rectlyimplementedprogram.Whentheseassignmentsaretransformed,theassociatedtaintdatawillalsobeini-tialized,andhencethereisnoneedtoinitializesuchtaintdataintherstplace.So,weallocatedtagmapdynami-cally,andinitializedonlythelocationscorrespondingtostaticdata.Byusingmmapforthisallocation,andbyper-formingtheallocationataxedaddressthatisunusedinLinux(ourimplementationplatform),weensuredthatruntimeaccessestotagmapelementswillbenomoreexpensivethanthatofastaticallyallocatedarray(whosebaseaddressisalsodeterminedatcompile-time).Theaboveapproachreducedthestartupoverheads,but themereuseofaddressspaceseemedtotieupOSre-sourcessuchaspagetableentries,andsignicantlyin-creasedtimeforforkoperations.Forprogramssuchasshellsthatforkfrequently,thisoverheadbecomesunac-ceptable.Sowedevisedanincrementalallocationtech-niquethatcanbelikenedtouser-levelpage-faulthan-dling.Initially,tagmappointsto1GBofaddressspacethatisunmapped.Whenanyaccesstotagmap[i]ismade,itresultsinaUNIXsignalduetoamemoryfault.Inthetransformedprogram,weintroducecodethatinter-ceptsthissignal.Thiscodequeriestheoperatingsystemtodeterminethefaultingaddress.Ifitfallswithintherangeoftagmap,achunkofmemory(say,16KB)thatspansthefaultingaddressisallocatedusingmmap.Ifthefaultingaddressisoutsidetherangeoftagmap,thesig-nalisforwardedtothedefaultsignalhandler.3.3.2Compile-timeOptimizationsUseoflocaltainttagvariables.InmostCprograms,operationsonlocalvariablesoccurmuchmorefre-quentlythanglobalvariables.Moderncompilersaregoodatoptimizinglocalvariableoperations,butduetopossiblealiasing,mostsuchoptimizationscannotbesafelyappliedtoglobalarrays.Unfortunately,thebasictransformationintroducesoneoperationonaglobalar-rayforeachoperationonalocalvariable,andthishastheeffectofmorethandoublingtheruntimeoftrans-formedprograms.Toaddressthisproblemwemodi-edourtransformationsothatituseslocalvariablestoholdtaintinformationforlocalvariables,sothatthecodeaddedbythetransformercanbeoptimizedaseasilyastheoriginalcode.Note,however,thattheuseoflocaltagvariableswouldbeunsoundifaliasingofalocalvariableispossi-ble.Forexample,considerthefollowingcodesnippet:intx;int*y=&x;x=u;*y=v;Ifuisuntaintedandvistainted,thenthevaluestoredinxshouldbetaintedattheendoftheabovecodesnippet.However,ifweintroducedalocalvariable,say,tagx,tostorethetaintvalueofx,thenwecannotmakesurethatitwillgetupdatedbytheassignmentto*y.Toensurethattaintinformationistrackedaccurately,ourtransformationuseslocaltainttagvariablesonlyinthosecaseswherenoaliasingispossible,i.e.,theoptimizationislimitedtosimplevariables(notarrays)whoseaddressisnevertaken.However,thisaloneisnotenough,asaliasingmaystillbepossibleduetomemoryerrors.Forinstance,asimplevariablexmaygetupdatedduetoanout-of-boundsaccessonanadjacentarray,say,z.Toeliminatethispossibility,wesplittheruntimestackintotwostacks.Themainstackstoresonlysimplevari-ableswhoseaddressesarenevertaken.Thisstackisalsousedforcall-return.Allotherlocalvariablesarestoredinthesecondstack,alsocalledshadowstack.Thelastpossibilityforaliasingarisesduetopointer-forging.Inprogramswithpossiblememoryerrors,apointertoalocalvariablemaybecreated.However,withtheabovetransformation,anyaccesstothemainstackusingapointerindicatesamemoryerror.Weshowhowtoimplementanefcientmechanismtopreventaccesstosomesectionsofmemoryinthetransformedprogram.Usingthistechnique,wepreventallaccessestothemainstackexceptusinglocalvariablenames,thusensuringthattaintinformationcanbeaccuratelytrackedforthevariablesonthemainstackusinglocaltainttagvariables.Intra-proceduraldependencyanalysisisperformedtodeterminewhetheralocalvariablecaneverbecometainted,andtoremovetaintupdatesifitcannot.Notethatalocalvariablecanbecometaintedonlyifitisin-volvedinanassignmentwithaglobalvariable,apro-cedureparameter,oranotherlocalvariablethatcanbe-cometainted.Duetoaliasingissues,thisoptimizationisappliedonlytovariablesonthemainstack.3.4ProtectingMemoryRegionsToensureaccuratetaint-tracking,itisnecessarytopre-cludeaccesstocertainregionsofmemory.Specically,weneedtoensurethatthetagmaparrayitselfcannotbewrittenbytheprogram.Otherwise,tagmapmaybecorruptedduetoprogrammingerrors,orevenworse,acarefullycraftedattackmaybeabletoevadedetectionbymodifyingthetagmaptohidethepropagationoftainteddata.Asecondregionthatneedstobeprotectedisthemainstack.Third,itwouldbedesirabletoprotectmem-orythatshouldnotdirectlybeaccessedbyaprogram,e.g.,theGOT.(GlobalOffsetTableisusedfordynamiclinking,butthereshouldnotbeanyreferencetotheGOTintheCcode.IftheGOTisprotectedinthismanner,thatwouldruleoutattacksbasedoncorruptingafunc-tionpointerintheGOT.)Thebasicideaisasfollows.Consideranassignmenttoamemorylocationa.Ourtransformationensuresthatanaccesstotagmap[a]willbemadebeforeaisac-cessed.Thus,inordertoprotectarangeofmemorylo-cationslh,itisenoughifweensurethattagmap[l]throughtagmap[h]willbeunmapped.Thisiseasytodo,givenourincrementalapproachtoallocationoftagmap.Now,anyaccesstoaddresseslthroughhwillresultinamemoryfaultwhenthecorrespondingtagmaplocationisaccessed.Notethatlandhcannotbearbitrary:theyshouldfallona16Kboundary,ifthepagesizeis4KBandif2bittaintingisused.Thisisbecausemmapallocatesmem-oryblockswhosesizesareamultipleofapagesize.Thisalignmentrequirementisnotaproblemfortagmap,sincewecanalignitona16Kboundary.Forthemainstack,apotentialissuearisesbecausethebottomofthe AttackTypePolicyCommentControl-owhijackjmp(addr)jaddrmatches(any+)t!term()TaintedvaluescannotbeusedasatargetofcontroltransferFormatstringFormat="%[%]"vfprintf(fmt)jfmtmatchesany(Format)Tany!reject()Formatdirectives(e.g.%n)shouldnotbetaintedDirectoryversalDirTraversalModi er=".." lefunction(path)=open(path;)jjunlink(path)jj::: lefunction(path)jpathmatchesany(DirTraversalModi er)Tany&&escapeRootDir(path)!reject()Ifpathcontainstainteddirectorytraversalstrings(e.g...),thentherealpathofpathshouldnotgoout-sidethetopleveldirectoriesthatareallowedtobeaccessedbythepro-gram,e.g.DocumentRootandcgi-binforhttpdCross-siteScriptTag="j:::htmlprintfunction(str)jstrmatches(StrIdNumjDelim)(ScriptTag)Tany!reject()Notaintedscripttags(e.g.script)shouldbeoutputtoHTML.SQLinjectionSqlMetachar="'"j";"j"/*"j:::sqlqueryfunction(query)jquerymatches(StrIdNumjDelim)(SqlMetachar)Tany!reject()SQLquerystringshouldnotcontaintaintedSQLmeta-charsShellcommandinjectionShellMetachar=";"j"&&"j:::shellcommandfunction(cmd)jcmdmatches(StrIdNumjDelim)(ShellMetachar)Tany!reject()cmdargumentofsystemorpopenshouldnotcontaintaintedshellmeta-charsFigure4:Illustrativesecuritypoliciesstackholdsenvironmentvariablesandcommand-linear-gumentsthatarearrays.Todealwiththisproblem,werstintroduceagapinthestackinmainsothatitstopisalignedona16Kboundary.Theregionofmainstackabovethispointisprotectedusingtheabovemechanism.Thismeansthatitissafetouselocaltagvariablesinanyfunctionexceptmain.4MarkingandPolicySpecication4.1MarkingTrustedandUntrustedInputsMarkinginvolveassociatingtaintinformationwithallthedatacomingfromexternalsources.Ifallcode,in-cludinglibraries,istransformed,thenmarkingneedstobespeciedforsystemcallsthatreturninputs,foren-vironmentvariablesandcommand-linearguments.(Ifsomelibrariesarenottransformed,thenmarkingspeci-cationsmaybeneededforuntransformedlibraryfunc-tionsthatperforminputs.)Notethatwecantreatcommand-lineargumentsandenvironmentvariablesasargumentstomain.Thus,markingspecicationscan,ineverycase,beassociatedwithafunctioncall.MarkingactionsarespeciedusingBMSL(BehaviorMonitoringSpecicationLanguage)[29,3],anevent-basedlanguagethatisdesignedtosupportspecicationofsecuritypoliciesandbehaviors.BMSLspecica-tionsconsistofrulesoftheformeventpattern!action.WeuseBMSLinasimpliedwayinthispa-perinparticular,eventpatternwillbeoftheformeventjcondition,whereeventidentiesafunction.Whenthisfunctionreturns,and(theoptional)conditionholds,actionwillbeexecuted.Theeventcorrespondingtoafunctionwilltakeanadditionalargumentthatcap-turesthereturnvaluefromthefunction.Boththecondi-tionandtheactioncanuseexternalfunctions(writteninCorC++).Moreover,theactioncanincludearithmeticandlogicaloperations,aswellasif-then-else.Considerthefollowingexample:read(fd,buf,size,rv)|(rv&#xscri;&#xpt"0;0)!if(isNetworkEndpoint(fd))taint_buffer(buf,rv);elseuntaint_buffer(buf,rv);Thisrulestatesthatwhenthereadfunctionreturns,thebufargumentwillbetainted,basedonwhetherthereadwasfromanetworkornot,asdeterminedbytheexter-nalfunctionisNetworkEndpoint.Theactualtaintingisdoneusingtwosupportfunctionstaintbufferanduntaintbuffer.Notethateveryinputactionneedstohaveanassoci-atedmarkingrule.Toreducetheburdenofwritingmanyrules,weprovidedefaultrulesforallsystemcallsthatuntaintthedatareturnedbyeachsystemcall.Specicrulesthatoverridethesedefaultrules,suchastherulegivenabove,canthenbesuppliedbyauser.4.2SpecifyingPoliciesSecuritypoliciesarealsowrittenusingBMSL,buttheserulesaresomewhatdifferentfromthemarkingrules.Forapolicyruleinvolvingafunctionf,itsconditioncompo-nentisexaminedimmediatelybeforeanyinvocationoff.Tosimplifythepolicyspecication,abstracteventscanbedenedtorepresentasetoffunctionsthatsharethesamesecuritypolicy.(Abstracteventscanbethoughtofasmacros.)Thedenitionofconditionisalsoextendedtosup-portregular-expressionbasedpatternmatching,usingthe keywordmatches.Weusetaint-annotatedregularex-pressionsdenedasfollows.Ataintedregularexpres-sionisobtainedforanormalregularexpressionbyat-tachingasuperscriptt,Toru.Astringswillmatchataint-annotatedregularexpressionrtprovidedthatsmatchesr,andatleastoneofthecharactersinsistainted.Similarly,swillmatchrTprovidedallchar-actersinsaretainted.Finally,swillmatchruprovidednoneofthecharactersinsaretainted.Thepredenedpatternanymatchesanysinglechar-acter.Parenthesesandotherstandardregularexpressionoperatorsareusedintheusualway.Moreover,taint-annotatedregularexpressionscanbenamed,andthenamecanbereusedsubsequently,e.g.,StrIdNumusedinmanysamplepolicyrulesisdenedas:StrIdNum=String|Id|NumwhereString,IdandNumdenotenamedregularex-pressionsthatcorrespondrespectivelytostrings,identi-ersandnumbers.Also,Delimdenotesdelimiters.Figure4showstheexamplesofafewsimplepoliciestodetectvariousattacks.Theactioncomponentofthesepoliciesmakeuseoftwosupportfunctions:term()ter-minatestheprogramexecution,whilereject()deniestherequestandreturnswithanerror.Forthecontrol-owhijackpolicy,weuseaspecialkeywordjmpasafunctionname,asweneedsomespe-cialwaytocapturelow-levelcontrol-owtransfersthatarenotexposedasafunctioncallintheClanguage.Thepolicystatesthatifanyofthebytesinthetargetaddressaretainted,thentheprogramshouldbeterminated.Forformatstringattacks,weonlydeneapolicyforvfprintf,becausevfprintfisthecommonfunctionusedinternallytoimplementallotherprintffamilyoffunctions.Allformatdirectivesinaformatstringbeginwitha%,andarefollowedbyacharacterotherthan%.(Thesequence%%willsimplyprinta%,andhencecanbepermittedintheformatstring.)Examplepoliciestodetectfourotherattacks,namely,directorytraversal,cross-sitescripting,SQLinjectionandshellcommandinjectionarealsoshowninFigure4.Thecommentsassociatedwiththepoliciesprovideanintuitivedescriptionofthepolicy.Thesepolicieswereabletodetectalloftheattacksconsideredinourevalua-tion,butwedonotmakeanyclaimthatthepoliciesaregoodenoughtodetectallpossibleattacksinthesecate-gories.Adiscussionofhowskilledattackersmayevadesomeofthesepolicies,andsomedirectionsforreningpoliciestostanduptosuchattacks,canbefoundinSec-tion7.2.Themainstrengthoftheapproachpresentedinthispaperisthattheavailabilityofne-grainedtaintin-formationmakesitpossibleforaknowledgeablesystemadministratortodevelopsuchrenedpolicies.5ImplementationWehaveimplementedtheprogramtransformationtech-niquedescribedinSection3.Thetransformerconsistsofabout3,600linesofObjectiveCamlcodeandusestheCIL[19]toolkitasthefrontendtomanipulateCcon-structs.Ourimplementationcurrentlyhandlesglibc(containingaround1millionLOC)andseveralothermediumtolargeapplications.Thecomplexityandsizeofglibcdemonstratedthatourimplementationcanhan-dlereal-worldcode.Wesummarizesomeofthekeyissuesinvolvedinourimplementation.5.1CopingwithUntransformedLibrariesIdeally,allthelibrariesusedbyanapplicationwillbetransformedusingourtechniquesoastoenableaccuratetainttracking.Inpractice,however,sourcecodemaynotbeavailableforsomelibraries,orinrarecases,somefunctionsinalibrarymaybeimplementedinanassem-blylanguage.Oneoptionwithsuchlibrariesistodonothingatall.Ourimplementationisdesignedtoworkinthesecases,butclearly,theabilitytotrackinforma-tionowviauntransformedfunctionsislost.Toover-comethisproblem,ourimplementationofferstwofea-tures.First,itproduceswarningswhenacertainfunc-tioncouldnotbetransformed.Thisensuresthatinaccu-racieswillnotbeintroducedintotainttrackingwithoutexplicitknowledgeoftheuser.Whentheuserseesthiswarning,shemaydecidethatthefunctioninquestionperformslargelyreadoperations,orwillneverhan-dletainteddata,andhencethewarningcansafelybeignored.Ifnot,thenourimplementationsupportssum-marizationfunctionsthatspecifyhowtaintinformationispropagatedbyafunction.Forinstance,weusethefollowingsummarizationfunctionforthememcpy.Sum-marizationfunctionsarealsospeciedinBMSL,andusesupportfunctionstocopytaintinformation.Asumma-rizationfunctionforfwouldbeinvokedinthetrans-formedcodewhenfreturns.memcpy(dest,src,n)!copy_buffer_tagmap(dest,src,n);Sofar,wehadtowritesummarizationfunctionsfortwoglibcfunctionsthatarewritteninassemblyandcopydata,namely,memcpyandmemset.Inaddition,gccreplacescallstosomefunctionssuchasstrcpyandstrdupwithitsowncode,necessitatinganadditional13summarizationfunctions.5.2InjectingMarking,CheckingandSumma-rizationCodeintoTransformedProgramsInourcurrentimplementation,themarkingspecica-tions,securitypolicies,andsummarizationcodeasso-ciatedwithafunctionfareallinjectedintothetrans-formedprogrambysimplyinlining(orexplicitlycall- CVE#ProgramLanguageAttacktypeAttackdescriptionCAN-2003-0201sambaCStacksmashingBufferoverowincalltrans2openfunctionCVE-2000-0573wu-ftpdCFormatstringviaSITEEXECcommandCAN-2005-1365picoserverCDirectorytraversalCommandexecutionviaURLwithmultipleleading/charac-tersand..CAN-2003-0486phpBB2.0.5PHPSQLinjectionviatopicidparameterCAN-2005-0258phpBB2.0.5PHPDirectorytraversalDeletearbitrarylevia..se-quencesinavatarselectparameterCAN-2002-1341SquirrelMail1.2.10PHPCrosssitescriptingInsertscriptviathemailboxpa-rameterinreadbody.phpCAN-2003-0990SquirrelMail1.4.0PHPCommandinjectionviameta-characterinTo:eldCAN-2005-1921PHPXML-RPCPHPCommandinjectionEvalinjectionCVE-1999-0045nph-test-cgiBASHShellmeta-characterexpansionusing'*'in$QUERYSTRINGFigure5:Attacksusedineffectivenessevaluationing)therelevantcodebeforeorafterthecalltof.Inthefuture,weanticipatethesecodetobedecoupledfromthetransformation,andbeabletooperateonbinariesus-ingtechniquessuchaslibraryinterposition.Thiswouldenableasiteadministratortoalter,reneorcustomizehernotionsoftrustworthyinputanddangerousargu-mentswithouthavingaccesstothesourcecode.6ExperimentalEvaluationThemaingoalofourexperimentswastoevaluateat-tackdetection(Section6.1),andruntimeperformance(Section6.4).Falsepositivesandfalsenegativesaredis-cussedinSections6.2and6.3.6.1AttackDetectionTable5showstheattacksusedinourexperiments.Theseattackswerechosentocovertherangeofattackcate-gorieswehavediscussed,andtospanmultipleprogram-minglanguages.Whereverpossible,weselectedattacksonwidely-usedapplications,sinceitislikelythatobvi-oussecurityvulnerabilitiesinsuchapplicationswouldhavebeenxed,andhencewearemorelikelytodetectmorecomplexattacks.Intermsofmarking,allinputsreadfromnetwork(us-ingread,recvandrecvfrom)weremarkedastainted.SincethePHPinterpreterisconguredasamoduleforApache,thesametechniqueworksforPHPapplica-tionsaswell.NetworkdataistaintedwhenitisreadbyApache,andthisinformationpropagatesthroughthePHPinterpreter,andineffect,throughthePHPapplica-tionaswell.ThepoliciesusedinourattackexampleswerealreadydiscussedinSection4.Totestourtechnique,werstdownloadedthesoft-warepackagesshowninFigure5.Wedownloadedtheexploitcodefortheattacks,andveriedthattheyworkedasexpected.ThenweusedtransformedCprogramsandinterpreterswithpolicycheckingenabled,andveriedthateachoneoftheattackswerepreventedbythesepoli-cieswithoutraisingfalsealarms.NetworkServersinC.wu-ftpdversions2.6.0andlowerhaveaformatstringvulnerabilityinSITEEXECcommandthatallowsar-bitrarycodeexecution.Theattackisstoppedbythepolicythattheformatdirective%ninaformatstringshouldnotbetainted.sambaversions2.2.8andlowerhaveastack-smashingvulnerabilityinprocessingatypeofrequestcalledtransaction2open.Nopolicyisrequiredtostopthisattackthestack-smashingstependsupcorruptingsomedataontheshadowstackratherthanthemainstack,sotheattackfails.IfwehadusedanattackthatusesaheapoverowtooverwriteaGOTentry(whichiscommonwithheapoverows),thistoowouldbedetectedwithouttheneedforanypoliciesduetothetechniquedescribedinSection3.4forpreventingtheGOTfrombeingdi-rectlyaccessedbytheCcode.Thereasoningisthatbeforetheinjectedcodegetscontrol,theGOTentryhastobeclobberedbytheexistingcodeinthepro-gram.TheinstrumentationintheclobberingcodewillcauseasegmentationfaultbecauseoftheprotectionoftheGOT,andhencetheattackwillbeprevented.NotethattheGOTisnormallyusedbythePLT(ProcedureLinkageTable)codethatisintheassemblycodeau-tomaticallyaddedbythecompiler,andisnotintheCsourcecode,soanormalGOTaccesswillnotbein-strumentedwithchecksontainttags,andhencewillnotleadtoamemoryfault.Iftheattackcorruptedsomeotherfunctionpointer,thenthejmppolicywoulddetecttheuseoftainteddatainjumptargetandstoptheattack.PicoHTTPServer(pServ)versions3.2and lowerhaveadirectorytraversalvulnerability.Thewebserverdoesincludechecksforthepresenceof..inthelename,butallowsthemaslongastheirusedoesnotgooutsidethecgi-bindirectory.Todeterminethis,pServscansthelenameleft-to-right,decrementingthecountforeachoccurrenceof..,andincrementingitforeachoccurrenceof/character.Ifthecountergoestozero,thenac-cessisdisallowed.Unfortunately,alenamesuchas/cgi-bin////../../bin/shsatisesthischeck,buthastheeffectofgoingoutsidethe/cgi-bindi-rectory.Thisattackisstoppedbythedirectorytraver-salpolicyshowninSection4.WebApplicationsinPHP.phpBB2SQLinjectionvulnerabilityin(version2.0.5of)phpBB,apopularelectronicbulletinboardappli-cation,allowsanattackertostealtheMD5passwordhashofanotheruser.Thevulnerablecodeis:$sql="SELECTp.post_idFROM...WHERE...ANDp.topic_id=$topic_idAND..."Normally,theuser-suppliedvalueforthevariabletopicidshouldbeanumber,andinthatcase,theabovequeryworksasexpected.Supposethattheat-tackerprovidesthefollowingvalue:-1UNIONSELECTord(substring(user_password,5,1))FROMphpbb_usersWHEREuserid=3/*ThisconvertstheSQLqueryintoaunionoftwoSELECTstatements,andcommentsout(using/*)theremainingpartoftheoriginalquery.TherstSELECTreturnsanemptysetsincetopicidissetto-1.Asaresult,thequeryresultequalsthevalueoftheSELECTstatementinjectedbytheattacker,whichreturnsthe5thbyteintheMD5hashofthebulletinboarduserwiththeuseridof3.Byrepeatingthisat-tackwithdifferentvaluesforthesecondparameterofsubstring,theattackercanobtaintheentireMD5passwordhashofanotheruser.TheSQLinjectionpol-icydescribedinSection4stopsthisattack.SquirrelMailcross-sitescriptingispresentinver-sion1.2.10ofSquirrelMail,apopularweb-basedemailclient,e.g.,readbody.phpdirectlyoutputsvaluesofuser-controlledvariablessuchasmailboxwhilegeneratingHTMLpages.Theattackisstoppedbythecross-sitescriptingpolicyinSection4.SquirrelMailcommandinjection:SquirrelMail(Version1.4.0)constructsacommandforencryptingemailusingthefollowingstatementinthefunctiongpgencryptintheGPGplugin1.1.$command.="-r$send_to_list�2&1";Thevariablesendtolistshouldcontaintherecip-ientnameintheToeld,whichisextractedusingtheparseAddressfunctionofRfc822Headerob-jectinSquirrelMail.However,duetoabuginthisfunction,somemalformedentriesintheToeldarereturnedwithoutcheckingforproperemailformat.Inparticular,byenteringhrecipienti;hcmdi;intothiseld,theattackercanexecuteanyarbitrarycommandhcmdiwiththeprivilegeofthewebserver.Byapply-ingapolicythatprohibitstaintedshellmeta-charactersintherstargumenttothepopenfunction,thisattackisstoppedbyourtechnique.phpBBdirectorytraversal:AvulnerabilityexistsinphpBB,which,whenthegalleryavatarfeatureisen-abled,allowsremoteattackerstodeletearbitrarylesusingdirectorytraversal.Thisvulnerabilitycanbeex-ploitedbyatwo-stepattack.Intherststep,theat-tackersavesthelename,whichcontains..char-acters,intotheSQLdatabase.Inthesecondstep,thelenameisretrievedfromthedatabaseandusedinacommand.Todetectthisattack,itisnecessarytorecordtaintinformationfordatastoredinthedatabase,whichisquiteinvolved.Wetookashortcut,andmarkedalldataretrievedfromthedatabaseastainted.(Alternatively,wecouldhavemarkedonlythoseeldsupdatedbytheuserastainted.)Thisenabledtheattacktobedetectedusingthedirectorytraversalpolicy.phpxmlrpc/expatcommandinjection:phpxmlrpcisapackagewritteninPHPtosupporttheimplemen-tationofPHPclientsandserversthatcommunicateus-ingtheXML-RPCprotocol.ItusestheexpatXMLparserforprocessingXML.phpxmlrpcversions1.0andearlierhavearemotecommandinjectionvulner-ability.Ourcommandinjectionpolicystopsexploita-tionsofthisvulnerability.BashCGIApplication.nph-test-cgiisaCGIscriptthatwasincludedbydefaultwithApachewebserverversions1.0.5andearlier.ItprintsoutthevaluesoftheenvironmentvariablesavailabletoaCGIscript.ItusesthecodeechoQUERY_STRING=$QUERY_STRINGtoprintthevalueofthequerystringsenttoit.Ifthequerystringcontainsa*thenbashwillapplylenameex-pansiontoit,thusenablinganattackertolistanydirec-toryonthewebserver.Thisattackwasstoppedbyapolicythatrestrictedtheuseoftaintedmeta-charactersintheargumenttoshellglobfilename,whichisthefunctionusedbybashforlenameexpansion.Intermsofmarking,theCGIinterfacedenestheexactsetofen-vironmentvariablesthroughwhichinputsareprovidedtoaCGIapplication,andallthesearemarkedastainted.6.2FalsePositivesThepoliciesdescribedsofarhavebeendesignedwiththegoalofavoidingfalsepositives.Weexperimen-tallyveriedthatfalsepositivesdidnotoccurinourexperimentsinvolvingthewu-ftpdserver,theApache ServerProgramsWorkloadOrig.ResponseTimeOverheadApache-2.0.40Webstone30clientsdownloading0.036sec/page6%5KBpagesover100Mbpsnetworkwu-ftpd-2.6.0Downloada12MBle10times.11.5sec3%postx-1.1.12Sendonethousand3KBemails0.03sec/mail7%Figure6:Performanceoverheadsofservers.ForApacheserver,performanceismeasuredintermsoflatencyandthroughputdegradation.Forotherprograms,itismeasuredintermsofoverheadinclientresponsetime.ProgramWorkloadOver-Over-Over-Over-head(A)head(B)head(C)head(D)bc-1.06Findfactorialof600.212%68%61%61%enscript-1.6.4Converta5.5MBtextleintoaPSle.660%529%63%58%bison-1.35ParseaBisonleforC++grammar.134%92%79%78%gzip-1.3.3Compressa12MBle.228%161%110%106%Figure7:PerformanceoverheadsofCPU-intensiveprograms.PerformanceismeasuredintermsofCPUtime.Over-headsindifferentcolumnscorrespondto:(A)Nooptimizations,(B)Useoflocaltagvariable,(C)B+Useof2-bittaintvalue,(D)C+Useofdependencyanalysis.webserver,andthetwoPHPapplications,phpBBandSquirrelMail.Forwu-ftpdandApache,weenabledthecontrolowhijackpolicy,formatstringpolicy,direc-torytraversalpolicy,andshellcommandinjectionpolicy.ForthePHPapplications,weadditionallyenabledtheSQLinjectionpolicyandcross-sitescriptingpolicyforthePHPinterpreter.ToevaluatethefalsepositivesforApache,weusedthetransformedserverasourlab'sregularwebserverthatacceptedreal-worldHTTPrequestsfromInternetforseveralhours.Forthewu-ftpdserver,weranallthesupportedcommandsfromaftpclient.TotestphpBBandSquirrelMail,wewentthroughallthemenuitemsofthesetwoWebapplications,performednormaloper-ationsthataregularusermightdo,suchasregisteringauser,postingamessage,searchingamessage,managingaddressbook,movingmessagesbetweendifferentmailfolders,andsoon.Nofalsepositiveswereobservedintheseexperiments.6.3FalseNegativesFalsenegativescanarisedueto(a)overlypermissivepolicies,(b)implicitinformationows,and(c)useofuntransformedlibrarieswithoutadequatesummarizationfunctions.WewilldiscussthepolicyrenementandimplicitowsinSection7.Asforexternallibraries,thebestap-proachistotransformthem,sothattheneedforsumma-rizationcanbeeliminated.Ifthiscannotbedone,thenourtransformationwillidentifyalltheexternalfunctionsthatareusedbyanapplication,sothaterrorsofomissioncanbeavoided.However,ifasummarizationfunctionisincorrect,thenitcanleadtofalsenegatives,falseposi-tives,orboth.6.4PerformanceFigure7showstheperformanceoverheads,whentheoriginalandtransformedprogramswerecompiledusinggcc3.2.2with-O2,andranona1.7GHz/512MB/RedHatLinux9.0PC.Forserverprograms,theoverheadofourapproachislow.ThisisbecausetheyareI/Ointensive,whereasourtransformationaddsoverheadsonlytocodethatper-formssignicantamountofdatacopyingwithinthepro-gram,and/orotherCPU-intensiveoperations.ForCPU-intensiveCprograms,theoverheadisbetween61%to106%,withanaverageof76%.6.4.1EffectofOptimizations.TheoptimizationsdiscussedinSection3.3havebeenveryeffective.WecommentfurtherinthecontextofCPU-intensivebench-marks.Useoflocaltaintvariablesreducedtheoverheadsby42%to144%.Thisisduetothereasonsmentionedearlier:compilerssuchasgccareverygoodinopti-mizingoperationsonlocalvariables,butdoapoorjobonglobalarrays.Thus,byreplacingglobaltagmapaccesseswithlocaltagvariableaccesses,signicantperformanceimprovementcanbeobtained.Mostprogramsaccesslocalvariablesmuchmorefre-quentlythanglobalvariables.Forinstance,wefoundout(byinstrumentingthecode)that99%ofaccessesmadebybcaretolocalvariables.Agureof90%isnotatalluncommon.Asaresult,theintroductionoflocaltagvariablesleadstodramaticperformanceimprovementforsuchprograms.Forprogramsthataccessglobalvariablesfrequently,suchasgzipthathas41%ofitsaccessesgoingtoglobalvariables,theperformanceimprovementsarelessstriking.tagmapoptimizationsareparticularlyeffectivefor programsthatoperatemainlyonintegerdata.Thisisbecauseoftheuseof2-bittainttags,whichavoidstheneedforbit-maskingandshiftstoaccesstaintinforma-tion.Asaresultweseesignicantoverheadreductionintherangeof7%to466%.Intraproceduralanalysisandoptimizationfurtherre-ducestheoverheadbyupto5%.Thegainsaremod-estbecausegccoptimizationshavealreadyeliminatedmostlocaltagvariablesafterthepreviousstep.Whencombined,theseoptimizationsreducetheover-headbyafactorof2to5.7Discussion7.1SupportforImplicitInformationFlowImplicitinformationowoccurswhenthevaluesofcer-tainvariablesarerelatedbyvirtueofprogramlogic,eventhoughtherearenoassignmentsbetweenthem.Aclassicexampleisgivenbythecodesnippet[25]:x=x%2;y=0;if(x==1)y=1;Eventhoughthereisnoassignmentsinvolvingxandy,theirvaluesarealwaysthesame.Theneedfortrackingsuchimplicitowshaslongbeenrecognized.[11]for-malizedimplicitowsusinganotionofnoninterference.Severalrecentresearchefforts[18,30,20]havedevel-opedtechniquesbasedonthisconcept.Noninterferenceisaverypowerfulproperty,andcancaptureeventheleastbitofcorrelationbetweensensitivedataandotherdata.Forinstance,inthecode:if(x�10000)error=true;if(!error){y="/bin/ls";execve(y);}thereisanimplicitowfromxtoerror,andthentoy.Hence,apolicythatforbidstainteddatatobeusedasanexecveargumentwouldbeviolatedbythiscode.Thisexampleillustrateswhynon-interferencemaybetooconservative(andhenceleadtofalsepositives)inourapplication.Inthecontextofthekindsofattacksweareaddressing,attackersusuallyneedmorecontroloverthevalueofythantheminimalrelationshipthatexistsinthecodeabove.Thus,itismoreappropriatetotrackexplicitows.Nevertheless,therecanbecaseswheresubstantialinformationowtakesplacewithoutassignments,e.g.,inthefollowingif-then-else,thereisadirectowofin-formationfromxtoyonbothbranches,butourformu-lationofexplicitinformationowwouldonlydetecttheowintheelsestatement.if(x==0)y=0;elsey=x;Thegoalofourapproachistosupportthoseimplicitowswherethevalueofonevariabledeterminesthevalueofanothervariable.Byusingthiscriteria,weseekabalancebetweentrackingnecessarydatavaluepropa-gationandminimizingfalsepositives.Currently,ourim-plementationsupportstwoformsofimplicitowsthatappeartobecommoninCprograms.Translationtables.Decodingissometimesimple-mentedusingatablelookup,e.g.,y=translation_tab[x];wheretranslationtabisanarrayandxisabyteofinput.Inthiscase,thevalueofxdeterminesthevalueofyalthoughthereisnodirectassignmentfromxtoy.Tohandlethiscase,wemodifythebasictransfor-mationsothattheresultofanarrayaccessismarkedastaintedwheneverthesubscriptistainted.Thissuc-cessfullyhandlestheuseoftranslationtablesinthePHPinterpreter.Decodingusingif-then-else/switch.Sometimes,de-codingisimplementedusingastatementoftheform:if(x=='+')y='';(SuchcodeisoftenusedforURL-decoding.)Clearly,thevalueofycanbedeterminedbythevalueofx.Moregenerally,switchstatementscouldbeusedtotranslatebetweenmultiplecharacters.Ourtransfor-mationhandlestheminthesamewayasaseriesofif-then-elsestatements.Specically,consideranif-then-elsestatementoftheform:if(x==E){...y=E0;...}IfEandE0areconstant-valued,thenweaddatagupdatetag(y)=tag(x)immediatelybeforetheas-signmenttoy.Whileourcurrenttechniqueseemstoidentifysomeofthecommoncaseswhereimplicitowsaresignicant,itisbynomeanscomprehensive.Developmentofamoresystematicapproachthatcanprovidesomeassurancesaboutthekindsofimplicitowscaptured,whileensur-ingalowfalsepositiverate,isatopicoffutureresearch.7.2PolicyRenementPolicydevelopmenteffortisanimportantconcernwithanypolicyenforcementtechnique.Inparticular,thereisatrade-offbetweenpolicyprecisionandthelevelofef-fortrequired.Ifoneiswillingtotoleratefalsepositives,policiesthatproduceveryfewfalsenegativescanbede-velopedwithmodesteffort.Alternatively,iffalseneg-ativescanbetolerated,thenfalsepositivescanbekepttoaminimumwithlittleeffort.Tocontainbothfalsepositivesandfalsenegatives,moreeffortneedstobespentonpolicydevelopment,takingapplication-specicorinstallation-speciccharacteristics.Theaboveremarksaboutpolicy-basedtechniquesaregenerallyapplicabletoourapproachaswell.Forthefor-matstringattack,weusedapolicythattendedtoerronthesideofproducingfalsepositives,bydisallowingalluseoftaintedformatdirectives.However,itisconceiv-ablethatsomeapplicationsmaybepreparedtoreceiveasubsetofformatdirectivesinuntrustedinputs,andhan-dlethemcorrectly.Insuchcases,thisapplicationknowl-edgecanbeusedbyasystemadministratortousealess restrictivepolicy,e.g.,allowingtheuseofformatdirec-tivesotherthan%n.Thisshouldbedonewithcare,orelseitispossibletowritepoliciesthatpreventtheuseof%n,butallowtheuseofvariantssuchas%5nthathaveessentiallythesameeffect.Alternatively,thepolicymayberelaxedtopermitspecicexceptionstothegeneralrulethattherebenoformatdirectives,e.g.,therule:vfprintf(fmt)jfmtmatchesany(Format)Tany&&(!(fmtmatches"[%]*%s[%]*"))!reject()allowstheuseofasingle%sformatdirectivefromun-trustedsources,inadditiontopermittingformatstringsthatcontainuntaintedformatdirectives.Thedirectorytraversalpolicyalsotendstoerronthesideoffalsepositives,sinceitprecludesallaccessesoutsidetheauthorizedtopleveldirectories(e.g.Docu-mentRootandcgi-bin)ofawebserverifcomponentsofthelenamebeingaccessedarecomingfromuntrustedsources.Indevisingthispolicy,wereliedonapplication-specicknowledge,namely,thefactthatwebserversdonotallowclientstoaccesslesoutsidethetopleveldi-rectoriesspeciedintheservercongurationle.An-otherpointtobenotedaboutthispolicyisthatvariantsofdirectorytraversalattackthatdonotescapethesetopleveldirectories,butsimplyattempttofoolper-directoryaccesscontrols,arenotaddressedbyourpolicy.Thecontrol-owhijackpolicyisalreadyaccurateenoughtocaptureallattacksthatusecorruptionofcodepointersasthebasistoalterthecontrol-owofprograms,soweproceedtodiscusstheSQLinjectionpolicy.ThepolicyshowninFigure4doesnotaddressattacksthatinjectonlySQLkeywords(e.g.,theUNIONoperation)toalterthemeaningofaquery.Thiscanbeaddressedbyapolicybasedontokenization.TheideaistoperformalexicalanalysisontheSQLquerytobreakitupintoto-kens.SQLinjectionattacksarecharacterizedbythefactthatmultipletokensappearintheplaceofone,e.g.,mul-tiplekeywordsandmeta-characterswereprovidedbytheattackerintheplaceofasimplestringvalueintheattackexamplesdiscussedearlierinthepaper.Thus,systematicprotectionfromSQLinjectionscanbeobtainedusingapolicythatpreventstaintedstringsfromspanningmul-tipletokens.Asimilarapproachissuggestedin[24],althoughtheconditionsarenotdenedasprecisely.Suetal[27]provideaformalcharacterizationofSQLinjec-tionusingasyntaxanalysisofSQLqueries.Theessen-tialideaistoconstructaparsetreefortheSQLquery,andtoexamineitssubtrees.Foranysubtreewhoserootistainted,allthenodesbelowthatsubtreeshouldbetaintedaswell.Inotherwords,taintedinputcannotstraddledif-ferentsyntacticconstructs.Thisisafurtherrenementoverthecharacterizationwesuggest,wheretaintedinputshouldnotstraddledifferentlexicalentities.CommandinjectionattacksaresimilartoSQLinjec-tionattacksinmanyways,andhenceatokenization-basedpolicymaybeagoodchoiceforthemaswell.Forthisreason,weomitadetaileddiscussionofcommandinjectionpolicies.Nevertheless,itshouldbementionedthattherearesomedifferencesbetweenSQLandcom-mandinjection,e.g.,shellsyntaxismuchmorecomplexthanSQLsyntax.Moreover,wemaywanttorestrictthecommandnamessothattheyarenottainted.Notethattokenizationisalexicalanalysistaskthatis(almostinvariably)implementedusingregularexpres-sionbasedspecications.Thus,theabovetokenization-basedpolicyisamenabletoexpressionusingourpolicylanguage.Onecouldarguethataregularexpressiontorecognizetokenswouldbecomplex,andhenceapolicymayendupusingasimplerapproximationtotokeniza-tion.Thisdiscussionshowsthattheusualtrade-offinpolicybasedattackdetectionbetweenaccuracyandpol-icycomplexitycontinuesinthecaseoftaint-enhancedpoliciesaswell.Nevertheless,itshouldbenotedthatforagivenpolicydevelopmenteffort,taint-enhancedpoli-ciesseemtobesignicantlymoreaccuratethanpoliciesthatdonotincorporateanyknowledgeabouttaint.Finally,wediscussthecross-sitescriptingat-tack.Thepolicydiscussedearlierdoesnotad-dressvariationsofthebasicattack,e.g.,attack-erscanevadethispolicybyinjectingthemaliciousscriptcodeinonmouseover=malicious()or&#ximg0;src="javascript:malicious()",whichisnotablockenclosedbythescripttag.TodetecttheseXSSvariations,onehastounderstandthedifferentHTMLtagpatternsinwhichamaliciousscriptcanbeinjectedintodynamicHTMLpages,anddeveloppoliciestopreventtheuseofsuchtaintedpatternsinHTMLoutputs.Insummary,althoughtheexamplepoliciesshowninFigure4wereabletostoptheattacksinourexperi-ments,manyofthemneedfurtherimprovementbeforetheycanstanduptoskilledattackersthatareknowl-edgeableaboutthepoliciesbeingenforced.Weoutlinedthewaystoimprovesomeofthesepolicies,butacom-prehensivesolutiontothepolicydevelopmentproblemisnotreallythefocusorcontributionofthispaper.In-stead,ourcontributionistoshowthefeasibilityandprac-ticalityofusingne-grainedtaintinformationindevel-opingpolicy-basedattackprotection.Theavailabilityofne-grainedtaintinformationmakesourpoliciessignif-icantlymoreprecisethantraditionalaccess-controlpoli-cies.Moreover,ourapproachempowerssystemadmin-istratorsandsecurityprofessionalstoupdateandrenethesepoliciestoimproveprotection,withouthavingtowaitforthepatchesofanewlydiscoveredattackavenue.8RelatedWorkMemoryErrorExploitDetection.Bufferoverowsandrelatedmemoryerrorshavereceivedalotofatten- tion,andseveralefcienttechniqueshavebeendevel-opedtoaddressthem.EarlyapproachessuchasStack-Guard[7]andProPolice[9]focusedonjustasingleclassofattacks.Recently,moregeneraltechniquesbasedonrandomizationhavebeendeveloped,andtheypromisetodefendagainstmostmemoryerrorexploits[16,2],How-ever,duetothenatureoftheClanguage,thesemeth-odsstillcannotdetectcertaintypesofattacks,e.g.,over-owsfromanarraywithinastructuretoanadjacentvari-able.Fine-grainedtaintanalysiscancapturetheseat-tackswheneverthecorrupteddataisusedasanargumentinasensitiveoperation.(Thisisusuallythecase,sincethegoalofanattackerincorruptingthatdatawastoper-formasecurity-sensitiveoperation.)Althoughourover-headsaregenerallyhigherthanthetechniquesmentionedabove,webelievethattheyaremorethancompensatedbytheincreaseinattackcoverage.Fine-GrainedTaintAnalysis.Thekeydistinctionsbetweenourworkandpreviousne-grainedtaintanal-ysistechniquesof[22,28,5]werealreadydiscussedintheintroduction,sowelimitourdiscussiontothemoretechnicalpointshere.Asmentionedearlier,[28,5]relyonhardwaresupportfortaint-tracking.[22]isclosertoourtechniquethanthesetwotechniques.Ithasanadvan-tageoveroursinthatitcanoperateonarbitraryCOTSbi-naries,whereaswerequireaccesstotheCsourcecode.Thisavoidsproblemssuchashand-writtenassemblycode.Theirmaindrawbackisperformance:ontheap-plicationApachethattheyprovideperformancenumberson,theiroverheadsaremorethan100timeshigherthanours.Thisisbecause(a)theyrelyonValgrind,whichinitselfintroducesmorethan40timesoverheadsascom-paredtoourtechnique,and(b)theyareconstrainedbyhavingtoworkonbinarycode,andwithoutthebenetofstaticanalysesandoptimizationsthathavegoneintoourwork.(Here,wearenotonlyreferringtoourownanaly-sesandoptimizations,butalsomanyoftheoptimizationsimplementedintheGCCcompilerthatweusedtocom-pilethetransformedprograms.)Thereareseveralothertechnicaldifferencesbetweenourworkandthatof[22].Forinstance,theytrack32-bitsoftaintinformationforeachbyteofdata,whereasweuse2bits.Anotherimportantdifferenceisoursupportforimplicitows,whicharenothandledin[22].DynamicTaintBasedTechniquesforDetectingAt-tacksonWebApplications.Independentlyandinpar-alleltoourwork,whichrstappearedin[33],[23]and[24]haveproposedtheideaofusingne-grainedtaintanalysistodetectinjectionattacksonwebapplications.Theimplementationsof[23]and[24]areverysimilar,usinghand-transformationofthePHPinterpretertotracktaintdata.However,[24]providesamoredetailedfor-mulationanddiscussionoftheproblem,sowefocusonthisworkhere.TheyexplainthattheseinjectionattacksaretheresultofadhocserializationofcomplexdatasuchasSQLqueriesorshellcommands,anddevelopade-tectiontechniquecalledcontext-sensitivestringevalua-tion(CSSE),whichinvolvescheckingtheuseoftainteddatainstrings.Ourworkimprovesovertheirsinsev-eralways.First,byworkingattheleveloftheClan-guage,weareabletohandlemanymoreapplications:mostserverprogramsthatarewritteninC,aswellasprogramswrittenininterpretedlanguagessuchasPHP,bashandsoon.Second,ourformulationoftheprob-lemastaint-enhancedpolicyenforcementismoregen-eral,andcanbeappliedtostealthyattackssuchasthosediscussedinSection2thatdonotinvolveserializationproblems;andtoattacksinvolvingarbitrarytypesofdataratherthanbeinglimitedtostrings.Third,ourapproachreliesonasimpletransformationthatisshowninSection3,andimplementedusing3.6KLOCofcode,whiletheirapproachreliesonmanualtransformationofalargepieceofsoftwarethathasover300KLOC.Othertechnicalcon-tributionsofourworkinclude(a)thedevelopmentofasimplepolicylanguageforconcisespecicationoftaint-enhancedpolices,and(b)supportforimplicitowsthatallowustoprovidesomesupportforcharacterencodingsandtranslations.AsdiscussedinSection7,Suetal[27]describeatech-niquefordetectingSQLinjectionattacksusingsyntaxanalysis.TheirmainfocusisonprovidingapreciseandformalcharacterizationofSQLinjectionattacks.How-ever,theirimplementationoftainttrackingisnotveryreliable.Inparticular,theysuggestatechniquethatavoidsruntimeoperationsfortaint-trackingbybrack-etingeachinputstringwithtwospecialsymbolsthatsurrounduntrustedinputstrings.Assumingthatthesebracketswouldbepropagatedtogetherwithinputstrings,checkingforthepresenceoftaintwouldreducetocheck-ingforthepresenceofthesespecialsymbols.However,thisassumptiondoesnotholdforprogramsthatextractpartsoftheirinputandusethem,e.g.,awebapplica-tionmayremovenon-alphanumericcharactersfromaninputstringandusethem,andthisprocesswouldlikelydiscardthebracketingcharacters.Inothercases,awebapplicationmayparseauserinputintomultipleelds,anduseeacheldindependently,onceagaincausingthespecialsymbolstobelost.ManualApproachesforCorrectingInputValidationErrors.Taintanalysistargetsvulnerabilitiesthatariseduetomissingorincorrectinputvalidationcode.Onecanmanuallyreviewthecode,andtrytoaddallthenec-essaryinputvalidationchecks.However,thenotionofvalidityisdeterminedbythemannerinwhichtheinputisused.Thus,onehastotraceforwardintheprogramtoidentifyallpossibleusesofaninputinsecuritysensitiveoperations,whichisaverytime-consuminganderror- pronetask.Ifwetrytoperformthevalidationcheckatthepointofuse,wefacetheproblemthatthenotionofvaliditydependsonthedatasource.Forinstance,itisperfectlyreasonableforanSQLquerytocontainsemi-colonsiftheseoriginatedwithintheprogramtext,butnotsoifitcamefromexternalinput.Thus,wehavetotracebackfromsecurity-sensitiveoperationstoiden-tifyhowitsargumentswereconstructed,onceagainhav-ingtomanuallyexaminelargenumberofprogrampaths.Thisleadstosituationswherevalidationchecksareleftoutonsomepaths,andpossiblyduplicatedonothers.Moreover,thevalidationchecksthemselvesarenotori-ouslyhardforprogrammerstocodecorrectly,andhavefrequentlybeenthesourceofvulnerabilities.InformationFlow.Informationowanalysishasbeenresearchedforalongtime[1,10,8,18,30,20,25].Earlyresearchwasfocusedonmulti-levelsecurity,wherene-grainedanalysiswasnotdeemednecessary[1].Morere-centworkhasbeenfocusedontrackinginformationowatvariablelevel,andmanyinterestingresearchresultshavebeenproduced.Whilethesetechniquesarepromis-ingforprotectingprivacyandintegrityofsensitivedata,asdiscussedinSection2,thevariable-levelgranularityisinsufcientfordetectingmostattacksdiscussedinthispaper.StaticAnalysis.Statictaintanalysistechniqueshavebeenproposedbymanyforndingsecurityvulnerabil-ities,includinginputvalidationerrorsinwebapplica-tions[17,14,32],user/kernelpointerbugs[15],formatstringbugs[26],andbugsinplacementofauthorizationhooks[34].Themainadvantageofstaticanalysis(ascomparedtoruntimetechniques)isthatallpotentialvul-nerabilitiescanbefoundstatically,whileitsdrawbackisarelativelackofaccuracy.Inparticular,thesetechniquestypicallydetectdependenciesratherthanvulnerabilities.Forinstance,[17]willproduceawarningwheneverun-trusteddataisusedinanymannerinanSQLquery.Thismaynotbeveryusefulifsuchadependencyisaninte-gralpartofapplicationlogic.Tosolvethisproblem,theconceptofendorsementcanbeusedtoindicatesafedependencies.Typically,thisisdonebyrstperform-ingappropriatevalidationchecksonapieceofuntrusteddata,andthenendorsingittoindicatethatitissafetouse(i.e.,nolongertainted).However,programmersarestillresponsiblefordeterminingwhatissafeasdiscussedbefore,thereisnoeasywayforthemtodothis.Animportantdifferencebetweenourworkandstaticanalysisisoneofintendedaudience.Staticanalysisbasedtoolsaretypicallyintendedforusebydevelopers,sincetheyneeddetailedknowledgeaboutprogramlogictodeterminewheretointroduceendorsements,andwhatvalidationchecksneedtobemadebeforeendorsement.Incontrast,theaudienceforourtoolisasystemadmin-istratororanoutsidesecurityengineerthatlacksdetailedknowledgeofapplicationcode.OtherTechniques.SQLrand[4]defeatsSQLinjec-tionbyrandomizingthetextualrepresentationofSQLcommands.Adrawbackofthisapproach,ascomparedtothetechniquepresentedinthispaper,isthatitrequiresmanualchangestotheprogramsothattheprogramusesthemodiedrepresentationforSQLcommandsgener-atedbyitself.Ourapproachwasinspiredbytheef-fectachievedbySQLrand,namely,thatofdistinguish-ingcommandsgeneratedbytheapplicationfromthoseprovidedbyuntrustedusers.AMNESIA[12]isanotherinterestingapproachforde-tectingSQLinjectionattacks.ItusesastaticanalysisofJavaprogramstocomputeanite-statemachinemodelthatcapturesthelexicalstructureofSQLqueriesissuedbyaprogram.SQLinjectionattackscauseSQLqueriesissuedbytheprogramtodeviatefromthismodel,andhencedetected.Akeybenetofthisapproachisthatbyusingstaticanalysis,itcanavoidruntimetaint-tracking,andishencemuchmoreefcientthanourapproach.Al-thoughthisapproachhasbeendemonstratedtoworkwellforSQLinjections,theconservativenatureofitsstaticanalysisanditsinabilitytodistinguishdifferentsourcesofinputscanleadtoahigherrateoffalsepositiveswhenappliedtoothertypesofattacks.Perlhasataintmode[31]thattrackstaintinformationatacoarsegranularitythatofvariables.InPerl,onehastoexplicitlyuntaintdatabeforeusingitinasecu-ritysensitivecontext.Thisisusuallydoneafterperform-ingappropriatevalidations.Inourapproach,duetotheexibilityprovidedbyourpolicylanguage,wehavenotfacedaneedforsuchexplicituntainting.Nevertheless,ifauserexplicitlywantstotrustsomeinput,aprimitivecanbeeasilyaddedtosupportthis.9ConclusionInthispaper,wepresentedauniedapproachthatad-dressesawiderangeofcommonlyreportedattacksthatexploitsoftwareimplementationerrors.Ourapproachisbasedonafullyautomaticandefcienttaintanaly-sistechniquethatcantracktheowofuntrusteddatathroughaprogramatthegranularityofbytes.Throughexperiments,weshowedthatourtechniquecanbeap-pliedtodifferenttypesofapplicationswritteninmultipleprogramminglanguages,andthatitiseffectiveindetect-ingattackswithoutproducingfalsepositives.Webelievethatanumberofsoftwarevulnerabilitiesariseduetothefactthatsecuritychecksareinterspersedthroughouttheprogram,anditisoftendifculttocheckifthecorrectsetofchecksarebeingperformedoneveryprogrampath,especiallyincomplexprogramswherethecontrolowsthroughmany,manyfunctions.Bydecou- plingpoliciesfromapplicationlogic,ourapproachcanprovideahigherdegreeofassuranceonthecorrectnessofpolicies.Moreover,theexibilityofourapproachal-lowssiteadministratorsandthirdpartiestoquicklyde-veloppoliciestopreventnewclassesofattacks,withouthavingtowaitforpatches.AcknowledgmentsThisresearchwassupportedinpartbyanONRgrantN000140110967,NSFgrantsCNS-0208877andCCR-0205376,andbyaNYSTARgrant.References[1]D.E.BellandL.J.LaPadula.Securecomputersystems:Math-ematicalfoundations.TechnicalReportMTR-2547,Vol.1,MITRECorp.,Bedford,MA,1973.[2]S.Bhatkar,D.C.DuVarney,andR.Sekar.Addressobfuscation:Anefcientapproachtocombatabroadrangeofmemoryerrorexploits.InUSENIXSecuritySymposium,August2003.[3]T.Bowen,D.Chee,M.Segal,R.Sekar,T.Shanbhag,andP.Up-puluri.Buildingsurvivablesystems:Anintegratedapproachbasedonintrusiondetectionanddamagecontainment.InDIS-CEX,2000.[4]S.W.BoydandA.D.Keromytis.SQLrand:PreventingSQLinjectionattacks.InInternationalConferenceonAppliedCryp-tographyandNetworkSecurity(ACNS),pages292302,2004.[5]S.Chen,J.Xu,N.Nakka,Z.Kalbarczyk,andR.K.Iyer.Defeat-ingmemorycorruptionattacksviapointertaintednessdetection.InIEEEInternationalConferenceonDependableSystemsandNetworks(DSN),2005.[6]C.Cowan,M.Barringer,S.Beattie,andG.Kroah-Hartman.For-matGuard:Automaticprotectionfromprintfformatstringvul-nerabilities.InUSENIXSecuritySymposium,2001.[7]C.Cowan,C.Pu,D.Maier,J.Walpole,P.Bakke,S.Beattie,A.Grier,P.Wagle,Q.Zhang,andH.Hinton.Automaticde-tectionandpreventionofbuffer-overowattacks.InUSENIXSecuritySymposium,1998.[8]D.E.DenningandP.J.Denning.Certicationofprogramsforse-cureinformationow.CommunicationsoftheACM,20(7):504513,July1977.[9]H.EtohandK.Yoda.Protectingfromstack-smashingat-tacks.http://www.trl.ibm.com/projects/security/ssp/main.html,June2000.[10]J.S.Fenton.Memorylesssubsystems.ComputingJournal,17(2):143147,May1974.[11]J.GoguenandJ.Meseguer.Securitypoliciesandsecuritymod-els.InIEEESymposiumonSecurityandPrivacy,1982.[12]W.HalfondandA.Orso.AMNESIA:AnalysisandmonitoringforneutralizingSQL-injection.InIEEE/ACMInternationalCon-ferenceonAutomatedSoftwareEngineering(ASE),2005.[13]N.Hardy.Theconfuseddeputy:(orwhycapabilitiesmighthavebeeninvented).ACMSIGOPSOperatingSystemsReview,22(4):3638,October1988.[14]Y.-W.Huang,F.Yu,C.Hang,C.-H.Tsai,D.Lee,andS.-Y.Kuo.Securingwebapplicationcodebystaticanalysisandruntimepro-tection.InInternationalWorldWideWebConference,2004.[15]R.JohnsonandD.Wagner.Findinguser/kernelpointerbugswithtypeinference.InUSENIXSecuritySymposium,2004.[16]G.S.Kc,A.D.Keromytis,andV.Prevelakis.Counteringcode-injectionattackswithinstruction-setrandomization.InACMConferenceonComputerandCommunicationSecurity(CCS),2003.[17]V.B.LivshitsandM.S.Lam.FindingsecurityvulnerabilitiesinJavaapplicationswithstaticanalysis.InUSENIXSecuritySym-posium,2005.[18]J.McLean.Ageneraltheoryofcompositionfortracesetsclosedunderselectiveinterleavingfunctions.InIEEESymposiumonSecurityandPrivacy,pages7993,May1994.[19]S.McPeak,G.C.Necula,S.P.Rahul,andW.Weimer.CIL:In-termediatelanguageandtoolsforCprogramanalysisandtrans-formation.InConferenceonCompilerConstruction,2002.[20]A.C.Myers.JFlow:Practicalmostly-staticinformationowcontrol.InACMSymposiumonPrinciplesofProgrammingLan-guages(POPL),pages228241,Jan.1999.[21]N.NethercoteandJ.Seward.Valgrind:Aprogramsupervisionframework.InWorkshoponRuntimeVerication(RV),Boulder,Colorado,USA,July2003.[22]J.NewsomeandD.Song.Dynamictaintanalysisforautomaticdetection,analysis,andsignaturegenerationofexploitsoncom-moditysoftware.InNetworkandDistributedSystemSecuritySymposium(NDSS),2005.[23]A.Nguyen-Tuong,S.Guarnieri,D.Greene,J.Shirley,andD.Evans.Automaticallyhardeningwebapplicationsusingpre-cisetainting.In20thIFIPInternationalInformationSecurityConference,2005.[24]T.PietraszekandC.V.Berghe.Defendingagainstinjectionat-tacksthroughcontext-sensitivestringevaluation.InRecentAd-vancesinIntrusionDetection(RAID),2005.[25]A.SabelfeldandA.C.Myers.Language-basedinformation-owsecurity.IEEEJ.SelectedAreasinCommunications,21(1),Jan.2003.[26]U.Shankar,K.Talwar,J.S.Foster,andD.Wagner.Detectingformatstringvulnerabilitieswithtypequaliers.InUSENIXSe-curitySymposium,2001.[27]Z.SuandG.Wassermann.Theessenceofcommandinjectionattacksinwebapplications.InACMSymposiumonPrinciplesofProgrammingLanguages(POPL),January2006.[28]G.E.Suh,J.W.Lee,D.Zhang,andS.Devadas.Securepro-gramexecutionviadynamicinformationowtracking.InInter-nationalConferenceonArchitecturalSupportforProgrammingLanguagesandOperatingSystems,pages8596,Boston,MA,USA,2004.[29]P.UppuluriandR.Sekar.Experienceswithspecicationbasedintrusiondetection.InproceedingsoftheRecentAdvancesinIntrusionDetectionconference,October2001.[30]D.Volpano,G.Smith,andC.Irvine.Asoundtypesystemforsecureowanalysis.JournalofComputerSecurity,4(3):167187,1996.[31]L.Wall,T.Christiansen,andR.Schwartz.ProgrammingPerl.O'Reilly,1996.[32]Y.XieandA.Aiken.Staticdetectionofsecurityvulnerabilitiesinscriptinglanguages.InUSENIXSecuritySymposium,2006.[33]W.Xu,S.Bhatkar,andR.Sekar.Practicaldynamictaintanalysisforcounteringinputvalidationattacksonwebapplications.Tech-nicalReportSECLAB-05-04,DepartmentofComputerScience,StonyBrookUniversity,May2005.[34]X.Zhang,A.Edwards,andT.Jaeger.UsingCQualforstaticanalysisofauthorizationhookplacement.InUSENIXSecuritySymposium,2002.