pointeroperationsForexamplepointersmayhavealiasesBecausealiasanalysismayonlybeapproximateinthepresenceofpointerarithmeticusingsymbolicvaluestopreciselytracksuchpointersmayresultinconstraintswhose ID: 309812
Download Pdf The PPT/PDF document "CUTE:AConcolicUnitTestingEngineforCKoush..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
CUTE:AConcolicUnitTestingEngineforCKoushikSen,DarkoMarinov,GulAghaDepartmentofComputerScienceUniversityofIllinoisatUrbanaChampaignksen,marinov,agha@cs.uiuc.eduABSTRACTInunittesting,aprogramisdecomposedintounitswhicharecollectionsoffunctions.Apartofunitcanbetestedbygeneratinginputsforasingleentryfunction.Theen-tryfunctionmaycontainpointerarguments,inwhichcasetheinputstotheunitarememorygraphs.Thepaperad-dressestheproblemofautomatingunittestingwithmem-orygraphsasinputs.Theapproachusedbuildsonpreviousworkcombiningsymbolicandconcreteexecution,andmorespecically,usingsuchacombinationtogeneratetestin-putstoexploreallfeasibleexecutionpaths.Thecurrentworkdevelopsamethodtorepresentandtrackconstraintsthatcapturethebehaviorofasymbolicexecutionofaunitwithmemorygraphsasinputs.Moreover,anecientcon-straintsolverisproposedtofacilitateincrementalgenerationofsuchtestinputs.Finally,CUTE,atoolimplementingthemethodisdescribedtogetherwiththeresultsofapplyingCUTEtoreal-worldexamplesofCcode.CategoriesandSubjectDescriptors:D.2.5[SoftwareEngineering]:TestingandDebuggingGeneralTerms:Reliability,VericationKeywords:concolictesting,randomtesting,explicitpathmodel-checking,datastructuretesting,unittesting,testingCprograms.1.INTRODUCTIONUnittestingisamethodformodulartestingofapro-grams'functionalbehavior.Aprogramisdecomposedintounits,whereeachunitisacollectionoffunctions,andtheunitsareindependentlytested.Suchtestingrequiresspeci-cationofvaluesfortheinputs(ortestinputs)totheunit.Manualspecicationofsuchvaluesislaborintensiveandcannotguaranteethatallpossiblebehaviorsoftheunitwillbeobservedduringthetesting.Inordertoimprovetherangeofbehaviorsobserved(ortestcoverage),severaltechniqueshavebeenproposedtoau-tomaticallygeneratevaluesfortheinputs.Onesuchtech-Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.ESECFSE'05,September59,2005,Lisbon,Portugal.Copyright2005ACM1595930140/05/0009...5.00.niqueistorandomlychoosethevaluesoverthedomainofpotentialinputs[4,8,10,21].Theproblemwithsuchrandomtestingistwofold:rst,manysetsofvaluesmayleadtothesameobservablebehaviorandarethusredundant,andsec-ond,theprobabilityofselectingparticularinputsthatcausebuggybehaviormaybeastronomicallysmall[20].Oneapproachwhichaddressestheproblemofredundantexecutionsandincreasestestcoverageissymbolicexecu-tion[1,3,9,22,23,27,28,30].Insymbolicexecution,apro-gramisexecutedusingsymbolicvariablesinplaceofcon-cretevaluesforinputs.Eachconditionalexpressionintheprogramrepresentsaconstraintthatdeterminesanexecu-tionpath.Observethatthefeasibleexecutionsofaprogramcanberepresentedasatree,wherethebranchpointsinaprogramareinternalnodesofthetree.Thegoalistogen-erateconcretevaluesforinputswhichwouldresultindier-entpathsbeingtaken.Theclassicapproachistousedepthrstexplorationofthepathsbybacktracking[14].Unfor-tunately,forlargeorcomplexunits,itiscomputationallyintractabletopreciselymaintainandsolvetheconstraintsrequiredfortestgeneration.Tothebestofourknowledge,LarsonandAustinwerethersttoproposecombiningconcreteandsymbolicexe-cution[16].Intheirapproach,theprogramisexecutedonsomeuser-providedconcreteinputvalues.Symbolicpathconstraintsaregeneratedforthespecicexecution.Theseconstraintsaresolved,iffeasible,toseewhethertherearepotentialinputvaluesthatwouldhaveledtoaviolationalongthesameexecutionpath.Thisimprovescoveragewhileavoidingthecomputationalcostassociatedwithfull-blownsymbolicexecutionwhichexercisesallpossibleexe-cutionpaths.Godefroidetal.proposedincrementallygeneratingtestinputsbycombiningconcreteandsymbolicexecution[11].InGodefroidetal.'sapproach,duringaconcreteexecution,aconjunctionofsymbolicconstraintsalongthepathoftheexecutionisgenerated.Theseconstraintsaremodiedandthensolved,iffeasible,togeneratefurthertestinputswhichwoulddirecttheprogramalongalternativepaths.Speci-cally,theysystematicallynegatetheconjunctsinthepathconstrainttoprovideadepthrstexplorationofallpathsinthecomputationtree.Ifitisnotfeasibletosolvethemodiedconstraints,Godefroidetal.proposesimplysub-stitutingrandomconcretevalues.AchallengeinapplyingGodefroidetal.'sapproachistoprovidemethodswhichextractandsolvetheconstraintsgeneratedbyaprogram.Thisproblemisparticularlycom-plexforprogramswhichhavedynamicdatastructuresusing pointeroperations.Forexample,pointersmayhavealiases.Becausealiasanalysismayonlybeapproximateinthepres-enceofpointerarithmetic,usingsymbolicvaluestopreciselytracksuchpointersmayresultinconstraintswhosesatisfac-tionisundecidable.Thismakesthegenerationoftestin-putsbysolvingsuchconstraintsinfeasible.Inthispaper,weprovideamethodforrepresentingandsolvingapproximatepointerconstraintstogeneratetestinputs.Ourmethodisthusapplicabletoabroadclassofsequentialprograms.Thekeyideaofourmethodistorepresentinputsfortheunitundertestusingalogicalinputmapthatrepresentsallinputs,including(nite)memorygraphs,asacollectionofscalarsymbolicvariablesandthentobuildconstraintsontheseinputsbysymbolicallyexecutingthecodeundertest.Werstinstrumentthecodebeingtestedbyinsertingfunctioncallswhichperformsymbolicexecution.Wethenrepeatedlyruntheinstrumentedcodeasfollows.Thelogi-calinputmapIisusedtogenerateconcretememoryinputgraphsfortheprogramandtwosymbolicstates,oneforpointervaluesandoneforprimitivevalues.Thecodeisrunconcretelyontheconcreteinputgraphandsymbolicallyonthesymbolicstates,collectingconstraints(intermsofthesymbolicvariablesinthesymbolicstate)thatcharacterizethesetofinputsthatwould(likely)takethesameexecutionpathasthecurrentexecutionpath.Asin[11],oneofthecollectedconstraintsisnegated.TheresultingconstraintsystemissolvedtoobtainanewlogicalinputmapI0thatissimilartoIbut(likely)leadstheexecutionthroughadierentpath.WethensetI=I0andrepeattheprocess.Sincethegoalofthistestingapproachistoexplorefeasi-bleexecutionpathsasmuchaspossible,itcanbeseenasExplicitPathModel-Checking.Animportantcontributionofourworkisseparatingpointerconstraintsfromintegerconstraintsandkeepingthepointerconstraintssimpletomakeoursymbolicexecutionlight-weightandourconstraintsolvingprocedurenotonlytractablebutalsoecient.Thepointerconstraintsarecon-ceptuallysimpliedusingthelogicalinputmaptoreplacecomplexsymbolicexpressionsinvolvingpointerswithsim-plesymbolicpointervariables(whilemaintainingtheprecisepointerrelationsinthelogicalinputmap).Forexample,ifisaninputpointertoawithaeld,thenaconstraintonwillbesimpliedtoaconstrainton0,where0isthesymbolicvariablecorrespondingtotheinputvalue.Althoughthissimplicationintroducessomeap-proximationsthatdonotpreciselycaptureallexecutions,itresultsinsimplepointerconstraintsoftheformx=yorx=y,wherexandyareeithersymbolicpointervariablesortheconstant.Theseconstraintscanbeecientlysolved,andtheapproximationsseemtosuceinpractice.WeimplementedourmethodinatoolcalledCUTE(ConcolicUnitTestingEngine,whereConcolicstandsforcooperativeConcreteandsymbolicexecution).CUTEisavailableat.CUTEimplementsasolverforbotharithmeticandpointerconstraintstoincrementallygeneratetestinputs.Thesolverexploitsthedomainofthisparticularproblemtoimplementthreenoveloptimizationswhichhelptoimprovethetestingtimebyseveralordersofmagnitude.Ourexperimentalre-sultsconrmthatCUTEcanecientlyexplorepathsinCcode,achievinghighbranchcoverageanddetectingbugs.Inparticular,itexposedsoftwarebugsthatresultinassertionviolations,segmentationfaults,orinniteloops.typedefstructcellfintv;structcell*next;gcell;intf(intv)freturn2*v+1;ginttestme(cell*p,intx)fif(x0)if(p!=NULL)if(f(x)==p-v)if(p-next==p)ERROR;return0;g Input 1:\r p\rx\r 236\rNULL\rInput 3:\r p\rx\r 3\r 1\r NULL\rInput 4:\r p\rx\r 3\r 1\rInput 2:\r p\rx\r 634\r236\r NULL\r Figure1:ExampleCcodeandinputsthatCUTEgeneratesfortestingthefunctionThispaperpresentstwocasestudiesoftestingcodeusingCUTE.TherststudyinvolvestheCcodeoftheCUTEtoolitself.Thesecondcasestudyfoundtwopreviouslyun-knownerrors(asegmentationfaultandaninniteloop)inSGLIB[25],apopularCdatastructurelibraryusedinacommercialtool.WereportedtheSGLIBerrorstotheSGLIBdeveloperswhoxedtheminthenextrelease.2.EXAMPLEWeuseasimpleexampletoillustratehowCUTEperformstesting.ConsidertheCfunctionshowninFigure1.Thisfunctionhasanerrorthatcanbereachedgivensomespecicvaluesoftheinput.Inanarrowsense,theinputtoconsistsofthevaluesoftheargumentsand.However,isapointer,andthustheinputincludesthememorygraphreachablefromthatpointer.Inthisexample,thegraphisalistofallocationunits.Fortheexamplefunction,CUTErstnon-randomlygeneratesforandrandomlygenerates236for,respectively.Figure1showsthisinputto.Asaresult,therstexecutionoftakesthebranchoftherststatementandthebranchofthesecond.Let0andx0bethesymbolicvariablesrepresentingthevaluesofand,respectively,atthebeginningoftheex-ecution.CUTEcollectstheconstraintsfromthepredicatesofthebranchesexecutedinthispath:x00(forthebranchoftherst)and0=(forthebranchofthesecond).Thepredicatesequencehx0;p0=iiscalledapathconstraint.CUTEnextsolvesthepathconstrainthx0;p0=i,obtainedbynegatingthelastpredicate,todrivethenextexecutionalonganalternativepath.Thesolu-tionthatCUTEproposesis07!non-;x07!,whichrequiresthatCUTEmakepointtoanallocatedthatintroducestwonewcomponents,and,tothereachablegraph.Accordingly,CUTErandomlygen-erates634forandnon-randomlygeneratesfor,respectively,forthenextexecution.Inthesec-ondexecution,takesthebranchoftherstandthesecondandthebranchofthethird.Forthisexecution,CUTEgeneratesthepathconstrainthx0;p0=;x0+1=v0i,where0,v0,n0,andx0arethesymbolicvaluesof,,,and,respectively.NotethatCUTEcomputestheexpression x0+1(correspondingtotheexecutionof)throughaninter-procedural,dynamictracingofsymbolicexpressions.CUTEnextsolvesthepathconstrainthx0;p0=;x0+1=v0i,obtainedbynegatingthelastpredicateandgeneratesInput3fromFigure1forthenextexecution.Notethatthespecicvalueofx0haschanged,butitremainsinthesameequivalenceclasswithrespecttothepredicatewhereitappears,namelyx00.OnInput3,takesthebranchoftherstthreestatementsandthebranchofthefourth.CUTEgeneratesthepathconstrainthx0;p0=;x0+1=v0;p0=n0i.Thispathconstraintincludesdynamicallyobtainedconstraintsonpointers.CUTEhandlesconstraintsonpointersbutre-quiresnostaticaliasanalysis.Todrivetheprogramalonganalternativepathinthenextexecution,CUTEsolvestheconstraintshx0;p0=;x0+1=v0;p0=n0iandgeneratesInput4fromFigure1.Onthisinput,thefourthexecutionofrevealstheerrorinthecode.3.CUTEWerstdenetheinputlogicalinputmapthatCUTEusestorepresentinputs.WealsointroduceprogramunitsofasimpleC-likelanguage(cf.[19]).WepresenthowCUTEinstrumentsprogramsandperformsconcolicexecution.WethendescribehowCUTEsolvestheconstraintsaftereveryexecution.WenextpresenthowCUTEhandlescomplexdatastructures.WenallydiscusstheapproximationsthatCUTEusesforpointerconstraints.Toexploreexecutionpaths,CUTErstinstrumentsthecodeundertest.CUTEthenbuildsalogicalinputmapIforthecodeundertest.Suchalogicalinputmapcanrepresentamemorygraphinasymbolicway.CUTEthenrepeatedlyrunstheinstrumentedcodeasfollows:1.ItusesthelogicalinputmapItogenerateaconcreteinputmemorygraphfortheprogramandtwosymbolicstates,oneforpointervaluesandanotherforprimitivevalues.2.Itrunsthecodeontheconcreteinputgraph,collect-ingconstraints(intermsofthesymbolicvaluesinthesymbolicstate)thatcharacterizethesetofinputsthatwouldtakethesameexecutionpathasthecurrentex-ecutionpath.3.ItnegatesoneofthecollectedconstraintsandsolvestheresultingconstraintsystemtoobtainanewlogicalinputmapI0thatissimilartoIbut(likely)leadstheexecutionthroughadierentpath.ItthensetsI=I0andrepeatstheprocess.Conceptually,CUTEexecutesthecodeundertestbothconcretelyandsymbolicallyatthesametime.TheactualCUTEimplementationrstinstrumentsthesourcecodeun-dertest,addingfunctionsthatperformthesymbolicexecu-tion.CUTEthenrepeatedlyexecutestheinstrumentedcodeonlyconcretely.3.1LogicalInputMapCUTEkeepstrackofinputmemorygraphsasalogicalin-putmapIthatmapslogicaladdressestovaluesthatareei-therlogicaladdressesorprimitivevalues.Thismapsymbol-icallyrepresentstheinputmemorygraphatthebeginningofanexecution.ThereasonthatCUTEintroduceslogicaladdressesisthatactualconcreteaddressesofdynamicallyallocatedcellsmaychangeindierentexecutions.Also,theconcreteaddressesthemselvesarenotnecessarytorepre-sentmemorygraphs;itsucestoknowhowthecellsareconnected.Finally,CUTEattemptstomakeconsecutiveinputssimilar,andthiscanbedonewithlogicaladdresses.IfCUTEusedtheactualphysicaladdresses,itwouldde-pendonand(toreturnthesameaddresses)andmoreimportantly,itwouldneedtohandledestructiveupdatesoftheinputbythecodeundertest:afterCUTEgeneratesoneinput,thecodechangesit,andCUTEwouldneedtoknowwhatchangedtoreconstructthenextinput.LetNbethesetofnaturalnumbersandVbethesetofallprimitivevalues.Then,I:N!N[V.ThevaluesinthedomainandtherangeofIbelongingtothesetNrepresentsthelogicaladdresses.Wealsoassumethateachlogicaladdressl2Nhasatypeassociatedwithit.AtypecanbeT*(apointeroftype)(wherecanbeprimitivetypeorstructtype)orp(aprimitivetype).ThefunctiontypeOf(l)returnsthistype.LetthefunctionsizeOf()re-turnsthenumberofmemorycellsthatanobjectoftypeuses.IftypeOf(l)isT*andI(l)=,thenthesequenceI(v);:::;I(v+n 1)storesthevalueoftheobjectpointedbythelogicaladdressl(eachelementinthesequencerepre-sentsthecontentofeachcelloftheobjectinorder),wherev=I(l)andn=sizeOf().Thisrepresentationofalogi-calinputmapessentiallygivesasimplewaytoserializeamemorygraph.Weillustratelogicalinputsonanexample.Recalltheex-ampleInput3fromFigure1.CUTErepresentsthisinputwiththefollowinglogicalinput:h;;;i,wherelogicaladdressesrangefrom1to4.Therstvalue3correspondstothevalueof:itpointstothelocationwithlogicalad-dress3.Thesecondvalue1correspondsto.Thethirdvaluecorrespondstoandthefourthto(0rep-resents).Thislogicalinputencodesasetofconcreteinputsthathavethesameunderlyinggraphbutresideatdif-ferentconcreteaddresses.Similarly,thelogicalinputmapforInput4fromFigure1ish;;;i.3.2UnitsandProgramModelAunitundertestcanhaveseveralfunctions.CUTEre-quirestheusertoselectoneofthemastheentryfunctionforwhichCUTEgeneratesinputs.Thisfunctioninturncancallotherfunctionsintheunitaswellasfunctionsthatarenotintheunit(e.g.,libraryfunctions).Theentryfunctiontakesasinputamemorygraph,asetofallmemoryloca-tionsreachablefromtheinputpointers.Weassumethattheunitoperatesonlyonthisinput,i.e.,theunithasnoexternalfunctions(thatwould,forexample,simulateanin-teractiveinputfromtheuserorlereading).However,aprogramcanallocateadditionalmemory,andtheexecutionthenoperatesonsomelocationsthatwerenotreachableintheinitialstate.Givenanentryfunction,CUTEgeneratesfunctionthatrstinitializesalltheargumentsofthefunctionbycallingtheprimitivefunctioninput()(de-scribednext)andthencallstheentryfunctionwiththesearguments.TheunitalongwiththefunctionformsaclosedprogramthatCUTEinstrumentsandtests.WedescribehowCUTEworksforasimpleC-likelanguageshowninFigure2.representstherststatementofaprogramundertest.Eachstatementhasanoptionallabel.Theprogramcangetinputusingtheexpressioninput().Forsimplicityofdescription,weassumethataprogramgetsall P::=StmtStmt::=[l:]SS::=lhs ejifpgotol0jSTARTjHALTjERRORlhs::=vjve::=vj&vjvjcjvopvjinput()whereop2f+; ;=;;%;:::g;visavariable,cisaconstantp::=v=vjv=vjvvjvvjvvjv-294;.881;v Figure2:SyntaxofasimpleC-likelanguagetheinputsatthebeginningofanexecutionandthenumberofinputsisxed.CUTEusestheCILframework[19]toconvertmorecomplexstatements(withnofunctioncalls)intothissimpliedformbyintroducingtemporaryvariables.Forexample,CILconverts**v=3intot1=*v;*t1=andp[i]=q[j]intot1=q+j;t2=p+i;*t2=*t1.Detailsofhandlingoffunctioncallsusingasymbolicstackarediscussedin[24].TheCexpression&vdenotestheaddressofthevariablev,andvdenotesthevalueoftheaddressstoredinv.Inconcretestate,eachaddressstoresavaluethateitherisprimitiveorrepresentsanothermemoryaddress(pointer).3.3InstrumentationTotestaprogramP,CUTEtriestoexploreallexecutionpathsofP.Toexploreallpaths,CUTErstinstrumentstheprogramundertest.Then,itrepeatedlyrunsthein-strumentedprogramPasfollows://input:Pistheinstrumentedprogramtotest//depthisthedepthofboundedDFSrun CUTE(P,depth)=[];h=(numberofargumentsinP)+1;completed=false;branch hist=[];whilenotcompletedexecutePBeforestartingtheexecutionloop,CUTEinitializesthelogicalinputmapItoanemptymapandthevariablerep-resentingthenextavailablelogicaladdresstothenumberofargumentstotheinstrumentedprogramplusone.(CUTEgivesalogicaladdresstoeachargumentattheverybegin-ning.)TheintegervariabledepthspeciesthedepthintheboundedDFSdescribedinSection3.4.Figure3showsthecodethatCUTEaddsduringinstru-mentation.Theexpressionsenclosedindoublequotes(\e")representsyntacticobjects.Duetospaceconstraint,wede-scribetheinstrumentationforfunctioncallsin[24].Inthefollowingsection,wedescribethevariousglobalvariablesandproceduresthatCUTEinserts.3.4ConcolicExecutionRecallthataprograminstrumentedbyCUTErunscon-cretelyandatthesametimeperformssymboliccomputationthroughtheinstrumentedfunctioncalls.Thesymbolicexe-cutionfollowsthepathtakenbytheconcreteexecutionandreplaceswiththeconcretevalueanysymbolicexpressionthatcannotbehandledbyourconstraintsolver.AninstrumentedprogrammaintainsattheruntimetwosymbolicstatesandP,wheremapsmemorylocationstosymbolicarithmeticexpressions,andPmapsmemorylo-cationstosymbolicpointerexpressions.Thesymbolicarith-meticexpressionsinCUTEarelinear,i.e.oftheform BeforeInstrumentation AfterInstrumentation //programstart globalvars==path c=M=[]; START globalvarsi=inputNumber=0; START //inputs inputNumber=inputNumber+1; v input(); initInput(&v;inputNumber); //inputs inputNumber=inputNumber+1; v input(); initInput(v;inputNumber); //assignment execute symbolic(&v;\"); v ; v ; //assignment execute symbolic(v;\"); v ; v ; //conditional evaluate predicate(\p";p); if(p)gotol if(p)gotol //normaltermination solve constraint(); HALT HALT; //programerror print\FoundError" ERROR ERROR; Figure3:CodethatCUTE'sinstrumentationaddsa1x1+:::+anxn+c,wheren1,eachxisasymbolicvari-able,eachaisanintegerconstant,andcisanintegercon-stant.Notethatnmustbegreaterthan0.Otherwise,theexpressionisaconstant,andCUTEdoesnotkeepconstantexpressionsin,becauseitkeepssmall:ifasymbolicexpressionisconstant,itsvaluecanbeobtainedfromtheconcretestate.Thearithmeticconstraintsareoftheforma1x1+:::+anxn+c./0,where./2f;.38;æ;;;=;=.Thepointerexpressionsaresimpler:eachisoftheformxp,wherexpisasymbolicvariable,ortheconstant.Thepointerconstraintsareoftheformx=yorx=,where=2f=;=.GivenanymapM(e.g.,orP),weuseM0=M[m7!v]todenotethemapthatisthesameasMexceptthatM0(m)=v.WeuseM0=M mtodenotethemapthatisthesameasMexceptthatM0(m)isundened.Wesaym2domain(M)ifM(m)isdened.InputInitializationusingLogicalInputMapFigure4showstheprocedureinitInput(m;l)thatusesthelogicalinputmapItoinitializethememorylocationm,toupdatethesymbolicstatesandP,andtoupdatetheinputmapIwithnewmappings.Mmapslogicaladdressestophysicaladdressesofmem-orycellsalreadyallocatedinanexecution,andmalloc(n)allocatesnfreshcellsforanobjectofsizenandreturnstheaddressesofthesecellsasasequence.Theglobalvariablekeepstrackofthenextunusedlogicaladdressavailableforanewlyallocatedobject.ForalogicaladdresslpassedasanargumenttoinitInput,I(l)canbeundenedintwocases:(1)intherstexecutionwhenIistheemptymap,and(2)whenlissomelogicaladdressthatgotallocatedintheprocessofinitialization.IfI(l)isundenedandiftypeOf(l)isnotapointer,thenthecontentofthememoryisinitializedrandomly;other-wise,ifthetypeOf(l)isapointer,thenthecontentsoflandmarebothinitializedto.NotethatCUTEdoesnotattempttogeneraterandompointergraphsbutassignsallnewpointersto.IftypeOf(I(l))isapointerto(i.e.,T*)andM(l)isdened,thenweknowthattheob-jectpointedbythelogicaladdresslisalreadyallocatedandwesimplyinitializethecontentofmbyM(l).Otherwise,weallocatesucientphysicalmemoryfortheobjectpointedbyusingmallocandinitializethemrecursively.Inthe //input:misthephysicaladdresstoinitialize//listhecorrespondinglogicaladdress//modiesh;;;initInput(m;l)ifl62domain()if(typeOf(m)==pointertoT)m=NULL;elsem=random();=[l7!m];elsev0=v=(l);if(typeOf(v)==pointertoT)if(v2domain(M))m=M(v);elsen=sizeOf(T);fm1;:::;mng=malloc(n);if(v==non-NULL)v0=h;h=h+n;//histhenextlogicaladdressm=m1;=[l7!v0];M=M[v7!m1];forj=1toninitInput(mj;v0+j 1);elsem=v;=[l7!v];//lisasymbolicvariableforlogicaladdresslif(typeOf(m)==pointertoT)=[m7!l];else=[m7!l];Figure4:Inputinitializationprocess,wealsoallocatelogicaladdressesbyincrementingifnecessary.SymbolicExecutionFigure5showsthepseudo-codeforthesymbolicmanip-ulationsdonebytheprocedureexecute symbolicwhichisinsertedbyCUTEintheprogramundertestduringinstru-mentation.Theprocedureexecute symbolic(m;e)evaluatestheexpressionsymbolicallyandmapsittothememorylocationmintheappropriatesymbolicstate.RecallthatCUTEreplacesasymbolicexpressionthattheCUTE'sconstraintsolvercannothandlewiththeconcretevaluefromtheexecution.Assume,forinstance,thatthesolvercansolveonlylinearconstraints.Inparticular,whenasymbolicexpressionbecomesnon-linear,asinthemulti-plicationoftwonon-constantsub-expressions,CUTEsim-pliesthesymbolicexpressionbyreplacingoneofthesub-expressionsbyitscurrentconcretevalue(seelineLinFig-ure.5).Similarly,ifthestatementisforinstancev00 v=v0(seelineDinFigure.5),andbothvandv0aresymbolic,CUTEremovesthememorylocation&v00frombothandPtore\rectthefactthatthesymbolicvalueforv00isunde-ned.Figure6showsthefunctionevaluate predicate(p;b)thatsymbolicallyevaluatesandupdatespath c.Incaseofpointers,CUTEonlyconsiderspredicatesoftheformx=y,x=y,x=,andx=,wherexandyaresymbolicpointervariables.WediscussthisinSection3.7.Ifasym-bolicpredicateexpressionisconstant,thenorisreturned.Atthetimesymbolicevaluationofpredicatesintheproce-dureevaluate predicate,symbolicpredicateexpressionsfrombranchingpointsarecollectedinthearraypath c.Attheendoftheexecution,path c[0:::i 1],whereiisthenum-berofconditionalstatementsofPthatCUTEexecutes,containsallpredicateswhoseconjunctionholdsfortheexe-cutionpath.Notethatinboththeproceduresexecute symbolicand//inputs:misamemorylocation//eisanexpressiontoevaluate//modiesandbysymbolicallyexecutingm execute symbolic(m;e)if(idepth)match:case\v1":m1=&v1;if(m12domain())=A m;=[m7!P(m1)];//removeifcontainsmelseif(m12domain())=[m7!A(m1)];=P m;else=P m;=A m;case\v1v2"://where2f+; gm1=&v1;m2=&v2;if(m12domain()andm22domain())v=\(m1)A(m2)";//symbolicadditionorsubtractionelseif(m12domain())v=\(m1)v2";//symbolicadditionorsubtractionelseif(m22domain())v=\v1A(m2)";//symbolicadditionorsubtractionelse=A m;=P m;return;=[m7!v];=P m;case\v1v2":m1=&v1;m2=&v2;if(m12domain()andm22domain())L:v=\v1A(m2)";//replaceonewithconcretevalueelseif(m12domain())v=\(m1)v2";//symbolicmultiplicationelseif(m22domain())v=\v1A(m2)";//symbolicmultiplicationelse=A m;=P m;return;=[m7!v];=P m;case\v1":m2=v1;if(m22domain())=A m;=[m7!P(m2)];elseif(m22domain())=[m7!A(m2)];=P m;else=A m;=P m;default:D:=A m;=P m;Figure5:Symbolicexecutionevaluate predicate,weskipsymbolicexecutionifthenumberofpredicatesexecutedsofar(recordedintheglobalvariablei)becomesgreaterthantheparameterdepth,whichgivesthedepthofboundedDFSdescribednext.BoundedDepthFirstSearchToexplorepathsintheexecutiontree,CUTEimplements(bounded)depth-rststrategy(boundedDFS).IntheboundedDFS,eachrun(excepttherst)isexecutedwiththehelpofarecordoftheconditionalstatements(whichisthearraybranch hist)executedinthepreviousrun.Theprocedurecmp n set branch histingure7checkswhetherthecurrentexecutionpathmatchestheonepredictedattheendofthepreviousexecutionandrepresentedinthevariablebranch hist.Weobservedinourexperimentsthattheexe-cutionalmostalwaysfollowsapredictionoftheoutcomeofaconditional.However,itcouldhappenthatapredictionisnotfullledbecauseCUTEapproximates,whennecessary,symbolicexpressionswithconcretevalues(asexplainedinSection3.4),andtheconstraintsolvercouldthenproduceasolutionthatchangestheoutcomeofsomeearlierbranch.(Notethatevenwhenthereisanapproximation,theso-lutiondoesnotnecessarychangetheoutcome.)Ifiteverhappensthatapredictionisnotfullled,anexceptionisraisedtorestartrun CUTEwithafreshrandominput.Boundeddepth-rstsearchprovesusefulwhenthelengthofexecutionpathsareinniteorlongenoughtopreventex-haustivelysearchthewholecomputationtree.Particularly, //inputs:pisapredicatetoevaluate//bistheconcretevalueofthepredicateinS//modiespath c,ievaluate predicate(p;b)if(idepth)matchp:case\v1./v2"://where./2f;;;-0.1;éågm1=&v1;m2=&v2;if(m12domain()andm22domain())c=\(m1) A(m2)./0";elseif(m12domain())c=\(m1) v2./0";elseif(m22domain())c=\v1 A(m2)./0";elsec=b;case\v1=v2"://where=2f=;=gm1=&v1;m2=&v2;if(m12domain()andm22domain())c=\(m1)=(m2)";elseif(m12domain()andv2==NULL)c=\(m1)=NULL";elseif(m22domain()andv1==NULL)c=\(m2)=NULL";elseif(m12domain()andm22domain())c=\(m1) A(m2)=0";elseif(m12domain())c=\(m1) v2=0";elseif(m22domain())c=\v1 A(m2)=0";elsec=b;if(b)path c[i]=c;elsepath c[i]=neg(c);cmp n set branch hist(b);i=i+1;Figure6:Symbolicevaluationofpredicates//modiesbranch histcmp n set branch hist(branch)if(ijbranch histj)if(branch hist[i].branch=branch)print\PredictionFailed";raiseanexception;//restartrun CUTEelseif(i==jbranch histj 1)branch hist[i].done=true;elsebranch hist[i].branch=branch;branch hist[i].done=false;Figure7:Predictioncheckingitisimportantforgeneratingnitesizeddatastructureswhenusingpreconditionssuchasdatastructureinvariants(seesection3.6.Forexample,ifweuseaninvarianttogeneratesortedbinarytrees,thenanon-boundeddepth-rstsearchwouldendupgeneratinginnitenumberoftreeswhoseeverynodehasatmostoneleftchildrenandnorightchildren.3.5ConstraintSolvingWenextpresenthowCUTEsolvespathconstraints.GivenapathconstraintC=neg last(path c[0:::j]),CUTEchecksifCissatisable,andifso,ndsasatisfyingsolutionI0.WehaveimplementedaconstraintsolverforCUTEtoop-timizesolvingofthepathconstraintsthatariseinconcolicexecution.Oursolverisbuiltontopof [17],acon-straintsolverforlineararithmeticconstraints.Oursolverprovidesthreeimportantoptimizationsforpathconstraints:(OPT1)Fastunsatisabilitycheck:Thesolverchecksifthelastconstraintissyntacticallythenegationofanyprecedingconstraint;ifitis,thesolverdoesnotneedtoinvoketheexpensivesemanticcheck.(Experimentalresultsshowthatthisoptimizationreducesthenumberofsemanticchecksby60-95%.)//modiesbranch hist,,completedsolve constraint()=j=i 1;while(j0)if(branch hist[j].done==false)branch hist[j].branch=:branch hist[j].branch;if(9I0thatsatisesneg last(path c[0:::j]))branch hist=branch hist[0:::j];=0;return;elsej=j 1;elsej=j 1;if(j0)completed=true;Figure8:Constraintsolving(OPT2)Commonsub-constraintselimination:Thesolveridentiesandeliminatescommonarithmeticsub-constraintsbeforepassingthemtothe .(Thissim-pleoptimization,alongwiththenextone,issignicantinpracticeasitcanreducethenumberofsub-constraintsby64%to90%.)(OPT3)Incrementalsolving:Thesolveridentiesde-pendencybetweensub-constraintsandexploitsittosolvetheconstraintsfasterandkeepthesolutionssimilar.Weexplainthisoptimizationindetail.GivenapredicateinC,wedenevars()tobethesetofallsymbolicvariablesthatappearin.Giventwopredicatesand0inC,wesaythatand0aredependentifoneofthefollowingconditionsholds:1.vars()\vars(0)=,or2.thereexistsapredicate00inCsuchthatand00aredependentand0and00aredependent.Twopredicatesareindependentiftheyarenotdependent.ThefollowingisanimportantobservationaboutthepathconstraintsCandC0fromtwoconsecutiveconcolicexecu-tions:CandC0dierinthesmallnumberofpredicates(moreprecisely,onlyinthelastpredicatewhenthereisnobacktracking),andthustheirrespectivesolutionsIandI0mustagreeonmanymappings.Oursolverexploitsthisob-servationtoprovidemoreecient,incrementalconstraintsolving.ThesolvercollectsallthepredicatesinCthataredependenton:path c[j].LetthissetofpredicatesbeD.NotethatallpredicatesinDareeitherlineararith-meticpredicatesorpointerpredicates,becausenopredicateinCcontainsbotharithmeticsymbolicvariablesandpointersymbolicvariables.ThesolverthenndsasolutionI00fortheconjunctionofallpredicatesfromD.TheinputforthenextrunisthenI0=I[I00]whichisthesameasIexceptthatforeverylforwhichI00(l)isdened,I0(l)=I00(l).Inpractice,wehavefoundthatthesizeofDisalmostone-eighththesizeofConaverage.IfallpredicatesinDarelineararithmeticpredicates,thenCUTEusesintegerlinearprogrammingtocomputeI00.IfallpredicatesinDarepointerpredicates,thenCUTEusesthefollowingproceduretocomputeI00.Letusconsideronlypointerconstraints,whichareeitherequalitiesordisequalities.Thesolverrstbuildsanequiv-alencegraphbasedon(dis)equalities(similartocheckingsatisabilityintheoryofequality[2])andthenbasedonthisgraph,assignsvaluestopointers.ThevaluesassignedtothepointerscanbealogicaladdressinthedomainofI,theconstant(aspecialconstant),orthecon-stant(representedby0).Thesolverviewsasa //inputs:pisasymbolicpointerpredicate//istheprevioussolution//returns:anewsolution00solve pointer(p;)matchp:case\=NULL":00=fy7!non-NULLjy2[]=g;case\=NULL":00=fy7!NULLjy2[]=g;case\=y":00=fz7!vjz2[y]=and()=vg;case\=y":00=fz7!non-NULLjz2[y]=g;return00;Figure9:Assigningvaluestopointerssymbolicvariable.Thus,allpredicatesinDareoftheformx=yorx=y,wherexandyaresymbolicvariables.LetD0bethesubsetofDthatdoesnotcontainthepredicate:path c[j].Thesolverrstchecksif:path c[j]isconsistentwiththepredicatesinD.Forthis,thesolverconstructsanundirectedgraphwhosenodesaretheequivalenceclasses(withrespecttotherelation=)ofallsymbolicvariablesthatappearinD0.Weuse[x]=todenotetheequivalenceclassofthesymbolicvariablex.Giventwonodesdenotedbytheequivalenceclasses[x]=and[y]=,thesolveraddsanedgebetween[x]=and[y]=ithereexistssymbolicvari-ablesuandvsuchthatu=vexistsinD0andu2[x]=andv2[y]=.Giventhegraph,thesolverndsthat:path c[j]issatisableif:path c[j]isoftheformx=yandthereisnoedgebetween[x]=and[y]=inthegraph;otherwise,if:path c[j]isoftheformx=y,then:path c[j]issatis-ableif[x]=and[y]=arenotthesameequivalenceclass.If:path c[j]issatisable,thesolvercomputesI00usingtheproceduresolve pointer(:path c[j];I)showninFigure9.Notethataftersolvingthepointerconstraints,weeitheradd(byassigningapointerto)orremoveanode(byassigningapointer)fromthecurrentinputgraph,oraliasornon-aliastwoexistingpointers.Thiskeepstheconsecutivesolutionssimilar.Keepingconsecutivesolutionsforpointerssimilarisimportantbecauseofthelogicalinputmap:ifinputswereverydierent,CUTEwouldneedtorebuildpartsofthelogicalinputmap.3.6DataStructureTestingWenextconsidertestingoffunctionsthattakedatastruc-turesasinputs.Moreprecisely,afunctionhassomepointerarguments,andthememorygraphreachablefromthepoint-ersformsadatastructure.Forinstance,considertestingofafunctionthattakesalistandremovesanelementfromit.Wecannotsimplytestsuchfunctioninisolation[5,27,30]|saygeneratingrandommemorygraphsasinputs|becausethefunctionrequirestheinputmemorygraphtosatisfythedatastructureinvariant.1Ifaninputisinvalid(i.e.,vi-olatestheinvariant),thefunctionprovidesnoguaranteesandmayevenresultinanerror.Forinstance,afunctionthatexpectsanacycliclistmayloopinnitelygivenacycliclist,whereasafunctionthatexpectsacycliclistmayderef-erencegivenanacycliclist.Wewanttotestsuchfunctionswithvalidinputsonly.Therearetwomainap-proachestoobtainingvalidinputs:(1)generatinginputswithcallsequences[27,30]and(2)solvingdatastructureinvariants[5,27].CUTEsupportsbothapproaches. 1Thefunctionsmayhaveadditionalpreconditions,butweomitthemforbrevityofdiscussion;formoredetails,see[5].GeneratingInputswithCallSequences:Oneapproachtogeneratingdatastructuresistousese-quencesoffunctioncalls.Eachdatastructureimplementsfunctionsforseveralbasicoperationssuchascreatinganemptystructure,addinganelementtothestructure,re-movinganelementfromthestructure,andcheckingifanelementisinthestructure.Asequenceoftheseoperationscanbeusedtogenerateaninstanceofdatastructure,e.g.,wecancreateanemptylistandaddseveralelementstoit.Thisapproachhastworequirements[27]:(1)allfunctionsmustbeavailable(andthuswecannottesteachfunctioninisolation),and(2)allfunctionsmustbeusedingeneration:forcomplexdatastructures,e.g.,red-blacktrees,therearememorygraphsthatcannotbeconstructedthroughaddi-tionsonlybutrequireremovals[27,30].SolvingDataStructureInvariants:Anotherapproachtogeneratingdatastructuresistousethefunctionsthatcheckinvariants.Goodprogrammingprac-ticesuggeststhatdatastructuresprovidesuchfunctions.Forexample,SGLIB[25](seeSection4.2)isapopularClibraryforgenericdatastructuresthatprovidessuchfunc-tions.Wecallthesefunctions[5].(SGLIBcallsthem .)Asanillustration,SGLIBimplementsoperationsondoublylinkedlistsandprovidesafunc-tionthatchecksifamemorygraphisavaliddoublylinkedlist;eachfunctionreturnsortoindicatethevalidityoftheinputgraph.Themainideaofusingfunctionsfortestingistosolvefunctions,i.e.,generateonlytheinputmem-orygraphsforwhichreturns[5,27].Thisap-proachallowsmodulartestingoffunctionsthatimplementdatastructureoperations(i.e.,doesnotrequirethatallop-erationsbeavailable):allweneedforafunctionundertestisacorrespondingfunction.Previoustechniquesforsolvingfunctionsincludeasearchthatusespurelyconcreteexecution[5]andasearchthatusessymbolicexecu-tionforprimitivedatabutconcretevaluesforpointers[27].CUTE,incontrast,usessymbolicexecutionforbothprim-itivedataandpointers.TheconstraintsthatCUTEbuildsandsolvesforpointersallowittosolvefunctionsasymptoticallyfasterthanthefastestprevioustechniques[5,27].Consider,forexam-ple,thefollowingcheckfromtheinvariantfordoublylinkedlist:foreachnode,n.next.prev==n.AssumethatthesolverisbuildingadoublylinkedlistwithNnodesreachablealongthepointers.Assumealsothatthesolverneedstosetthevaluesforthepointers.Executingthecheckonce,CUTEndstheexactvalueforeachpointerandthustakesO(N)stepstondthevaluesforallNpoint-ers.Incontrast,theprevioustechniques[5,27]takeO(N2)stepsastheysearchforthevalueforeachpointer,tryingrstthevalue,thenapointertotheheadofthelist,thenapointertothesecondelementandsoon.3.7ApproximationsforScalableSymbolicExecutionCUTEusessimplesymbolicexpressionsforpointersandbuildsonly(dis)equalityconstraintsforpointers.Webe-lievethattheseconstraints,whichapproximatetheexactpathcondition,areagoodtrade-o.Toexactlytrackthepointerconstraints,itwouldbenecessarytousethetheoryofarrays/memorywithupdatesandselections[18].How- ever,itwouldmakethesymbolicexecutionmoreexpensiveandcouldresultinconstraintswhosesolutionisintractable.Therefore,CUTEdoesnotusethetheoryofarraysbuthandlesarraysbyconcretelyinstantiatingthemandmak-ingeachelementofthearrayascalarsymbolicvariable.Itisimportanttonotethat,althoughCUTEusessim-plepointerconstraints,itstillkeepsapreciserelationshipbetweenpointers:thelogicalinputmap(throughtypes),maintainsarelationshipbetweenpointerstostructsandtheireldsandbetweenpointerstoarraysandtheirele-ments.Forexample,fromthelogicalinputmaph;;;iforInput3fromFigure1,CUTEknowsthatisatthe(logical)address4becausehasvalue3,andtheeldisattheoset1inthestructcell.Indeed,thelogicalinputmapallowsCUTEtouseonlysimplescalarsymbolicvariablestorepresentthememoryandstillobtainfairlypreciseconstraints.Finally,weshowthatCUTEdoesnotkeeptheexactpointerconstraints.Considerforexamplethecodesnippet*p=0;*q=1;if(*p==1)ERROR(andassumethatandarenot).CUTEcannotgeneratetheconstraintthatwouldenabletheprogramtotakethe\then"branch.Thisisbecausetheprogramcontainsnoconditionalthatcangeneratetheconstraint.Analogously,forthecodesnip-peta[i]=0;a[j]=1;if(a[i]==0)ERROR,CUTEcannotgenerate.4.IMPLEMENTATIONANDEXPERIMENTALEVALUATIONWehaveimplementedthemainpartsofCUTEinC.Toinstrumentcodeundertest,weuseCIL[19],aframe-workforparsingandtransformingCprograms.Tosolvearithmeticinequalities,theconstraintsolverofCUTEuses [17],alibraryforintegerlinearprogramming.Fur-therdetailsabouttheimplementationcanbefoundin[24].WeillustratetwocasestudiesthatshowhowCUTEcandetecterrors.Inthesecondcasestudy,wealsopresentresultsthatshowhowCUTEachievesbranchcoverageofthecodeundertest.WeperformedallexperimentsonaLinuxmachinewithadual1.7GHzIntelXeonprocessor.4.1DataStructuresofCUTEWeappliedCUTEtotestitsowndatastructures.CUTEusesanumberofnon-standarddatastructuresatrun-time,suchas torepresentlinearexpressions, torepresentpointerexpressions, torepresentdependencygraphsforpathconstraintsetc.Ourgoalinthiscasestudywastodetectmemoryleaksinaddi-tiontostandarderrorssuchassegmentationfaults,assertionviolationetc.Tothatend,weusedCUTEinconjunctionwithvalgrind[26].WediscoveredafewmemoryleaksandacoupleofsegmentationfaultsthatdidnotshowupinotherusesofCUTE.ThiscasestudyisinterestinginthatweappliedCUTEtopartlyunittestitselfanddiscoveredbugs.Webrie\rydescribeourexperiencewithtestingthe datastructure.Wetestedthe moduleofCUTEinthedepth-rstsearchmodeofCUTEalongwithvalgrind.In537it-erations,CUTEfoundamemoryleak.Thefollowingisasnippetofthefunction linear relevantforthemem-oryleak:cu_linear*cu_linear_add(cu_linear*c1,cu_linear*c2,intadd)finti,j,k,flag;cu_linear*ret=(cu_linear*)malloc(sizeof(cu_linear));::://skipped18linesofcodeif(ret-count==0)returnNULL;Ifthesumofthetwolinearexpressionspassedasargumentsbecomesconstant,thefunctionreturnswithoutfreeingthememoryallocatedforthelocalvariable.CUTEcon-structedthisscenarioautomaticallyatthetimeoftesting.Specically,CUTEconstructedthesequenceoffunctioncalls linear create(0);l1=cu linear create(0);l1=cu linear negate(l1);l1=cu linear thatexposesthememoryleakthatvalgrinddetects.4.2SGLIBLibraryWealsoappliedCUTEtounittestSGLIB[25]version1.0.1,apopular,open-sourceClibraryforgenericdatastructures.Thelibraryhasbeenextensivelyusedtoim-plementthecommercialtoolXrefactory.SGLIBconsistsofasingleCheaderle,,withabout2000linesofcodeconsistingonlyofCmacros.Thisleprovidesgenericim-plementationofmostcommonalgorithmsforarrays,lists,sortedlists,doublylinkedlists,hashtables,andred-blacktrees.UsingtheSGLIBmacros,ausercandeclareanddenevariousoperationsondatastructuresofparametrictypes.Thelibraryanditssampleexamplesprovideverierfunc-tions(canbeusedas)foreachdatastructureex-ceptforhashtables.WeusedtheseverierfunctionstotestthelibraryusingthetechniqueofmentionedinSection3.6.Forhashtables,weinvokedasequenceofitsfunction.WeusedCUTEwithboundeddepth-rstsearchstrategywithbound50.Figure10showstheresultsofourexperiments.WechoseSGLIBasacasestudyprimarilytomeasuretheeciencyofCUTE.AsSGLIBiswidelyused,wedidnotexpecttondbugs.Muchtooursurprise,wefoundtwobugsinSGLIBusingCUTE.Therstbugisasegmentationfaultthatoccursinthedoubly-linked-listlibrarywhenanon-zerolengthlistiscon-catenatedwithanotherzero-lengthlist.CUTEdiscoveredthebugin140iterations(about1seconds)intheboundeddepth-rstsearchmode.Thisbugiseasytoxbyputtingacheckonthelengthofthesecondlistintheconcatenationfunction.Thesecondbug,whichisamoreseriousone,wasfoundbyCUTEinthehashtablelibraryin193iterations(in1second).Specically,CUTEconstructedthefollowingvalidsequenceoffunctioncallswhichgetsthelibraryintoanin-niteloop:typedefstructilistfinti;structilist*next;gilist;ilist*htab[10];main()fstructilist*e,*e1,*e2,*m;sglib_hashed_ilist_init(htab);e=(ilist*)malloc(sizeof(ilist));e-next=0;e-i=0;sglib_hashed_ilist_add_if_not_member(htab,e,&m);sglib_hashed_ilist_add(htab,e);e2=(ilist*)malloc(sizeof(ilist));e2-next=0;e2-i=0;sglib_hashed_ilist_is_member(htab,e2);gwhereisarepresentinganelementofthehashtable.WereportedthesebugstotheSGLIBdevelopers,whoconrmedthattheseareindeedbugs. Name Runtime #of #ofBranches %Branch #ofFunctions OPT1 OPT2 #ofBugs inseconds Iterations Explored Coverage Tested in% &3in% Found ArrayQuickSort 2 732 43 97.73 2 67.80 49.13 0 ArrayHeapSort 4 1764 36 100.00 2 71.10 46.38 0 LinkedList 2 570 100 96.15 12 86.93 88.09 0 SortedList 2 1020 110 96.49 11 88.86 80.85 0 DoublyLinkedList 3 1317 224 99.12 17 86.95 79.38 1 HashTable 1 193 46 85.19 8 97.01 52.94 1 RedBlackTree 2629 1,000,000 242 71.18 17 89.65 64.93 0 Figure10:ResultsfortestingSGLIB1.0.1withboundeddepth-rststrategywithdepth50Figure10showstheresultsfortestingSGLIB1.0.1withtheboundeddepth-rststrategy.ForeachdatastructureandarraysortingalgorithmthatSGLIBimplements,wetabulatethetimethatCUTEtooktotestthedatastruc-ture,thenumberofrunsthatCUTEmade,thenumberofbranchesitexecuted,branchcoverageobtained,thenumberoffunctionsexecuted,thebenetofoptimizations,andthenumberofbugsfound.Thebranchcoverageinmostcasesislessthan100%.Af-terinvestigatingthereasonforthis,wefoundthatthecodecontainsanumberofassertstatementsthatwerenevervi-olatedandanumberofpredicatesthatareredundantandcanberemovedfromtheconditionals.ThelasttwocolumnsinFigure10showthebenetofthethreeoptimizationsfromSection3.5.ThecolumnOPT1givestheaveragepercentageofexecutionsinwhichthefastunsatisabilitycheckwassuccessful.Itisimportanttonotethatthesavinginthenumberofsatisabilitycheckstrans-latesintoanevenhigherrelativesavinginthesatisability-checkingtimebecause takesmuchmoretime(ex-ponentialinnumberofconstraints)todeterminethatasetofconstraintsisunsatisablethantogenerateasolutionwhenoneexists.Forexample,forred-blacktreesanddepth-rstsearch,OPT1wassuccessfulinalmost90%ofexecu-tions,whichmeansthatOPT1reducesthenumberofcallsto anorderofmagnitude.However,OPT1re-ducesthesolvingtimeof morethantwoordersofmagnitudeinthiscase;inotherwords,itwouldbeinfeasibletorunCUTEwithoutOPT1.ThecolumnOPT2&3givestheaveragepercentageofconstraintsthatCUTEeliminatedineachexecutionduetocommonsub-expressioneliminationandincrementalsolvingoptimizations.Yetagain,thisre-ductioninthesizeofconstraintsettranslatesintoamuchhigherrelativereductioninthesolvingtime.5.RELATEDWORKAutomatingunittestingisanactiveareaofresearch.Inthelastveyears,overadozenoftechniquesandtoolshavebeenproposedthatautomaticallyincreasetestcoverageorgeneratetestinputs.Thesimplest,andyetoftenveryeective,techniquesuserandomgenerationof(concrete)testinputs[4,8,10,20,21].Somerecenttoolsusebounded-exhaustiveconcreteexecu-tion[5,12,29]thattriesallvaluesfromuser-provideddo-mains.Thesetoolscanachievehighcodecoverage,espe-ciallyfortestingdatastructureimplementation.However,theyrequiretheusertocarefullychoosethevaluesinthedomainstoensurehighcoverage.Toolsbasedonsymbolicexecutionuseavarietyofap-proaches|includingabstraction-basedmodelchecking[1,3],explicit-statemodelchecking[27],symbolic-sequenceexplo-ration[22,30],andstaticanalysis[9]|todetect(potential)bugsorgeneratetestinputs.Thesetoolsinherittheincom-pletenessoftheirunderlyingreasoningenginessuchastheo-remproversandconstraintsolvers.Forexample,toolsusingprecisesymbolicexecution[27,30]cannotanalyzeanycodethatwouldbuildconstraintsoutofpre-speciedtheories,e.g.,anycodewithnon-lineararithmeticorarrayindexingwithnon-constantexpressions.Asanotherexample,toolsbasedonpredicateabstraction[1,3]donothandlecodethatdependsoncomplexdatastructures.Inthesetools,thesymbolicexecutionproceedsseparatelyfromtheconcreteexecution(orconstraintsolving).TheclosestworktooursisthatofGodefroidetal.'sdi-rectedautomatedrandomtesting(DART)[11].DARTcon-sistsofthreeparts:(1)directedgenerationoftestinputs,(2)automatedextractionofunitinterfacesfromsourcecode,and(3)randomgenerationoftestinputs.CUTEdoesnotprovideautomatedextractionofinterfacesbutleavesituptotheusertospecifywhichfunctionsarerelatedandwhattheirpreconditionsare.UnlikeDARTthatwasappliedtotestingeachfunctioninisolationandwithoutprecon-ditions,CUTEtargetsrelatedfunctionswithpreconditionssuchasdatastructureimplementations.DARThandlescon-straintsonlyonintegertypesandcannothandleprogramswithpointersanddatastructures;insuchsituations,DARTtool'stestingreducestosimpleandineectiverandomtest-ing.DARTproposedasimplestrategytogeneraterandommemorygraphs:eachpointeriseitherorpointstoanewmemorycellwhosenodesarerecursivelyinitialized.Thisstrategysuersfromseveraldeciencies:1.Therandomgenerationitselfmaynotterminate[7].2.Therandomgenerationproducesonlytrees;thereisnosharingandaliasing,sotherearenoDAGsorcycles.3.Thedirectedgenerationdoesnotkeeptrackofanyconstraintsonpointers.4.Thedirectedgenerationneverchangestheunderlyingmemorygraph;itcanonlychangethe(primitive,in-teger)valuesinthenodesinthegraph.DARTalsodoesnotconsideranypreconditionsforthecodeundertest.Forexample,intheoSIPcasestudy[11],itisunclearwhethersomevaluesareactualbugsorfalsealarmsduetoviolatedpreconditions.Moreover,CUTEim-plementsanovelconstraintsolverthatsignicantlyspeedsuptheanalysis.CadarandEnglerproposedExecutionGeneratedTest-ing(EGT)[6]thattakesasimilarapproachtotestingasCUTE:itexploresdierentexecutionpathsusingacom-binedsymbolicandconcreteexecution.However,EGTdidnotconsiderinputsthatarememorygraphsorcodethathaspreconditions.Also,EGTandCUTEdierinhowtheyap-proximatesymbolicexpressionswithconcretevalues.EGTfollowsamoretraditionalapproachtosymbolicexecution andproposesaninterestingmethodthatlazilysolvesthepathconstraints:EGTstartswithonlysymbolicinputsandtriestoexecutethecodefullysymbolically,butifitcannot,EGTsolvesthecurrentconstraintstogeneratea(partial)concreteinputwithwhichtheexecutionproceeds.CUTEisalsorelatedtothepriorworkthatusesback-trackingtogenerateatestinputthatexecutesonegivenpath(thatmaybeknowntocontainabug)[13,15].Incon-trast,CUTEattemptstocoverallfeasiblepaths,inastylesimilartosystematictesting.Moreover,thisinitialworkdidnotaddressinputsthatarememorygraphs.VisvanathanandGupta[28]recentlyproposedatechniquethatgener-atesmemorygraphs.Theyalsouseaspecializedsymbolicexecution(nottheexactexecutionwithsymbolicarrays)anddevelopasolverfortheirconstraints.However,theyconsideronegivenpath,donotconsiderunknowncodeseg-ments(e.g.,libraryfunctions),anddonotuseacombinedconcreteexecutiontogeneratenewtestinputs.6.DISCUSSIONOurworkshowsthatapproximatesymbolicexecutionfortestingcodewithdynamicdatastructuresisfeasibleandscalable.Moreover,wehaveshownhowtoecientlygen-eratedynamicdatastructuresbyincrementallyaddingandremovinganode,orbyaliasingtwopointers.Whilewede-scribedanimplementationforC,wehavealsodevelopedanimplementationforthesequentialsubsetofJava.Wearecurrentlyinvestigatinghowtotestprogramswithcon-currencyusingasimilarmethod.Wearealsoinvestigatingtheapplicationofthetechniquetondalgebraicsecurityattacksincryptographicprotocols,andsecuritybreachesinunsafelanguages.AcknowledgementsWeareindebtfultoPatriceGodefroidandNilsKlarlundfortheircommentsonapreviousversionofthispaperandforsuggestionsonclarifyingtherelationshipofthecurrentworkwithDART.Moreover,therstauthorbenetedgreatlyfrominteractionwiththemduringasummerinternship.WewouldliketothankThomasBall,CristianCadar,SarfrazKhurshid,AlexOrso,RupakMajumdar,SameerSundresh,andTaoXieforprovidingvaluablecomments.ThisworkissupportedinpartbytheONRGrantN00014-02-1-0715.7.REFERENCES[1]T.Ball.Abstraction-guidedtestgeneration:Acasestudy.TechnicalReportMSR-TR-2003-86,MicrosoftResearch.[2]C.W.BarrettandS.Berezin.CVCLite:Anewimplementationofthecooperatingvaliditychecker.InProc.16thInternationalConferenceonComputerAidedVerication,pages515{518,July2004.[3]D.Beyer,A.J.Chlipala,T.A.Henzinger,R.Jhala,andR.Majumdar.GeneratingTestfromCounterexamples.InProc.ofthe26thICSE,pages326{335,2004.[4]D.BirdandC.Munoz.AutomaticGenerationofRandomSelf-CheckingTestCases.IBMSystemsJournal,22(3):229{245,1983.[5]C.Boyapati,S.Khurshid,andD.Marinov.Korat:AutomatedtestingbasedonJavapredicates.InProc.ofInternationalSymposiumonSoftwareTestingandAnalysis,pages123{133,2002.[6]C.CadarandD.Engler.Executiongeneratedtestcases:Howtomakesystemscodecrashitself.InProc.ofSPINWorkshop,2005.[7]K.ClaessenandJ.Hughes.Quickcheck:AlightweighttoolforrandomtestingofHaskellprograms.InProc.of5thACMSIGPLANInternationalConferenceonFunctionalProgramming(ICFP),pages268{279,2000.[8]C.CsallnerandY.Smaragdakis.JCrasher:anautomaticrobustnesstesterforJava.Software:PracticeandExperience,34:1025{1050,2004.[9]C.CsallnerandY.Smaragdakis.Check'n'Crash:Combiningstaticcheckingandtesting.In27thInternationalConferenceonSoftwareEngineering,2005.[10]J.E.ForresterandB.P.Miller.AnEmpiricalStudyoftheRobustnessofWindowsNTApplicationsUsingRandomTesting.InProceedingsofthe4thUSENIXWindowsSystemSymposium,2000.[11]P.Godefroid,N.Klarlund,andK.Sen.DART:Directedautomatedrandomtesting.InProc.oftheACMSIGPLAN2005ConferenceonProgrammingLanguageDesignandImplementation(PLDI),2005.[12]W.Grieskamp,Y.Gurevich,W.Schulte,andM.Veanes.Generatingnitestatemachinesfromabstractstatemachines.InProc.InternationalSymposiumonSoftwareTestingandAnalysis,pages112{122,2002.[13]N.Gupta,A.P.Mathur,andM.L.Soa.Generatingtestdataforbranchcoverage.InProc.oftheInternationalConferenceonAutomatedSoftwareEngineering,pages219{227,2000.[14]S.Khurshid,C.S.Pasareanu,andW.Visser.Generalizedsymbolicexecutionformodelcheckingandtesting.InProc.9thInt.Conf.onTACAS,pages553{568,2003.[15]B.Korel.AdynamicApproachofTestDataGeneration.InIEEEConferenceonSoftwareMaintenance,pages311{317,November1990.[16]E.LarsonandT.Austin.Highcoveragedetectionofinput-relatedsecurityfaults.InProc.ofthe12thUSENIXSecuritySymposium(Security'03),Aug.2003.[17]lp solve.http://groups.yahoo.com/group/lp solve/.[18]J.McCarthyandJ.Painter.Correctnessofacompilerforarithmeticexpressions.InProceedingsofSymposiainAppliedMathematics.AMS,1967.[19]G.C.Necula,S.McPeak,S.P.Rahul,andW.Weimer.CIL:IntermediateLanguageandToolsforAnalysisandtransformationofCPrograms.InProceedingsofConferenceoncompilerConstruction,pages213{228,2002.[20]J.OutandJ.Hayes.ASemanticModelofProgramFaults.InProc.ofISSTA'96,pages195{200,1996.[21]C.PachecoandM.D.Ernst.Eclat:Automaticgenerationandclassicationoftestinputs.In19thEuropeanConferenceObject-OrientedProgramming,2005.[22]Parasoft.Jtestmanualsversion6.0.Onlinemanual,February2005.http://www.parasoft.com/.[23]C.S.Pasareanu,M.B.Dwyer,andW.Visser.Findingfeasiblecounter-exampleswhenmodelcheckingabstractedjavaprograms.InProc.ofTACAS'01,pages284{298,2001.[24]K.Sen,D.Marinov,andG.Agha.CUTE:AconcolicunittestingengineforC.TechnicalReportUIUCDCS-R-2005-2597,UIUC,2005.[25]SGLIB.http://xref-tech.com/sglib/main.html.[26]Valgrind.http://valgrind.org/.[27]W.Visser,C.S.Pasareanu,andS.Khurshid.TestinputgenerationwithJavaPathFinder.InProc.2004ACMSIGSOFTInternationalSymposiumonSoftwareTestingandAnalysis,pages97{107,2004.[28]S.VisvanathanandN.Gupta.Generatingtestdataforfunctionswithpointerinputs.In17thIEEEInternationalConferenceonAutomatedSoftwareEngineering,2002.[29]T.Xie,D.Marinov,andD.Notkin.Rostra:Aframeworkfordetectingredundantobject-orientedunittests.InProc.19thIEEEInternationalConferenceonAutomatedSoftwareEngineering,pages196{205,Sept.2004.[30]T.Xie,D.Marinov,W.Schulte,andD.Notkin.Symstra:Aframeworkforgeneratingobject-orientedunittestsusingsymbolicexecution.InProc.oftheToolsandAlgorithmsfortheConstructionandAnalysisofSystems,2005.