/
CUTE:AConcolicUnitTestingEngineforCKoushikSen,DarkoMarinov,GulAghaDepa CUTE:AConcolicUnitTestingEngineforCKoushikSen,DarkoMarinov,GulAghaDepa

CUTE:AConcolicUnitTestingEngineforCKoushikSen,DarkoMarinov,GulAghaDepa - PDF document

lindy-dunigan
lindy-dunigan . @lindy-dunigan
Follow
371 views
Uploaded On 2016-05-07

CUTE:AConcolicUnitTestingEngineforCKoushikSen,DarkoMarinov,GulAghaDepa - PPT Presentation

pointeroperationsForexamplepointersmayhavealiasesBecausealiasanalysismayonlybeapproximateinthepresenceofpointerarithmeticusingsymbolicvaluestopreciselytracksuchpointersmayresultinconstraintswhose ID: 309812

pointeroperations.Forexample pointersmayhavealiases.Becausealiasanalysismayonlybeapproximateinthepres-enceofpointerarithmetic usingsymbolicvaluestopreciselytracksuchpointersmayresultinconstraintswhose

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "CUTE:AConcolicUnitTestingEngineforCKoush..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

CUTE:AConcolicUnitTestingEngineforCKoushikSen,DarkoMarinov,GulAghaDepartmentofComputerScienceUniversityofIllinoisatUrbana­Champaignksen,marinov,agha@cs.uiuc.eduABSTRACTInunittesting,aprogramisdecomposedintounitswhicharecollectionsoffunctions.Apartofunitcanbetestedbygeneratinginputsforasingleentryfunction.Theen-tryfunctionmaycontainpointerarguments,inwhichcasetheinputstotheunitarememorygraphs.Thepaperad-dressestheproblemofautomatingunittestingwithmem-orygraphsasinputs.Theapproachusedbuildsonpreviousworkcombiningsymbolicandconcreteexecution,andmorespeci cally,usingsuchacombinationtogeneratetestin-putstoexploreallfeasibleexecutionpaths.Thecurrentworkdevelopsamethodtorepresentandtrackconstraintsthatcapturethebehaviorofasymbolicexecutionofaunitwithmemorygraphsasinputs.Moreover,anecientcon-straintsolverisproposedtofacilitateincrementalgenerationofsuchtestinputs.Finally,CUTE,atoolimplementingthemethodisdescribedtogetherwiththeresultsofapplyingCUTEtoreal-worldexamplesofCcode.CategoriesandSubjectDescriptors:D.2.5[SoftwareEngineering]:TestingandDebuggingGeneralTerms:Reliability,Veri cationKeywords:concolictesting,randomtesting,explicitpathmodel-checking,datastructuretesting,unittesting,testingCprograms.1.INTRODUCTIONUnittestingisamethodformodulartestingofapro-grams'functionalbehavior.Aprogramisdecomposedintounits,whereeachunitisacollectionoffunctions,andtheunitsareindependentlytested.Suchtestingrequiresspeci- cationofvaluesfortheinputs(ortestinputs)totheunit.Manualspeci cationofsuchvaluesislaborintensiveandcannotguaranteethatallpossiblebehaviorsoftheunitwillbeobservedduringthetesting.Inordertoimprovetherangeofbehaviorsobserved(ortestcoverage),severaltechniqueshavebeenproposedtoau-tomaticallygeneratevaluesfortheinputs.Onesuchtech-Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontherstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.ESEC­FSE'05,September5–9,2005,Lisbon,Portugal.Copyright2005ACM1­59593­014­0/05/0009...5.00.niqueistorandomlychoosethevaluesoverthedomainofpotentialinputs[4,8,10,21].Theproblemwithsuchrandomtestingistwofold: rst,manysetsofvaluesmayleadtothesameobservablebehaviorandarethusredundant,andsec-ond,theprobabilityofselectingparticularinputsthatcausebuggybehaviormaybeastronomicallysmall[20].Oneapproachwhichaddressestheproblemofredundantexecutionsandincreasestestcoverageissymbolicexecu-tion[1,3,9,22,23,27,28,30].Insymbolicexecution,apro-gramisexecutedusingsymbolicvariablesinplaceofcon-cretevaluesforinputs.Eachconditionalexpressionintheprogramrepresentsaconstraintthatdeterminesanexecu-tionpath.Observethatthefeasibleexecutionsofaprogramcanberepresentedasatree,wherethebranchpointsinaprogramareinternalnodesofthetree.Thegoalistogen-erateconcretevaluesforinputswhichwouldresultindi er-entpathsbeingtaken.Theclassicapproachistousedepth rstexplorationofthepathsbybacktracking[14].Unfor-tunately,forlargeorcomplexunits,itiscomputationallyintractabletopreciselymaintainandsolvetheconstraintsrequiredfortestgeneration.Tothebestofourknowledge,LarsonandAustinwerethe rsttoproposecombiningconcreteandsymbolicexe-cution[16].Intheirapproach,theprogramisexecutedonsomeuser-providedconcreteinputvalues.Symbolicpathconstraintsaregeneratedforthespeci cexecution.Theseconstraintsaresolved,iffeasible,toseewhethertherearepotentialinputvaluesthatwouldhaveledtoaviolationalongthesameexecutionpath.Thisimprovescoveragewhileavoidingthecomputationalcostassociatedwithfull-blownsymbolicexecutionwhichexercisesallpossibleexe-cutionpaths.Godefroidetal.proposedincrementallygeneratingtestinputsbycombiningconcreteandsymbolicexecution[11].InGodefroidetal.'sapproach,duringaconcreteexecution,aconjunctionofsymbolicconstraintsalongthepathoftheexecutionisgenerated.Theseconstraintsaremodi edandthensolved,iffeasible,togeneratefurthertestinputswhichwoulddirecttheprogramalongalternativepaths.Speci -cally,theysystematicallynegatetheconjunctsinthepathconstrainttoprovideadepth rstexplorationofallpathsinthecomputationtree.Ifitisnotfeasibletosolvethemodi edconstraints,Godefroidetal.proposesimplysub-stitutingrandomconcretevalues.AchallengeinapplyingGodefroidetal.'sapproachistoprovidemethodswhichextractandsolvetheconstraintsgeneratedbyaprogram.Thisproblemisparticularlycom-plexforprogramswhichhavedynamicdatastructuresusing pointeroperations.Forexample,pointersmayhavealiases.Becausealiasanalysismayonlybeapproximateinthepres-enceofpointerarithmetic,usingsymbolicvaluestopreciselytracksuchpointersmayresultinconstraintswhosesatisfac-tionisundecidable.Thismakesthegenerationoftestin-putsbysolvingsuchconstraintsinfeasible.Inthispaper,weprovideamethodforrepresentingandsolvingapproximatepointerconstraintstogeneratetestinputs.Ourmethodisthusapplicabletoabroadclassofsequentialprograms.Thekeyideaofourmethodistorepresentinputsfortheunitundertestusingalogicalinputmapthatrepresentsallinputs,including( nite)memorygraphs,asacollectionofscalarsymbolicvariablesandthentobuildconstraintsontheseinputsbysymbolicallyexecutingthecodeundertest.We rstinstrumentthecodebeingtestedbyinsertingfunctioncallswhichperformsymbolicexecution.Wethenrepeatedlyruntheinstrumentedcodeasfollows.Thelogi-calinputmapIisusedtogenerateconcretememoryinputgraphsfortheprogramandtwosymbolicstates,oneforpointervaluesandoneforprimitivevalues.Thecodeisrunconcretelyontheconcreteinputgraphandsymbolicallyonthesymbolicstates,collectingconstraints(intermsofthesymbolicvariablesinthesymbolicstate)thatcharacterizethesetofinputsthatwould(likely)takethesameexecutionpathasthecurrentexecutionpath.Asin[11],oneofthecollectedconstraintsisnegated.TheresultingconstraintsystemissolvedtoobtainanewlogicalinputmapI0thatissimilartoIbut(likely)leadstheexecutionthroughadi erentpath.WethensetI=I0andrepeattheprocess.Sincethegoalofthistestingapproachistoexplorefeasi-bleexecutionpathsasmuchaspossible,itcanbeseenasExplicitPathModel-Checking.Animportantcontributionofourworkisseparatingpointerconstraintsfromintegerconstraintsandkeepingthepointerconstraintssimpletomakeoursymbolicexecutionlight-weightandourconstraintsolvingprocedurenotonlytractablebutalsoecient.Thepointerconstraintsarecon-ceptuallysimpli edusingthelogicalinputmaptoreplacecomplexsymbolicexpressionsinvolvingpointerswithsim-plesymbolicpointervariables(whilemaintainingtheprecisepointerrelationsinthelogicalinputmap).Forexample,ifisaninputpointertoawitha eld,thenaconstraintonwillbesimpli edtoaconstrainton0,where0isthesymbolicvariablecorrespondingtotheinputvalue.Althoughthissimpli cationintroducessomeap-proximationsthatdonotpreciselycaptureallexecutions,itresultsinsimplepointerconstraintsoftheformx=yorx=y,wherexandyareeithersymbolicpointervariablesortheconstant.Theseconstraintscanbeecientlysolved,andtheapproximationsseemtosuceinpractice.WeimplementedourmethodinatoolcalledCUTE(ConcolicUnitTestingEngine,whereConcolicstandsforcooperativeConcreteandsymbolicexecution).CUTEisavailableat.CUTEimplementsasolverforbotharithmeticandpointerconstraintstoincrementallygeneratetestinputs.Thesolverexploitsthedomainofthisparticularproblemtoimplementthreenoveloptimizationswhichhelptoimprovethetestingtimebyseveralordersofmagnitude.Ourexperimentalre-sultscon rmthatCUTEcanecientlyexplorepathsinCcode,achievinghighbranchcoverageanddetectingbugs.Inparticular,itexposedsoftwarebugsthatresultinassertionviolations,segmentationfaults,orin niteloops.typedefstructcellfintv;structcell*next;gcell;intf(intv)freturn2*v+1;ginttestme(cell*p,intx)fif(x�0)if(p!=NULL)if(f(x)==p-�v)if(p-�next==p)ERROR;return0;g Input 1:\r p\rx\r 236\rNULL\rInput 3:\r p\rx\r 3\r 1\r NULL\rInput 4:\r p\rx\r 3\r 1\rInput 2:\r p\rx\r 634\r236\r NULL\r Figure1:ExampleCcodeandinputsthatCUTEgeneratesfortestingthefunctionThispaperpresentstwocasestudiesoftestingcodeusingCUTE.The rststudyinvolvestheCcodeoftheCUTEtoolitself.Thesecondcasestudyfoundtwopreviouslyun-knownerrors(asegmentationfaultandanin niteloop)inSGLIB[25],apopularCdatastructurelibraryusedinacommercialtool.WereportedtheSGLIBerrorstotheSGLIBdeveloperswho xedtheminthenextrelease.2.EXAMPLEWeuseasimpleexampletoillustratehowCUTEperformstesting.ConsidertheCfunctionshowninFigure1.Thisfunctionhasanerrorthatcanbereachedgivensomespeci cvaluesoftheinput.Inanarrowsense,theinputtoconsistsofthevaluesoftheargumentsand.However,isapointer,andthustheinputincludesthememorygraphreachablefromthatpointer.Inthisexample,thegraphisalistofallocationunits.Fortheexamplefunction,CUTE rstnon-randomlygeneratesforandrandomlygenerates236for,respectively.Figure1showsthisinputto.Asaresult,the rstexecutionoftakesthebranchofthe rststatementandthebranchofthesecond.Let0andx0bethesymbolicvariablesrepresentingthevaluesofand,respectively,atthebeginningoftheex-ecution.CUTEcollectstheconstraintsfromthepredicatesofthebranchesexecutedinthispath:x0�0(forthebranchofthe rst)and0=(forthebranchofthesecond).Thepredicatesequencehx0�;p0=iiscalledapathconstraint.CUTEnextsolvesthepathconstrainthx0�;p0=i,obtainedbynegatingthelastpredicate,todrivethenextexecutionalonganalternativepath.Thesolu-tionthatCUTEproposesis07!non-;x07!,whichrequiresthatCUTEmakepointtoanallocatedthatintroducestwonewcomponents,and,tothereachablegraph.Accordingly,CUTErandomlygen-erates634forandnon-randomlygeneratesfor,respectively,forthenextexecution.Inthesec-ondexecution,takesthebranchofthe rstandthesecondandthebranchofthethird.Forthisexecution,CUTEgeneratesthepathconstrainthx0�;p0=;x0+1=v0i,where0,v0,n0,andx0arethesymbolicvaluesof,,,and,respectively.NotethatCUTEcomputestheexpression x0+1(correspondingtotheexecutionof)throughaninter-procedural,dynamictracingofsymbolicexpressions.CUTEnextsolvesthepathconstrainthx0�;p0=;x0+1=v0i,obtainedbynegatingthelastpredicateandgeneratesInput3fromFigure1forthenextexecution.Notethatthespeci cvalueofx0haschanged,butitremainsinthesameequivalenceclasswithrespecttothepredicatewhereitappears,namelyx0�0.OnInput3,takesthebranchofthe rstthreestatementsandthebranchofthefourth.CUTEgeneratesthepathconstrainthx0�;p0=;x0+1=v0;p0=n0i.Thispathconstraintincludesdynamicallyobtainedconstraintsonpointers.CUTEhandlesconstraintsonpointersbutre-quiresnostaticaliasanalysis.Todrivetheprogramalonganalternativepathinthenextexecution,CUTEsolvestheconstraintshx0�;p0=;x0+1=v0;p0=n0iandgeneratesInput4fromFigure1.Onthisinput,thefourthexecutionofrevealstheerrorinthecode.3.CUTEWe rstde netheinputlogicalinputmapthatCUTEusestorepresentinputs.WealsointroduceprogramunitsofasimpleC-likelanguage(cf.[19]).WepresenthowCUTEinstrumentsprogramsandperformsconcolicexecution.WethendescribehowCUTEsolvestheconstraintsaftereveryexecution.WenextpresenthowCUTEhandlescomplexdatastructures.We nallydiscusstheapproximationsthatCUTEusesforpointerconstraints.Toexploreexecutionpaths,CUTE rstinstrumentsthecodeundertest.CUTEthenbuildsalogicalinputmapIforthecodeundertest.Suchalogicalinputmapcanrepresentamemorygraphinasymbolicway.CUTEthenrepeatedlyrunstheinstrumentedcodeasfollows:1.ItusesthelogicalinputmapItogenerateaconcreteinputmemorygraphfortheprogramandtwosymbolicstates,oneforpointervaluesandanotherforprimitivevalues.2.Itrunsthecodeontheconcreteinputgraph,collect-ingconstraints(intermsofthesymbolicvaluesinthesymbolicstate)thatcharacterizethesetofinputsthatwouldtakethesameexecutionpathasthecurrentex-ecutionpath.3.ItnegatesoneofthecollectedconstraintsandsolvestheresultingconstraintsystemtoobtainanewlogicalinputmapI0thatissimilartoIbut(likely)leadstheexecutionthroughadi erentpath.ItthensetsI=I0andrepeatstheprocess.Conceptually,CUTEexecutesthecodeundertestbothconcretelyandsymbolicallyatthesametime.TheactualCUTEimplementation rstinstrumentsthesourcecodeun-dertest,addingfunctionsthatperformthesymbolicexecu-tion.CUTEthenrepeatedlyexecutestheinstrumentedcodeonlyconcretely.3.1LogicalInputMapCUTEkeepstrackofinputmemorygraphsasalogicalin-putmapIthatmapslogicaladdressestovaluesthatareei-therlogicaladdressesorprimitivevalues.Thismapsymbol-icallyrepresentstheinputmemorygraphatthebeginningofanexecution.ThereasonthatCUTEintroduceslogicaladdressesisthatactualconcreteaddressesofdynamicallyallocatedcellsmaychangeindi erentexecutions.Also,theconcreteaddressesthemselvesarenotnecessarytorepre-sentmemorygraphs;itsucestoknowhowthecellsareconnected.Finally,CUTEattemptstomakeconsecutiveinputssimilar,andthiscanbedonewithlogicaladdresses.IfCUTEusedtheactualphysicaladdresses,itwouldde-pendonand(toreturnthesameaddresses)andmoreimportantly,itwouldneedtohandledestructiveupdatesoftheinputbythecodeundertest:afterCUTEgeneratesoneinput,thecodechangesit,andCUTEwouldneedtoknowwhatchangedtoreconstructthenextinput.LetNbethesetofnaturalnumbersandVbethesetofallprimitivevalues.Then,I:N!N[V.ThevaluesinthedomainandtherangeofIbelongingtothesetNrepresentsthelogicaladdresses.Wealsoassumethateachlogicaladdressl2Nhasatypeassociatedwithit.AtypecanbeT*(apointeroftype)(wherecanbeprimitivetypeorstructtype)orp(aprimitivetype).ThefunctiontypeOf(l)returnsthistype.LetthefunctionsizeOf()re-turnsthenumberofmemorycellsthatanobjectoftypeuses.IftypeOf(l)isT*andI(l)=,thenthesequenceI(v);:::;I(v+n1)storesthevalueoftheobjectpointedbythelogicaladdressl(eachelementinthesequencerepre-sentsthecontentofeachcelloftheobjectinorder),wherev=I(l)andn=sizeOf().Thisrepresentationofalogi-calinputmapessentiallygivesasimplewaytoserializeamemorygraph.Weillustratelogicalinputsonanexample.Recalltheex-ampleInput3fromFigure1.CUTErepresentsthisinputwiththefollowinglogicalinput:h;;;i,wherelogicaladdressesrangefrom1to4.The rstvalue3correspondstothevalueof:itpointstothelocationwithlogicalad-dress3.Thesecondvalue1correspondsto.Thethirdvaluecorrespondstoandthefourthto(0rep-resents).Thislogicalinputencodesasetofconcreteinputsthathavethesameunderlyinggraphbutresideatdif-ferentconcreteaddresses.Similarly,thelogicalinputmapforInput4fromFigure1ish;;;i.3.2UnitsandProgramModelAunitundertestcanhaveseveralfunctions.CUTEre-quirestheusertoselectoneofthemastheentryfunctionforwhichCUTEgeneratesinputs.Thisfunctioninturncancallotherfunctionsintheunitaswellasfunctionsthatarenotintheunit(e.g.,libraryfunctions).Theentryfunctiontakesasinputamemorygraph,asetofallmemoryloca-tionsreachablefromtheinputpointers.Weassumethattheunitoperatesonlyonthisinput,i.e.,theunithasnoexternalfunctions(thatwould,forexample,simulateanin-teractiveinputfromtheuseror lereading).However,aprogramcanallocateadditionalmemory,andtheexecutionthenoperatesonsomelocationsthatwerenotreachableintheinitialstate.Givenanentryfunction,CUTEgeneratesfunctionthat rstinitializesalltheargumentsofthefunctionbycallingtheprimitivefunctioninput()(de-scribednext)andthencallstheentryfunctionwiththesearguments.TheunitalongwiththefunctionformsaclosedprogramthatCUTEinstrumentsandtests.WedescribehowCUTEworksforasimpleC-likelanguageshowninFigure2.representsthe rststatementofaprogramundertest.Eachstatementhasanoptionallabel.Theprogramcangetinputusingtheexpressioninput().Forsimplicityofdescription,weassumethataprogramgetsall P::=StmtStmt::=[l:]SS::=lhs ejifpgotol0jSTARTjHALTjERRORlhs::=vjve::=vj&vjvjcjvopvjinput()whereop2f+;;=;;%;:::g;visavariable,cisaconstantp::=v=vjv=vjvvjvvjvvjv&#x-294;&#x.881;v Figure2:SyntaxofasimpleC-likelanguagetheinputsatthebeginningofanexecutionandthenumberofinputsis xed.CUTEusestheCILframework[19]toconvertmorecomplexstatements(withnofunctioncalls)intothissimpli edformbyintroducingtemporaryvariables.Forexample,CILconverts**v=3intot1=*v;*t1=andp[i]=q[j]intot1=q+j;t2=p+i;*t2=*t1.Detailsofhandlingoffunctioncallsusingasymbolicstackarediscussedin[24].TheCexpression&vdenotestheaddressofthevariablev,andvdenotesthevalueoftheaddressstoredinv.Inconcretestate,eachaddressstoresavaluethateitherisprimitiveorrepresentsanothermemoryaddress(pointer).3.3InstrumentationTotestaprogramP,CUTEtriestoexploreallexecutionpathsofP.Toexploreallpaths,CUTE rstinstrumentstheprogramundertest.Then,itrepeatedlyrunsthein-strumentedprogramPasfollows://input:Pistheinstrumentedprogramtotest//depthisthedepthofboundedDFSrun CUTE(P,depth)=[];h=(numberofargumentsinP)+1;completed=false;branch hist=[];whilenotcompletedexecutePBeforestartingtheexecutionloop,CUTEinitializesthelogicalinputmapItoanemptymapandthevariablerep-resentingthenextavailablelogicaladdresstothenumberofargumentstotheinstrumentedprogramplusone.(CUTEgivesalogicaladdresstoeachargumentattheverybegin-ning.)Theintegervariabledepthspeci esthedepthintheboundedDFSdescribedinSection3.4.Figure3showsthecodethatCUTEaddsduringinstru-mentation.Theexpressionsenclosedindoublequotes(\e")representsyntacticobjects.Duetospaceconstraint,wede-scribetheinstrumentationforfunctioncallsin[24].Inthefollowingsection,wedescribethevariousglobalvariablesandproceduresthatCUTEinserts.3.4ConcolicExecutionRecallthataprograminstrumentedbyCUTErunscon-cretelyandatthesametimeperformssymboliccomputationthroughtheinstrumentedfunctioncalls.Thesymbolicexe-cutionfollowsthepathtakenbytheconcreteexecutionandreplaceswiththeconcretevalueanysymbolicexpressionthatcannotbehandledbyourconstraintsolver.AninstrumentedprogrammaintainsattheruntimetwosymbolicstatesandP,wheremapsmemorylocationstosymbolicarithmeticexpressions,andPmapsmemorylo-cationstosymbolicpointerexpressions.Thesymbolicarith-meticexpressionsinCUTEarelinear,i.e.oftheform BeforeInstrumentation AfterInstrumentation //programstart globalvars==path c=M=[]; START globalvarsi=inputNumber=0; START //inputs inputNumber=inputNumber+1; v input(); initInput(&v;inputNumber); //inputs inputNumber=inputNumber+1; v input(); initInput(v;inputNumber); //assignment execute symbolic(&v;\"); v ; v ; //assignment execute symbolic(v;\"); v ; v ; //conditional evaluate predicate(\p";p); if(p)gotol if(p)gotol //normaltermination solve constraint(); HALT HALT; //programerror print\FoundError" ERROR ERROR; Figure3:CodethatCUTE'sinstrumentationaddsa1x1+:::+anxn+c,wheren1,eachxisasymbolicvari-able,eachaisanintegerconstant,andcisanintegercon-stant.Notethatnmustbegreaterthan0.Otherwise,theexpressionisaconstant,andCUTEdoesnotkeepconstantexpressionsin,becauseitkeepssmall:ifasymbolicexpressionisconstant,itsvaluecanbeobtainedfromtheconcretestate.Thearithmeticconstraintsareoftheforma1x1+:::+anxn+c./0,where./2f;�.38;晄;;;=;=.Thepointerexpressionsaresimpler:eachisoftheformxp,wherexpisasymbolicvariable,ortheconstant.Thepointerconstraintsareoftheformx=yorx=,where=2f=;=.GivenanymapM(e.g.,orP),weuseM0=M[m7!v]todenotethemapthatisthesameasMexceptthatM0(m)=v.WeuseM0=MmtodenotethemapthatisthesameasMexceptthatM0(m)isunde ned.Wesaym2domain(M)ifM(m)isde ned.InputInitializationusingLogicalInputMapFigure4showstheprocedureinitInput(m;l)thatusesthelogicalinputmapItoinitializethememorylocationm,toupdatethesymbolicstatesandP,andtoupdatetheinputmapIwithnewmappings.Mmapslogicaladdressestophysicaladdressesofmem-orycellsalreadyallocatedinanexecution,andmalloc(n)allocatesnfreshcellsforanobjectofsizenandreturnstheaddressesofthesecellsasasequence.Theglobalvariablekeepstrackofthenextunusedlogicaladdressavailableforanewlyallocatedobject.ForalogicaladdresslpassedasanargumenttoinitInput,I(l)canbeunde nedintwocases:(1)inthe rstexecutionwhenIistheemptymap,and(2)whenlissomelogicaladdressthatgotallocatedintheprocessofinitialization.IfI(l)isunde nedandiftypeOf(l)isnotapointer,thenthecontentofthememoryisinitializedrandomly;other-wise,ifthetypeOf(l)isapointer,thenthecontentsoflandmarebothinitializedto.NotethatCUTEdoesnotattempttogeneraterandompointergraphsbutassignsallnewpointersto.IftypeOf(I(l))isapointerto(i.e.,T*)andM(l)isde ned,thenweknowthattheob-jectpointedbythelogicaladdresslisalreadyallocatedandwesimplyinitializethecontentofmbyM(l).Otherwise,weallocatesucientphysicalmemoryfortheobjectpointedbyusingmallocandinitializethemrecursively.Inthe //input:misthephysicaladdresstoinitialize//listhecorrespondinglogicaladdress//modi esh;;;initInput(m;l)ifl62domain()if(typeOf(m)==pointertoT)m=NULL;elsem=random();=[l7!m];elsev0=v=(l);if(typeOf(v)==pointertoT)if(v2domain(M))m=M(v);elsen=sizeOf(T);fm1;:::;mng=malloc(n);if(v==non-NULL)v0=h;h=h+n;//histhenextlogicaladdressm=m1;=[l7!v0];M=M[v7!m1];forj=1toninitInput(mj;v0+j1);elsem=v;=[l7!v];//lisasymbolicvariableforlogicaladdresslif(typeOf(m)==pointertoT)=[m7!l];else=[m7!l];Figure4:Inputinitializationprocess,wealsoallocatelogicaladdressesbyincrementingifnecessary.SymbolicExecutionFigure5showsthepseudo-codeforthesymbolicmanip-ulationsdonebytheprocedureexecute symbolicwhichisinsertedbyCUTEintheprogramundertestduringinstru-mentation.Theprocedureexecute symbolic(m;e)evaluatestheexpressionsymbolicallyandmapsittothememorylocationmintheappropriatesymbolicstate.RecallthatCUTEreplacesasymbolicexpressionthattheCUTE'sconstraintsolvercannothandlewiththeconcretevaluefromtheexecution.Assume,forinstance,thatthesolvercansolveonlylinearconstraints.Inparticular,whenasymbolicexpressionbecomesnon-linear,asinthemulti-plicationoftwonon-constantsub-expressions,CUTEsim-pli esthesymbolicexpressionbyreplacingoneofthesub-expressionsbyitscurrentconcretevalue(seelineLinFig-ure.5).Similarly,ifthestatementisforinstancev00 v=v0(seelineDinFigure.5),andbothvandv0aresymbolic,CUTEremovesthememorylocation&v00frombothandPtore\rectthefactthatthesymbolicvalueforv00isunde- ned.Figure6showsthefunctionevaluate predicate(p;b)thatsymbolicallyevaluatesandupdatespath c.Incaseofpointers,CUTEonlyconsiderspredicatesoftheformx=y,x=y,x=,andx=,wherexandyaresymbolicpointervariables.WediscussthisinSection3.7.Ifasym-bolicpredicateexpressionisconstant,thenorisreturned.Atthetimesymbolicevaluationofpredicatesintheproce-dureevaluate predicate,symbolicpredicateexpressionsfrombranchingpointsarecollectedinthearraypath c.Attheendoftheexecution,path c[0:::i1],whereiisthenum-berofconditionalstatementsofPthatCUTEexecutes,containsallpredicateswhoseconjunctionholdsfortheexe-cutionpath.Notethatinboththeproceduresexecute symbolicand//inputs:misamemorylocation//eisanexpressiontoevaluate//modi esandbysymbolicallyexecutingm execute symbolic(m;e)if(idepth)match:case\v1":m1=&v1;if(m12domain())=Am;=[m7!P(m1)];//removeifcontainsmelseif(m12domain())=[m7!A(m1)];=Pm;else=Pm;=Am;case\v1v2"://where2f+;gm1=&v1;m2=&v2;if(m12domain()andm22domain())v=\(m1)A(m2)";//symbolicadditionorsubtractionelseif(m12domain())v=\(m1)v2";//symbolicadditionorsubtractionelseif(m22domain())v=\v1A(m2)";//symbolicadditionorsubtractionelse=Am;=Pm;return;=[m7!v];=Pm;case\v1v2":m1=&v1;m2=&v2;if(m12domain()andm22domain())L:v=\v1A(m2)";//replaceonewithconcretevalueelseif(m12domain())v=\(m1)v2";//symbolicmultiplicationelseif(m22domain())v=\v1A(m2)";//symbolicmultiplicationelse=Am;=Pm;return;=[m7!v];=Pm;case\v1":m2=v1;if(m22domain())=Am;=[m7!P(m2)];elseif(m22domain())=[m7!A(m2)];=Pm;else=Am;=Pm;default:D:=Am;=Pm;Figure5:Symbolicexecutionevaluate predicate,weskipsymbolicexecutionifthenumberofpredicatesexecutedsofar(recordedintheglobalvariablei)becomesgreaterthantheparameterdepth,whichgivesthedepthofboundedDFSdescribednext.BoundedDepth­FirstSearchToexplorepathsintheexecutiontree,CUTEimplements(bounded)depth- rststrategy(boundedDFS).IntheboundedDFS,eachrun(exceptthe rst)isexecutedwiththehelpofarecordoftheconditionalstatements(whichisthearraybranch hist)executedinthepreviousrun.Theprocedurecmp n set branch histin gure7checkswhetherthecurrentexecutionpathmatchestheonepredictedattheendofthepreviousexecutionandrepresentedinthevariablebranch hist.Weobservedinourexperimentsthattheexe-cutionalmostalwaysfollowsapredictionoftheoutcomeofaconditional.However,itcouldhappenthatapredictionisnotful lledbecauseCUTEapproximates,whennecessary,symbolicexpressionswithconcretevalues(asexplainedinSection3.4),andtheconstraintsolvercouldthenproduceasolutionthatchangestheoutcomeofsomeearlierbranch.(Notethatevenwhenthereisanapproximation,theso-lutiondoesnotnecessarychangetheoutcome.)Ifiteverhappensthatapredictionisnotful lled,anexceptionisraisedtorestartrun CUTEwithafreshrandominput.Boundeddepth- rstsearchprovesusefulwhenthelengthofexecutionpathsarein niteorlongenoughtopreventex-haustivelysearchthewholecomputationtree.Particularly, //inputs:pisapredicatetoevaluate//bistheconcretevalueofthepredicateinS//modi espath c,ievaluate predicate(p;b)if(idepth)matchp:case\v1./v2"://where./2f;;;&#x-0.1;阄倀gm1=&v1;m2=&v2;if(m12domain()andm22domain())c=\(m1)A(m2)./0";elseif(m12domain())c=\(m1)v2./0";elseif(m22domain())c=\v1A(m2)./0";elsec=b;case\v1=v2"://where=2f=;=gm1=&v1;m2=&v2;if(m12domain()andm22domain())c=\(m1)=(m2)";elseif(m12domain()andv2==NULL)c=\(m1)=NULL";elseif(m22domain()andv1==NULL)c=\(m2)=NULL";elseif(m12domain()andm22domain())c=\(m1)A(m2)=0";elseif(m12domain())c=\(m1)v2=0";elseif(m22domain())c=\v1A(m2)=0";elsec=b;if(b)path c[i]=c;elsepath c[i]=neg(c);cmp n set branch hist(b);i=i+1;Figure6:Symbolicevaluationofpredicates//modi esbranch histcmp n set branch hist(branch)if(ijbranch histj)if(branch hist[i].branch=branch)print\PredictionFailed";raiseanexception;//restartrun CUTEelseif(i==jbranch histj1)branch hist[i].done=true;elsebranch hist[i].branch=branch;branch hist[i].done=false;Figure7:Predictioncheckingitisimportantforgenerating nitesizeddatastructureswhenusingpreconditionssuchasdatastructureinvariants(seesection3.6.Forexample,ifweuseaninvarianttogeneratesortedbinarytrees,thenanon-boundeddepth- rstsearchwouldendupgeneratingin nitenumberoftreeswhoseeverynodehasatmostoneleftchildrenandnorightchildren.3.5ConstraintSolvingWenextpresenthowCUTEsolvespathconstraints.GivenapathconstraintC=neg last(path c[0:::j]),CUTEchecksifCissatis able,andifso, ndsasatisfyingsolutionI0.WehaveimplementedaconstraintsolverforCUTEtoop-timizesolvingofthepathconstraintsthatariseinconcolicexecution.Oursolverisbuiltontopof [17],acon-straintsolverforlineararithmeticconstraints.Oursolverprovidesthreeimportantoptimizationsforpathconstraints:(OPT1)Fastunsatis abilitycheck:Thesolverchecksifthelastconstraintissyntacticallythenegationofanyprecedingconstraint;ifitis,thesolverdoesnotneedtoinvoketheexpensivesemanticcheck.(Experimentalresultsshowthatthisoptimizationreducesthenumberofsemanticchecksby60-95%.)//modi esbranch hist,,completedsolve constraint()=j=i1;while(j0)if(branch hist[j].done==false)branch hist[j].branch=:branch hist[j].branch;if(9I0thatsatis esneg last(path c[0:::j]))branch hist=branch hist[0:::j];=0;return;elsej=j1;elsej=j1;if(j0)completed=true;Figure8:Constraintsolving(OPT2)Commonsub-constraintselimination:Thesolveridenti esandeliminatescommonarithmeticsub-constraintsbeforepassingthemtothe .(Thissim-pleoptimization,alongwiththenextone,issigni cantinpracticeasitcanreducethenumberofsub-constraintsby64%to90%.)(OPT3)Incrementalsolving:Thesolveridenti esde-pendencybetweensub-constraintsandexploitsittosolvetheconstraintsfasterandkeepthesolutionssimilar.Weexplainthisoptimizationindetail.GivenapredicateinC,wede nevars()tobethesetofallsymbolicvariablesthatappearin.Giventwopredicatesand0inC,wesaythatand0aredependentifoneofthefollowingconditionsholds:1.vars()\vars(0)=,or2.thereexistsapredicate00inCsuchthatand00aredependentand0and00aredependent.Twopredicatesareindependentiftheyarenotdependent.ThefollowingisanimportantobservationaboutthepathconstraintsCandC0fromtwoconsecutiveconcolicexecu-tions:CandC0di erinthesmallnumberofpredicates(moreprecisely,onlyinthelastpredicatewhenthereisnobacktracking),andthustheirrespectivesolutionsIandI0mustagreeonmanymappings.Oursolverexploitsthisob-servationtoprovidemoreecient,incrementalconstraintsolving.ThesolvercollectsallthepredicatesinCthataredependenton:path c[j].LetthissetofpredicatesbeD.NotethatallpredicatesinDareeitherlineararith-meticpredicatesorpointerpredicates,becausenopredicateinCcontainsbotharithmeticsymbolicvariablesandpointersymbolicvariables.Thesolverthen ndsasolutionI00fortheconjunctionofallpredicatesfromD.TheinputforthenextrunisthenI0=I[I00]whichisthesameasIexceptthatforeverylforwhichI00(l)isde ned,I0(l)=I00(l).Inpractice,wehavefoundthatthesizeofDisalmostone-eighththesizeofConaverage.IfallpredicatesinDarelineararithmeticpredicates,thenCUTEusesintegerlinearprogrammingtocomputeI00.IfallpredicatesinDarepointerpredicates,thenCUTEusesthefollowingproceduretocomputeI00.Letusconsideronlypointerconstraints,whichareeitherequalitiesordisequalities.Thesolver rstbuildsanequiv-alencegraphbasedon(dis)equalities(similartocheckingsatis abilityintheoryofequality[2])andthenbasedonthisgraph,assignsvaluestopointers.ThevaluesassignedtothepointerscanbealogicaladdressinthedomainofI,theconstant(aspecialconstant),orthecon-stant(representedby0).Thesolverviewsasa //inputs:pisasymbolicpointerpredicate//istheprevioussolution//returns:anewsolution00solve pointer(p;)matchp:case\=NULL":00=fy7!non-NULLjy2[]=g;case\=NULL":00=fy7!NULLjy2[]=g;case\=y":00=fz7!vjz2[y]=and()=vg;case\=y":00=fz7!non-NULLjz2[y]=g;return00;Figure9:Assigningvaluestopointerssymbolicvariable.Thus,allpredicatesinDareoftheformx=yorx=y,wherexandyaresymbolicvariables.LetD0bethesubsetofDthatdoesnotcontainthepredicate:path c[j].Thesolver rstchecksif:path c[j]isconsistentwiththepredicatesinD.Forthis,thesolverconstructsanundirectedgraphwhosenodesaretheequivalenceclasses(withrespecttotherelation=)ofallsymbolicvariablesthatappearinD0.Weuse[x]=todenotetheequivalenceclassofthesymbolicvariablex.Giventwonodesdenotedbytheequivalenceclasses[x]=and[y]=,thesolveraddsanedgebetween[x]=and[y]=i thereexistssymbolicvari-ablesuandvsuchthatu=vexistsinD0andu2[x]=andv2[y]=.Giventhegraph,thesolver ndsthat:path c[j]issatis ableif:path c[j]isoftheformx=yandthereisnoedgebetween[x]=and[y]=inthegraph;otherwise,if:path c[j]isoftheformx=y,then:path c[j]issatis -ableif[x]=and[y]=arenotthesameequivalenceclass.If:path c[j]issatis able,thesolvercomputesI00usingtheproceduresolve pointer(:path c[j];I)showninFigure9.Notethataftersolvingthepointerconstraints,weeitheradd(byassigningapointerto)orremoveanode(byassigningapointer)fromthecurrentinputgraph,oraliasornon-aliastwoexistingpointers.Thiskeepstheconsecutivesolutionssimilar.Keepingconsecutivesolutionsforpointerssimilarisimportantbecauseofthelogicalinputmap:ifinputswereverydi erent,CUTEwouldneedtorebuildpartsofthelogicalinputmap.3.6DataStructureTestingWenextconsidertestingoffunctionsthattakedatastruc-turesasinputs.Moreprecisely,afunctionhassomepointerarguments,andthememorygraphreachablefromthepoint-ersformsadatastructure.Forinstance,considertestingofafunctionthattakesalistandremovesanelementfromit.Wecannotsimplytestsuchfunctioninisolation[5,27,30]|saygeneratingrandommemorygraphsasinputs|becausethefunctionrequirestheinputmemorygraphtosatisfythedatastructureinvariant.1Ifaninputisinvalid(i.e.,vi-olatestheinvariant),thefunctionprovidesnoguaranteesandmayevenresultinanerror.Forinstance,afunctionthatexpectsanacycliclistmayloopin nitelygivenacycliclist,whereasafunctionthatexpectsacycliclistmayderef-erencegivenanacycliclist.Wewanttotestsuchfunctionswithvalidinputsonly.Therearetwomainap-proachestoobtainingvalidinputs:(1)generatinginputswithcallsequences[27,30]and(2)solvingdatastructureinvariants[5,27].CUTEsupportsbothapproaches. 1Thefunctionsmayhaveadditionalpreconditions,butweomitthemforbrevityofdiscussion;formoredetails,see[5].GeneratingInputswithCallSequences:Oneapproachtogeneratingdatastructuresistousese-quencesoffunctioncalls.Eachdatastructureimplementsfunctionsforseveralbasicoperationssuchascreatinganemptystructure,addinganelementtothestructure,re-movinganelementfromthestructure,andcheckingifanelementisinthestructure.Asequenceoftheseoperationscanbeusedtogenerateaninstanceofdatastructure,e.g.,wecancreateanemptylistandaddseveralelementstoit.Thisapproachhastworequirements[27]:(1)allfunctionsmustbeavailable(andthuswecannottesteachfunctioninisolation),and(2)allfunctionsmustbeusedingeneration:forcomplexdatastructures,e.g.,red-blacktrees,therearememorygraphsthatcannotbeconstructedthroughaddi-tionsonlybutrequireremovals[27,30].SolvingDataStructureInvariants:Anotherapproachtogeneratingdatastructuresistousethefunctionsthatcheckinvariants.Goodprogrammingprac-ticesuggeststhatdatastructuresprovidesuchfunctions.Forexample,SGLIB[25](seeSection4.2)isapopularClibraryforgenericdatastructuresthatprovidessuchfunc-tions.Wecallthesefunctions[5].(SGLIBcallsthem .)Asanillustration,SGLIBimplementsoperationsondoublylinkedlistsandprovidesafunc-tionthatchecksifamemorygraphisavaliddoublylinkedlist;eachfunctionreturnsortoindicatethevalidityoftheinputgraph.Themainideaofusingfunctionsfortestingistosolvefunctions,i.e.,generateonlytheinputmem-orygraphsforwhichreturns[5,27].Thisap-proachallowsmodulartestingoffunctionsthatimplementdatastructureoperations(i.e.,doesnotrequirethatallop-erationsbeavailable):allweneedforafunctionundertestisacorrespondingfunction.Previoustechniquesforsolvingfunctionsincludeasearchthatusespurelyconcreteexecution[5]andasearchthatusessymbolicexecu-tionforprimitivedatabutconcretevaluesforpointers[27].CUTE,incontrast,usessymbolicexecutionforbothprim-itivedataandpointers.TheconstraintsthatCUTEbuildsandsolvesforpointersallowittosolvefunctionsasymptoticallyfasterthanthefastestprevioustechniques[5,27].Consider,forexam-ple,thefollowingcheckfromtheinvariantfordoublylinkedlist:foreachnode,n.next.prev==n.AssumethatthesolverisbuildingadoublylinkedlistwithNnodesreachablealongthepointers.Assumealsothatthesolverneedstosetthevaluesforthepointers.Executingthecheckonce,CUTE ndstheexactvalueforeachpointerandthustakesO(N)stepsto ndthevaluesforallNpoint-ers.Incontrast,theprevioustechniques[5,27]takeO(N2)stepsastheysearchforthevalueforeachpointer,trying rstthevalue,thenapointertotheheadofthelist,thenapointertothesecondelementandsoon.3.7ApproximationsforScalableSymbolicExecutionCUTEusessimplesymbolicexpressionsforpointersandbuildsonly(dis)equalityconstraintsforpointers.Webe-lievethattheseconstraints,whichapproximatetheexactpathcondition,areagoodtrade-o .Toexactlytrackthepointerconstraints,itwouldbenecessarytousethetheoryofarrays/memorywithupdatesandselections[18].How- ever,itwouldmakethesymbolicexecutionmoreexpensiveandcouldresultinconstraintswhosesolutionisintractable.Therefore,CUTEdoesnotusethetheoryofarraysbuthandlesarraysbyconcretelyinstantiatingthemandmak-ingeachelementofthearrayascalarsymbolicvariable.Itisimportanttonotethat,althoughCUTEusessim-plepointerconstraints,itstillkeepsapreciserelationshipbetweenpointers:thelogicalinputmap(throughtypes),maintainsarelationshipbetweenpointerstostructsandtheir eldsandbetweenpointerstoarraysandtheirele-ments.Forexample,fromthelogicalinputmaph;;;iforInput3fromFigure1,CUTEknowsthatisatthe(logical)address4becausehasvalue3,andthe eldisattheo set1inthestructcell.Indeed,thelogicalinputmapallowsCUTEtouseonlysimplescalarsymbolicvariablestorepresentthememoryandstillobtainfairlypreciseconstraints.Finally,weshowthatCUTEdoesnotkeeptheexactpointerconstraints.Considerforexamplethecodesnippet*p=0;*q=1;if(*p==1)ERROR(andassumethatandarenot).CUTEcannotgeneratetheconstraintthatwouldenabletheprogramtotakethe\then"branch.Thisisbecausetheprogramcontainsnoconditionalthatcangeneratetheconstraint.Analogously,forthecodesnip-peta[i]=0;a[j]=1;if(a[i]==0)ERROR,CUTEcannotgenerate.4.IMPLEMENTATIONANDEXPERIMENTALEVALUATIONWehaveimplementedthemainpartsofCUTEinC.Toinstrumentcodeundertest,weuseCIL[19],aframe-workforparsingandtransformingCprograms.Tosolvearithmeticinequalities,theconstraintsolverofCUTEuses [17],alibraryforintegerlinearprogramming.Fur-therdetailsabouttheimplementationcanbefoundin[24].WeillustratetwocasestudiesthatshowhowCUTEcandetecterrors.Inthesecondcasestudy,wealsopresentresultsthatshowhowCUTEachievesbranchcoverageofthecodeundertest.WeperformedallexperimentsonaLinuxmachinewithadual1.7GHzIntelXeonprocessor.4.1DataStructuresofCUTEWeappliedCUTEtotestitsowndatastructures.CUTEusesanumberofnon-standarddatastructuresatrun-time,suchas torepresentlinearexpressions, torepresentpointerexpressions, torepresentdependencygraphsforpathconstraintsetc.Ourgoalinthiscasestudywastodetectmemoryleaksinaddi-tiontostandarderrorssuchassegmentationfaults,assertionviolationetc.Tothatend,weusedCUTEinconjunctionwithvalgrind[26].WediscoveredafewmemoryleaksandacoupleofsegmentationfaultsthatdidnotshowupinotherusesofCUTE.ThiscasestudyisinterestinginthatweappliedCUTEtopartlyunittestitselfanddiscoveredbugs.Webrie\rydescribeourexperiencewithtestingthe datastructure.Wetestedthe moduleofCUTEinthedepth- rstsearchmodeofCUTEalongwithvalgrind.In537it-erations,CUTEfoundamemoryleak.Thefollowingisasnippetofthefunction linear relevantforthemem-oryleak:cu_linear*cu_linear_add(cu_linear*c1,cu_linear*c2,intadd)finti,j,k,flag;cu_linear*ret=(cu_linear*)malloc(sizeof(cu_linear));::://skipped18linesofcodeif(ret-�count==0)returnNULL;Ifthesumofthetwolinearexpressionspassedasargumentsbecomesconstant,thefunctionreturnswithoutfreeingthememoryallocatedforthelocalvariable.CUTEcon-structedthisscenarioautomaticallyatthetimeoftesting.Speci cally,CUTEconstructedthesequenceoffunctioncalls linear create(0);l1=cu linear create(0);l1=cu linear negate(l1);l1=cu linear thatexposesthememoryleakthatvalgrinddetects.4.2SGLIBLibraryWealsoappliedCUTEtounittestSGLIB[25]version1.0.1,apopular,open-sourceClibraryforgenericdatastructures.Thelibraryhasbeenextensivelyusedtoim-plementthecommercialtoolXrefactory.SGLIBconsistsofasingleCheader le,,withabout2000linesofcodeconsistingonlyofCmacros.This leprovidesgenericim-plementationofmostcommonalgorithmsforarrays,lists,sortedlists,doublylinkedlists,hashtables,andred-blacktrees.UsingtheSGLIBmacros,ausercandeclareandde nevariousoperationsondatastructuresofparametrictypes.Thelibraryanditssampleexamplesprovideveri erfunc-tions(canbeusedas)foreachdatastructureex-ceptforhashtables.Weusedtheseveri erfunctionstotestthelibraryusingthetechniqueofmentionedinSection3.6.Forhashtables,weinvokedasequenceofitsfunction.WeusedCUTEwithboundeddepth- rstsearchstrategywithbound50.Figure10showstheresultsofourexperiments.WechoseSGLIBasacasestudyprimarilytomeasuretheeciencyofCUTE.AsSGLIBiswidelyused,wedidnotexpectto ndbugs.Muchtooursurprise,wefoundtwobugsinSGLIBusingCUTE.The rstbugisasegmentationfaultthatoccursinthedoubly-linked-listlibrarywhenanon-zerolengthlistiscon-catenatedwithanotherzero-lengthlist.CUTEdiscoveredthebugin140iterations(about1seconds)intheboundeddepth- rstsearchmode.Thisbugiseasyto xbyputtingacheckonthelengthofthesecondlistintheconcatenationfunction.Thesecondbug,whichisamoreseriousone,wasfoundbyCUTEinthehashtablelibraryin193iterations(in1second).Speci cally,CUTEconstructedthefollowingvalidsequenceoffunctioncallswhichgetsthelibraryintoanin- niteloop:typedefstructilistfinti;structilist*next;gilist;ilist*htab[10];main()fstructilist*e,*e1,*e2,*m;sglib_hashed_ilist_init(htab);e=(ilist*)malloc(sizeof(ilist));e-�next=0;e-�i=0;sglib_hashed_ilist_add_if_not_member(htab,e,&m);sglib_hashed_ilist_add(htab,e);e2=(ilist*)malloc(sizeof(ilist));e2-�next=0;e2-�i=0;sglib_hashed_ilist_is_member(htab,e2);gwhereisarepresentinganelementofthehashtable.WereportedthesebugstotheSGLIBdevelopers,whocon rmedthattheseareindeedbugs. Name Runtime #of #ofBranches %Branch #ofFunctions OPT1 OPT2 #ofBugs inseconds Iterations Explored Coverage Tested in% &3in% Found ArrayQuickSort 2 732 43 97.73 2 67.80 49.13 0 ArrayHeapSort 4 1764 36 100.00 2 71.10 46.38 0 LinkedList 2 570 100 96.15 12 86.93 88.09 0 SortedList 2 1020 110 96.49 11 88.86 80.85 0 DoublyLinkedList 3 1317 224 99.12 17 86.95 79.38 1 HashTable 1 193 46 85.19 8 97.01 52.94 1 RedBlackTree 2629 1,000,000 242 71.18 17 89.65 64.93 0 Figure10:ResultsfortestingSGLIB1.0.1withboundeddepth- rststrategywithdepth50Figure10showstheresultsfortestingSGLIB1.0.1withtheboundeddepth- rststrategy.ForeachdatastructureandarraysortingalgorithmthatSGLIBimplements,wetabulatethetimethatCUTEtooktotestthedatastruc-ture,thenumberofrunsthatCUTEmade,thenumberofbranchesitexecuted,branchcoverageobtained,thenumberoffunctionsexecuted,thebene tofoptimizations,andthenumberofbugsfound.Thebranchcoverageinmostcasesislessthan100%.Af-terinvestigatingthereasonforthis,wefoundthatthecodecontainsanumberofassertstatementsthatwerenevervi-olatedandanumberofpredicatesthatareredundantandcanberemovedfromtheconditionals.ThelasttwocolumnsinFigure10showthebene tofthethreeoptimizationsfromSection3.5.ThecolumnOPT1givestheaveragepercentageofexecutionsinwhichthefastunsatis abilitycheckwassuccessful.Itisimportanttonotethatthesavinginthenumberofsatis abilitycheckstrans-latesintoanevenhigherrelativesavinginthesatis ability-checkingtimebecause takesmuchmoretime(ex-ponentialinnumberofconstraints)todeterminethatasetofconstraintsisunsatis ablethantogenerateasolutionwhenoneexists.Forexample,forred-blacktreesanddepth- rstsearch,OPT1wassuccessfulinalmost90%ofexecu-tions,whichmeansthatOPT1reducesthenumberofcallsto anorderofmagnitude.However,OPT1re-ducesthesolvingtimeof morethantwoordersofmagnitudeinthiscase;inotherwords,itwouldbeinfeasibletorunCUTEwithoutOPT1.ThecolumnOPT2&3givestheaveragepercentageofconstraintsthatCUTEeliminatedineachexecutionduetocommonsub-expressioneliminationandincrementalsolvingoptimizations.Yetagain,thisre-ductioninthesizeofconstraintsettranslatesintoamuchhigherrelativereductioninthesolvingtime.5.RELATEDWORKAutomatingunittestingisanactiveareaofresearch.Inthelast veyears,overadozenoftechniquesandtoolshavebeenproposedthatautomaticallyincreasetestcoverageorgeneratetestinputs.Thesimplest,andyetoftenverye ective,techniquesuserandomgenerationof(concrete)testinputs[4,8,10,20,21].Somerecenttoolsusebounded-exhaustiveconcreteexecu-tion[5,12,29]thattriesallvaluesfromuser-provideddo-mains.Thesetoolscanachievehighcodecoverage,espe-ciallyfortestingdatastructureimplementation.However,theyrequiretheusertocarefullychoosethevaluesinthedomainstoensurehighcoverage.Toolsbasedonsymbolicexecutionuseavarietyofap-proaches|includingabstraction-basedmodelchecking[1,3],explicit-statemodelchecking[27],symbolic-sequenceexplo-ration[22,30],andstaticanalysis[9]|todetect(potential)bugsorgeneratetestinputs.Thesetoolsinherittheincom-pletenessoftheirunderlyingreasoningenginessuchastheo-remproversandconstraintsolvers.Forexample,toolsusingprecisesymbolicexecution[27,30]cannotanalyzeanycodethatwouldbuildconstraintsoutofpre-speci edtheories,e.g.,anycodewithnon-lineararithmeticorarrayindexingwithnon-constantexpressions.Asanotherexample,toolsbasedonpredicateabstraction[1,3]donothandlecodethatdependsoncomplexdatastructures.Inthesetools,thesymbolicexecutionproceedsseparatelyfromtheconcreteexecution(orconstraintsolving).TheclosestworktooursisthatofGodefroidetal.'sdi-rectedautomatedrandomtesting(DART)[11].DARTcon-sistsofthreeparts:(1)directedgenerationoftestinputs,(2)automatedextractionofunitinterfacesfromsourcecode,and(3)randomgenerationoftestinputs.CUTEdoesnotprovideautomatedextractionofinterfacesbutleavesituptotheusertospecifywhichfunctionsarerelatedandwhattheirpreconditionsare.UnlikeDARTthatwasappliedtotestingeachfunctioninisolationandwithoutprecon-ditions,CUTEtargetsrelatedfunctionswithpreconditionssuchasdatastructureimplementations.DARThandlescon-straintsonlyonintegertypesandcannothandleprogramswithpointersanddatastructures;insuchsituations,DARTtool'stestingreducestosimpleandine ectiverandomtest-ing.DARTproposedasimplestrategytogeneraterandommemorygraphs:eachpointeriseitherorpointstoanewmemorycellwhosenodesarerecursivelyinitialized.Thisstrategysu ersfromseveralde ciencies:1.Therandomgenerationitselfmaynotterminate[7].2.Therandomgenerationproducesonlytrees;thereisnosharingandaliasing,sotherearenoDAGsorcycles.3.Thedirectedgenerationdoesnotkeeptrackofanyconstraintsonpointers.4.Thedirectedgenerationneverchangestheunderlyingmemorygraph;itcanonlychangethe(primitive,in-teger)valuesinthenodesinthegraph.DARTalsodoesnotconsideranypreconditionsforthecodeundertest.Forexample,intheoSIPcasestudy[11],itisunclearwhethersomevaluesareactualbugsorfalsealarmsduetoviolatedpreconditions.Moreover,CUTEim-plementsanovelconstraintsolverthatsigni cantlyspeedsuptheanalysis.CadarandEnglerproposedExecutionGeneratedTest-ing(EGT)[6]thattakesasimilarapproachtotestingasCUTE:itexploresdi erentexecutionpathsusingacom-binedsymbolicandconcreteexecution.However,EGTdidnotconsiderinputsthatarememorygraphsorcodethathaspreconditions.Also,EGTandCUTEdi erinhowtheyap-proximatesymbolicexpressionswithconcretevalues.EGTfollowsamoretraditionalapproachtosymbolicexecution andproposesaninterestingmethodthatlazilysolvesthepathconstraints:EGTstartswithonlysymbolicinputsandtriestoexecutethecodefullysymbolically,butifitcannot,EGTsolvesthecurrentconstraintstogeneratea(partial)concreteinputwithwhichtheexecutionproceeds.CUTEisalsorelatedtothepriorworkthatusesback-trackingtogenerateatestinputthatexecutesonegivenpath(thatmaybeknowntocontainabug)[13,15].Incon-trast,CUTEattemptstocoverallfeasiblepaths,inastylesimilartosystematictesting.Moreover,thisinitialworkdidnotaddressinputsthatarememorygraphs.VisvanathanandGupta[28]recentlyproposedatechniquethatgener-atesmemorygraphs.Theyalsouseaspecializedsymbolicexecution(nottheexactexecutionwithsymbolicarrays)anddevelopasolverfortheirconstraints.However,theyconsideronegivenpath,donotconsiderunknowncodeseg-ments(e.g.,libraryfunctions),anddonotuseacombinedconcreteexecutiontogeneratenewtestinputs.6.DISCUSSIONOurworkshowsthatapproximatesymbolicexecutionfortestingcodewithdynamicdatastructuresisfeasibleandscalable.Moreover,wehaveshownhowtoecientlygen-eratedynamicdatastructuresbyincrementallyaddingandremovinganode,orbyaliasingtwopointers.Whilewede-scribedanimplementationforC,wehavealsodevelopedanimplementationforthesequentialsubsetofJava.Wearecurrentlyinvestigatinghowtotestprogramswithcon-currencyusingasimilarmethod.Wearealsoinvestigatingtheapplicationofthetechniqueto ndalgebraicsecurityattacksincryptographicprotocols,andsecuritybreachesinunsafelanguages.AcknowledgementsWeareindebtfultoPatriceGodefroidandNilsKlarlundfortheircommentsonapreviousversionofthispaperandforsuggestionsonclarifyingtherelationshipofthecurrentworkwithDART.Moreover,the rstauthorbene tedgreatlyfrominteractionwiththemduringasummerinternship.WewouldliketothankThomasBall,CristianCadar,SarfrazKhurshid,AlexOrso,RupakMajumdar,SameerSundresh,andTaoXieforprovidingvaluablecomments.ThisworkissupportedinpartbytheONRGrantN00014-02-1-0715.7.REFERENCES[1]T.Ball.Abstraction-guidedtestgeneration:Acasestudy.TechnicalReportMSR-TR-2003-86,MicrosoftResearch.[2]C.W.BarrettandS.Berezin.CVCLite:Anewimplementationofthecooperatingvaliditychecker.InProc.16thInternationalConferenceonComputerAidedVeri cation,pages515{518,July2004.[3]D.Beyer,A.J.Chlipala,T.A.Henzinger,R.Jhala,andR.Majumdar.GeneratingTestfromCounterexamples.InProc.ofthe26thICSE,pages326{335,2004.[4]D.BirdandC.Munoz.AutomaticGenerationofRandomSelf-CheckingTestCases.IBMSystemsJournal,22(3):229{245,1983.[5]C.Boyapati,S.Khurshid,andD.Marinov.Korat:AutomatedtestingbasedonJavapredicates.InProc.ofInternationalSymposiumonSoftwareTestingandAnalysis,pages123{133,2002.[6]C.CadarandD.Engler.Executiongeneratedtestcases:Howtomakesystemscodecrashitself.InProc.ofSPINWorkshop,2005.[7]K.ClaessenandJ.Hughes.Quickcheck:AlightweighttoolforrandomtestingofHaskellprograms.InProc.of5thACMSIGPLANInternationalConferenceonFunctionalProgramming(ICFP),pages268{279,2000.[8]C.CsallnerandY.Smaragdakis.JCrasher:anautomaticrobustnesstesterforJava.Software:PracticeandExperience,34:1025{1050,2004.[9]C.CsallnerandY.Smaragdakis.Check'n'Crash:Combiningstaticcheckingandtesting.In27thInternationalConferenceonSoftwareEngineering,2005.[10]J.E.ForresterandB.P.Miller.AnEmpiricalStudyoftheRobustnessofWindowsNTApplicationsUsingRandomTesting.InProceedingsofthe4thUSENIXWindowsSystemSymposium,2000.[11]P.Godefroid,N.Klarlund,andK.Sen.DART:Directedautomatedrandomtesting.InProc.oftheACMSIGPLAN2005ConferenceonProgrammingLanguageDesignandImplementation(PLDI),2005.[12]W.Grieskamp,Y.Gurevich,W.Schulte,andM.Veanes.Generating nitestatemachinesfromabstractstatemachines.InProc.InternationalSymposiumonSoftwareTestingandAnalysis,pages112{122,2002.[13]N.Gupta,A.P.Mathur,andM.L.So a.Generatingtestdataforbranchcoverage.InProc.oftheInternationalConferenceonAutomatedSoftwareEngineering,pages219{227,2000.[14]S.Khurshid,C.S.Pasareanu,andW.Visser.Generalizedsymbolicexecutionformodelcheckingandtesting.InProc.9thInt.Conf.onTACAS,pages553{568,2003.[15]B.Korel.AdynamicApproachofTestDataGeneration.InIEEEConferenceonSoftwareMaintenance,pages311{317,November1990.[16]E.LarsonandT.Austin.Highcoveragedetectionofinput-relatedsecurityfaults.InProc.ofthe12thUSENIXSecuritySymposium(Security'03),Aug.2003.[17]lp solve.http://groups.yahoo.com/group/lp solve/.[18]J.McCarthyandJ.Painter.Correctnessofacompilerforarithmeticexpressions.InProceedingsofSymposiainAppliedMathematics.AMS,1967.[19]G.C.Necula,S.McPeak,S.P.Rahul,andW.Weimer.CIL:IntermediateLanguageandToolsforAnalysisandtransformationofCPrograms.InProceedingsofConferenceoncompilerConstruction,pages213{228,2002.[20]J.O utandJ.Hayes.ASemanticModelofProgramFaults.InProc.ofISSTA'96,pages195{200,1996.[21]C.PachecoandM.D.Ernst.Eclat:Automaticgenerationandclassi cationoftestinputs.In19thEuropeanConferenceObject-OrientedProgramming,2005.[22]Parasoft.Jtestmanualsversion6.0.Onlinemanual,February2005.http://www.parasoft.com/.[23]C.S.Pasareanu,M.B.Dwyer,andW.Visser.Findingfeasiblecounter-exampleswhenmodelcheckingabstractedjavaprograms.InProc.ofTACAS'01,pages284{298,2001.[24]K.Sen,D.Marinov,andG.Agha.CUTE:AconcolicunittestingengineforC.TechnicalReportUIUCDCS-R-2005-2597,UIUC,2005.[25]SGLIB.http://xref-tech.com/sglib/main.html.[26]Valgrind.http://valgrind.org/.[27]W.Visser,C.S.Pasareanu,andS.Khurshid.TestinputgenerationwithJavaPathFinder.InProc.2004ACMSIGSOFTInternationalSymposiumonSoftwareTestingandAnalysis,pages97{107,2004.[28]S.VisvanathanandN.Gupta.Generatingtestdataforfunctionswithpointerinputs.In17thIEEEInternationalConferenceonAutomatedSoftwareEngineering,2002.[29]T.Xie,D.Marinov,andD.Notkin.Rostra:Aframeworkfordetectingredundantobject-orientedunittests.InProc.19thIEEEInternationalConferenceonAutomatedSoftwareEngineering,pages196{205,Sept.2004.[30]T.Xie,D.Marinov,W.Schulte,andD.Notkin.Symstra:Aframeworkforgeneratingobject-orientedunittestsusingsymbolicexecution.InProc.oftheToolsandAlgorithmsfortheConstructionandAnalysisofSystems,2005.