Objects may be reordered rounded or resized they may have names or attributes removed or they may even be coerced to a new class if necessary in order to achieve equality The results of comparisons report not just whether the objects are the same bu ID: 8743
Download Pdf The PPT/PDF document "Comparing NonIdentical Objects Introduci..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1.WriteRcodetocreatethethreevectorsandthefactorshownbelow,withnamesid,age,edu,andclass.Youshouldendupwithob-jectsthatlooklikethis:id[1]123456age[1]303228392025edu[1]000000class[1]poorpoorpoormiddle[5]middlemiddleLevels:middlepoor2.CombinetheobjectsfromQuestion1togethertomakeadataframecalledIndianMothers.Youshouldendupwithanobjectthatlookslikethis:IndianMothersidageeduclass11300poor22320poor33280poor44390middle55200middle66250middle Figure1:TwosimpleexamplesoftheexercisesthatSTATS220studentsareaskedtoperform.ThisreturnsTRUEifthetwoobjectsareexactlythesame,otherwiseitreturnsFALSE.Theproblemwiththisfunctionisthatitisverystrictindeedandwillfailforobjectsthatare,forallpracticalpurposes,thesame.Theclassicexampleisthecomparisonoftworeal( oating-point)values,asdemonstratedinthefollowingcode,wheredierencescanarisesimplyduetothelimitationsofhownumbersarerepresentedincomputermemory(seeRFAQ7.31,Hornik,2008).identical(0.3-0.2,0.1)[1]FALSEUsingthefunctiontotestforequalitywouldclearlybeunreasonablyharshwhenmarkinganystudentanswerthatinvolvescalculatinganumericresult.Theidentical()function,byitself,isnotsucientforcomparingstudentanswerswithmodelanswers.ShadesofgreyTherecommendedsolutiontotheproblemmentionedaboveofcomparingtwo oating-pointvaluesistousetheall.equal()function.Thisfunctionallowsfor\insignicant"dierencesbetweennumericvalues,asshownbelow.all.equal(0.3-0.2,0.1)[1]TRUE2 Thismakesall.equal()amuchmoreappropriatefunctionforcomparingstu-dentanswerswithmodelanswers.Whatislesswell-knownabouttheall.equal()functionisthatitalsoworksforcomparingothersortsofRobjects,besidesnumericvectors,andthatitdoesmorethanjustreportequalitybetweentwoobjects.Iftheobjectsbeingcomparedhavedierences,thenall.equal()doesnotsimplyreturnFALSE.Instead,itreturnsacharactervectorcontainingmessagesthatdescribethedierencesbetweentheobjects.Thefollowingcodegivesasimpleexample,whereall.equal()reportsthatthetwocharactervectorshavedierentlengths,andthat,ofthetwopairsofstringsthatcanbecompared,onepairofstringsdoesnotmatch.all.equal(c("a","b","c"),c("a","B"))[1]"Lengths(3,2)differ(stringcompareonfirst2)"[2]"1stringmismatch"Thisfeatureisactuallyveryusefulformarkingstudentwork.Informationaboutwhetherastudent'sansweriscorrectisusefulfordeterminingarawmark,butitisalsousefultohaveinformationaboutwhatthestudentdidwrong.Thisinformationcanbeusedasthebasisforassigningpartialmarksforananswerthatisclosetothecorrectanswer,andforprovidingfeedbacktothestudentaboutwheremarkswerelost.Theall.equal()functionhassomeusefulfeaturesthatmakeitahelpfultoolforcomparingstudentanswerswithmodelanswers.However,thereisanapproachthatcanperformbetterthanthis.Theall.equal()functionlooksforequalitybetweentwoobjectsand,ifthatfails,providesinformationaboutthesortofdierencesthatexist.Analternativeapproach,whentwoobjectsarenotequal,istotrytotransformtheobjectstomakethemequal,andreportonwhichtransformationswerenecessaryinordertoachieveequality.Asanexampleofthedierencebetweentheseapproaches,considerthetwoobjectsbelow:acharactervectorandafactor.obj1-c("a","a","b","c")ကobj1[1]"a""a""b""c"ကobj2-factor(obj1)ကobj2[1]aabcLevels:abcTheall.equal()functionreportsthattheseobjectsaredierentbecausetheydierintermsoftheirfundamentalmode|onehasattributesandtheotherdoesnot|andbecauseeachobjectisofadierentclass.ကall.equal(obj1,obj2)[1]"Modes:character,numeric"[2]"Attributes:targetisNULL,currentislist-531;"[3]"targetischaracter,currentisfactor"3 Thealternativeapproachwouldbetoallowvarioustransformationsoftheobjectstoseeiftheycanbetransformedtobethesame.Thefollowingcodeshowsthisapproach,whichreportsthattheobjectsareequal,ifthesecondoneiscoercedfromafactortoacharactervector.Thisismoreinformationthanwasprovidedbyall.equal()and,intheparticularcaseofcomparingstudentanswerstomodelanswers,ittellsusalotabouthowclosethestudentgottotherightanswer.library(compare)compare(obj1,obj2,allowAll=TRUE)TRUEcoercedfromfactorကtocharacterကAnotherlimitationofall.equal()isthatitdoesnotreportonsomeotherpossibledierencesbetweenobjects.Forexample,itispossibleforastudenttohavethecorrectvaluesforanRobject,buthavethevaluesinthewrongorder.Anothercommonmistakeistogetthecasewronginasetofstringvalues(e.g.,inacharactervectororinthenamesattributeofanobject).Insummary,whileall.equal()providessomedesirablefeaturesforcom-paringstudentanswerstomodelanswers,wecandobetterbyallowingforawiderrangeofdierencesbetweenobjectsandbytakingadierentapproachthatattemptstotransformthestudentanswertobethesameasthemodelanswer,ifatallpossible,whilereportingwhichtransformationswerenecessary.Theremainderofthisarticledescribesthecomparepackage,whichprovidesfunctionsforproducingthesesortsofcomparisons.Thecompare()functionThemainfunctioninthecomparepackageisthecompare()function.Thisfunctioncheckswhethertwoobjectsarethesameand,iftheyarenot,carriesoutvarioustransformationsontheobjectsandchecksthemagaintoseeiftheyarethesameaftertheyhavebeentransformed.Bydefault,compare()onlysucceedsifthetwoobjectsareidentical(usingtheidentical()function)orthetwoobjectsarenumericandtheyareequal(accordingtoall.equal()).Iftheobjectsarenotthesame,notransformationsoftheobjectsareconsidered.Inotherwords,bydefault,compare()issimplyaconveniencewrapperforidentical()andall.equal().Asasimpleexam-ple,thefollowingcomparisontakesaccountofthefactthatthevaluesbeingcomparedarenumericandusesall.equal()ratherthanidentical().ကcompare(0.3-0.2,0.1)TRUETransformationsThemoreinterestingusesofcompare()involvespecifyingoneormoreoftheargumentsthatallowtransformationsoftheobjectsthatarebeingcompared.Forexample,thecoerceargumentspeciesthatthesecondargumentmaybecoercedtotheclassoftherstargument.Thisallowsformore exiblecomparisonssuchasbetweenafactorandacharactervector.4 compare(obj1,obj2,coerce=TRUE)TRUEcoercedfromfactorကtocharacterကItisimportanttonotethatthereisadeniteordertotheobjects;themodelobjectisgivenrstandthecomparisonobjectisgivensecond.Transformationsattempttomakethecomparisonobjectlikethemodelobject,thoughinanum-berofcases(e.g.,whenignoringthecaseofstrings)themodelobjectmayalsobetransformed.Intheexampleabove,thecomparisonobjecthasbeencoercedtobethesameclassasthemodelobject.Thefollowingcodedemonstratestheeectofreversingtheorderoftheobjectsinthecomparison.Nowthecharactervectorisbeingcoercedtoafactor.ကcompare(obj2,obj1,coerce=TRUE)TRUEcoercedfromcharacterကtofactorကOfcourse,transforminganobjectisnotguaranteedtoproduceidenticalobjectsiftheoriginalobjectsaregenuinelydierent.ကcompare(obj1,obj2[1:3],coerce=TRUE)FALSEcoercedfromfactorကtocharacterကNotice,however,thateventhoughthecomparisonfailed,theresultstillreportsthetransformationthatwasattempted.Thisresultindicatesthatthecompar-isonobjectwasconvertedfromafactor(toacharactervector),butitstilldidnotendupbeingthesameasthemodelobject.Anumberofothertransformationsareavailableinadditiontocoercion.Forexample,dierencesinlength,likeinthelastcase,canalsobeignored.ကcompare(obj1,obj2[1:3],+shorten=TRUE,coerce=TRUE)TRUEcoercedfromfactorကtocharacterကshortenedmodelItisalsopossibletoallowvaluestobesorted,orrounded,ortoconvertallcharactervaluestouppercase(i.e.,ignorethecaseofstrings).Table1providesacompletelistofthetransformationsthatarecurrentlyallowed(inversion0.2ofcompare)andtheargumentsthatareusedtoenablethem.Afurtherargumenttothecompare()function,allowAll,controlsthede-faultsettingformostofthesetransformations,sospecifyingallowAll=TRUEisaquickwayofenablingallpossibletransformations.SpecictransformationscanstillbeexcludedbyexplicitlysettingtheappropriateargumenttoFALSE.TheequalargumentisabitofaspecialcasebecauseitisTRUEbydefault,whereasalmostallothersareFALSE.Theequalargumentisalsoespecially5 in uentialbecauseobjectsarecomparedaftereverytransformationandthisargumentcontrolswhatsortofcomparisontakesplace.Objectsarealwayscomparedusingidentical()rst,whichwillonlysucceediftheobjectshaveexactlythesamerepresentationinmemory.Ifthetestusingidentical()failsandequal=TRUE,thenamorelenientcomparisonisalsoperformed.Bydefault,thisjustmeansthatnumericvaluesarecomparedusingall.equal(),butvariousotherargumentscanextendthistoallowthingslikedierencesincaseforcharactervalues(seetheasteriskedargumentsinTable1).TheroundargumentisalsospecialbecauseitalwaysdefaultstoFALSE,evenifallowAll=TRUE.Thismeansthattheroundargumentmustbespeciedexplicitlyinordertoenablerounding.ThedefaultissetupthiswaybecausethevalueoftheroundargumentiseitherFALSEoranintegervaluespecifyingthenumberofdecimalplacestoroundto.Forthisargument,thevalueTRUEcorrespondstoroundingtozerodecimalplaces.Finally,thereisanadditionalargumentcolsOnlyforcomparingdataframes.Thisargumentcontrolswhethertransformationsareonlyappliedtocolumns(andnottorows).Forexample,bydefault,adataframewillonlyallowcolumnstobedropped,butnotrows,ifshorten=TRUE.Note,however,thatignoreOrdermeansignoretheorderofrowsfordataframesandignoreColOrdermustbeusedtoignoretheorderofcolumnsincomparisonsinvolvingdataframes.ThecompareName()functionThecompareName()functionoersaslightvariationonthecompare()function.Forthisfunction,onlythenameofthecomparisonobjectisspecied,ratherthananexplicitobject.Theadvantageofthisisthatitallowsforvariationsincaseinthenamesofobjects.Forexample,astudentmightcreateavari-ablecalledindianMothersratherthanthedesiredIndianMothers.Thiscase-insensitivityisenabledviatheignore.caseargument.Anotheradvantageofthisfunctionisthatitispossibletospecify,viathecompEnvargument,aparticularenvironmenttosearchwithinforthecomparisonobject(ratherthanjustthecurrentworkspace).Thisbecomesusefulwhencheckingtheanswersfromseveralstudentsbecauseeachstudent'sanswersmaybegeneratedwithinaseparateenvironmentinordertoavoidanyinteractionsbetweencodefromdierentstudents.Thefollowingcodeshowsasimpledemonstrationofthisfunction,whereacomparisonobjectiscreatedwithinatemporaryenvironmentandthenameofthecomparisonobjectisuppercasewhenitshouldbelowercase.tempEnv-new.env()ကwith(tempEnv,X-1:10)ကcompareName(1:10,"x",compEnv=tempEnv)TRUErenamedobjectNoticethat,aswiththetransformationsincompare(),thecompareName()func-tionrecordswhetheritneededtoignorethecaseofthenameofthecomparisonobject.7 ApathologicalexampleThissectionshowsamanufacturedexamplethatdemonstratessomeofthe exibilityofthecompare()function.Wewillcomparetwodataframesthathaveanumberofsimpledierences.Themodelobjectisadataframewiththreecolumns:anumericvector,acharactervector,andafactor.model-+data.frame(x=1:26,+y=letters,+z=factor(letters),+row.names=letters,+stringsAsFactors=FALSE)Thecomparisonobjectcontainsessentiallythesameinformation,exceptthatthereisanextracolumn,thecolumnnamesareuppercaseratherthanlowercase,thecolumnsareinadierentorder,theyvariableisafactorratherthanacharactervector,andthezvariableisacharactervariableratherthanafactor.Theyvariableandtherownamesarealsouppercaseratherthanlowercase.ကcomparison-+data.frame(W=26:1,+Z=letters,+Y=factor(LETTERS),+X=1:26,+row.names=LETTERS,+stringsAsFactors=FALSE)Thecompare()functioncandetectthatthesetwoobjectsareessentiallythesameaslongaswereorderthecolumns(ignoringthecaseofthecolumnnames),coercetheyandzvariables,droptheextravariable,ignorethecaseoftheyvariable,andignorethecaseoftherownames.ကcompare(model,comparison,allowAll=TRUE)TRUErenamedreorderedcolumns[Y]coercedfromfactorကtocharacterက[Z]coercedfromcharacterကtofactorကshortenedcomparison[Y]ignoredcaserenamedrowsNoticethatwehaveusedallowAll=TRUEtoallowcompare()toattemptallpossibletransformationsatitsdisposal.ComparinglesofRcodeReturningnowtotheoriginalmotivationforthecomparepackage,thecompare()functionprovidesanexcellentbasisfordeterminingnotonlywhetherastudent's8 answersarecorrect,butalsohowmuchincorrectanswersdierfromthemodelanswer.Asdescribedearlier,submissionsbystudentsintheSTATS220coursecon-sistoflesofRcode.Markingthesesubmissionsconsistsofusingsource()torunthecode,thencomparingtheresultingobjectswithmodelanswerobjects.Withapproximately100studentsintheSTATS220course,withweeklylabs,andwithmultiplequestionsperlab,eachofwhichmaycontainmorethanoneRobject,thereisareasonablemarkingburden.Consequently,thereisastrongincentivetoautomateasmuchofthemarkingprocessaspossible.ThecompareFile()functionThecompareFile()functioncanbeusedtorunRcodefromaspecicleandcomparetheresultswithasetofmodelanswers.Thisfunctionrequiresthreepiecesofinformation:thenameofalecontainingthe\comparisoncode",whichisrunwithinalocalenvironment,usingsource(),togeneratethecomparisonvalues;avectorof\modelnames",whicharethenamesoftheobjectsthatwillbelookedforinthelocalenvironmentafterthecomparisoncodehasbeenrun;andthemodelanswers,eitherasthenameofabinaryletoload(),orasthenameofaleofRcodetosource(),orasalistobjectcontainingtheready-mademodelanswerobjects.Anyargumenttocompare()mayalsobeincludedinthecall.Oncethecomparisoncodehasbeenrun,compareName()iscalledforeachofthemodelnamesandtheresultisalistof"comparison"objects.Asasimpledemonstration,considerthebasicquestionsshowninFigure1.Themodelnamesinthiscasearethefollowing:modelNames-c("id","age",+"edu","class",+"IndianMothers")Onestudent'ssubmissionforthisexerciseisinalecalledstudent1.R,withinadirectorycalledExamples.Themodelanswerisinalecalledmodel.Rinthesamedirectory.Wecanevaluatethisstudent'ssubmissionandcompareittothemodelanswerwiththefollowingcode:ကcompareFile(file.path("Examples",+"student1.R"),+modelNames,+file.path("Examples",+"model.R"))$idTRUE$ageTRUE$eduTRUE9 $classFALSE$IndianMothersFALSEobjectnotfoundThisprovidesastrictcheckandshowsthatthestudentgottherstthreeproblemscorrect,butthelasttwowrong.Infact,thestudent'scodecompletelyfailedtogenerateanobjectwiththenameIndianMothers.Wecanprovideextraargumentstoallowtransformationsofthestudent'sanswers,asinthefollowingcode:compareFile(file.path("Examples",+"student1.R"),+modelNames,+file.path("Examples",+"model.R"),+allowAll=TRUE)$idTRUE$ageTRUE$eduTRUE$classTRUEreorderedlevels$IndianMothersFALSEobjectnotfoundThisshowsthat,althoughthestudent'sanswerfortheclassobjectwasnotperfect,itwasprettyclose;itjusthadthelevelsofthefactorinthewrongorder.ThecompareFiles()functionThecompareFiles()functionbuildsoncompareFile()byallowingavectorofcomparisonlenames.Thisallowsawholesetofstudentsubmissionstobetestedatonce.Theresultofthisfunctionisalistoflistsof"comparison"objectsandaspecialprintmethodprovidesasimpliedviewofthisresult.Continuingtheexamplefromabove,theExamplesdirectorycontainssub-missionsfromafurtherfourstudents.WecancompareallofthesesubmissionswiththemodelanswersandproduceasummaryoftheresultswithasinglecalltocompareFiles().TheappropriatecodeandoutputareshowninFigure2.10 files-list.files("Examples",+pattern="^student[0-9]+[.]R$",+full.names=TRUE)ကresults-compareFiles(files,+modelNames,+file.path("Examples","model.R"),+allowAll=TRUE,+resultNames=gsub("Examples.|[.]R","",files))ကresultsidageeduclassIndianMothersstudent1TRUETRUETRUETRUEreorderedlevelsFALSEobjectnotfoundstudent2TRUETRUETRUETRUETRUEstudent3TRUETRUETRUETRUEcoercedfromcharacterကtofactorကFALSEobjectnotfoundstudent4TRUETRUETRUETRUEcoercedfromcharacterကtofactorကTRUErenamedobjectstudent5TRUETRUETRUEFALSEobjectnotfoundFALSEobjectnotfoundFigure2:UsingthecompareFiles()functiontorunRcodefromseverallesandcomparetheresultstomodelobjects.Theresultofthissortofcompar-isoncaneasilygetquitewide,soitisoftenusefultoprinttheresultwithoptions(width)settosomelargevalueandusingasmallfont,ashasbeendonehere.Theresultsshowthatmoststudentsgottherstthreeproblemscorrect.Theyhadmoretroublegettingthefourthproblemright,withonegettingthefactorlevelsinthewrongorderandtwoothersproducingacharactervectorratherthanafactor.Onlyonestudent,student2,gotthenalproblemexactlyrightandonlyoneother,student4,gotessentiallytherightanswer,thoughthisstudentspeltthenameoftheobjectwrong.AssigningmarksandgivingfeedbackTheresultreturnedbycompareFiles()isalistoflistsofcomparisonresults,whereeachresultisitselfalistofinformationincludingwhethertwoobjectsarethesameandarecordofhowtheobjectsweretransformedduringthecomparison.ThisrepresentsawealthofinformationwithwhichtoassesstheperformanceofstudentsonasetofRexercises,butitcanbealittleunwieldlytodealwith.Thecomparepackageprovidesfurtherfunctionsthatmakeiteasiertodealwiththisinformationforthepurposeofdetermininganalmarkandforthepurposeofprovidingcommentsforeachstudentsubmission.Inordertodetermineanalmark,weusethequestionMarks()functiontospecifywhichobjectnamesareinvolvedinaparticularquestion,toprovideamaximummarkforthequestion,andtospecifyasetofrulesthatdeterminehowmanymarksshouldbedeductedforvariousdeviationsfromthecorrectanswers.Therule()functionisusedtodeneamarkingrule.Ittakesanobjectname,anumberofmarkstodeductifthecomparisonforthatobjectisFALSE,11 plusanynumberoftransformationrules.ThelatteraregeneratedusingthetransformRule()function,whichassociatesaregularexpressionwithanum-berofmarkstodeduct.Iftheregularexpressionismatchedintherecordoftransformationsforacomparison,thentheappropriatenumberofmarksarededucted.Asimpleexample,basedonthesecondquestioninFigure1,isshownbelow.ThisspeciesthatthequestiononlyinvolvesanobjectnamedIndianMothers,thatthereisamaximummarkof1forthisquestion,andthat1markisdeductedifthecomparisonisFALSE.q2-+questionMarks("IndianMothers",+maxMark=1,+rule("IndianMothers",1))TherstquestionfromFigure1providesamorecomplexexample.Inthiscase,therearefourdierentobjectsinvolvedandthemaximummarkis2.TherulesbelowspecifythatanyFALSEcomparisondropsamarkandthat,forthecomparisoninvolvingtheobjectnamed"class",amarkshouldalsobedeductedifcoercionwasnecessarytogetaTRUEresult.ကq1-+questionMarks(+c("id","age","edu","class"),+maxMark=2,+rule("id",1),+rule("age",1),+rule("edu",1),+rule("class",1,+transformRule("coerced",1)))Havingsetupthismarkingscheme,marksaregeneratedusingthemarkQuestions()function,asshownbythefollowingcode.ကmarkQuestions(results,q1,q2)id-age-edu-classIndianMothersstudent120student221student310student411student510Fortherstquestion,thethirdandfourthstudentsloseamarkbecauseofthecoercion,andthefthstudentlosesamarkbecausehehasnotgeneratedtherequiredobject.Asimilarsuiteoffunctionsareprovidedtoassociatecomments,ratherthanmarkdeductions,withparticulartransformations.Thefollowingcodeprovidesasimpledemonstration.ကq1comments-+questionComments(12 +c("id","age","edu","class"),+comments(+"class",+transformComment(+"coerced",+"'class'isafactor!")))commentQuestions(results,q1comments)id-age-edu-classstudent1""student2""student3"'class'isafactor!"student4"'class'isafactor!"student5""Inthiscase,wehavejustgeneratedfeedbackforthestudentswhogeneratedacharactervectorinsteadofthedesiredfactorinQuestion1oftheexercise.Summary,discussion,andfuturedirectionsThecomparepackageisbasedaroundthecompare()function,whichcomparestwoobjectsforequalityand,iftheyarenotequal,attemptstotransformtheobjectstomakethemequal.Itreportswhetherthecomparisonsucceededoverallandprovidesarecordofthetransformationsthatwereattemptedduringthecomparison.Furtherfunctionsareprovidedontopofthecompare()functiontofacilitatemarkingexerciseswherestudentsinaclasssubmitRcodeinaletocreateasetofRobjects.Thisarticlehasgivensomebasicdemonstrationsoftheuseofthecompare()packageforcomparingobjectsandmarkingstudentsubmissions.Thepackagecouldalsobeusefulforthestudentsthemselves,bothtocheckwhethertheyhavethecorrectanswerandtoprovidefeedbackabouthowtheiranswerdiersfromthemodelanswer.Moregenerally,thecompare()functionmayhaveapplicationwherevertheidentical()andall.equal()functionsarecurrentlyinuse.Forexample,itmaybeusefulwhendebuggingcodeandforperformingregressiontestsaspartofaqualitycontrolprocess.Obviousextensionsofthecomparepackageincludeaddingnewtransfor-mationsandprovidingcomparisonmethodsforotherclassesofobjects.Moredetailsabouthowthepackageworksandhowtheseextensionsmightbedevel-opedarediscussedinthevignette,\FundamentalsoftheComparePackage",whichisinstalledaspartofthecomparepackage.AcknowledgementsManythankstotheeditorsandanonymousreviewersfortheirusefulcommentsandsuggestions,onboththisarticleandthecomparepackageitself.13 ReferencesK.~Hornik.TheRFAQ,2008.URLhttp://CRAN.R-project.org/doc/FAQ/R-FAQ.html.ISBN3-900051-08-9.RDevelopmentCoreTeam.R:ALanguageandEnvironmentforStatisticalComputing.RFoundationforStatisticalComputing,Vienna,Austria,2008.URLhttp://www.R-project.org.ISBN3-900051-07-0.14