/
A Coding Scheme Development Methodology Using Grounded Theory for Qualitative Analysis A Coding Scheme Development Methodology Using Grounded Theory for Qualitative Analysis

A Coding Scheme Development Methodology Using Grounded Theory for Qualitative Analysis - PDF document

ellena-manuel
ellena-manuel . @ellena-manuel
Follow
564 views
Uploaded On 2014-12-12

A Coding Scheme Development Methodology Using Grounded Theory for Qualitative Analysis - PPT Presentation

9 14195 Berlin Germany salingerplonkaprecheltinffuberlinde Abstract Since a number of quantitative studies of pair programming the practice of two programmers working together using just one com puter have produced somewhat con64258icting results a ID: 22951

14195 Berlin Germany

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "A Coding Scheme Development Methodology ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

ACodingSchemeDevelopmentMethodologyUsingGroundedTheoryforQualitativeAnalysisofPairProgrammingStephanSalinger,LauraPlonka,andLutzPrecheltFreieUniversitatBerlin,InstitutfurInformatik,Takustr.9,14195Berlin,Germanysalinger,plonka,prechelt@inf.fu-berlin.deAbstract.Sinceanumberofquantitativestudiesofpairprogramming(thepracticeoftwoprogrammersworkingtogetherusingjustonecom-puter)haveproducedsomewhatcon ictingresults,anumberofresearch-ershavestartedtostudypairprogrammingqualitatively.Whilemostsuchstudiesusecodingschemesthatarefullyorpartiallyprede ned,wehavedecidedtogothelongwayanduseGroundedTheory(GT)togroundeachandeverystatementwemakedirectlyinobservations.The rstintermediategoal,whichwetalkabouthere,wastoproduceacodingschemethatwouldallowtheobjectiveconceptualdescriptionofspeci cpairprogrammingsessionsindependentofaparticularresearchgoal.ThepresentarticleexplainshowourinitialattemptsatusingthemethodofGroundedTheoryfailedandwhichpracticeswedevelopedtoavoidthesediculties:predeterminedperspectiveonthedata,conceptnamingrules,analysisresultsmeta-model,andpaircoding.WeexpectthesepracticesbehelpfulinallGTsituations,inparticularthoseinvolvingveryrichdatasuchasvideodata.Weillustratetheoperationandusefulnessofthesepracticesbyrealex-amplesderivedfromourcodingworkandalsopresentafewpreliminaryhypothesesregardingpairprogrammingthatwehavestumbledacross.1IntroductionDuringthelastfewyears,pairprogramming,asitisknownfromextremepro-gramming[1],hasbeenthesubjectofmanyempiricalinvestigations.Thisre-searchfocussedmainlyonthemeasurementofbottomlinepairprogramminge ects,whereastheunderlyingprocessofpairprogramminghasbeenregardedasakindofblackbox,theoutputofwhichisanalyzedquantitativelywithrespecttoitsperformance,errorrate,programmersatisfactionetc.Unfortunately,theresultsofthisresearchareoftencontradictory.Forin-stanceregardingtotale ort,Williamsfoundthatpairprogrammingresultsina15%increasecomparedtosoloprogramming[2],LuiandChanfound21%[3],andNawrockietal.found48%[4].Mostlikelythesedi erencesarecausedbydi erencesinmoderatorvariablessuchasprogrammerandpairexperience, 2typeoftasketc.,butneitherdoweknowthecompletesetofrelevantmoderatorvariablesnorthenatureandmechanismoftheirin uence.Ourgoalassoftwareengineeringresearchersistounderstandpairprogram-minginsuchawaythatwecanadvisepractitionershowtouseitmosteciently.Weproposethattheonlywaytoobtainsuchunderstandingistounderstandthemechanismsatworkintheactualpairprogrammingprocess.Obviously,thisunderstandingmust rstbegainedinqualitativeformbeforewecanstartquantifying,andsincewedonotknowmuchyet,theinvestigationhastostartinanexploratoryfashion.WehavestartedsuchaninvestigationbasedontheGroundedTheory(GT)methodology[5]andworkingfromrichsetsofdata(full-lengthaudio,program-mervideo,andscreenvideoofpairprogrammingsessions).Thepresentpaperpresentsanumberofimportantmethodologicalinsightsgainedduringthisre-searchandafewinitialresults.Itscontributionsarethefollowing:{adescriptionofstumblingblocksforaGT-basedanalysisinthisarea;{asetofpracticesthatextendtheplainGTmethodandhelpovercomingtheseobstacles;{asketchofapairprogrammingprocesscodingscheme.Insubsequentresearch,thecodingschemeissupposedtoformthebasisformoredetailedconceptualdescriptionsofthepairprogrammingprocessandalsotosupportthepropositionofhypothesesandtheoryconstruction.Wewill rstgiveashortintroductiontoGroundedTheory(Section2)anddescribethenatureandoriginofourrawdata(Section3).TheheartofthepaperdescribeshowandwhyplainGTdoesnotworkwellundertheseconstraints(Section4)andwhichpracticeshelptomakeitworkbetter(Section5).Section6presentstheapplicationofthemodi edGTprocessandafewofitsinitialresults,namelyexcerptsofacodingschemefordescribingtheactivitiesoccuringduringpairprogramming.Weclosebyoutliningrelatedworks(Section7)ando eringasummaryandoutlook(Section8).Thepaperfocusesonresearchmethod,notonresearchresults.Theresultsmostlyservetoillustratethemethod.2TheGroundedTheorymethodologyAsmentionedabove,theinitialanalysisofpairprogramminghastobeex-ploratory.Inordertobeasopenaspossiblewithrespecttothenatureandcontentoftheresults,wepickGroundedTheoryasouranalysisapproach.GT, rstdescribedin[6],isadataanalysisapproachthatislargelydata-driven(i.e.useshardlyanypriorassumptionsnorpre-de nedterminology)andaimsatproducingatheorythatdescribesinterestingrelationshipsbetweenthings,situations,events,andactivities(togethercalledphenomena)re ectedinthedatabymeansofabstractconcepts.Thetermgroundedindicatesthatthistheorywillcontainonlystatementsderivedfromactualobservationsina 3mannerthatcanbetracedbacktothesedata|thetheoryisgroundedinthedata.WeusethevariantofGTdescribedbyStraussandCorbin[5],whosuggestthree(partiallyparallel)activitiesforaGT-baseddataanalysis:1.Opencodingdescribesthedatabymeansofconceptual(ratherthanmerelydescriptive)codes,whicharederiveddirectlyfromthedata.2.Axialcodingidenti esrelationshipsbetweentheconceptsdescribedbythesecodes.StraussandCorbinsuggestaconcretesetofrelationshipstocheckfor(inparticular:causalconditionsleadtophenomenawhichexistinacontextfeaturinginterveningconditionsandleadingtoparticipant'sstrategieswhichcreatecertainconsequences).Theserelationships(plustheslightlyfuzzyno-tionofformingcategories)theycallparadigmaticmodel,atermwewilluseafewtimesfurtherbelow.3.Selectivecodingextractsasubsetoftheconceptsandrelationshipsthusfoundandformulatesthemintoacoherenttheory.Selectivecodingisnotrelevantforthedevelopmentofacodingschemeandwillnotbediscussedinthepresentarticle.StraussconsideredthefollowingthreeaspectstobethecoreoftheGTmethod,saying\Whenyoudoallofthese,thenitisGroundedTheory,ifyoudonot,thenitissomethingelse"[7]:{Theoreticalcoding:Codesaretheoretical,notjustdescriptive;theyre ectconceptswhichhavepotentialexplanatoryvalueforthephenomenade-scribed.{Theoreticalsampling:Theselectionofthematerialtobeanalyzedismadeincrementallyinthecourseoftheanalysis,basedonwhatisexpectedtobemostrelevantforthetheoryunderdevelopment.{Constantcomparison:Observedphenomena(andtheircontexts)arecom-paredmanytimesinordertocreatecodesthatarepreciseandconsistent.Theoreticalsamplingisoflessinterestinthepresentarticle,buttheoreticalcodingandconstantcomparisonareofvitalimportancetounderstandthedis-cussion.3DatausedfortheanalysisofpairprogrammingInthefollowing,wedescribeourobservationcontext(programmersandtask)andthedatacapturingmethodused.3.1Observationcontext:TheoriginofourdataWeobserved(inthemannerdescribedbelow)sevenpairsofgraduatestudentswhoallworkedonthesametask.Sixofthemhadworkedtogetheraspairsprevi-ously.Theaverageworktime(whichwasnotlimited)was3.8hours.Thestudents 4wereallparticipantsofahighlytechnicalcourseonenterpriseinformationsys-temsandtheJava2EnterpriseEdition(J2EE)architectureandtechnologies.Thespeci ctaskcalledforanextensionofanexistingwebshopapplication.ThetaskrequiredbroadpassiveJ2EEknowledgeforanalyzingandunderstand-ingtheexistingsystemandspeci coperationalknowledgeaboutJMS,JNDI,andtheJBossapplicationserverforprogramming,con guring,andtestingtheactualextension.Thetaskwasnon-trivialsothatonlythreeofthepairswerecompletelysuccessful.Fortheanalysisdescribedinthepresentarticle,weusedthesessionofoneofthesuccessfulpairsonly;itis2hoursand58minuteslong.3.2Observationmethod:DatacapturingprocedureSincewedonotknowinadvancewhatwillbeimportantandwhatwillnot,weneedtostartfromaratherrichdataset.Weusethreedi erentdatasources:{Audiorecordingcapturesverbalcommunicationamongtheparticipantsaswellasothernoises,vocalorother,thatmayhelpwiththeinterpretationoftheremainingdata.{Frontal-perspectivevideooftheprogrammers(shotfromabove-behindthescreenandreachingdowntoaboutwaistlevel)capturesaspectsoffacialexpression,gestures,posture,directionofattention,and|mostrelevantly|whoiscurrentlyoperatingmouseandkeyboard.{Full-resolutionscreenrecordingcapturesalmostallcomputeractivitiesoftheprogrammersonafairly ne-grainedlevel.AllthreerecordingsaremadeatonceusingCamtasiaStudio[8]anduni edintoasingle,fullysynchronizedvideo leinwhichthecameravideoissuperim-posedsemi-transparentlyontoacornerofthescreenvideosothatallinformationisvisibleatonce(multi-dimensionalvideo).Thesessionwasrecordedinanotherwisesilentoce.CombinedwiththehighaudioqualityoftheLogitech5000webcam,thisprovidesgoodacousticalplaybackconditions.4ProblemsofaplainGroundedTheorydataanalysisapproachAttemptingGT-styleexploratoryanalysisoftherichdatasetdescribedabove1,wequicklyrecognizedthattranscriptionwasnotpractical.Toomuchrelevantinformationisfoundinthescreenrecordingforwhichitisnotobvioushowtotranscribeitatall,nottospeakofthee ortfordoingso:sourcecodefragmentinput,usingfeaturesofthedevelopmentenvironment(suchasbrowsingacrossdi erent lesorpositionswithin les),pointingwiththemouseduringdiscussionwiththepartner,etc. 1Actuallyaprecursor,butverysimilarinallrespects. 5ThisiswhywedecidedtoworkontherawvideodirectlyandchosethequalitativedataanalysissoftwareATLAS.ti[9]fordoingso,whichisoneofthefewproductsthatallowscreatingdirectannotationstovideo.Oneofus,StephanSalinger,startedopencodinginthemannersuggestedbyStraussandCorbin.Theshort-termgoalwastocharacterizetheactivitiesoccuringduringpairprogramming,thelong-termgoalwastoidentifyrecur-ringbehavioralpatternsandclassifythemashelpful,hampering,ambivalent,orneutral.Thisapproachgeneratednofewerthan194di erentconceptsandalmostcompleteconfusionanddespairinthecourseofafewdaysofanalysisduetothefollowingproblems:{Noprede nedfocus:Wehadnocriteriaforselectingwhich(kindsof)ob-servationstocodeandwhichtoignore(codeverbalinteraction?factialex-pressions?gestures?posture?directionsofgaze?sub-verbalvocalnoises?nervoustics?computerinput?inputmethods?computeroutput?andsoon)andconsequentlywereoverwhelmedbythedata.{Noprede nedgranularity:Wehadnopriordecisiononthelevelofdetailthatwouldbeworthcoding.Asaresult,weproducedcodesondi erentlevelsofdetail(say,coarseonessuchashandleproblemand neronessuchastestdefect x),whichwherediculttodelineateagainstoneanothersubsequently.{Noprede nedlevelofacceptablesubjectivity:ThenatureofthecodeschoseninGTcanbeanywhereonthespectrumrangingfromcodesthatstickcloselytoobservationsthatanyobserverwouldagreewithtocodesthatinterprettheobservationtoadegreethattheymustbecalledwishfulthinking.GTassuchdoesnotprovideacriterionfordecidingwhere\groundedindata"endsandwishfulthinkingbegins.Asaconsequence,wemixedobjective-descriptiveandsubjective-evaluativeattitudesforselectingcodes.Thisledtocodesofdi erentnature(say,descriptiveonessuchasusesdocumentationandassumption-bearingonessuchasgainsknowledgeofdetail)existingside-by-side,whichmadeithardertodecidewhichonetouseinaparticularcase.{Toomanytopics:Thecodesdescribedtoomanydi erenttopicsofinterest,makingitimpossibletoproperlyfocusonanything.Noneofthevariousresultingcollectionsofinformationeverreachedausefuldegreeofcomplete-ness.{Lackofconceptgrouping:Thediversityoftopicsalsodistractedfromform-ingwhatGTcallscategories:afewlargegroupsofheavilyinterrelatedcon-cepts(say,\Human-humaninteraction",HHI,and\Human-computerinter-action",HCI){Importancemisjudgments:Thehighattentiontoabroadsetofconceptsovertaxedourabilitytojudgetheirimportancesothatbecauseofthelargenumberofconceptsweintroduced,wecompletelyoverlookedanumberofimportantones.Afterwehadnoticedandgraduallyunderstoodanumberoftheseproblems,westoppedthismodeofinvestigationcompletely.Westartedthewholeanalysis 6againfromscratch(butveryslowlyandcarefully,withalotofbacktracking)andconcurrentlyredesignedthecodingprocedure.TheresultofthisredesignwereanumberofheuristicpracticesdescribedbelowthathelpusingtheGTanalysisprocess.5PracticessupportingtheanalysisofcomplexvideodataThemethodologicalheuristicspresentedhereformtheheartofthepresentarti-cle.Theseintertwinedpracticesservetoreduceorsolvetheproblemsdescribedintheprevioussection.Section6willpresentanapplicationofthepracticesthatalsoshowshowtheyworktogetherandmutuallysupportoneanother.5.1Practice1:PerspectiveonthedataStraussandCorbinsuggestthatthestartofselectivecoding(thatis,afteropencodingandaxialcodinghavebeengoingonforquitesometime)isthetimewhenyoushouldbegintodecidewhatisimportantandwhatislessso.Asdescribedabove,wefoundthatthisisnotagoodideawhenworkingwithrichvideodata.Therearethreereasonswhyaperspectiveusedfortheanalysisshouldbede nedbeforestarting:{Toavoiddrowningindetail;{toprovideconstancyinthecriteriausedforcreatingandassigningconcepts;{tofocusattentiononthemostrelevantaspects.Thisperspectivecanbede nedbyformulatinganswerstothefollowingques-tions.Theseanswersshouldbereviewed(andperhapsrevised)severaltimesinthecourseoftheanalysis:1.Inwhichrespectsdoyouexpectthedatatoprovideinsight?2.Whatkindsofphenomenadotheresearchersallowthemselvestoidentifyinthedata?3.Whattypeofresultdoyouwanttheanalysistobringforth?Question1doesnotaskwhatyouexpectto nd,onlyinwhatrespectsyouexpectto ndsomething.Theansweractsasa lterthattellsyouwhichphenomenashouldreceivemoreattentionthanothers.Furthermore,constantlyre-checkingandadjustingtheanswertothisquestionhelpsdecidingwhentostoptheanalysis,whentomodify(orthrowoverboard)yourresearchquestion,andwhentoobtainfurtherordi erentrawdata.Inourcase,theexpectationwasthatthedatacouldhelpunderstandwhatactivitiesdominatethepairprogrammingprocessandhowtheyrelate.Answer2providesthemechanismforsystematicallyboundingthenatureandamountofsubjectivitytobefoundintheconceptualizationsofthedata.Thestrongestrestrictionwouldbetoallowonlyconceptsthatexpressdirectlyobservablephenomena,resultinginabehaviorist(stimulus/response)research 7perspective.Weakerrestrictionsmightalsoallowconceptsreferingtounobserv-ableprocesses(suchasattitudesorthinkingprocessesofactors),conceptsthatinvolvepredictions(suchas\helpfulforreachinggoalX"),and/orconceptsex-pressingmoraljudgement(good,bad).Wewereconvincedthatinourcaseonlythebehavioristperspectivewouldenableustotrustourownresults.Finally,theresulttypeisthestandardusedfordecidinghowmuchattentiontoinvestinwhichkindsofphenomenawhentheanalysisresourcesbegintogetscarce(whichveryquicklytheywill).Ithelpstostayontrack.Dowewanttoproduceafullconceptualtheory?Orjustaconceptualstructure(systemofcategories)forthedata?Orevenjustacodingscheme?Inourcase,thegoalwasjusttoproduceacodingscheme,becausewefeltweknewsolittleabouttheinternalsofpairprogrammingthatweshouldnotyetdecideonanactualengineeringresearchquestion.5.2Practice2:ConceptnamesyntaxrulesChoosingthenamesofconceptsisanotherareawherewefoundthatgivingupsomeofthefreedompostulatedbyplainGTisbene cial,becauseourfreelychosenconceptnamesturnedouttobehighlyvariableandhencediculttounderstand,remember,andcompare.Asaremedy,wedevelopedastructurednamingschemeasdescribedbelow.Withinthecon neswesetourselvesbypractice1,thatis,describingdirectlyobservableactivitiesofthepairprogrammers,theschemedoesnotpredetermineanythingwithrespecttothemeaningofaconcept,itonlyprescribestheshapeofitsname.Whenworkingwiththisscheme,weobservedthefollowingbene ts:{Aconceptwillbebetterunderstoodrightatintroductiontime.{Itfacilitateshandlingandoverlookingalargesetofconcepts.{Somerelationshipsbetweenconceptsareimplicitlyrecordedaswell,whichmuchsimpli esaxialcodingandtheformingofcategories.{Aconceptnameexplicitlyrepresentsseveralaspectsatonce,whichsimpli esthebasicGTpracticeof\constantcomparison".{Itbecomeseasiertounderstandwheredicultiesindelineatingoneconceptagainstanothercomefromandcorrespondinglyeasiertoobtaininsightsastotheweaknessesoftheoverallcurrentconceptualdescription.Inourcase,theconceptsneededtodescribeindividualactivitiesbyoneorbothofthepairmembers2,soaconceptnameisstructuredlikeacompletesentence:code=¬to;&#xr000;Þsc;&#xript;ကionactor=P1|P2|Pdescription=&#xverb;&#xob10; rit;rio;&#xn000;ject[_ 2Forotherdomainsofanalysis,othercodenamingstructuresmightbepreferable. 8forexample\P1.ask knowledge"and\P2.explain knowledge".Thecriterionpartcanbeusedforadditionalspecializationwhereneeded.Givensuchcodes,sub-sequentanalysiscanveryeasilyabstractforinstancetheverbpart(tocomparecontextsofobjects)ortheobjectpart(tocomparethevariantsofactiontypes).Withoutsuchcomplexcodes,thesamesituationwouldprobablybemodeledbyatupleofcodeswithrelationships.SowhileinplainGT ndingrelationshipsinvolvesaxialcoding,inourcaserecordingatleastsomerelationshipsbecameafringebene tofopencoding.5.3Practice3:Analysisresultsmeta-modelWhenwestartedpracticingGT,wefoundsomeoftheterminologyandconceptsconfusing.First,whereGTtalksaboutphenomena,conceptualization,concepts,properties,categories,andrelationships,ouranalysissoftwareATLAS.titalksaboutquotations,annotation,concepts,concepts,families,andrelationships,respectively|andevenrelationshipsandrelationshipsarenotquitethesamething.Second,evenaftertheinitiallearningphasesomeofthedi erencesweresub-tleenoughthatwemisappliedthemeveryonceinawhileandbecameconfusedwhenwetriedtoreconstructwhatwehadmeanttoexpress.Third,whendecisionsregardingtheintroductionordemarcationofcodesbecamedicult(whichtheyoftendid),werealizedweneededguidanceforsystematicallyapplyingtheideasofGTtobreakoutofthesituationinanappropriateway.(AnexampleofthiswillbegiveninSection6.)Fourth,weextendedtheterminologicalframeworkbysomeadditionalideasowingtothenatureofourdata,inparticularthenotionoftrackforpartitioningdatainordertosupportdatavisualizationforabetteroverviewofnestedandparallelactivities.Together,theseissuespromptedustoformulateanexplicitanalysisresultsmeta-model,thatis,amodeloftheconceptsthatdescribethestructureofananalysisresult.Weformulatedthismeta-modelasaUMLclassmodel[10],whichisshowninFigure1.Hereisaveryshortdescriptionofthemostimportantelementsofthemodel:Quotationsde nefragmentsofthedata(scenesinthevideo)thattheanaly-sisrefersto.AnnotationsconnectQuotationswithConcepts.ConceptscanbegroupedintoConceptClasses;asingleConceptcanbeamemberofmanyConceptClasses.ConceptRelationsareusedtodescriberelationshipsbetweenConcepts,forinstanceaccordingtotheparadigmaticmodel.Inmanycases,sucharelationshipisnotvalidforallpairsofAnnotationsthatusetheseConcepts;itcanthenbeexpressedindividuallybyusingAnnotationRelation.Theotherelementsofthemeta-modelarenotrelevantforthepresentarticle.Besidesdescribingthestructureofanalysisresults(toavoidterminologicalconfusion),themeta-modelalsoactsasarepositoryofideasfortheanalysisprocess.Forinstance,whenunsurewhetheracertainConceptRelationwillalwayshold,themeta-modelsuggeststoinitiallyannotatethecurrentlyknown 9 Fig.1.Meta-modelofanalysisresultsinstancesonly(AnnotationRelation)anddeferthecreationofthemoregeneralConceptRelationuntilsucientevidenceisavailable.5.4Practice4:PaircodingThecentralandmostimportantpracticeispaircoding.Paircodingmeansthatallcodingworkisdonebytwopeopleworkingtogetheratonecomputer(muchlikepairprogramming,butthatisjustacoincidence).Thekeyideaofpaircodingistorequireaconsensusoftwopeopleforallimportantdecisions:Whichphenomenafoundinthedatatosingleoutforcoding;whereintimesuchaphenomenonstartsandends;whichexistingconcepttouseforcodingthisphenomenon;whentocreateanewconcept;howtonamethatconcept.Wefoundanumberofbene tsofapaircomparedtoasingleresearcher,someofthemveryimportantforsuccessfulGTwork:{Conceptde nitionsbecomemoreexact,becausetheyarescrutinizedmorecloselyrightupontheirintroduction.Thise ectisfurthersupportedbythestructurednamingscheme(practice2).{Thedi erentiationbetweensimilarconceptsalsobecomesmoreprecise,notjustduetobetterde nitionsbutalsobecauseapairislesslikelytoletaconceptslipinthatisonamuchdi erentlevelofgranularitythantheothersandthathencemuchmoreoftenhasbigoverlapswithoneormoreexistingconcepts.{Remainingconceptdi erentiationproblemswillnotbeignoredbutratherdiscussed.Iftheycanberesolved,thiswillhappenatanearliertimeleadingtofewerincorrectconceptassignmentsand/orlessrework.Ifitisinherentlyimpossibletofullyresolvethem(whichisnotuncommonatall),thereasonforthiswillbeunderstoodmuchmorethoroughlybythediscussion,leadingtoabetterunderstandingoftheconceptsinvolved.{Theperspectiveonthedata(practice1)ismaintainedmoreconsistently. 10{Theperspectiveonthedataisre nedmoreregularlyandmorethoroughly.{Alargernumberofrelevantphenomenaaredetectedandencoded.Together,thesefourpracticesprovidedaquantumleapintheusefulnessofouranalysisresults.Thenextsectionwillillustratethiswithanumberofexampleswhichwillalsoshowhowthepracticescomplementoneanother.6ApplicationofthepracticesandsomeresultsThissectionwillpresentafewfragmentsfromtheanalysisprocessthatusedthepracticesdescribedaboveandthatledtoourcodingschemeforpairprogram-ming.Wepresenttheseexamplestomakethepracticesclearer,toexplainhowtheyinteract,andtomakeitmorecrediblethattheyreallyhelpvitally.We rstintroducefourconceptsfromourcodingschemeandthenpresentsomeepisodesfromtheprocessinwhichwecreatedthem.Asanadd-on(andslightlyo -topicforthisarticle)westateafewhypothesesaboutpairprogram-mingthatwehavederivedbasedonourcodingscheme.6.1AnextractfromthecodingschemeOurcurrentversionofthecodingscheme(whichignoresthesubjectpartoftheconceptnames)containsabout50di erentconcepts,clusteredintoabout20overlappingConceptClasses|mostconceptsbeingmembersofeithertwoorthreeofthem.Asanillustrativeexample,wepresentthefourconceptsofthethinkaloudConceptClass.TheyareshowninTable1;thedescriptionsareheavilyabbre-viated.Table1.TheconceptsofthethinkaloudConceptClass ConceptnameDescription thinkaloud activityExplainsacurrentcomputer-operatingactivity thinkaloud ndingStatesanewlywoninsight(e.g.,thatsomeprioractionwasamistake) thinkaloud stateRe ectsonthecurrentstateofworkw.r.t.tothecurrentstrategyandgoal thinkaloud completionStatesthatasimpleworkstephasbeencompleted 116.2Useofthepractices:afewexamplesWewillnowexplainhowwearrivedatthesefourconceptsinordertoshowthepracticesinactionandillustratetheirinteraction.Soonduringthecodingprocesswerecognizedthattheso-calledDriver[11]frequentlyverbalizedwhathewasdoingonthecomputer.Basedonthisobser-vation,wemadetwodecisions:First,westarteddevelopingtwoConceptClasses(seepractice3)calledHCI(human-computerinteraction)andHHI(human-humaninteraction)forsepa-ratingthecomputer-operatingaspectfromtheverbalizationaspect.ThesewereConceptClassesratherthanindividualconceptsbecausethesameseparationwouldobviouslyberelevantinmanyothercasesaswell.Second,wepostulatedanewconcept,thinkaloud activity.Byvirtueoftheconceptnamingsyntaxstructure(practice2),thisoneconceptimmediatelygen-eratedawholeConceptClass(sofarhavingonlyonemember)basedontheverb\tothinkaloud".Thise ectleadstoextendeddi erentiationofconceptswhereneededbutincursonlylittleadditionalcomplexityforthecodingscheme.Asthesecondmemberofthisclassweintroducedthinkaloud ndingwhenwefoundaphenomenonthatwasobviouslythinkingaloud,butthatalsoobvi-ouslydidnotexplaincomputeractivity.Thedemarcationappearedtoberel-ativelyclear.Inthediscussionofthepaircoders(practice4)weagreedthatthinkaloud activitycanbeusedonlyfortheDriverandthatishasprioritywherethinkaloud ndingmightalsobeapplicable.Soonthereafterweencounteredaprogrammer'sexplanationofthestateofa airsandrecognizeditcouldbeannotatedasthinkaloud state,thuscreatingthethirdmemberofthissetofconcepts.Butwesoonfoundthinkaloud statetoex-hibittwoproblems.First,wehadacasewhereitcollidedwiththinkaloud nding,becausethe ndingconcernedthestateofwork.Second,itdesignatedstatementsonratherdi erentlevelsofabstractionandgranularity.Wesolvedbothproblemsbyusingthemeta-model(practice3),speci -callybyintroducingtheConceptRelation\is-precondition-of"fromtheex-istingconceptspropose step(suggestingthenextstep)andpropose strategy(suggestinganapproachforchoosingmanyfuturesteps).Wepostulatedthatthinkaloud statehadtorefertoapreviouspropose strategyandintroducedanewconceptthinkaloud completionthatwouldrefertoapreviouspropose step.Thissolvedbothproblemsatonce:Wecouldnowdiscriminatelargeandsmallgranularity(strategicandtactical)andgainedacriterionforwhennottousethinkaloud nding,whichprovidedthedemarcationtotheothertwo.Thisillustrateshowopencodingnaturallyleadsintoaxialcodingandhowthecombinationoftheparadigmaticmodelwiththeconceptnamingsyntax(practice2)canshowawaybackintoopencoding,thuskeepingthecomplexityoftheresultingannotationsdown.Weareconvincedthatthisrouteworkedonlybecauseofthepaircodingconstellation(practice4),asbothcodersinitiallysuggestedencodingsbasedontheexistingcodesandonlythenon-acceptanceofthesesuggestions(andtheir 12supportingarguments)bytheotherleadtothediscoveryofthe\is-precondition-of"relationshipandthefourthcodethinkaloud completion.6.3SomehypothesesbasedonthecodingschemeAlthoughwehavenotyetstartedtheanalysisoftheactualpairprogrammingprocessassuch,anumberofphenomenarecurredsoconsistentlythatwealreadycallthemhypotheses:{WehavefoundnocluesthatDriverandObserverdoindeedworkondi erentlevelsofabstraction,asclaimedinthepairprogrammingliterature[11].{Wehaveobservedwhatwecallpairphases,characterizedbyahighden-sityofcommunicationactsreferingtojustonenarrowissue.Theylookalotlikewhatdescriptionsofpairprogrammingsuggestasthenormalpairprogrammingprocess,butwerealizedtheyareallshort(usuallyunderthreeminutes).{Webelievethatpairprogrammingisnotdrivenbystrategicplanningandmonitoring.Rather,theplanisquiteoftenonlyonesteplong:Asinglestepissuggested,possiblydiscussed,decided(orrevised),andimmediatelyexecuted.{BesidestheunavoidablerolesofDriverandObserver,pairprogrammingsessionsapparrentlytendtoimplicitlyproduceaLeaderroleaswell.TheLeaderisthepersonmoreskilledforthegiventaskandin uencesspeedanddirectionoftheprocessmuchmorestronglythanthepairpartner.Weexpectthatvaluableinsightaboutpairprogrammingcanbegainedbyinvestigatingthereasons,consequences,andtypicalcontextconditionsoftheabovetrends.Forinstance,weexpectto ndthatpairphasesareepisodesofsuper-highproductivitysothatitwouldbehelpfultounderstandwhenandwhytheyoccur.7Relatedwork7.1QualitativeanalysisofpairprogrammingWeknowofnootherworkanalyzingtheprocessofpairprogrammingthatusesaGTapproach(theyallworkwithatleastpartiallypre-de nedcodingschemes)andalsononethatworksdirectlywithvideodata(multi-dimensionalorother).Wake[12]presentsalistoftypicalpairprogrammeractivities,butprovideslittleinformationonhowitwasderived.Bryant[13]studiesthedi erenceofinteractiontypeandfrequencyinnoviceversusexpertpairprogrammers.Inapilotstudy,she rstre nedWake'slistintoatableof11behaviorandinteractiontypes.Intherealstudy,shethenrecordedthesequenceofeventsinrealtimeaccordingtothisschema.Suchrealtimecategorizationisobviouslyagoodpreconditionforanalyzingalargenumberofsessions,whichispositive.Ontheotherhand,thesimplicity 13ofthecategorizationthatisneededtomakeitpossiblealsorestrictstheresultstotalkingintermsoftheratherplainconceptsalreadypresentinthepre-de nedlist.Neithersubtlediscriminationsnorsurprisingnewinsightsappearlikelyfromthisapproach;itisapplicableonlytonarrowly-scopedinvestigationsusingprede nedhypotheses.CaoandXu[14]investigatetheactivitypatternsofpairprogramming.Pairworkingsessionswerevideotapedandthentranscribed.Theanalysisusedacodingschemethatstartedoutfromacombinationoftheschemesfrom[15]and[16].Then,duringtheanalysisofthedata,anewschemawasdevelopedinamannernotdescribed.Thisworksharesourbehavioristobservationattitude:Unlikeus,itignoresallinformationcontainedinthecomputerinteraction,butfortherestitstillgroundsonobjectivelyobservablecommunicationactsonly.Incontrast,XuandRajlich[17]usethedialog-basedprotocol3inordertoanalyzethecognitiveactivitiesinpairprogramming,whichinvolvesafargreateramountofeithersubjectivityorgeneralizedassumption.Thecodingschemeinvolvesclassi cationheuristicsderivedfromatheoryonself-directedlearning[18].XuandRajlichproposedtodothecodingassignmentbytwoormorecoders.Incontrasttoourapproach,thecodersworkseparatelyandcomparetheresultsafterwards.Thisapproachissensibleonlywitha xedcodingscheme,becauseaGT-likegenerationofconceptswouldbeveryinecientinthismanner|immediatediscussionasinpaircoding(practice4)ismuchmoreecient.Itisobviousthatallthreestudiesuserathermoreprede nedconceptsdur-ingtheanalysisthanconceptsgroundedonlyinthedata.WefearthatsuchapproacheswillbemuchmorelikelytofallpreytounwarrantedassumptionsaccordingtoconventionalwisdomsuchaspresumedDriver/Observerroledi er-encesetc.7.2GroundedTheoryworkusingrichvideodataEveninthebroaderGT-relatedliterature,examplesofstudiesusingvideodur-ingtheanalysis(ratherthantranscriptsofvideosonly)arerare.Wefoundonesuchexampleinmedicinethatstudiedmedicalteamleadershipbehavior[19].Thevideowasrecordedwithfourcamerasfromdi erentangles.Theanalysisinvolvedfouranalystsandthreesteps.(1)Oneanalystidenti edvideosegmentswithinterestingverbalornon-verbalteaminteractions.(2)Twoanalystscreatedconceptualdescriptionsofthesegmentsbyconsensus.(3)Taxonomiesforlead-ershipactionsfromtheconceptualdescriptionsweredeveloped.Thisapproachresemblesourpaircodingpractice,atleastinstep2.Ifdi erentpeopleper-formedsteps1,2,3(thearticleisveryunclearinthisrespect),weconsiderthisaproblematicprocedure:itisalmostantitheticaltotheGTphilosophy,becauseitpartiallyprohibitsconstantcomparisonandfullyprohibitstheintertwiningofopencoding(steps1+2)andaxialcoding(step3). 3Bytheway,[17]suggeststousescreen-captureandvoicerecordingonlyratherthanvideotapingtoavoidin uencesduetocamera-consciousness|wehaveneverobservedthistobeanissueatall. 148ConclusionandfurtherworkWehavedescribedwhyastraightforwardapplicationofthestandardGroundedTheorymethodtomulti-dimensionalvideodataofpairprogrammingsessionsisnotlikelytobesuccessfulandhavepresentedandillustratedasetoffouranalysispracticesthatprovideasystematicwaytoholdtheanalysisproblemsatbay.Wehaveusedthesepracticestogenerateageneral-purposecodingschemeofpairprogrammingactivities,ofwhichwepresentedasmallexcerpt.Inthefuture,wewillproceedwiththefollowingsteps:{Validationofthecodingscheme.Wewillencodesessionsthathaveverydi erentpropertieswithrespecttoparticipants,task,andsetting.{Qualitativeandquantitativeevaluationofthecodingprocessitself,basedonitsresults,intermediateresults,andprocessmonitoringinformation(inparticulartimestamps)recordedbyATLAS.ti.{Re nementofthecodingschemewithrespecttoparticularresearchappli-cations,inparticularbyaddingpropertiesaccordingtothemeta-model.{Applicationofthecodingschemetoproduceactualgroundedtheoriesofseveralaspectsofthepairprogrammingprocess.Thiswillrequireselectivecodingwhichweexpecttoexerciseeventhosepartsofthemeta-modelnotdiscussedinthepresentarticle.Justlikethefourpracticesmutuallysupportoneanother,thesetaskswillalsoexhibitsynergyandsowillbeperformedpartiallyinparallel.References[1]Beck,K.:ExtremeProgrammingExplained:EmbraceChange,SecondEdition.Addisson-WesleyProfessional(2004)[2]Williams,L.:Integratingpairprogrammingintoasoftwaredevelopmentprocess.In:CSEET'01:Proceedingsofthe14thConferenceonSoftwareEngineeringEd-ucationandTraining,Washington,DC,USA,IEEEComputerSociety(2001)[3]Lui,K.M.,Chan,K.C.:Whendoesapairoutperformtwoindividuals?In:Ex-tremeProgrammingandAgileProcessesinSoftwareEngineering.Volume2675ofLectureNotesinComputerScience.,Springer(2003)225{233[4]Nawrocki,J.R.,Jasi~nski,M.,Olek,L.,Lange,B.:Pairprogrammingvs.side-by-sideprogramming.In:EuroSPI.Volume3792ofLectureNotesinComputerScience.,Springer(2005)28{38[5]Strauss,A.,Corbin,J.:BasicsofQualitativeResearch:GroundedTheoryProce-duresandTechniques.SagePublications,Inc.(1990)[6]Glaser,B.G.,Strauss,A.L.:TheDiscoveryofGroundedTheory:StrategiesforQualitativeResearch.AldinedeGruyter,NewYork(1967)[7]Legewie,H.,Schervier-Legewie,B.:ImGesprach:AnselmStrauss.JournalfurPsychologie3(1995)64{75[8]TechSmithCorporation:Camtasiastudio4.0.1.(http://www.techsmith.com)[9]ATLAS.ti:User'sManualforATLAS.ti5.0.(http://www.atlasti.com) 15[10]Rumbaugh,J.,Jacobson,I.,Booch,G.:TheUni edModelingLanguageReferenceManual,SecondEdition.Addison-WesleyProfessional(2005)[11]Williams,L.,Kessler,R.R.,Cunningham,W.,Je ries,R.:Strengtheningthecaseforpairprogramming.IEEESoftware17(2000)19{25[12]Wake,W.:ExtremeProgrammingExplored.AddisonWesleyBoston(2002)[13]Bryant,S.:Doubletrouble:Mixingqualitativeandquantitativemethodsinthestudyofextremeprogrammers.In:VLHCC'04:Proceedingsofthe2004IEEESymposiumonVisualLanguages-HumanCentricComputing,Washington,DC,USA,IEEEComputerSociety(2004)55{61[14]Cao,L.,Xu,P.:Activitypatternsofpairprogramming.In:HICSS'05:Proceed-ingsoftheProceedingsofthe38thAnnualHawaiiInternationalConferenceonSystemSciences,Washington,DC,USA,IEEEComputerSociety(2005)[15]Lim,K.,Ward,L.,Benbasat,I.:Anempiricalstudyofcomputersystemlearning:Comparisonofco-discoveryandself-discoverymethods.InformationSystemsResearch8(1997)254{272[16]Okada,T.,Simon,H.:Collaborativediscoveryinascienti cdomain.CognitiveScience21(1997)109{146[17]Xu,S.,Rajlich,V.:Dialog-basedprotocol:anempiricalresearchmethodforcogni-tiveactivitiesinsoftwareengineering.In:InternationalSymposiumonEmpiricalSoftwareEngineering.(2005)383{392[18]Xu,S.,Rajlich,V.,Marcus,A.:Anempiricalstudyofprogrammerlearningduringincrementalsoftwaredevelopment.In:(ICCI2005:FourthIEEEConferenceonCognitiveInformatics.(2005)340{349[19]Xiao,Y.,Seagull,F.,Mackenzie,C.,Klein,K.:Adaptiveleadershipintraumaresuscitationteams:agroundedtheoryapproachtovideoanalysis.Cognition,Technology&Work6(2004)158{164