176K - views

Guidelines for the Design and Statistical Analysis of Experiments Using Laboratory Animals Michael F

W Festing and Douglas G Altman Abstract For ethical and economic reasons it is important to design animal experiments well to analyze the data correctly and to use the minimum number of animals necessary to achieve the scientific objectivesbut not s

Embed :
Pdf Download Link

Download Pdf - The PPT/PDF document "Guidelines for the Design and Statistica..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Guidelines for the Design and Statistical Analysis of Experiments Using Laboratory Animals Michael F






Presentation on theme: "Guidelines for the Design and Statistical Analysis of Experiments Using Laboratory Animals Michael F"— Presentation transcript:

GuidelinesfortheDesignandStatisticalAnalysisofExperimentsUsingLaboratoryAnimalsMichaelF.W.FestingandDouglasG.AltmanForethicalandeconomicreasons,itisimportanttodesignanimalexperimentswell,toanalyzethedatacorrectly,andtousetheminimumnumberofanimalsnecessarytoachievethescientificobjectives—butnotsofewastomissbiologi-callyimportanteffectsorrequireunnecessaryrepetitionofexperiments.Investigatorsareurgedtoconsultastatistician E xperimentsusinglaboratoryanimalsshouldbewell MichaelF.W.Festing,M.Sc.,Ph.D.,D.Sc.,CStat.,BIBiol.,isaSeniorResearchScientistattheMRCToxicologyUnit,UniversityofLeicester,UK.DouglasG.Altman,Ph.D.,D.Sc.,isDirectoroftheCancerResearchUK/NHSCentreforStatisticsinMedicine,InstituteofHealthSciences,Headington,Oxford,UK. chanceofcontributingtohumanoranimalwelfare,possiblyinthelongterm.ThefollowingofRussellandBurch(1959)provideaframeworkforconsideringthehumaneuseofAnimalsshouldbebylesssentientalternativessuchasinvertebratesorinvitromethodswheneverExperimentalprotocolsshouldbetominimizeanyadverseeffectsforeachindividualanimal.Forex-ample,appropriateanesthesiaandanalgesiashouldbeusedforanysurgicalintervention.Deathisnotanac-ceptableendpointifitisprecededbysomehoursofacutedistress,andhumaneendpointsshouldbeusedwheneverpossible(Stokes2000).Staffshouldbewelltrained,andhousingshouldbeofahighstandardwithappropriateenvironmentalenrichment.Animalsshouldbeprotectedfrompathogens.Thenumberofanimalsshouldbetothemini-mumconsistentwithachievingthescientificobjectivesofthestudy,recognizingthatimportantbiologicalef-fectsmaybemissediftoofewanimalsareused.Somethoughtalsoshouldbegiventotherequiredprecisionofanyoutcomestobemeasured.Forexample,chemicalsareclassifiedintoanumberofgroupsonthebasisoftheiracutetoxicityinanimals.Itmaynotbenecessarytoobtainahighlypreciseestimateofthemedianlethaldose(LDvalue)toclassifythem.Anumberofse-quentialexperimentaldesignsthatusefeweranimalshavebeendevelopedforthispurpose(Lipnicketal.1995;Rispinetal.2002;Schledeetal.1992).Ethicalreviewpanelsshouldalsoinsistthatanyscientistwhodoesnothaveagoodbackgroundinexperimentalde-signandstatisticsshouldconsultastatistician.GeneralPrinciplesAllresearchshouldbedescribedinsuchawaythatitcouldberepeatedelsewhere.AuthorsshouldclearlystatetheTheobjectivesoftheresearchand/orthehypothesestobetested;Thereasonforchoosingtheirparticularanimalmodel;Thespecies,strain,source,andtypeofanimalused;Thedetailsofeachseparateexperimentbeingreported,includingthestudydesignandthenumberofanimalsused;andThestatisticalmethodsusedforanalysis.ExperimentsandSurveysisaprocedureforcollectingscientificdataontheresponsetoaninterventioninasystematicwaytomaximizethechanceofansweringaquestioncorrectly(confirmatoryresearch)ortoprovidematerialforthegenerationofnewhypotheses(exploratoryresearch).Itinvolvessometreatmentorothermanipulationthatisunderthecontroloftheexperimenter,andtheaimistodiscoverwhetherthetreatmentisaresponseintheexperimentalsubjectsand/ortoquantifysuchresponse.A,incontrast,isanobservationalstudyusedtofindbetweenvariablesthatthescientistcannotusu-allycontrol.Anyassociationmayormaynotbeduetoacausalrelation.TheseguidelinesareconcernedonlywithExperimentsshouldbeplannedbeforetheyarestarted,andthisplanningshouldincludethestatisticalmethodsusedtoassesstheresults.Sometimesasingleexperimentisreplicatedindifferentlaboratoriesoratdifferenttimes.However,ifthisreplicationisplannedinadvanceandthedataareanalyzedaccordingly,itstillrepresentsasingleConfirmatoryandExploratoryExperimentsConfirmatoryresearchnormallyinvolvesformaltestingofoneormoreprespecifiedhypotheses.Bycontrast,explor-atoryresearchnormallyinvolveslookingforpatternsinthedatawithlessemphasisonformaltestingofhypotheses.Commonly,exploratoryexperimentsinvolvemanycharac-ters.Forexample,manymicroarrayexperimentsinwhichupordownregulationofmanythousandsofgenesisas-sayedineachanimalcouldbeclassifiedasexploratoryex-perimentsbecausethemainpurposeisusuallytolookforpatternsofresponseratherthantotestsomeprespecifiedhypotheses.Thereisfrequentlysomeoverlapbetweenthesetwotypesofexperiment.Forexample,anexperimentmaybesetuptotestwhetheracompoundproducesaspecificeffectonthebodyweightofratsaconfirmatorystudy.However,datamayalsobecollectedonhematologyandclinicalbiochemistry,andexploratoryinvestigationsusingthesedatamaysuggestadditionalhypothesestobetestedinfutureconfirmatoryexperiments.InvestigationsInvolvingSeveralExperimentsScientificarticlesoftenreporttheresultsofseveralinde-pendentexperiments.Whentwoormoreexperimentsarepresented,theyshouldbeclearlydistinguishedandeachshouldbedescribedfully.Itishelpfultoreaderstonumbertheexperiments.AnimalsasModelsofHumansorOtherSpeciesLaboratoryanimalsarenearlyalwaysusedasmodelsorsurrogatesofhumansorotherspecies.Amodelisarepre-Volume43,Number42002 sentationofthethingbeingmodeled(thetarget).Itmusthavecertaincharacteristicsthatresemblethetarget,butitcanbeverydifferentinotherways,someofwhichareoflittleimportancewhereasothersmaybeofgreatpracticalimportance.Forexample,therabbitwasusedformanyyearsasamodelofdiabetichumansforassayingthepo-tencyofinsulinpreparationsbecauseitwaswellestablishedthatinsulinreducesbloodglucoselevelsinrabbitsaswellasinhumans.Thefactthatrabbitsdifferfromhumansinmanythousandsofwayswasirrelevantforthisparticularapplication.Thiswasawell-validatedmodel,butithasnowbeenreplacedwithchemicalmethods.Othermodelsmaybelesswellvalidated;andinsomecasesitmaybedifficult,impossible,orimpracticaltovali-dateagivenmodel.Forexample,itiswidelyassumedthatmanyindustrialchemicalsthataretoxicatagivendoseinlaboratoryanimalswillalsobetoxictohumansatapproxi-matelythesamedoseaftercorrectingforscale.However,itisusuallynotpossibletotestthisassumption.Clearly,thevalidityofananimalmodelasapredictorofhumanre-sponsedependsonhowcloselythemodelresembleshu-mansforthespecificcharactersbeinginvestigated.Thus,thevalidityofanymodel,includingmathematical,invitro,andlowerorganismmodels,mustbeconsideredonacase-by-casebasis.NeedtoControlVariationAfterchoosingamodel,theaimoftheexperimentwillbetodeterminehowitrespondstotheexperimentaltreat-ment(s).Modelsshouldbesensitivetotheexperimentaltreatmentsbyrespondingwell,withminimalvariationamongsubjectstreatedalike.Uncontrolledvariation,whethercausedbyinfection,genetics,orenvironmentalorageheterogeneity,reducesthepowerofanexperimenttodetecttreatmenteffects.Ifmiceorratsarebeingused,theuseofisogenicstrainsshouldbeconsideredbecausetheyareusuallymoreuniformphenotypicallythancommonlyusedoutbredstocks.Experi-mentsusingsuchanimalseithershouldbemorepowerfulandabletodetectsmallertreatmentresponsesorcouldusefeweranimals.Whenitisnecessarytoreplicateanexperi-mentacrossarangeofpossiblesusceptibilityphenotypes,smallnumbersofanimalsofseveraldifferentinbredstrainscanbeusedinafactorialexperimentaldesign(seebelow)withoutanysubstantialincreaseintotalnumbers(Festing1995,1997,1999).Theadvantageofthisdesignisthattheimportanceofgeneticvariationinresponsecanbequantified.Inbredstrainshavemanyotherusefulproperties.Becauseallindividualswithinastrainaregeneticallyiden-tical(apartpossiblyfromasmallnumberofrecentmuta-tions),itispossibletobuildupageneticprofileofthegenesandallelespresentineachstrain.Suchinformationcanbeofvalueinplanningandinterpretingexperiments.Suchstrainsremaingeneticallyconstantformanygenera-tions,andidentificationofindividualstrainsispossibleus-inggeneticmarkers.Thereisaconsiderableliteratureonthecharacteristicsofthemorecommonstrains,sothatstrainssuitableforeachprojectcanbechosenaccordingtotheirknowncharacteristics(Festing1997,1999;Animalsshouldbemaintainedingoodenvironmentalconditionsbecauseanimalsunderstressarelikelytobemorevariablethanthosemaintainedinoptimumconditions(RussellandBurch1959).Whenaresponseisfoundintheanimal,itstruerelevancetohumansisstillnotknown.Thus,clinicaltrialsarestillneededtodiscovertheeffectsofanyproposedtreatmentinhumans.However,intestingtoxicenvironmentalchemicals,itisnormallyassumedthathumansrespondinasimilarwaytoanimals,althoughthisassumptioncanrarelybetested.Theanimalsshouldbeadequatelydescribedinthematerialsandmethodsorotherrelevantsectionofthepaperorreport.TheAppendixpro-videsachecklistofthesortofinformationthatmightbeprovided,dependingontheindividualstudy.ExperimentalDesignTheexperimentaldesigndependsontheobjectivesofthestudy.Itshouldbeplannedindetail,includingthedevel-opmentofwrittenprotocolsandconsiderationofthestatis-ticalmethodstobeused,beforestartingwork.Inprinciple,awell-designedexperimentavoidsbiasandissufficientlypowerfultobeabletodetecteffectslikelytobeofbiologicalimportance.Itshouldnotbesocomplicatedthatmistakesaremadeinitsexecution.Virtuallyallanimalexperimentsshouldbedoneusingoneoftheformaldesignsdescribedbrieflybelow.ExperimentalUnitEachexperimentinvolvesanumberofexperimentalunits,whichcanbeassignedatrandom(seebelow)toatreatment.Theexperimentalunitshouldalsobetheunitofstatisticalanalysis.Itmustbepossible,inprinciple,toassignanytwoexperimentalunitstodifferenttreatments.Forthisreason,ifthetreatmentisgiveninthedietandallanimalsinthesamecagethereforehavethesamediet,thecageofanimals(nottheindividualanimalswithinthecage)istheexperimentalunit.Thissituationcancausesomeproblems.Instudyingtheeffectsofaninfection,forexample,itmaybenecessarytohouseinfectedanimalsinoneisolatorandcontrolani-malsinanother.Strictly,theisolatoristhentheexperimen-talunitbecauseitwastheentityassignedtothetreatmentandananalysisbasedonacomparisonofindividualin-fectedversusnoninfectedanimalswouldbevalidonlywiththeadditionalassumption(whichshouldbeexplicitlystated)thatanimalswithinasingleisolatorarenomoreornolessalikethananimalsindifferentisolators.Althoughindividualanimalsareoftentheexperimentalunitsassignedtothetreatments,acrossoverexperimentaldesignmayin- volveassigningananimaltotreatmentsX,Y,andZse-quentiallyinrandomorder,inwhichcasetheexperimentalunitistheanimalforaperiodoftime.Similarly,ifcellsfromananimalareculturedinanumberofdishesthatcanbeassignedtodifferentinvitrotreatments,thenthedishofcellsistheexperimentalunit.Split-plotexperimentaldesignshavemorethanonetypeofexperimentalunit.Forexample,cageseachcontainingtwomicecouldbeassignedatrandomtoanumberofdi-etarytreatments(sothecageistheexperimentalunitforcomparingdiets),andthemicewithinthecagemaybegivenoneoftwovitamintreatmentsbyinjection(sothemiceareexperimentalunitsforthevitamineffect).Ineachcase,theanalysisshouldreflectthewaytherandomizationwasdone.Treatmentsshouldbeassignedsothateachexperimentalunithasaknown,oftenequal,probabilityofreceivingagiventreatment.Thisprocess,termedrandomization,ises-sentialbecausethereareoftensourcesofvariation,knownorunknown,whichcouldbiastheresults.Moststatisticalpackagesforcomputerswillproducerandomnumberswithinaspecifiedrange,whichcanbeusedinassigningexperimentalunitstotreatments.Sometextbookshavetablesofrandomnumbersdesignedforthispurpose.Alter-natively,treatmentassignmentscanbewrittenonpiecesofpaperanddrawnoutofabagorbowlforeachexperimentalunit(e.g.,animalorcage).Ifpossible,therandomizationmethodshouldensurethattherearepredefinednumbersineachtreatmentgroup.Notethatthedifferenttreatmentgroupsshouldbepro-cessedidenticallythroughoutthewholeexperiment.Forexample,measurementsshouldbemadeatthesametimes.Furthermore,animalsofdifferenttreatmentgroupsshouldnotbehousedondifferentshelvesorindifferentroomsbecausetheenvironmentsmaybedifferent(seeBlindingandBlockDesignsbelow).Toavoidbias,experimentsshouldbeperformedwithrespecttothetreatmentswhenpossibleandparticu-larlywhenthereisanysubjectiveelementinassessingtheresults.Aftertherandomizedallocationofanimals(orotherexperimentalunit)tothetreatments,animals,samples,andtreatmentsshouldbecodeduntilthedataareanalyzed.Forexample,whenaningredientisadministeredinthediet,thedifferentdietscanbecodedwithnumbersand/orcolorsandthecagescanbesimilarlycodedtoensurethatthecorrectdietisgiventoeachcage.Animalscanbenumberedinrandomordersothatatthepostmortemexaminationtherewillbenoindicationofthetreatmentgroup.Pathologistswhoreadslidesfromtoxicityexperimentsareoftennotblindedwithrespecttotreatmentgroup,whichcancauseproblemsintheinterpretationoftheresults(Fairweatheretal.1998).PilotStudiesPilotstudies,sometimesinvolvingonlyasingleanimal,canbeusedtotestthelogisticsofaproposedexperiment.Slightlylargeronescanprovideestimatesofthemeansandstandarddeviationsandpossiblyalsosomeindicationoflikelyresponse,whichcanbeusedinapoweranalysistodeterminesamplesizesoffutureexperiments(seebelow).However,ifthepilotexperimentisverysmall,theseesti-mateswillbeinaccurate.FormalExperimentalDesignsSeveralformalexperimentaldesignsaredescribedintheliterature,andmostexperimentsshoulduseoneofthesedesigns.Themostcommonarecompletelyrandomized,randomizedblock(seebelow),andfactorialdesigns;how-ever,Latinsquare,crossover,repeatedmeasures,split-plot,incompleteblock,andsequentialdesignsarealsoused.Theseformaldesignshavebeendevelopedtotakeaccountofspecialfeaturesandconstraintsoftheexperimentalma-terialandthenatureoftheinvestigation.Itisnotpossibletodescribealloftheavailableexperimentaldesignshere.Theyaredescribedinmanystatisticaltextbooks.Investigatorsareencouragedtonameanddescribefullythedesigntheyusedtoenablereaderstounderstandexactlywhatwasdone.Wealsorecommendincludinganexplana-tionofanonstandarddesign,ifused.Withineachtypeofdesignthereisconsiderableflex-ibilityintermsofchoiceoftreatmentsandexperimentalconditions;however,standardizedmethodsofstatisticalanalysisareusuallyavailable.Inparticular,whenexperi-mentsproducenumericaldata,theycanoftenbeanalyzedusingsomeformoftheanalysisofvariance(ANOVACompletelyrandomizeddesigns,inwhichanimals(orotherexperimentalunits)areassignedtotreatmentsatran-dom,arewidelyusedforanimalexperiments.Themainadvantagesaresimplicityandtoleranceofunequalnumbersineachgroup,althoughbalancednumbersarelessimpor-tantnowthatgoodstatisticalsoftwareisavailableforana-lyzingmorecomplexdesignswithunequalnumbersineachgroup.However,simplerandomizationcannottakeaccountofheterogeneityofexperimentalmaterialorvariation(e.g.,duetobiologicalrhythmsorenvironment),whichcannotbecontrolledoveraperiodoftime.Randomizedcompleteblockdesignsareusedtosplitanexperimentintoanumberoftoincrease Abbreviationsusedinthisarticle:ANOVA,analysisofvariance;DF,degreesoffreedom.Volume43,Number42002 precisionand/ortakeaccountofsomenaturalstructureoftheexperimentalmaterial.Withlargeexperiments,itmaynotbepossibletoprocessalloftheanimalsatthesametimeorhousetheminthesameenvironment,soitmaybebettertodividetheexperimentintosmallerblocksthatcanbehandledseparately.Typically,awillconsistofoneormoreanimals(orotherexperimentalunits)thathavebeenassignedatrandomtoeachofthedifferenttreatmentgroups.Thus,iftherearesixdifferenttreatments,ablockwillconsistofamultipleofsixanimalsthathavebeenassignedatrandomtoeachofthetreatments.Blockingthusensuresbalanceoftreatmentsacrossthevariabilityrepre-sentedbytheblocks.Itmaysometimesbedesirabletoperformwithin-litterexperimentswhen,forexample,com-paringtransgenicanimalswithwild-typeones,witheachlitterbeingablock.Similarly,whentheexperimentalani-malsdifferexcessivelyinageorweight,itmaybebesttochooseseveralgroupsofuniformanimalsandthenassignthemtothetreatmentswithinthegroups.Randomizedblockdesignsareoftenmorepowerfulthancompletelyrandom-izeddesigns,buttheirbenefitsdependoncorrectanalysis,using(usually)atwo-wayANOVAwithoutinteraction.Notethatwhenthereareonlytwotreatments,theblocksizeistwoandtheresultingdatacanbeanalyzedusingeithera-testorthetwo-wayANOVAnotedabove,whichareequivalent.ChoiceofDependentVariable(s),Characters,Traits,orOutcomesConfirmatoryexperimentsnormallyhaveoneorafewout-comesofinterest,alsoknownasdependentvariables,whicharetypicallymentionedintheexperimentalhypotheses.Forexample,thenullhypothesismightbethattheexperimentaltreatmentsdonotaffectbodyweightinrats.Ideallythereshouldbeveryfewoutcomesofprimaryinterest,butsometoxicityexperimentsinvolvemanydependentvariables,anyofwhichmaybealteredbyatoxicchemical.Exploratoryexperimentsofteninvolvemanyoutcomes,suchasthethousandsofdependentvariablesinmicroarrayexperi-ments.Whenthereisachoice,quantitative(measurement)dataarebetterthanqualitativedata(e.g.,counts)becausetherequiredsamplesizesareusuallysmaller.Whenthereareseveralcorrelatedoutcomes(e.g.,organweights),sometypeofmultivariatestatisticalanalysismaybeappropriate.Insomestudies,scoressuchas0,+,++,and+++areused.Suchdatashouldnormallybeanalyzedbycomparingthenumberineachcategoryamongthedifferenttreatmentgroups,preferablytakingtheorderingintoac-count.Convertingscorestonumericalvalueswithmeansandstandarddeviationsisinappropriate.ChoiceofIndependentVariablesorTreatmentsExperimentsusuallyinvolvethedeliberatealterationofsometreatmentfactorsuchasthedoselevelofadrug.Thetreatmentsmayincludeoneormorecontrolsmaybeuntreatedanimalsorthosetreatedwithaplacebowithoutanactiveingredient.Thelatterisnormallymoreappropriate,althoughitmaybedesirabletostudyboththeeffectoftheactiveagentandthevehicle,inwhichcasebothtypesofcontrolwillbeneeded.Surgicalstudiesmayinvolvesham-operatedcontrols,whicharetreatedinthesamewayasthetestedanimalsbutwithoutthefinalsurgicalPositivecontrolsaresometimesusedtoensurethattheexperimentalprotocolswereactuallycapableofdetectinganeffect.Failureofthesecontrolstorespondmightimply,forexample,thatsomeoftheapparatuswasnotworkingcorrectly.Becausetheseanimalsmaysufferadverseeffects,andtheymaynotbenecessarytothehypothesisbeingtested,smallnumbersmaybeadequate.Doselevelsshouldnotbesohighthattheycauseun-necessarysufferingorunwantedlossofanimals.Whendif-ferentdosesarebeingcompared,threetoapproximatelysixdoselevelsareusuallyadequate.Ifadose-responserelationisbeinginvestigated,thedoselevels(X-variable)shouldcoverawiderangetoobtainagoodestimateofthere-sponse,althoughtheresponsemaynotbelinearoverawiderange.Doselevelsarefrequentlychosenonalog2orlog10scale.Iftheaimistotestforlinearity,thenmorethantwodoselevelsmustbeused.Ifpossible,werecommendusingdoselevelsthatareequallyspacedonsomescale,whichmayfacilitatethestatisticalanalysis.MoredetailsofchoiceofdoselevelsanddilutionsinbiologicalassayaregivenbyFinney(1978).Toxicologistsoftenusefractions(e.g.,halftoaquarterorless)ofthemaximumtolerateddose(thelargestdosethatresultsinonlyminimaltoxiceffects)inlong-termstudies.Thescientificvalidityofusingsuchhighdoselevelshasbeenquestionedbecausetheresponsetohighlevelsofatoxicchemicalmaybequalitativelydifferentfromthere-sponsetolowlevels(Fairweatheretal.1998).Thepossi-bilityofexploringtheeffectsofmorethanonefactor(e.g.,treatment,time,sex,orstrain)usingfactorialdesigns(seebelow)shouldbeconsidered.Uncontrolled(Random)VariablesInadditiontothetreatmentvariables,theremaybeanum-berofrandomvariablesthatareuncontrollableyetmayneedtobetakenintoaccountindesigninganexperimentandanalyzingtheresults.Forexample,circadianrhythmsmaycausebehaviormeasuredinthemorningtobedifferentfromthatmeasuredintheafternoon.Similarly,theexperi-mentalmaterialmayhavesomenaturalstructure(e.g.,membersofalitterofmicemaybemoresimilarthanani-malsofdifferentlitters).Measurementsmadebydifferentpeopleoratdifferenttimesmaybeslightlydifferent,andreagentsmaydeteriorateoveraperiodoftime.Iftheseeffectsarelikelytobelargeinrelationtotheoutcomesbeinginvestigated,itwillbenecessarytoaccountforthem atthedesignstage(e.g.,usingarandomizedblock,Latinsquare,orotherappropriatedesign)oratthetimeofthestatisticalanalysis(e.g.,usingcovarianceanalysis).FactorialExperimentsFactorialexperimentshavemorethanonetypeoftreatmentorindependentvariable(e.g.,adrugtreatmentandthesexoftheanimals).Theaimcouldbetolearnwhetherthereisaresponsetoadrugandwhetheritisthesameinbothsexes(i.e.,whetherthefactorsinteractwithorpotentiateeachother).Thesedesignsareoftenextremelypowerfulinthattheyusuallyprovidemoreinformationforagivensizeofexperimentthanmostsinglefactordesignsatthecostofincreasedcomplexityinthestatisticalanalysis.Theyaredescribedinmoststatisticaltexts(e.g.,Cox1958;Mont-gomery1997).Insomesituations,alargenumberoffactorsthatmightinfluencetheresultsofanexperimentcanbestudiedeffi-cientlyusingmoreadvancedfactorialdesigns.Forexample,inscreeningpotentialdrugs,itmaybedesirabletochooseasuitablecombinationofvariables(e.g.,presence/absenceofthetestcompound;thesex,strain,age,anddietoftheanimals;timeaftertreatment;andmethodofmeasuringtheendpoint).Iftherewereonlytwolevelsofeachofthesevariables,thentherewouldbe2128treatmentcombi-nationstobeexplored.Specialmethodsareavailablefordesigningsuchexperimentswithouthavingtouseexces-sivelylargenumbersofanimals(Cox1958:CoxandReid2000;Montgomery1997).Thistypeofdesigncanalsobeusedtooptimizeexperimentsthatareusedrepeatedlywithonlyminorchangesinthetreatments,suchasindrugde-velopment,whenmanydifferentcompoundsaretestedus-ingthesameanimalmodel(Shawetal.2002).ExperimentSizeDecidinghowlargeanexperimentneedstobeisofcriticalimportancebecauseoftheethicalimplicationsofusingani-malsinresearch.Anexperimentthatistoosmallmaymissbiologicallyimportanteffects,whereasanexperimentthatistoolargewastesanimals.Scientistsareoftenaskedtojustifythenumbersofanimalstheyproposetouseaspartoftheethicalreviewprocess.PowerAnalysisApoweranalysisisthemostcommonwayofdeterminingsamplesize.Theappropriatesamplesizedependsonamathematicalrelationbetweenthefollowing(describedinmoredetailbelow):the(1)effectsizeofinterest,(2)stan-darddeviation(forvariableswithaquantitativeeffect),(3)chosensignificancelevel,(4)chosenpower,(5)alternativehypothesis,(6)samplesize.Theinvestigatorgenerallyspecifiesthefirstfiveoftheseitemsandthesedeterminethesamplesize.Itisalsopossibletocalculatethepowerortheeffectsizeifthesamplesizeisfixed(e.g.,asaresultofrestrictedresources).Theformulaearecomplex;however,severalstatisticalpackagesofferpoweranalysisforestimat-ingsamplesizeswhenestimatingasinglemeanorpropor-tion,comparingtwomeansorproportions,orcomparingmeansinananalysisofvariance.Therearealsodedicatedpackages(e.g.,nQueryAdvisor[StatisticalSolutions,Cork,UK;Elashoff1997]),whichhaveamuchwiderrangeofanalyses(Thomas1997).Anumberofwebsitesalsopro-videfreepoweranalysiscalculationsforthesimplersitua-tions,andthefollowingsitesarecurrentlyavailable:&#xHttp;&#x://e; ook;&#x.sta;&#xt.uc;&#xla.e; u/c;&#xgi-b;&#xin/e;&#xngin;.cg;&#xi000;http://&#xHttp;&#x://e; ook;&#x.sta;&#xt.uc;&#xla.e; u/c;&#xgi-b;&#xin/e;&#xngin;.cg;&#xi000;www.math.yorku.ca/SCS/Demos/power/;andSamplesizeisconsideredinmoredetailbyDellandcol-leaguesinthisvolume(2002),andCohen(1988)providesextensivetablesandhelpfuldiscussionofmethods.EffectSizeBriefly,whenonlytwogroupsaretobecompared,theeffectsizeisthedifferenceinmeans(foraquantitativecharacter)orproportions(foraqualitative,dead/alivechar-acter)thattheinvestigatorwantstheexperimenttobeabletodetect.Forexample,theinvestigatorcouldspecifytheminimumdifferenceinmeanbodyweightbetweenacontrolgroupofratsandatreatedgroupthatwouldbeofbiologicalimportanceandthathe/sheconsiderstheexperimentshouldbeabletodetect.Itisoftenconvenienttoexpresstheeffectinunitsofstandarddeviationsbydividingthroughbythestandarddeviation(discussedbelow).Disaunitlessnumberthatcanbecomparedacrossdifferentexperimentsand/orwithdifferentoutcomes.Forexample,ifthestandarddeviationoflittersizeinaparticularcolonyofBALB/cstrainmiceis0.8pups(withameanof5pups)andanexperimentistobesetuptodetectadifferenceinmeanlittersizebetweentreatedandcontrolgroupsof,forex-ample,1.0pups,thenD1.25standarddevia-tionunits.Ifthestandarddeviationofthetotalnumberofpupsweanedpercageina6-mobreedingcycleis10pups(withameanof55pups)andtheexperimentissetuptodetectadifferencebetweenacontrolgroupandatreatedgroupof5.0pups,thenD0.5.Thiseffectsizeissmaller,sowouldrequirealargerexperimentthanthechangeinlittersizewouldrequire.Similarly,ifacontrolgroupisexpectedtohave,forexample,20%ofspontaneoustumors,andthecompoundisasuspectedcarcinogen,theincreaseinthepercentageoftumorsinthetreatedgroup(whichwouldbeimportanttobeabletodetect)mustbeStandardDeviationThestandarddeviationamongexperimentalunitsappropri-atetotheplannedexperimentaldesignmustbespecifiedVolume43,Number42002 (forquantitativecharacters).Forarandomizedblockorcrossoverdesigntheappropriateestimatewillusuallybethesquarerootoftheerrormeansquarefromananalysisofvarianceconductedonapreviousexperiment.Whennopreviousstudyhasbeendone,apilotstudymaybeused,althoughtheestimatewillnotbereliableifthepilotstudyisverysmall.SignificanceLevelThesignificancelevelisthechanceofobtainingafalse-positiveresultduetosamplingerror(knownasaTypeIerror).Itisusuallysetat5%,althoughlowerlevelsaresometimesspecified.Thepowerofanexperimentisthechancethatitwilldetectthespecifiedeffectsizeforthegivensignificancelevelandstandarddeviationandbeconsideredstatisticallysignifi-cant.Choiceofapowerlevelissomewhatarbitraryandusuallyrangesfrom80to95%.However,whentestingsomevaccinesforvirulence,apowerashighas99%maybespecifiedbecauseoftheseriousconsequencesoffailuretodetectavirulentbatch.Notethat(1-power)isthechanceofafalse-negativeresult,alsoknownasaTypeIIerror.AlternativeHypothesisThealternativehypothesisisusuallythattwomeansorpro-portionsdiffer,leadingtoatwo-tailedtest;butoccasionally,thedirectionofthedifferenceisspecified,leadingtoaone-tailedtest.Aslightlylargersamplesizeisrequiredforatwo-tailedtest.SampleSizeThesamplesizeisusuallywhatneedstobedetermined,soalloftheotherquantitieslistedaboveshouldbespecified.However,thereareoccasionswhenthesamplesizeisfixedandtheaimistodeterminethepoweroreffectsize,givensamplesize.Estimatedsamplesizesforanexperimentinvolvingtwogroupswithmeasurementdatathatwouldbeanalyzedusingatwo-sample-testaregiveninTable1asafunctionofD(seeabove).Forthetwoexamplesabove,Dwas1.25forthelittersizeeffect,whichwouldrequireapproximately14cagesineachgroup;whereasforthetotalproduction,ex-ampleDwas0.5,whichwouldrequireapproximately86cagesineachgroup.Whenexperimentsaresetuptocomparetwopropor-tionsusingachi-squaredtesttheeffectsizeisthedifferenceintheproportionofinthetwogroupsandthestandarddeviationisspecifiedbythetwoproportions.InTable2areshowntheestimatednumberrequiredineachgrouptocomparetwoproportionsforvariousproportionsrangingfrom0.2to0.8assumingapowerof90%,asig-nificancelevelof5%,andatwo-sidedtest.Notethatlargersamplesizesarerequiredtodetectagivendifferencebe-tweentwoproportionsiftheyarebothhighorlow(i.e.,lessthan0.3ormorethan0.7)thaniftheyarenear0.5.Poweranalysiscanalsobeusedtoestimatetherequiredsamplesizesforestimatingparameterssuchasamean,aregressioncoefficient,survival,orageneticlinkage(recom-binationproportion)withaspecifiedconfidenceinterval,althoughdedicatedpoweranalysissoftwaremaybeneededforthesemoreadvancedcalculations.Notethatlargenum-bersofanimalsareneededtoestimategeneticlinkagebe-tweentightlylinkedgeneticmarkersifanarrowconfidenceTable2Numberrequiredineachgroupforcomparingtwoproportions(basedonanormalapproximationofthebinomialdistribution)withasignificancelevelof0.05andapowerof90% Proportionineachgroup0.20.30.40.50.60.70.2—0.3392—0.4109477—0.552124519—0.69056130519—0.7193156124477—0.8133052109392Assumptionsmayleadtosomeinaccuracy.Table1SamplesizeasafunctionofDforatwo-samplet-testcomparisonassumingasignificancelevelof5%,apowerof90%,andatwo-sidedtest DNo.pergroup0.25270.32350.41330.5860.6600.7440.8340.9271.0231.2161.4121.6101.882.072.55D=(differenceinmeans)/(standarddeviation) intervaliswanted.However,severalspecializedapproachestosuchstudiesexist(Silver1995).ResourceEquationMethodforDeterminingSampleSizeWhenthereisnoinformationaboutthestandarddeviationand/oritisdifficulttospecifyaneffectsize,analternativemethodthatdependsonthelawofdiminishingreturnshasbeensuggested(Mead1988).Thismethodmayalsobeofvalueforsomeexploratoryexperimentswhentestinghy-pothesesisnotthemainobjective.Forquantitativecharactersthatareanalyzedusingtheanalysisofvariance,itissuggestedthatthedegreesoffree-dom(DF)fortheerrortermusedtotesttheeffectofthevariableshouldbeapproximately10to20.Withlessthan10DF,goodreturnscanbeexpectedfromaddingmoreexperimentalunits.However,withmorethan20DF,addingadditionalunitsprovideslittleextrainformation.Thisrule-of-thumbmethodseemstoworkquitewellforwholeani-malexperiments,althoughittendstoassumequitelargeeffectsizes.StatisticalAnalysisTheresultsofmostexperimentsshouldbeassessedbyanappropriatestatisticalanalysiseventhough,insomecases,theresultsaresoclear-cutthatitisobviousthatanystatis-ticalanalysiswouldnotaltertheinterpretation.Theanalysisshouldreflectthepurposeofthestudy.Thus,thegoalofanexploratoryanalysisistoidentifypatternsinthedatawithoutmuchemphasisonhypothesistesting,thegoalofaconfirmatoryexperimentistotestoneorafewpre-statedhypotheses,andexperimentsaimedatestimatingaparametersuchasageneticlinkagerequireappropriatees-timatesandstandarderrors.Thegeneralaim,however,istoextractalloftheusefulinformationpresentinthedatainawaythatitcanbeinterpreted,takingaccountofbiologicalvariabilityandmeasurementerror.Itisparticularlyusefulinpreventingunjustifiedclaimsabouttheeffectofatreatmentwhentheresultscouldprobablybeexplainedbysamplingvariation.Notethatitispossibleforaneffecttobestatis-ticallysignificantbutoflittleornobiologicalimportance.Thematerialsandmethodssectionshoulddescribethesta-tisticalmethodsusedinanalysingtheresults.Theaimshouldbetodescribestatisticalmethodswithenoughde-tailtoenableaknowledgeablereaderwithaccesstotheoriginaldatatoverifythereportedresults(ICMJE2001ExaminingtheRawDataRawdataanddataenteredintostatisticalsoftwareshouldbestudiedforconsistencyandanyobvioustranscriptionerrors.Graphicalmethods,whichareavailableinmoststatisticalpackages,arehelpful,particularlyifindividualobservationscanbeseenclearly.shouldnotbediscardedun-lessthereisindependentevidencethattheobservationisincorrect,suchasanotetakenatthetimetheobservationwasrecordedexpressingdoubtaboutitscredibility.Exclu-sionofanyobservationsshouldbestatedexplicitly,withreasons.Itissometimesusefultoanalyzethedatawithandwithoutthequestionabledatatolearnwhethertheyaltertheconclusions.Acleardistinctionmustbemadebetweenmissingdata(caused,forinstance,byananimaldyingpre-maturelyorbeingkilledduetoexcessivesuffering)anddatawithavalueofzero.Thoughtshouldbegivenatthedesignstagetodealingwithunexpecteddeaths,particularlyiftheyarerelatedtothetreatments.Detailswilldependonthenatureofthestudyandthenumberofanimalsthatdie.Forexample,thedeathofonlyoneortwoanimalsinarelativelylargestudymayhaverelativelylittleeffectontheresults.Insomelong-termstudies,itmaybepossibletoreplaceanimalsthatdieearly,butthisreplacementisnotusuallypracticalwhentheydielate.Inothercases,someusefulinformation(e.g.,bodyweightsandDNA)canbeobtainedfromthecarcasses,pro-videdtheyarepreserved.Treatment-relateddeathsmaybiassomeoftheresults.Forexample,experimentalstresscouldresultinsomeofthesmalleranimalsinoneofthetreatmentgroupsdying,therebycausingtheaverageweightofthegrouptoappearheavierthanitshouldbe.Inallcases,thenumberandtreatmentgroupsofanyanimalsthatdieshouldbenotedinthepublishedpaperorreport.Inprinciple,datashouldbekeptasaspossible.Forexample,express-ingsomenumbersaspercentagesofothernumbersshouldbeavoidedbecauseitmaycomplicatethestatisticalanalysisandinterpretationoftheresultsand/orreduceprecision.QuantitativeData:ParametricandNonparametricMethodsThemethodofstatisticalanalysisdependsonthepurposeofthestudy,thedesignoftheexperiment,andthenatureoftheresultingdata.Forexample,ananalysisinvolvingatestofanhypothesisshouldnotbeusediftheaimistoestimatetheslopeofaregressionline.Quantitativedataareoftensum-marizedintermsofthemean,(thenumberofsubjects),andthestandarddeviationasameasureofvariation.Themedian,n,andtheinterquartileranges(i.e.,the25thand75thcentiles)maybepreferablefordatathatareclearlyskewed.NonparametricmethodsarediscussedseparatelyQuantitativedatacanbeanalyzedusingmethods,suchasthe-testforoneortwogroupsortheANOVAforseveralgroups,orusingnonparametricmeth-odssuchastheMann-Whitneytest.Parametrictestsareusuallymoreversatileandpowerfulandsoarepreferred;however,theydependontheassumptionsthattheresidualsVolume43,Number42002 (i.e.,deviationofeachobservationfromitsgroupmean)haveanormaldistribution,thatthevariancesareapproxi-matelythesameineachgroup,andthattheobservationsareindependentofeachother.Thefirsttwooftheseassump-tionsshouldbeinvestigatedaspartoftheanalysisbystudy-ingtheresidualsusingmethodsavailableinmoststatisticalsoftwarepackages.Anormalprobabilityplotoftheresidu-alswillshowwhetherthenormalityassumptionisfulfilled(Altman1991).Thistypeofplotshouldgiveastraightlinewithanormaldistributionofresiduals.Aplotofthefits(estimatedgroupmeans)versustheresidualswillshowwhetherthevariationisapproximatelythesameineachgroup.Bothplotsalsotendtohighlightanyoutliers.Ob-servationsmustalsobeindependent(i.e.,theobservationswithinatreatmentgroupmustcomefromexperimentalunits,whichcould,inprinciple,havebeenassignedtodif-ferenttreatmentgroups).Thus,iftheeffectofdifferentdietsonmousebodyweightaretobecomparedusingseveralcageswith,forexample,fivemicepercage,themetrictobeanalyzedshouldbethemeanofallanimalsinthecagetheindividualmouseweightsbecausemicewithinthecagecannotbeassignedtodifferenttreatmentgroups,sotheyarenotstatisticallyindependent.Ifthenumberofmicepercagevaries,thenthismayneedtobetakenintoaccountinthestatisticalanalysis.Whereseveralobservationscanbemadeonanexperimentalunit(e.g.,weightsofindividualanimalswithinacage,asaboveorrandomlychosenmicroscopefieldswithinhisto-logicalsectionsfromananimal),itmaybeimportanttofindoutwhetherprecisioncouldbeincreasedmoreeffec-tivelybyusingmoreexperimentalunitsormoreobserva-tionswithineachunit.Insuchsituations,theobservationsaresaidtobewithintheexperimentalunits,andseverallevelsofnestingarepossible.AnestedANOVAisusuallyusedwiththeaimbeingtoestimatethenentsofvarianceassociatedwitheachlevelofnesting(DixonandMassey1983).Whenthisinformationiscom-binedwiththecostsofexperimentalunitsandobservationswithinaunit,itispossibletoestimatethebestwaytoincreaseprecision.Ingeneral,extrareplicationisnecessaryacrossthelevelofnestingwiththemostvariation.Thus,iftherearelargedifferencesamongscoresofmicroscopicfieldswithinananimal,itwillusuallybebettertosamplemorefieldsthantousemoreanimals,althoughthissamplingdependsontherelativecostsofanimalsandNestingmayalsoinvolveafixedeffect.Forexample,anumberofanimalsmaybeassignedatrandomtosometreatmentgroupsandtheconcentrationofametabolitemaythenbemeasuredinanumberofdifferentorgans.Anestedstatisticalanalysiscanthenbeusedtodeterminewhethertherearedifferencesamongtreatmentmeans,whethertherearedifferencesamongnamedorgans,andwhetherthereisanorganbytreatmentinteraction.Theanalysisissomewhatsimilartothatusedforasplit-plotdesign.Ifthevariancesarenotthesameineachgroupand/ortheresidualsdonothaveanormaldistribution,ascaletransformationmaynormalizethedata.Alogarithmictrans-formationmaybeappropriatefordatasuchastheconcen-trationofasubstance,whichisoftenskewedwithalongtailtotheright.Alogittransformation{log(p/(1-p))}wherepistheproportion,willoftencorrectpercentagesorpropor-tionsinwhichtherearemanyobservationslessthan0.2orgreaterthan0.8(assumingtheproportionscannotber�o28;�1),andasquareroottransformationmaybeusedondatawithaPoissondistributioninvolvingcountswhenthemeanislessthanaboutfive.Furtherdetailsaregiveninmoststatisticstextbooks.Ifnosuitabletransformationcanbefound,anonparametrictestcanoftenbeused(seebelow).MultipleComparisons-testshouldnotbeusedtocomparemorethantwogroupmeans.Itlackspower,andmultipletestingin-creasesthechanceofafalse-positiveresult.Whentherearetwoormoregroups,andparticularlywithrandomizedblockormorecomplexdesigns,theANOVAcanbeusedinitiallytotesttheoverallhypothesisthattherearenodifferencesamongtreatmentmeans.Ifnosignificantdifferencesarefound,thenfurthercomparisonsofmeansshouldnotbedone.WhentheANOVAresultsaresignificant(e.g.,at0.05)andseveralgroupsarebeingcompared,eitherhoccomparisonsorthogonalcontrastscanbeusedtostudydifferencesamongindividualmeans.Arangeofpost-hoccomparisonmethodsareavailablethatdifferslightlyintheirproperties.TheseincludeDun-stestforcomparingeachmeanwiththecontrol,stest,Fishersprotectedleast-significantdifferencetest,Newman-Keulstest,andseveralothersforcomparingallmeans.Largenumbersofpost-hoccomparisonsshouldbeavoidedbecausesomeofthesetestsareandfailtodetecttruedifferences(TypeIIerrors)whereasothersmaybetooliberalandgivefalse-positiveresults(TypeIerrors).Itisbettertospecifythosefewcomparisonsofparticularinterestatthedesignstage.Authorsshouldstatewhichtestshavebeenused.Notethatallofthesetestsusethepooledwithin-groupstandarddeviationobtainedfromtheANOVA.TheANOVAfollowedbyindividual-teststocomparemeans,notusingthepooledstandarddeviation,isnotacceptablebecauseeachtestwilllackpowerduetothelowprecisionoftheestimatesofindividualstandarddeviations. Groupmeanscanalsobecomparedusingso-calledthogonalcontrasts.Dependingonthetypesoftreatment,eitherthismethodcancompareindividualmeansorgroupsofmeansor,ifthetreatmentsrepresentdoselevelsortimeandareequallyspacedonsomescale,thesecontrastscanbeusedtotestlinearityandnonlinearityofresponse.Unfortu-nately,themethodsareavailableonlyinmoreadvancedstatisticalpackages,althoughthecalculationscanbedonemanually.MoredetailsaregivenbyMontgomery(1997).ThebestestimateofthepooledstandarddeviationisobtainedasthesquarerootoftheerrormeansquareintheANOVA.Indeed,thisistheonlyestimateofthestandarddeviationthatisavailableforarandomizedblockdesign.Thus,whenpresentingmeanseitherintablesorgraphically,thisestimateofthestandarddeviationshouldbeused.Itwill,ofcourse,bethesameforeachgroup.SeveralDependentVariablesWhenthereareseveraldependentvariables(characters),eachcanbeanalyzedseparately.However,ifthevariablesarecorrelated,theanalyseswillnotbeindependentofoneanother.Thus,ifsamplingvariationresultedinafalse-positiveorfalse-negativeresultforonecharacter,thesamethingmayhappenforanothercharacter.Amultivariatesta-tisticalanalysissuchasprincipalcomponentsanalysiscouldbeconsideredinsuchcases(EverittandDunn2001).SerialMeasurementsDataonexperimentalsubjectsaresometimescollectedse-rially.Forexample,growthcurves,responsetopharmaceu-ticalortoxicagents,behavioralmeasurements,andoutputfromtelemetricmonitoringmayinvolverepeatedmeasure-mentonindividualanimals.AlthougharepeatedmeasuresANOVAhassometimesbeenusedtoanalyzesuchdata,thisapproachisbestavoidedbecausetheresultsaredifficulttointerpretandtheassumptionsunderlyingtheanalysisarerarelymet.Appropriatesummarymeasuressuchasthemeanoftheobservations,theslopeofaregressionlinefittedtoeachindividual,thetimetoreachapeakortheareaunderthecurve,dependingonthetypeofobservedre-sponse,offerabetteralternativethatiseasiertointerpret(Matthewsetal.1990),althoughothermethodssuchasamultivariateanalysisarealsoavailable(Everitt1995).NonparametricTestsWhentheassumptionsnecessaryforthe-testandtheANOVAofapproximatelyequalvariationineachtreatmentgroupandapproximatenormalityoftheresidualsarenotvalidandnoscaletransformationisavailabletocorrecttheheterogeneityofvarianceand/ornon-normality,anonpara-metrictestcanusuallybeusedtocomparetheequalityofpopulationmeansormedians.Forcomparingtwogroups,theWilcoxonranksumtestandtheMann-Whitneytest(whichareequivalent)constituteanonparametricequiva-lentofthetwo-sample-test.Forcomparingseveralgroups,theKruskal-Wallisisthenonparametricequivalentoftheone-wayANOVA.Anonparametricequivalentofapost-hoccomparisoncanbeused,providedtheoveralltestissignificant(Sprent1993).AversionoftheWilcoxontestcanalsobeusedasthenonparametricversionofthepaired-testforarandomizedblockdesignwithtwotreatmentTheFriedmantestisthenonparametricequivalentoftherandomizedblockANOVAformorethantwotreatmentgroups.Severalothernonparametrictestsareappropriateforparticularcircumstances,andtheyaredescribedinmoststatisticstextbooks.Themostcommoncorrelationcoefficientisknownmoreformallyastheproduct-momentcorrelation,orPearsoncorrelationtodistinguishitfromseveralothertypes.ItisusedforassessingthestrengthoftherelationbetweentwonumericalvariablesAandB.BothAandBareas-sumedtobesubjecttosamplingvariation.Itdoesnotas-sumethatvariationinAcausesvariationinBorviceversa.Thecorrelationcanbeshowngraphicallyusingascatterplot.Normallyabestfittinglineshouldnotbeshown.TheinvestigatorwhowishestofitsuchalineshouldrememberthatthelinecalculatedfromtheregressionofAonBwillnormallybedifferentfromthatduetotheregressionofBonA.Theusualhypothesistestisthatthecorrelationiszero;however,insomecases,itmaybeappropriatetotestwheth-erthecorrelationdiffersfromsomeotherdefinedvalue.Notethatachangeofscalewillalterthecorrelationcoef-ficientandthatanonlinearrelationwillresultinalowcorrelationevenifthetwovariablesarestronglyassociated.Insuchcircumstances,useofthecorrelationofranksmaybemoreappropriate.Thereareseveralotherformsofcor-relationcoefficient,dependingonwhetherthevariablesaremeasurementsorranksoraredichotomous.RegressionanalysiscanbeusedtoquantifytherelationbetweentwocontinuousvariablesXandY,wherevariationinXispresumedtocausevariationinY.RegressionisthusasymmetricwithrespecttoXandY.TheXvariableisassumedtobemeasuredwithouterror.LinearregressioncanbeusedtofitastraightlineoftheformYa+bX,whereaandbareconstantsthatareestimatedfromthedatausingtheleast-squaresmethod.Inthiscase,(theinter-cept)representsthevalueofYwhenXiszero,andtheslopeoftheregressionline.Apositivevalueofbimpliesthatthesloperisesfromlefttoright,andanegativevalueVolume43,Number42002 impliesthatitdeclines.Confidenceintervalscanbeob-tainedfortheslopeandcanbefittedaroundtheregressionlinetogive,forexample,a95%confidenceintervalforthemeanvalueofYforagivenvalueofX.Predictionintervalscanalsobefittedtogive,forexample,a95%intervalforthevariationofindividualobservationsofYforanygivenvalueofX.Whenpossible,itisimportanttoquoteRwhichisinterpretedastheproportionofthevariabilityinthedataexplainedbyregression.ThismaybelowiftheX-variabledoesnothaveareasonablylargerange.Theresidual(error)variationfromtheANOVAtableshouldalsobequoted.IfanimalsarecagedingroupsandtheXvariable(e.g.,thedoselevelofatestcompoundoradietaryingredient)isadministeredtowholecages,thenthecagebecomestheexperimentalunitandtheYvariablewillbethemeanofalltheanimalsinthecage.Ifthenumberofanimalspercagevaries,thenmoreweightcanbegiventocageswithmoreanimals.Aweightedregressionanalysis,whichtakesac-countofpossiblevariationintheprecisionwithwhicheachpointisestimated,isavailableinmanystatisticalpackages.Quadraticregressioncanbeusedtofitacurvetothedatapoints.Manyothertypesofcurvecanbefitted,andsomehaveusefulbiologicalinterpretations.TheeffectofseveralindependentXvariablescanbeevaluatedsimultaneouslyusingmultipleregression.Oftensuchananalysisisexploratory,withtheaimofidentifyingwhichvariablesareinfluential.Logisticregressioncanbeusedtoexploretherelationbetweenoneormorepredictorvariablesandabinary(e.g.,dead/alive)outcome.RegressionandtheANOVAarecloselyrelatedsothataregressionof,forexample,responseondoselevelcansometimesbeincludedaspartoftheANOVAusingor-thogonalcontrasts(Altman1991;DixonandMassey1983;Montgomery1997).TheusualstatisticaltestinregressionanalysisisofthenullhypothesisthatthereisnolinearrelationbetweenXandY.Othercommontestsareofwhethertworegressionlineshavethesameslopeband/orthesameintercepta.Atesttodeterminewhetherthereisaquadraticrelationwouldbeatestofwhetheraquadraticcurvegivesasignificantlybetterfitthanastraightline.CategoricalDataCategoricaldataconsistofcountsofthenumberofunitswithgivenattributes.Theseattributescanbedescribedaswhentheyhavenonaturalorder(e.g.,thestrainorbreedoftheanimals).Theyaredescribedaswhentheyhaveanaturalordersuchaslow,medium,andhighdoselevelsorscores,whichmayalsobedefinednu-merically.Whentherearetwocategories,thedataarecalledbinary.Categoricaldataareoftenpresentedintheformoffrequencytablesand/orasproportionsorpercentages.Proportionsorpercentagesshouldbeaccompaniedbyaconfidenceinterval(preferably)orstandarderror,andnshouldbeclearlyindicated.Theusualmethodofcomparingtwoormoreproportionsisacontingencytablechi-squaredanalysis,whichteststhenullhypothesisthatrowsandcol-umnsareindependent.Themethodisinaccurateifthenum-bersinsomecellsareverylow.Fishersexacttestcanbeusedinsuchcases.Othermethodsofanalysisareavailableandaredescribedinsometexts.PresentationoftheResultsWhenindividualmeansarequoted,theyshouldbeaccom-paniedbysomemeasureofvariation.Excessdecimalplaces,oftenproducedbythecomputer,shouldbeelimi-nated.Itisusuallysufficienttoquotemeanstothreesig-nificantdigits(e.g.,11.4or0.128).Percentagescanoftenberoundedtothenearestwholenumber.Iftheaimistode-scribethevariationamongindividualsthatcontributetothemean,thenthestandarddeviationshouldbegiven.Avoidusingthe±sign.Whenpresentingmeansitisbettertouseadesignationsuchasmean9.6(SD2.1)becauseitavoidsanyconfusionbetweenstandarddeviationandstandarder-ror.Whentheaimistoshowtheprecisionofthemean,aconfidenceinterval(e.g.,9.6[95%CI7.2-12.0])shouldbeused(preferably)orastandarderror(e.g.,9.6[SE1.2]).Actualobservedvaluesshouldbequotedwheneverpos-sible,ratherthanusingor&#x-241;signs,althoughifthesevaluesareverylow,asigncanbeused(e.g.,0.001).Lackofstatisticalsignificanceshouldnotbeusedtoclaimthataneffectdoesnotexist.Nonsignificancemaybeduetotheexperimentbeingtoosmallortheexperimentalmaterialbeingtoovariable.Whentwomeansarecompared,thesizeofthediffer-encebetweenthemshouldbequotedtogetherwithacon-fidenceinterval.Whennonparametricanalyseshavebeendone,itismoresensibletoquotemediansand,forexample,25and75%centilesindicatingtheinterquartilerange.Whenproportionsorpercentagesaregiven,astandarderrororconfidenceintervalandnshouldalsobegiven.Whenproportionsarecompared,theconfidenceintervalforthedifference(orratio)shouldbesupplied.Weadviseshowingtabulatedmeansincolumnsratherthanrowsbecausethisarrangementmakesiteasiertocom-parevalues.IfthemeanshavebeencomparedusingaorANOVAandthestandarddeviationshavebeenfoundnottodiffermateriallybetweengroups,useofapooledstan-darddeviationmaybemoreappropriatethanshowingthestandarddeviationsseparatelyforeachmean.Thenumberofobservationsshouldalwaysbeindicated.GraphicalPresentationofDataGraphsareespeciallyvaluabletoillustratepointsthatwouldbedifficulttoexplaininwritingorinatable.Presentationofasmallnumberofmeanscanoftenbedonemoreclearlyandusinglessspaceusingatablethanabardiagram.Itis alsoeasiertoreadnumericalvaluesoffatablethantoreadthemoffagraph.Graphsshowingindividualpointsratherthanbarchartsorgraphswitherrorbarsarestronglyencouragedbecausetheyprovideamuchclearerimpressionofthenatureofthedata.Forexample,adose-responsecurvewithindividualpoints(Figure1)providesamuchclearerimpressionofindividualvariationthantheexampleinFigure2,whichtendstogiveafalseimpressionofuniformityateachdoseWhenmeanshavebeencomparedstatistically,itmaybebettertoindicatesignificantdifferencesonthediagram,ratherthanaddingerrorbars.Whenerrorbarsareshownongraphsorbardiagrams,thereshouldbeaclearindicationofwhetherthesearestandarddeviations,standarderrors,orconfidenceintervals(preferred),andthenumberofobser-vationsshouldbeclearlyindicatedinthetextorfigurecaption.Withmorecomplexgraphs,itmaybebetternottouseerrorbarsbutinsteadtosummarizethedatainanac-companyingtable.Regressionlinesshouldneverbeshownwithoutthedatapoints;preferably,theyshouldbeshownwithaconfidenceintervaland/orpredictioninterval.CombiningDatafromDifferentStudies:Meta-analysisSometimesanswerstothesameessentialquestionsaresoughtinseveralindependentexperimentsortrialsfromdifferentinvestigators.Formalmethodsofhavebeendevelopedthatattempttocombinetheresultsofdifferentexperimentstakingaccountofsamplesizesandapparentqualityofthedata.Meta-analysisusuallyformsonlypartofasystematicreviewtoidentifyallrelevantstudies(Eggeretal.2001).Thereareanumberofdifficul-tiesindoingsuchreviews,oneofwhichispublicationbias.Manystudiesarepublishedonlyiftheygivepositiveresultsbecausejournalsareoftenreluctanttopublishstudieswheredifferencesarenotstatisticallysignificant.Forexample,findingsthatsometypesofenvironmentalenrichmentben-efitlaboratorymicearemorelikelytobepublishedthanthosethatfindthereisnoeffect.Thus,ifonlypublishedstudiesareincludedinthemeta-analysis,thecaseforenvi-ronmentalenrichmentmightappeartobeoverwhelming.Unfortunately,nomechanismexistsforfindingunpublishedDespitethispotentialdifficulty,bringingtogetherallrelevantresearchevidenceinatopicshouldbegenerallyencouraged.Akeyaspectofsuchreviewistoassessthemethodologicalqualityoftheindividualstudies.Meta-analysisoftheresultsfromseveralstudiesmaythenbedoneforthosestudiesdeemedtobescientificallyreliableandaddressingthesamequestion.Althoughvariousstatisticalmethodsareavailable,meta-analysismaynotbestraight-forward,however,especiallyforobservationalstudies.UseofHistoricalDataThevalueofhistoricaldatadependsonitsqualityanditsreliability.Manyfactors(e.g.,strain,origin,associatedmi- Figure1Redbloodcellcountsinmiceasafunctionofthedoseofchloramphenicolshowingcountsforindividualmicewithalineconnectingthemeancountateachdoselevel.Notethatthisex-ampleprovidesabetterimpressionofthevariabilityofthedatathanFigure2.RawdatafromFestingMFW,DiamantiP,TurtonJA.2001.Straindifferencesinhaematologicalresponsetochlor-amphenicolsuccinateinmice:Implicationsfortoxicologicalre-search.FoodChemToxicol39:375-383. Figure2SamedataasinFigure1,butjustshowingthegroupmeansanderrorbarsofonestandarderrorabouteachmean.Thistypeofpresentationisnotrecommendedasittendstoobscureindividualvariability.Volume43,Number42002 croflora,housing,husbandry,andmethodsofmeasuringeachoutcome)caninfluenceindividualresultssothatinnearlyallstudies,contemporarycontrolsarealmostessen-tial,andhistoricaldata,particularlyfromanotherlaboratoryshouldbetreatedwithconsiderablecaution.Methodsofmeta-analysismaybeappropriateinsomecases.However,whensimilarexperimentsareperformedre-peatedlyinthesamelaboratory,therewilloftenbescopeforusinghistoricaldata.Forexample,chemicalsareoftenrou-tinelytestedtodeterminewhethertheyproducemicronucleiinmicewhengivenbyinjection.Usuallyalaboratorywillstandardizeonasinglestrainandsexofmiceanduseastandardprotocolthatincludescontemporarycontrols.Qualitycontrolcharts,oftenusedinindustry,provideonemethodofusingsuchdata(Hayashietal.1994).InFigure3isshownacontrolchartofthemeannumberofmicronu-cleiin47samplesoffivecontrolmicecollectedoveraperiodofseveralmonthsinonelaboratory,withthelasttwosamplesofmicetreatedwith35mg/kgof1,2-dimethyl-hydrazine.Thecontrolchartshowsthemeannumberofmicronucleiamongthecontrolsampleswithupperandlowercontrollimits.Oneofthesamplesofmicetreatedwiththe1,2-dimethylhydrazineclearlyexceedstheuppercontrollimitandhasbeenflaggedbythecomputer.Carefuluseofsuchtechniques,whichneedfurtherdevelopmentforuseinabiologicalcontext,mightmeanthatsmallersamplesizescouldbeusedineachstudy.Theneedforimprovedexperimentaldesignandstatisticalanalysisofanimalexperiments,iftheyaretobeconsideredethicallyacceptable,hasalreadybeenemphasized.How-ever,arecentexamplere-emphasizesthis.Ameta-analysisof44animalstudiesonfluidresuscitation(Robertsetal.2002)reportedthatonlytwooftheinvestigatorsstatedhowtheanimalswereallocatedtothetreatmentgroups,noneofthemweresufficientlylargetodetectahalvingintheriskofdeathreliably,therewasconsiderablescopeforbiastoenterintotheconclusions,andtherewassubstantialheterogene-ityintheresultsduetothemethodofbleeding.Presumablythelattercouldhavebeendetectedusingafactorialdesignwithbleedingmethodasadesignfactor.Theauthorscon-cludedthattheoddsratioswereimpossibletointerpret,andtheyquestionedwhethertheseanimaldatawereofanyrel-evancetohumanhealthcare.Ifscientistsaretohavetheprivilegeofbeingallowedtodopainfulexperimentsonanimals,theymustensurethattheirexperimentsarebeyondThanksareduetoPeterSasieni,CancerResearchUK,forhelpfulcomments.AltmanDG.1991.PracticalStatisticsforMedicalResearch.London:ChapmanandHall.AltmanDG.2002.Poorqualitymedicalresearch:Whatcanjournalsdo?JAMA(InPress).AltmanDG,GoreSM,GardnerMJ,PocockSJ.2000.Statisticalguidelinesforcontributorstomedicaljournals.In:AltmanDG,MachinD,BryantTN,GardnerMJ,eds.StatisticswithConfidence.2ndEd.London:BMJBooks.BoisvertDPJ.1997.Editorialpoliciesandanimalwelfare.In:vanZutpheLFM,BallsM,eds.AnimalAlternatives,WelfareandEthics.Amster-dam:Elsevier.p399-404.CohenJ.1988.StatisticalPowerAnalysisfortheBehavioralSciences.2ned.Hillsdale:LawrenceErlbaumAssociates.CoxDR.1958.PlanningExperiments.NewYork:JohnWiley&Sons.CoxDR,ReidN.2000.TheTheoryoftheDesignofExperiments.BocaRaton:ChapmanandHall/CRCPress.DellR,HolleranS,RamakrishnanR.2002.Samplesizedetermination.ILARJ43:207-213.&#xhttp;&#x://w;&#xww.n; tio;&#xnal-;겭mie;&#xs.or;&#xg/il; r00;DixonWJ,MasseyFJJ.1983.IntroductiontoStatisticalAnalysis.4thed.Auckland:McGraw-HillInternationalBookCo.EggerM,DaveySmithG,AltmanDG,eds.2001.SystematicReviewsinHealthCare.Meta-analysisinContext.2ndEd.London:BMJBooks.ElashoffJD.1997.nQueryAdvisorVersion2.0UsersGuide.Cork:Sta-tisticalSolutions.EverittBS.1995.Theanalysisofrepeatedmeasures:Apracticalreviewwithexamples.Statistician44:113-135.EverittBS,DunnG.2001.AppliedMultivariateDataAnalysis.2ndEd.London:Arnold.FairweatherWR,BhattacharyyaA,CeuppensPP,HeimannG,HothornLA,KodellRL,LinKK,MagerH,MiddletonBJ,SlobW,SoperKA,StallardN,VentreJ,WrightJ.1998.Biostatisticalmethodologyincarcinogenicitystudies.DrugInforJ32:401-421. Figure3Controlchartofmicronucleicountsper1000polychro-maticerythrocytesin47batchesoffivecontrolmiceandtwobatchesofmicetreatedwith1,2-dimethylhydrazine.Suchachartprovidesonemethodofmakinguseofrelativelyhomogeneoussetsofhistoricalcontroldatacollectedwithinasinglelaboratoryoveralongperiodoftime.(Datausedtoillustratethepoint,althoughtherealtimesequencewasnotavailable,rawdatawasextractedfrom:MorrisonV,AshbyJ.1995.Highresolutionrodentbonemarrowmicronucleusassaysof1,2-dimethylhydrazine:Im-plicationsofsystemictoxicityandindividualresponders.Muta-genesis10:129-135.) FestingMFW.1994.Reductionofanimaluse:Experimentaldesignandqualityofexperiments.LabAnim28:212-221.FestingMFW.1995.Useofamulti-strainassaycouldimprovetheNTPcarcinogenesisbioassayprogram.EnvironHealthPerspect103:44-52.FestingMFW.1997.Fatratsandcarcinogenscreening.Nature388:321-FestingMFW.1999.Warning:Theuseofgeneticallyheterogeneousmicemayseriouslydamageyourresearch.NeurobiolAging20:237-244.FestingMFW.2001.GuidelinesforthedesignandstatisticalanalysisofexperimentsinpaperssubmittedtoATLA.ATLA29:427-446.FestingMFW,LovellDP.1995.Theneedforstatisticalanalysisofrodentmicronucleustestdata:CommentonthepaperbyAshbyandTinwell.MutatRes329:221-224.FestingMFW,LovellDP.1996.Reducingtheuseoflaboratoryanimalsintoxicologicalresearchandtestingbybetterexperimental-design.JRStatSoc58(B-Methodol):127-140.FestingMFW,vanZutphenLFM.1997.Guidelinesforreviewingmanu-scriptsonstudiesinvolvingliveanimals.Synopsisoftheworkshop.In:vanZutphenLFM,BallsM,eds.AnimalAlternatives,WelfareandEthics.Amsterdam:Elsevier.p405-410.FestingMFW,DiamantiP,TurtonJA.2001.Straindifferencesinhaema-tologicalresponsetochloramphenicolsuccinateinmice:Implicationsfortoxicologicalresearch.FoodChemToxicol39:375-383.FestingMFW,OverendP,GainesDasR,CortinaBorjaM,BerdoyM.2002.TheDesignofAnimalExperiments:ReducingtheUseofAni-malsinResearchThroughBetterExperimentalDesign.London:RoyalSocietyofMedicinePressLimited.FinneyDJ.1978.StatisticalMethodinBiologicalAssay.3rdEd.London:CharlesGriffin&CompanyLtd.HayashiM,HashimotoS,SakamotoY,HamadaC,SofuniT,YoshimuraI.1994.Statisticalanalysisofdatainmutagenicityassays:Rodentmicronucleusassay.EnvironHealthPerspect102(Suppl1):49-52.ICMJE[InternationalCommitteeofMedicalJournalEditors].2001.Uni-formrequirementsformanuscriptssubmittedtobiomedicaljournals.LipnickRL,CotruvoJA,HillRN,BruceRD,StitzelKA,WalkerAP,ChuI,GoddardM,SegalL,SpringerJA,MyersRC.1995.Comparisonoftheup-and-down,conventionalLD,andfixed-doseacutetoxicityprocedures.FoodChemToxicol33:223-231.MatthewsJNS,AltmanDG,CampbellMJ,RoystonP.1990.Analysisofserialmeasurementsinmedicalresearch.BrMedJ300:230-235.MaxwellSE,DelaneyHD.1989.Designingexperimentsandanalyzingdata.BelmontCA:WadsworthPublishingCompany.McCanceI.1995.AssessmentofstatisticalproceduresusedinpapersintheAustralianVeterinaryJournal.AustVetJ72:322-328.MeadR.1988.TheDesignofExperiments.Cambridge:CambridgeUni-versityPress.MontgomeryDC.1997.DesignandAnalysisofExperiments.4thEd.NewYork:JohnWiley&Sons.MorrisonV,AshbyJ.1995.Highresolutionrodentbonemarrowmicro-nucleusassaysof1,2-dimethylhydrazine:Implicationsofsystemictox-icityandindividualresponders.Mutagenesis10:129-135.MullerKE,BartonCN,BenignusVA.1984.Recommendationsforappro-priatestatisticalpracticeintoxicologicalexperiments.NeurotoxicoObrinkKJ,RehbinderC.1999.Animaldefinition:Anecessityforthevalidityofanimalexperiments?LabAnim34:121-130.RispinA,FarrarD,MargoschesE,GuptaK,StitzelK,CarrG,GreeneM,MeyerW,McCallD.2002.AlternativemethodsfortheLDtest:Theupanddownprocedureforacutetoxicity.ILARJ43:233-243.RobertsI,KwanI,EvansP,HaigS.2002.Doesanimalexperimentationinformhumanhealthcare?Observationsfromasystematicreviewofinternationalanimalexperimentsonfluidresuscitation.BrMedJ324:RussellWMS,BurchRL.1959.ThePrinciplesofHumaneExperimentalTechnique.London:Methuen&Co.Ltd.[Reissued:1992,UniversitiesFederationforAnimalWelfare,Herts,England.]SchledeE,MischkeU,RollR,KayserD.1992.Anationalvalidationstudyoftheacute-toxic-classmethodAnalternativetotheLDtest.ArchToxicol66:455-470.ShawR,FestingMFW,PeersI,FurlongL.2002.Theuseoffactorialdesignstooptimizeanimalexperimentsandreduceanimaluse.ILARJ43:223-232.&#xhttp;&#x://w;&#xww.n; tio;&#xnal-;겭mie;&#xs.or;&#xg/il; r00;SilverLM.1995.MouseGenetics.NewYork:OxfordUniversityPress.SprentP.1993.AppliedNonparametricStatisticalMethods.2ndEd.Lon-don:ChapmanandHall.StokesWS.2000.Reducingunrelievedpainanddistressinlaboratoryanimalsusinghumaneendpoints.ILARJ41:59-61.ThomasL.1997.Areviewofstatisticalpoweranalysissoftware.BullEcolSocAm78:126-139.SpecificationoftheAnimalsUsedinanExperimentScientificexperimentsshouldberepeatable,soitisimpor-tantthattheanimals,theirenvironment,andtheirassociatedmicro-organismsaredescribedasfullyaspossible(ObrinkandRehbinder1999).Oftenthedescriptionsoftheanimalspublishedinscientificpapersaretotallyinadequate(Bois-vert1997).Scientistsshouldalsobeawarethatanimalswiththesamedesignationfromdifferentsourcesorfromonesourceatdifferenttimesmaybegeneticallydifferent,andthatthemicrobiologicalstatusofanimalscaninfluencetheirresponsetoexperimentaltreatments.Thefollowingcheck-listisbasedlargelyononeproposedbyFestingandvanZutphen(1997).ItshouldbeusedtohelpensurethatalldetailsoftheanimalsrelevanttoaparticularstudyarefullySpecifyinthepaperasmanyaspossibleofthefollowing::Species(withLatinnameifnotacommonlabora-toryspecies),source,conservationstatusifwild,ageand/orbodyweight,sex.:Lengthofacclimatizationperiod:Thebreed,strain,orstockname.Inbredstrains,mutants,transgenes,andclonesshouldbedescribedusinginternationallyacceptednomenclaturewhenavailable(see&#xwww.;&#xinfo;&#xrmat;&#xics.;&#xjax.;&#xorg0;formouseandratnomenclature).Anygeneticqualityassuranceverifyingthegenotypeshouldbementioned.Microbiologicalstatus:Conventional,specifiedpathogen-free(SPF),germfree/gnotobiotic.Whenpossible,referenceshouldbemadetosomeagreed-uponstandardsformicro-biologicalcharacterizationsuchastheFELASAstandardsTypeofhousingincludingwhetherconventional,barrier,isolator,orindividuallyventilatedcages.RoomVolume43,Number42002 temperature(withdiurnalvariation),humidity,ventilation,light/darkperiods,lightintensity.Cagetype,model,material,typeoffloor(solid/mesh),typeofbedding,fre-quencyofcagecleaning,numberofanimalspercage,cage:Type,composition,manufacturer,feedingregimen(adlibitum,restricted,pairfed),methodofsterilization.:adlibitum,bottlesorautomatic,quality,sterilization.StatisticalSoftwareManygoodstatisticalpackagesarenowavailable,andthechoicewilloftendependonwhichpackagesaresupportedbytheparticularresearchorganization.Researchersarestronglyurgedtouseoneofthededicatedstatisticalpack-ages,ratherthanaspreadsheet.Suchpackageshaveawiderrangeofstatisticalmethods,thealgorithmshaveusuallybeenoptimizedoveraperiodofseveralyears,andthemanualsoftenprovidemorehelpwiththeinterpretationoftheresultsthanisavailablewithaspreadsheet.Inmostcases,itiseasytopastematerialfromaspreadsheetintoastatisticalpackage,sorawdatacanbekeptinthespread-sheetifpreferred.SuggestedReadingTherearenumeroustextbooksonstatisticsandexperimen-taldesign.Mostaredirectedatspecificdisciplines(e.g.,agriculture,psychology,clinicalmedicine),butthemethodsaregeneralandapplicabletoanimalexperiments.Anyoneintendingtocontinuewitharesearchcareershouldinvestinapersonalcopyofagoodtextbook,whichtheyshouldbeabletoconsultformanyyears.Areviewofavailabletext-booksisbeyondthescopeofthisarticle,butthefollowingbooksthatarequotedherein(andseveralothersnotquoted)maybeworthconsulting,dependingontheexactapplica-tion:Altman(1991),Cox(1958),CoxandReid(2000),DixonandMassey(1983),EverittandDunn(2001),Festingetal.(2002),Finney(1978),MaxwellandDelaney(1989),Mead(1988),Montgomery(1997),Sprent(1993).Notethatmorerecenteditionsofsomeofthesebooksmaybeavail-ablesincepublicationofthisissueofILARJournal