/
Packagemade4June252021Version1670TitleMultivariateanalysisofmicro Packagemade4June252021Version1670TitleMultivariateanalysisofmicro

Packagemade4June252021Version1670TitleMultivariateanalysisofmicro - PDF document

emmy
emmy . @emmy
Follow
348 views
Uploaded On 2021-06-27

Packagemade4June252021Version1670TitleMultivariateanalysisofmicro - PPT Presentation

2betcoinertiaRtopicsdocumentedbetcoinertia2betweengraph4bga5bgajackk ID: 847779

bga khan null train khan bga train null classes true classvec coa frame genes false class data suppl dudi

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Packagemade4June252021Version1670TitleMu..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 Package`made4'June25,2021Version1.67.0Ti
Package`made4'June25,2021Version1.67.0TitleMultivariateanalysisofmicroarraydatausingADE4AuthorAedinCulhaneMaintainerAedinCulhane૭i;&#xn@ji;&#xmmy.;&#xharv; rd.;íu0;Importsade4DependsRColorBrewer,gplots,scatterplot3d,Biobase,SummarizedExperimentSuggestsaffy,BiocStyle,knitr,rmarkdownDescriptionMultivariatedataanalysisandgraphicaldisplayofmicroarraydata.Functionsincludeforsuperviseddimensionreduction(betweengroupanalysis)andjointdimensionreductionof2datasets(coinertiaanalysis).ItcontainsfunctionsthatrequireRpackageade4.ReferenceCulhaneAC,ThioulouseJ,PerriereG,HigginsDG.(2005)MADE4:anRpackageformultivariateanalysisofgeneexpressiondata.Bioinformatics.21(11):2789-90.CulhaneAC,ThioulouseJ(2006)Amultivariateapproachtointegratingdatasetsusingmade4andade4.RNews:SpecialIssueonBioconductorDecember.LicenseArtistic-2.0VignetteBuilderknitrLazyDataTRUEURLhttp://www.hsph.harvard.edu/aedin-culhane/biocViewsClustering,Classication,DimensionReduction,PrincipalComponent,Transcriptomics,MultipleComparison,GeneExpression,Sequencing,Microarraygit_urlhttps://git.bioconductor.org/packages/made4git_branchmastergit_last_commit54559b0git_last_commit_date2021-05-19Date/Publication2021-06-251 2bet.coinertiaRtopicsdocumented:bet.coinertia.........................................2between.graph........................................4bga..............................................5bga.jackknife.................

2 .......................8bga.suppl.......
.......................8bga.suppl..........................................9cia..............................................11commonMap........................................13comparelists.........................................15do3d.............................................16getcol............................................18graph1D...........................................19heatplot...........................................20html3D...........................................23isDataFrame.........................................25khan.............................................26NCI60............................................28ord..............................................30overview...........................................32plotarrays..........................................33plotgenes..........................................35prettyDend.........................................36randomiser.........................................38sumstats...........................................39suppl.............................................41topgenes...........................................42Index44 bet.coinertiaBetweenclasscoinertiaanalysis DescriptionBetweenclasscoinertiaanalysis.ciaof2datasetswherecovariancebetweengroupsorclassesofcases,ratherthanindividualcasesaremaximised.Usagebet.coinertia(df1,df2,fac1,fac2,cia.nf=2,type="nsc",...) bet.coinertia3Argumentsdf1Firstdataset.Amatr

3 ix,data.frame,ExpressionSetormarrayRaw-c
ix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.df2Seconddataset.Amatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.fac1Afactororvectorwhichdescribestheclassesindf1.fac2Afactororvectorwhichdescribestheclassesindf2.cia.nfIntegerindicatingthenumberofcoinertiaanalysisaxestobesaved.Defaultvalueis2.typeAcharacterstring,acceptedoptionsaretype="nsc"ortype="pca"....furtherargumentspassedtoorfromothermethods.ValueAlistofclassbet.ciaoflength5coinAnobjectofclass'coinertia',sub-classdudi.Seecoinertiacoa1,pca1Anobjectofclass'nsc'or'pca',withsub-class'dudi'.Seedudi,dudi.pcaordudi.nsccoa2,pca2Anobjectofclass'nsc'or'pca',withsub-class'dudi'.Seedudi,dudi.pcaordudi.nscbet1Anobjectofclass'bga',withsub-class'dudi'.Seedudi,bgaorbcabet2Anobjectofclass'bga',withsub-class'dudi'.Seedudi,bgaorbca.NoteThisisverycomputationalintensive.Theauthorsofade4arecurrentlyre-writingthecodeforcoinertiaanalysis,sothatitshouldsubstantiallyimprovethecomputationalrequirements(May2004).Author(s)AedinCulhaneReferencesCulhaneAC,etal.,2003Crossplatformcomparisonandvisualisationofgeneexpressiondatausingco-inertiaanalysis.BMCBioinformatics.4:59SeeAlsoSeeAlsoascoinert

4 ia,cia. 4between.graphExamples###NEEDTOD
ia,cia. 4between.graphExamples###NEEDTODOif(require(ade4,quiet=TRUE)){} between.graphPlot1Dgraphofresultsfrombetweengroupanalysis DescriptionPlotsa1Dgraph,ofresultsofbetweengroupanalysissimilartothatinCulhaneetal.,2002.Usagebetween.graph(x,ax=1,cols=NULL,hor=TRUE,scaled=TRUE,centnames=NULL,varnames=NULL,...)ArgumentsxObjectoftheclassbgaresultingfromabgaanalysis.axNumeric.Thecolumnnumberofprincipalcomponent(\$lsand\$li)tobeused.Defaultis1.Thisistherstcomponentoftheanalysis.colsVectorofcolours.BydefaultcoloursareobtainedusinggetcolhorLogical,indicatingwhetherthegraphshouldbeplottedhorizontallyorverti-cally.Thedefaultisahorizontalplot.scaledLogical,indicatingwhetherthecoordinatesinthegraphshouldbescaledtotoptimallyinplot.DefaultisTRUEcentnamesAvectorofvariableslabels.DefaultisNULL,ifNULLtherownamesofthecentroid\$licoordinateswillbeused.varnamesAvectorofvariableslabels.DefaultisNULL,ifNULLtherownamesofthevariable\$lscoordinateswillbeused....furtherargumentspassedtoorfromothermethodsDetailsThiswillproduceaguresimilartoFigure1inthepaperbyCulhaneetal.,2002.between.graphrequiresbothsamplesandcentroidco-ordinates(\$ls,\$li)whicharepassedtoitviaanobjectofclassbga.Ifcasesaretobecolouredbyclass,italsorequiresa\$facfactorwhichisalsopassedtoitviaanobjectofclassbga.Toplota1DgraphfromothermultivariateanalysissuchasPCA(dudi.pca),COA(dudi.coa),orcoinertiaanalysis.Pleaseusegraph1D.Author(s)AedinCulhane

5 bga5ReferencesCulhaneAC,etal.,2002Betwe
bga5ReferencesCulhaneAC,etal.,2002Between-groupanalysisofmicroarraydata.Bioinformatics.18(12):1600-8.SeeAlsograph1DExamplesdata(khan)if(require(ade4,quiet=TRUE)){khan.bga(khan$train.classes)}between.graph(khan.bga)between.graph(khan.bga,ax=2,lwd=3,cex=0.5,col=c("green","blue","red","yellow"))between.graph(khan.bga,ax=2,hor=FALSE,col=c("green","blue","red","yellow")) bgaBetweengroupanalysis DescriptionDiscriminationofsamplesusingbetweengroupanalysisasdescribedbyCulhaneetal.,2002.Usagebga(dataset,classvec,type="coa",...)##S3methodforclass'bga'plot(x,axis1=1,axis2=2,arraycol=NULL,genecol="gray25",nlab=10,genelabels=NULL,...)ArgumentsdatasetTrainingdataset.Amatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.classvecAfactororvectorwhichdescribestheclassesinthetrainingdataset.typeCharacter,"coa","pca"or"nsc"indicatingwhichdatatransformationisre-quired.Thedefaultvalueistype="coa".xAnobjectofclassbga.Theoutputfrombgaorbga.suppl.Itcontainstheprojectioncoordinatesfrombga,the\$ls,\$coor\$licoordinatestobeplotted. 6bgaarraycol,genecolCharacter,colourofpointsonplot.IfarraycolisNULL,arraycolwillobtainasetofcontrastingcoloursusinggetcol,foreachclassesofcases(microarraysamples)onthearray(case)plot.genecolisthecolourofthepointsforeachvariable(genes)ongeneplot.nlabNumeric.A

6 nintegerindicatingthenumberofvariables(g
nintegerindicatingthenumberofvariables(genes)attheendofaxestobelabelled,onthegeneplot.axis1Integer,thecolumnnumberforthex-axis.Thedefaultis1.axis2Integer,thecolumnnumberforthey-axis,Thedefaultis2.genelabelsAvectorofvariableslabels,ifgenelabels=NULLtherow.namesofinputmatrixdatasetwillbeused....furtherargumentspassedtoorfromothermethods.Detailsbgaperformsabetweengroupanalysisontheinputdataset.Thisfunctioncallsbca.TheinputformatofthedatasetisveriedusingisDataFrame.Betweengroupanalysisisasupervisedmethodforsamplediscriminationandclassprediction.BGAiscarriedoutbyordinatinggroups(setsofgroupedmicroarraysamples),thatis,groupsofsamplesareprojectedintoareduceddimensionalspace.ThisismosteasilydoneusingPCAorCOA,ofthegroupmeans.ThechoiceofPCA,COAisdenedbytheparametertype.Theusermustdenemicroarraysamplegroupingsinadvance.Thesegroupingsaredenedusingtheinputclassvec,whichisafactororvector.Cross-validationandtestingofbgaresults:bgaresultsshouldbevalidatedusingoneleaveoutjack-knifecross-validationusingbga.jackknifeandbyprojectingablindtestdatasetsontothebgaaxesusingsuppl.bgaandsupplarecombinedinbga.supplwhichrequiresinputofbothatrainingandtestdataset.Itisimportanttoensurethattheselectionofcasesforatrainingandtestsetarenotbiased,andgenerallymanycross-validationsshouldbeperformed.Thefunctionrandomisercanbeusedtorandomisetheselectionoftrainingandtestsamples.Plottingandvisualisingbgaresults:1Dplots,s

7 howoneaxisonly:1Dgraphscanbeplottedusing
howoneaxisonly:1Dgraphscanbeplottedusingbetween.graphandgraph1D.between.graphisusedforplottingthecases,andrequiredboththeco-ordinatesofthecases(\$ls)andtheircentroids(\$li).Itacceptsanobjectbga.graph1Dcanbeusedtoploteithercases(microarrays)orvariables(genes)andonlyrequiresavectorofcoordinates.2Dplots:Useplot.bgatoplotresultsfrombga.plot.bgacallsthefunctionsplotarraystodrawanxyplotofcases(\$ls).plotgenes,isusedtodrawanxyplotofthevariables(genes).plotgenes,isusedtodrawanxyplotofthevariables(genes).3Dplots:3Dgraphscanbegeneratedusingdo3Dandhtml3D.html3Dproducesawebpageinwhicha3Dplotcanbeinteractivelyrotated,zoomed,andinwhichclassesorgroupsofcasescanbeeasilyhighlighted.Analysisofthedistributionofvarianceamongaxes:Itisimportanttoknowwhichcases(microarraysamples)arediscriminatedbytheaxes.Thenumberofaxesorprincipalcomponentsfromabgawillequalthenumberofclasses-1,thatislength(levels(classvec))-1. bga7Thedistributionofvarianceamongaxesisdescribedintheeigenvalues(\$eig)ofthebgaanalysis.Thesecanbevisualisedusingascreeplot,usingscatterutil.eigenasitdoneinplot.bga.Itisalsousefultovisualisetheprincipalcomponentsfromausingabgaorprincipalcomponentsanalysisdudi.pca,orcorrespondenceanalysisdudi.coausingaheatmap.InMADE4thefunctionheatplotwillplotaheatmapwithnicerdefaultcolours.Extractinglistoftopvariables(genes):Usetopgenestogetlistofvariablesorcasesattheendsofaxes.Itwillreturnalistofthetopnvariables(bydefaultn=5)a

8 tthepositive,negativeorbothendsofanaxes.
tthepositive,negativeorbothendsofanaxes.sumstatscanbeusedtoreturntheangle(slope)anddistancefromtheoriginofalistofcoordinates.FormoredetailsseeCulhaneetal.,2002andhttp://bioinf.ucd.ie/research/BGA.ValueAlistwithaclassbgacontaining:ordResultsofinitialordination.Alistofclass"dudi"(seedudi)betResultsofbetweengroupanalysis.Alistofclass"dudi"(seedudi),"between"(seebca)facTheinputclassvec,thefactororvectorwhichdescribedtheclassesintheinputdatasetAuthor(s)AedinCulhaneReferencesCulhaneAC,etal.,2002Between-groupanalysisofmicroarraydata.Bioinformatics.18(12):1600-8.SeeAlsoSeeAlsobga,suppl,suppl.bga,bca,bga.jackknifeExamplesdata(khan)if(require(ade4,quiet=TRUE)){khan.bga(classvec=khan$train.classes)}khan.bgaplot(khan.bga,genelabels=khan$annotation$Symbol)#Provideaviewoftheprincipalcomponents(axes)ofthebgaheatplot(khan.bga$bet$ls,dend="none") 8bga.jackknife bga.jackknifeJackknifebetweengroupanalysis DescriptionPerformsone-leave-outjackknifeanalysisofabetweengroupanalysisasdescribedbyCulhaneetal.,20002Usagebga.jackknife(data,classvec,...)ArgumentsdataInputdataset.Amatrix,data.frameIftheinputisgeneexpressiondatainamatrixordata.frame.Thecolumnscontainthecases(arraysamples)whichwillbejackknifed.classvecAfactororvectorwhichdescribestheclassesinthetrainingdataset...furtherargumentspassedtoorfromothermethodsDetailsPerformsaone-leave-outcrossvalidationofbetweengroupanalysisbga.Inputisatrainingdataset.Thiscantake5-10

9 minutestocomputeonstandarddatageneexpres
minutestocomputeonstandarddatageneexpressionmatrix.Injackknifeoneleaveoutanalysis,onecase(column)isremoved.Theremainingdatasetissub-jectedtobga.Thentheclassofthecasethatwasremovedispredictedusingsuppl.Thisanalysisisrepeateduntilallsampleshavebeenremovedandpredicted.ValueAlistcontainingresultsTheprojectedco-ordinatesofeachsamplesummaryAsummaryofnumberandpercentageofcorrectlyassignedsamplesAuthor(s)AedinCulhaneReferencesCulhaneetal.,2002Between-groupanalysisofmicroarraydata.Bioinformatics.18(12):1600-8.SeeAlsoSeeAlsobga,bga.suppl,suppl,bga,bca,plot.bga bga.suppl9Examplesdata(khan)#NOTEusingaveryreduceddataset(first5genes)tospeedupresults#henceexpectpoorpredictionaccuracydim(khan$train)print("usingonlysmallsubsetofdata")if(require(ade4,quiet=TRUE)){bga.jackknife(khan$train[1:5,],khan$train.classes)} bga.supplBetweengroupanalysiswithsupplementarydataprojection Descriptionbga.supplperformsabgabetweengroupanalysiswithprojectionofsupplementarypointsusingsupplUsagebga.suppl(dataset,supdata,classvec,supvec=NULL,suponly=FALSE,type="coa",...)ArgumentsdatasetTrainingdataset.Amatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.supdataTestorblinddataset.Amatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsa

10 reexpectedtocontainthevariables(genes)an
reexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.Thetestdatasetsupdataandthetrainingdatasetdatasetmustcontainthesamenumberofvariables(genes).classvecAfactororvectorwhichdescribestheclassesinthetrainingdatadataset.supvecAfactororvectorwhichdescribestheclassesinthetestdatasetsupdata.suponlyLogicalindicatingwhetherthereturnedoutputshouldcontainthetestclassas-signmentresultsonly.ThedefaultvalueisFALSE,thatisthetrainingcoordi-nates,testcoordinatesandclassassignmentswillallbereturned.typeCharacter,"coa","pca"or"nsc"indicatingwhichdatatransformationisre-quired.Thedefaultvalueistype="coa"....furtherargumentspassedtoorfromothermethods. 10bga.supplDetailsbga.supplcallsbgatoperformbetweengroupanalysis(bga)onthetrainingdataset,thenitcallssuppltoprojectthetestdatasetontothebgaaxes.Itreturnsthecoordinatesandclassassignmentofthecases(microarraysamples)inthetestdatasetasdescribedbyCulhaneetal.,2002.Thetestdatasetmustcontainthesamenumberofvariables(genes)asthetrainingdataset.TheinputformatofboththetrainingdatasetandtestdatasetareveriedusingisDataFrame.Useplot.bgatoplotresultsfrombga.ValueIfsuponlyisFALSE(thedefaultoption)bga.supplreturnsalistoflength4containingtheresultsofthebgaofthetrainingdatasetandtheresultsoftheprojectionofthetestdatasetontothebgaaxes-ordResultsofinitialordination.Alistofclass"dudi"(seedudi).betResultsofbetweengroupanalysis.Alistofclass"dudi"(seedudi),"between"(se

11 ebca),and"dudi.bga"(seebga)facTheinputcl
ebca),and"dudi.bga"(seebga)facTheinputclassvec,thefactororvectorwhichdescribedtheclassesintheinputdatasetsupplAnobjectreturnedbysupplIfsuponlyisTRUEonlytheresultsfromsupplwillbereturned.Author(s)AedinCulhaneReferencesCulhaneAC,etal.,2002Between-groupanalysisofmicroarraydata.Bioinformatics.18(12):1600-8.SeeAlsoSeeAlsobga,suppl,bca,plot.bga,bga.jackknifeExamplesdata(khan)#khan.bga(khan$train.classes)if(require(ade4,quiet=TRUE)){khan.bga(supdata=khan$test,classvec=khan$train.classes,supvec=khan$test.classes)khan.bgaplot.bga(khan.bga,genelabels=khan$annotation$Symbol)khan.bga$suppl} cia11 ciaCoinertiaanalysis:Explorethecovariancebetweentwodatasets DescriptionPerformsCIAontwodatasetsasdescribedbyCulhaneetal.,2003.Usedformeta-analysisoftwoormoredatasets.Usagecia(df1,df2,cia.nf=2,cia.scan=FALSE,nsc=TRUE,...)##S3methodforclass'cia'plot(x,nlab=10,axis1=1,axis2=2,genecol="gray25",genelabels1=rownames(ciares$co),genelabels2=rownames(ciares$li),...)Argumentsdf1Therstdataset.Amatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.df2Theseconddataset.Amatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.cia.nfIntegerindicatingthenumbe

12 rofcoinertiaanalysisaxestobesaved.Defaul
rofcoinertiaanalysisaxestobesaved.Defaultvalueis2.cia.scanLogicalindicatingwhetherthecoinertiaanalysiseigenvalue(scree)plotshouldbeshownsothatthenumberofaxes,cia.nfcanbeselectedinteractively.De-faultvalueisFALSE.nscAlogicalindicatingwhethercoinertiaanalysisshouldbeperformedusingtwonon-symmetriccorrespondenceanalysesdudi.nsc.Thedefault=TRUEishighlyrecommended.IfFALSE,COAdudi.coawillbeperformedondf1,androwweightedCOAdudi.rwcoawillbeperformedondf2usingtherowweightsfromdf1.xAnobjectofclasscia,containingtheCIAprojectedcoordinatestobeplotted.nlabNumeric.Anintegerindicatingthenumberofvariables(genes)tobelabelledonplots.axis1Integer,thecolumnnumberforthex-axis.Thedefaultis1.axis2Integer,thecolumnnumberforthey-axis.Thedefaultis2.genecolCharacter,thecolourofgenes(variables).Thedefaultis"gray25".genelabels1,genelabels2Avectorofvariableslabels,bydefaulttherow.namesofeachinputmatrixdf1,anddf2areused....furtherargumentspassedtoorfromothermethods. 12ciaDetailsCIAhasbeensuccessfullyappliedtothecross-platformcomparison(meta-analysis)ofmicroarraygeneexpressiondatasets(Culhaneetal.,2003).PleaserefertothispaperandthevignetteforhelpininterpretationoftheoutputfromCIA.Co-inertiaanalysis(CIA)isamultivariatemethodthatidentiestrendsorco-relationshipsinmul-tipledatasetswhichcontainthesamesamples.Thatistherowsorcolumnsofthematrixhavetobeweightedsimilarlyandthusmustbe"matchable".Incia,itisassumedthattheanalysisisbeingpe

13 rformedonthemicroarraycases,andthustheco
rformedonthemicroarraycases,andthusthecolumnswillbematchedbetweenthe2datasets.Thuspleaseensurethattheorderofcases(thecolumns)indf1anddf2areequivalentbeforeperformingCIA.CIAsimultaneouslyndsordinations(dimensionreductiondiagrams)fromthedatasetsthataremostsimilar.Itdoesthisbyndingsuccessiveaxesfromthetwodatasetswithmaximumcovari-ance.CIAcanbeappliedtodatasetswherethenumberofvariables(genes)farexceedsthenumberofsamples(arrays)suchisthecasewithmicroarrayanalyses.ciacallscoinertiaintheADE4package.Formoreinformationoncoinertiaanalysispleaserefertocoinertiaandseveralrecentreviews(seebelow).InthepaperbyCulhaneetal.,2003,thedatasetsdf1anddf2aretransformedusingCOAandRowweightedCOArespectively,beforecoinertiaanalysis.Itisnowrecommendedtoperformnonsymmetriccorrespondenceanalysis(NSC)ratherthancorrespondenceanalysis(COA)onbothdatasets.TheRVcoefcientIntheresults,intheobjectciareturnedbytheanalysis,\$coinertia\$RVgivestheRVcoefcient.Thisisameasureofglobalsimilaritybetweenthedatasets,andisanumberbetween0and1.Thecloseritisto1thegreatertheglobalsimilaritybetweenthetwodatasets.Plottingandvisualisingciaresultsplot.ciadraws3plots.TherstplotusesS.match.coltoplotstheprojection(normalisedscores\$mYand\$mX)ofthesamplesfromeachdatasetontotheonespace.Cases(microarraysamples)fromonedatasetarerepresentedbycircles,andcasesfromtheseconddatasetarerepresentedbyarrowtips.Eachcircleandarrowisjoinedbyaline,wher

14 ethelengthofthelineisproportionaltothedi
ethelengthofthelineisproportionaltothedivergencebetweenthegeneexpressionprolesofthatsampleinthetwodatasets.Ashortlineshowsgoodagreementbetweenthetwodatasets.Thesecondtwoplotscallplot.genesareshowtheprojectionofthevariables(genes,\$liand\$co)fromeachdatasetinthenewspace.ItisimportanttonoteboththedirectionofprojectofVariables(genes)andcases(microarraysamples).Variablesandcasesthatareprojectedinthesamedirectionfromtheoriginhaveapositivecorrelation(iethosegenesareupregulatedinthosemicroarraysamples)PleaserefertothehelponbgaforfurtherdiscussionongraphingandvisualisationfunctionsinMADE4.ValueAnobjectoftheclassciawhichcontainsalistoflength4.calllistofinputarguments,df1anddf2 commonMap13coinertiaAobjectofclass"coinertia",sub-classdudi.Seecoinertiacoa1Returnsanobjectofclass"coa"or"nsc",withsub-classdudi.Seedudi.coaordudi.nsccoa2Returnsanobjectofclass"coa"or"nsc",withsub-classdudi.Seedudi.coaordudi.nscAuthor(s)AedinCulhaneReferencesCulhaneAC,etal.,2003Crossplatformcomparisonandvisualisationofgeneexpressiondatausingco-inertiaanalysis.BMCBioinformatics.4:59SeeAlsoSeealsocoinertia,plot.cia.CIAandmultipleCIAisalsoimplementedinBioconductorpack-agesomicade4andmogsaExamplesdata(NCI60)print("Thiswilltakeafewminutes,pleasewait...")if(require(ade4,quiet=TRUE)){#Exampledataare"G1_Ross_1375.txt"and"G5_Affy_1517.txt"coincia(NCI60$Ross,NCI60$Affy)}attach(coin)summary(coin)summary(coin$coinertia)#$coinertia$RVwillgi

15 vetheRV-coefficient,thegreater(scale0-1)
vetheRV-coefficient,thegreater(scale0-1)thebettercat(paste("TheRVcoefficientisameasureofglobalsimilaritybetweenthedatasets.\n","Thetwodatasetsanalysedareverysimilar.","TheRVcoefficientofthiscoinertiaanalysisis:",coin$coinertia$RV,"\n",sep=""))plot(coin)plot(coin,classvec=NCI60$classes[,2],clab=0,cpoint=3) commonMapPlothighlightscommonpointsbetweentwo1Dplots(biparitite) DescriptionCommonMapdrawstwo1Dplots,andlinksthecommonpointsbetweenthetwo.UsagecommonMap(x,y,hor=TRUE,cex=1.5,scaled=TRUE,...) 14commonMapArgumentsxThecoordinatesoftherstaxisyThecoordinatesofthesecondaxishorLogical,whetherahorizontallineshouldbedrawnonplot.DefaultisTRUE.cexNumeric.TheamountbywhichplottingtextandsymbolsshouldbescaledrelativetothedefaultscaledLogical,whetherthedatainxandyarescaled.Scalingisusefulforvisualisingsmallorlargedatavalues.SettoFALSEifactuallyortruevaluesshouldbevisualised.ThedefaultisTRUE....furtherargumentspassedtoorfromothermethodDetailsUsefulformappingthegenesincommonfromcoinertiaanalysisThisgraphsa1Dgraph,xandyarethecoordinatesfromtwodifferentanalysesbuttherowsofeachvectorscorrespond(iecommongenes)NoteThisisusefulforexaminingcommonpointsinaxesfromcoinertiaanalysis,orcomparingresultsfromtwodifferentanalysis.Author(s)AilisFaganandAedinCulhaneSeeAlsoSeealsobetween.graph,graph1DExamplesa()b()par(mfrow=c(2,2))commonMap(a,b)commonMap(a,b,hor=FALSE,col="red",pch=19)commonMap(a,b,col="blue",cex=2,pch=19)#If

16 thevectorscontaindifferentvariables,ther
thevectorscontaindifferentvariables,therowsshoulddefinethevariablesthatcorresponda[15:20]b[10:15]cbind(a,b)commonMap(a,b,col="darkgreen",pch=18) comparelists15 comparelistsReturntheintersect,differenceandunionbetween2vectors DescriptionThisisaverysimplefunctionwhichcomparestwovectors,xandy.Itreturnstheintersectionanduniquelists.Itisusefulforcomparingtwogenelists.Usagecomparelists(dx,dy,...)##S3methodforclass'comparelists'print(x,...)Argumentsdx,dyAvector.xAnobjectfromcomparelists....furtherargumentspassedtoorfromothermethods.Detailsreportsontheintersect,differenceandunionbetweentwolists.ValueAnobjectofclasscomparelists:intersectVectorcontainingtheintersectbetweenxandySet.DiffVectorcontainingtheelementsuniquetoXobtainedusingsetdiffXinYNumeric,indicatingthenumberofelementsofxinyYinXNumeric,indicatingthenumberofelementsofyinxLength.XNumeric,thenumberofelementsinxLength.YNumeric,thenumberofelementsiny...FurtherargumentspassedtoorfromothermethodsAuthor(s)AedinCulhaneSeeAlsoSeealsointersect,setdiff 16do3dExamplesa()b()z()z$Set.Diffz$intersect do3dGenerate3Dgraph(s)usingscatterplot3d Descriptiondo3disawrapperforscatterplot3d.do3dwilldrawasingle3Dxyzplotandwillploteachgroupofpointsinadifferentcolour,givenafactor.rotate3dcallsdo3dtodrawmultiple3Dplotsinwhicheachplotismarginallyrotatedonthex-yaxis.Usagedo3d(dataset,x=1,y=2,z=3,angle=40,classvec=NULL,classcol=NULL,col=NULL,cex.lab=0.3,pch=19,cex.symbols=1,

17 ...)rotate3d(dataset,x=1,y=2,z=3,beg=180
...)rotate3d(dataset,x=1,y=2,z=3,beg=180,end=360,step=12,savefiles=FALSE,classvec=NULL,classcol=NULL,col=NULL,...)ArgumentsdatasetXYZcoordinatestobeplotted.Amatrixordata.framewith3ormorecolumns.Usuallyresultsfrommultivariateanalysis,suchasthe\$coor\$licoordinatesfromaPCAdudi.pca,orCOAdudi.coaorthe\$ls,\$cocoordi-natesfrombga.xNumeric,thecolumnnumberforthex-axis,thedefaultis1(thatisdataset[,1])yNumeric,thecolumnnumberforthey-axis,thedefaultis2(thatisdataset[,2])zNumeric,thecolumnnumberforthez-axis,thedefaultis3(thatisdataset[,3])angleNumeric,theanglebetweenxandyaxis.Notetheresultdependsonscaling.Seescatterplot3dclassvecAfactororvectorwhichdescribestheclassesindatasetclasscolAfactororvectorwhichlistthecoloursforeachoftheclassesinthedataset.BydefaultNULL.WhenNULL,getcolisusedtoobtainanoptimumsetofcoloursoftheclassesinclassvec.cex.labNumeric.Themagnicationtobeusedfortheaxisannotationrelativetothecurrentdefaulttextandsymbolsize.Defaultis0.3pchIntegerspecifyingasymbolorsinglecharactertobeusedwhenplottingpoints.Thedefaultispch=19 do3d17cex.symbolsNumeric.Themagnicationtobeusedforthesymbolsrelativetothecurrentdefaulttextsize.Defaultis1colAcharacterindicatingacolour.Tobeusedifallpointsaretobeonecolour.Ifclassvec,classcolandcolareallNULL.allpointswillbedrawninredbydefault.begNumeric.Thestartinganglebetweenthexandyaxisforrotate3d.Rotate3dwilldrawplotsinwhichtheyarerotatedfromanglebegtoangleendendNu

18 meric.Thenalanglebetweenthexandyaxis
meric.Thenalanglebetweenthexandyaxisforrotate3d.Rotate3dwilldrawplotsinwhichtheyarerotatedfromanglebegtoangleendstepNumeric.Incrementofthesequencebetweenthestartinganglebegandthenalangleend.savefilesLogical,indicatingwhethertheplotshouldbesavedasapdfle.ThedefaultisFALSE...furtherargumentspassedtoorfromothermethodsDetailsThiscallsscatterplot3dtoplota3drepresentationofresults.Itisalsoworthexploringthepackagerglwhichenablesdynamic3dplot(thatcanberotated)library(rgl)plot3d(khan.coa$co[,1],khan.coa$co[,2],khan.coa$co[,3],size=4,col=khan$train.classes)rgl.snapshot(le="test.png",top=TRUE)rgl.close()ValueProducesplotsofthexyzcoordinates.Author(s)AedinCulhaneSeeAlsoSeeAlsoscatterplot3dExamplesdata(khan)if(require(ade4,quiet=TRUE)){khan.coa(scannf=FALSE,nf=5)}par(mfrow=c(2,1))do3d(khan.coa$co,classvec=khan$train.classes)do3d(khan.coa$co,col="blue")rotate3d(khan.coa$co,classvec=khan$train.classes)khan.bga(khan$train.classes)plot.new()par(bg="black")do3d(khan.bga$bet$ls,classvec=khan$train.classes) 18getcol getcolSpecialisedcolourpalettewithsetof21maximallycontrastingcolours DescriptionSpecialcolourpalettedevelopedtomaximisethecontrastbetweencolours.Colourswereselectedforvisualisinggroupsofpointsonxyorxyzplotsonawhitebackground.Becauseofthis,therearefewpastelcoloursareinthispalette.getcolcontains2palettesof12and21colours.Usagegetcol(nc=c(1:3),palette=NULL,test=FALSE)ArgumentsncNumeric.Intege

19 rorvectorinrange1to21.Thisselectscolours
rorvectorinrange1to21.ThisselectscoloursfrompalettepaletteAcharactertoselecteitherpalette"colours1"or"colours2".colours1contains12colours,colours2contains21colourstestAlogical,ifTRUEaplotwillbedrawntodisplaythepalettescolours1,colours2andanyselectedcolours.DetailsColours1containsthe12colours,"red","blue","green","cyan","magenta","yellow","grey","black","brown","orange","violet","purple").Thesewerechoosen,asthesearecompatiblewithrasmolandchime,thatareusedinhtml3D.Colours2contains21colours.Thesewereselectedsoastomaximisethecontrastbetweengroups.ForothercolourpalettesinR,seecolors,palette,rainbow,heat.colors,terrain.colors,topo.colorsorcm.colors.AlsoseethelibraryRColorBrewerValueAvectorcontainingalistofcolours.Author(s)AedinCulhaneSeeAlsoSeealsocolors,palette,rainbow,heat.colors,terrain.colors,topo.colorsorcm.colors,RColorBrewer graph1D19Examplesgetcol(3)getcol(c(1:7))getcol(10,test=TRUE)getcol(c(1:5,7,15,16),palette="colours2",test=TRUE) graph1DPlot1Dgraphofaxisfrommultivariateanalysis DescriptionDraw1Dplotofanaxisfrommultivariateanalysis.UsefulforvisualisinganindividualaxisfromanalysessuchasPCAdudi.pcaorCOAdudi.coa.Itacceptsafactorsothatgroupsofpointscanbecoloured.Itcanalsobeusedforgraphinggenes,andwillonlylabelngenesattheendsoftheaxis.Usagegraph1D(dfx,classvec=NULL,ax=1,hor=FALSE,s.nam=row.names(dfx),n=NULL,scaled=TRUE,col="red",width=NULL,...)Argumentsdfxvector,matrix,ordata.frame,whichcontains

20 acolumnwithaxiscoordinatesaxNumeric,indi
acolumnwithaxiscoordinatesaxNumeric,indicatingcolumnofmatrix,ordata.frametobeplotted.Thede-faultis1.classvecFactor,indicatingsub-groupingsorclassesindfxordfx[,ax]horLogical,indicatingwhetherthegraphshouldbedrawnhorizontalorvertically.Thedefaultisvertically.s.namVector.labelsofdfx,Thedefaultisrow.names(dfx)nNumeric.Whetherallrowsshouldbeplotted,n=10wouldlabelonlythe10variablesattheendoftheaxis.Bydefaultallvariables(rowofdfx)arelabelledscaledAlogicalindicatingwhethertheplotshouldbescaledtot.ThedefaultisTRUEcolAcharacterorvectorindicatingthecolour(s)forpointsorgroupsofpoints.Ifpointsaretobecolouredaccordingtoafactor,length(col)shouldequallength(levels(classvec))widthAvectoroflength2,whichisthewidth(ofaverticalplot)orheight(ofahori-zontalplot).Thiscanbeincreasedifvariablelabelsareunreadable.Thedefaultisc(-2,1)...furtherargumentspassedtoorfromothermethodsAuthor(s)AedinCulhane 20heatplotSeeAlsobetween.graphExamplesa()graph1D(a,s.nam=letters[1:25])graph1D(a,s.nam=letters[1:25],col="blue",pch=19,n=3)data(khan)if(require(ade4,quiet=TRUE)){khan.coa(scan=FALSE,nf=2)}graph1D(khan.coa$co,ax=1) heatplotDrawsheatmapwithdendrograms. Descriptionheatplotcallsheatmap.2usingared-greencolourschemebydefault.ItalsodrawsdendrogramsofthecasesandvariablesusingcorrelationsimilaritymetricandaveragelinkageclusteringasdescribedbyEisen.heatplotisusefulforaquickovervieworexploratoryanalysisofdataUsageheatplot(dataset,dend=

21 c("both","row","column","none"),cols.def
c("both","row","column","none"),cols.default=TRUE,lowcol="green",highcol="red",scale="none",classvec=NULL,classvecCol=NULL,classvec2=NULL,distfun=NULL,returnSampleTree=FALSE,method="ave",dualScale=TRUE,zlim=c(-3,3),scaleKey=TRUE,...)Argumentsdatasetamatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.dendAcharacterindicatingwhetherdendrogramsshouldbedrawnforbothrowsandcolumms"both",justrows"row"orcolumn"column"ornodendrogram"none".Defaultisboth.cols.defaultLogical.DefaultisTRUE.Useblue-browncolorscheme.lowcol,highcolCharacterindicatingcolourstobeusedfordownandupregulatedgeneswhendrawingheatmapifthedefaultcolorsarenotused,thatiscols.default=FALSE.scaleDefaultisrow.Scaleandcentereither"none","row",or"column"). heatplot21classvec,classvec2Afactororvectorwhichdescribestheclassesincolumnsorrowsofthedataset.DefaultisNULL.Ifincluded,acolorbarincludingtheclassofeachcolumn(arraysample)orrow(gene)willbedrawn.Itwillautomaticallyaddtoeitherthecolumnsorrow,dependingifthelength(as.character(classvec))==nrow(dataset)orncol(dataset).classvecColAvectoroflengththenumberoflevelsinthefactorclassvec.Thesearethecolorstobeusedfortheroworcolumncolorbar.Colorsshouldbeinthesameorder,asthelevels(factor(classvec))distfunAcharacter,indicatingfunctionusedtocomputethedistancebetweenbothro

22 wsandcolumns.Defaultsto1-PearsonCorrelat
wsandcolumns.Defaultsto1-PearsonCorrelationcoefcientmethodTheagglomerationmethodtobeused.Thisshouldbeoneof'"ward"','"sin-gle"','"complete"','"average"','"mcquitty"','"median"'or'"centroid"'.Seehclustformoredetails.Defaultis"ave"dualScaleAlogicalindicatingwhethertoscalethedatabyrowandcolumns.DefaultisTRUEzlimAvectoroflength2,withlowerandupperlimitsusingforscalingdata.Defaultisc(-3,3)scaleKeyAlogicalindicatingwhethertodrawaheatmapcolor-keybar.DefaultisTRUEreturnSampleTreeAlogicalindicatingwhethertoreturnthesample(column)tree.IfTRUEitwillreturnanobjectofclassdendrogram.DefaultisFALSE....furtherargumentspassedtoorfromothermethods.DetailsThehierarchicalplotisproducedusingaveragelinkageclusteranalysiswithacorrelationmetricdistance.heatplotcallsheatmap.2intheRpackagegplots.NOTE:Wehavechangedheatplotscalinginmade4(v1.19.1)inBioconductorv2.5.Heatplotbydefaultdualscalesthedatatolimitsof-3,3.Toreproduceolderversionofheatplot,usetheparametersdualScale=FALSE,scale="row".ValuePlotsaheatmapwithdendrogramofhierarchicalclusteranalysis.IfreturnSampleTreeisTRUE,itreturnsanobjectdendrogramwhichcanbemanipulatedusingNoteBecauseEisenetal.,1998usegreen-redcoloursfortheheatmapheatplotusesthesebydefaulthoweverablue-redoryellow-blueareeasilyobtainedbychanginglowcolandhighcolAuthor(s)AedinCulhane 22heatplotReferencesEisenMB,SpellmanPT,BrownPOandBotsteinD.(1998).ClusterAnalysisandDisplayofGenome-WideExpressionPatterns.P

23 rocNatlAcadSciUSA95,14863-8.SeeAlsoSeeal
rocNatlAcadSciUSA95,14863-8.SeeAlsoSeealsoashclust,heatmapanddendrogramExamplesdata(khan)##Changecolorschemeheatplot(khan$train[1:30,])heatplot(khan$train[1:30,],cols.default=FALSE,lowcol="white",highcol="red")##Addlabelstorows,columnsheatplot(khan$train[1:26,],labCol=c(64:1),labRow=LETTERS[1:26])##Addacolorbarheatplot(khan$train[1:26,],classvec=khan$train.classes)heatplot(khan$train[1:26,],classvec=khan$train.classes,classvecCol=c("magenta","yellow","cyan","orange"))##Changethescalingtotheoldermade4version(preBioconductor2.5)heatplot(khan$train[1:26,],classvec=khan$train.classes,dualScale=FALSE,scale="row")##GettingthemembersofaclusterandmanuipulatingthetreesTree(classvec=khan$train.classes,returnSampleTree=TRUE)class(sTree)plot(sTree)##Cutthetreeattheheight=1.0lapply(cut(sTree,h=1)$lower,labels)##Zoominonthefirstclusterplot(cut(sTree,1)$lower[[1]])str(cut(sTree,1.0)$lower[[1]])##Visualizingresultsfromanordinationusingheatplotif(require(ade4,quiet=TRUE)){#save5componentsfromcorrespondenceanalysisres(ord.nf=5)khan.coa=res$ord}#ProvidesaviewofthecomponentsoftheCorrespondenceanalysis html3D23#(geneprojection)#first5components,donotclustercolumns,onlyrows.heatplot(khan.coa$li,dend="row",dualScale=FALSE)#ProvidesaviewofthecomponentsoftheCorrespondenceanalysis#(sampleprojection)#Thedifferencebetweentissuesandcelllinesamples#aredefinedinthefirstaxis.#Changethemarginsize.Thedefaultisc(5,5)heatplot(khan

24 .coa$co,margins=c(4,20),dend="row")#Adda
.coa$co,margins=c(4,20),dend="row")#Addacolorbar,changetheheatmapcolorschemeandnoscalingofdataheatplot(khan.coa$co,classvec2=khan$train.classes,cols.default=FALSE,lowcol="blue",dend="row",dualScale=FALSE)apply(khan.coa$co,2,range) html3DProducewebpagewitha3DgraphthatcanbeviewedusingChimewebbrowserplug-in,and/orapdblethatcanbeviewedusingRas-mol Descriptionhtml3DproducesapdblethatcanbeviewedusingthefreewareproteinstructureviewerRasmolandahtmlwebpagewitha3Dgraphthatcanberotatedandmanipulatedinawebbrowserthatsupportsthechimewebbrowserplug-in.Usagehtml3D(df,classvec=NULL,writepdb=FALSE,filenamebase="output",writehtml=FALSE,title=NULL,scaled=TRUE,xyz.axes=c(1:3),...)ArgumentsdfAmatrixordata.framecontainingthex,y,zcoordinates.Typicallytheoutputfrombgasuchasthe\$lsor\$coles,orotherxyzcoordinates(\$lior\$co)producedbyPCA,COAorotherdudiclassvecfactororvectorwhichdescribesclassesinthedf.DefaultisNULL.Ifspeci-edeachgroupwillbecolouredincontrastingcolourswritepdbLogical.ThedefaultisFALSE.IfTRUEalewillbesavedwhichcanbereadintoRasmol.writehtmlLogical.ThedefaultisFALSE,IfTRUEawebhtmllewillbesavedwhichcanbeviewedinanywebbrowserthansupportschime. 24html3DfilenamebaseCharacter.Thebasenameofthehtmlorpdble(s)tobesaved.Thedefaultis"output",whichwillsavelesoutput.pdb,output.html,ifwritepdborwritehtmlareTRUErespectively.titleCharacter,thetitle(header)ofthewebpagesavedifwritehtmlisTRUE.Th

25 edefaultisNULL.scaledLogicalindicatingwh
edefaultisNULL.scaledLogicalindicatingwhetherthedatashouldbescaledforbestt.ThedefaultisTRUExyz.axesvectorindicatingwhichaxestouseforx,yandzaxes.Bydefault,therst3columnsofdf....furtherargumentspassedtoorfromothermethodsDetailsProducesahtmlle,ofa3DgraphwhichcanberotatedusingtheFREEWAREchime(win,Ma-cOS).Chimecanbedownloadedfromhttp://www.mdlchime.com/.html3Dwillcoloursamplesbyclassvecifgivenone,andwillproducechimescripttohigh-lightgroups,spinon/off,andincludebuttonforrestoreforexampleseehttp://bioinf.ucd.ie/research/BGA/supplement.htmlhtml3dcallschime3Dtoproducethehtmlwebpagewitha3Dgraph.Valuehtml3DproducesthepdboutputlewhichcanbereadinRasmolorothermolecularstructureviewers.html3Dproducesahtmllewitha3Dgraphthatcanberotatedandmanipulatedinawebbrowserthatsupportsthechimewebbrowserplug-in.NoteNotechimeisonlyavailableonwindowsorMacOScurrently.Usingthechimeplug-inonLinuxisslightlycomplicatedbutisavailableiftheCrossOverPlug-inisinstalled.InstructionsoninstallingthisandchimeonLinuxareavailableathttp://mirrors.rcsb.org/SMS/STINGm/help/chime_linux.htmlIfyouwishtoviewa3DgraphinRasmol,youwillneedtoexecuteaRasmolscriptsimilartoloadpdbfilename.pdb;setaxeson;selectoff;connect;setambient40;rotatex180;select*;spacefill40html3Dcallschime3Dtoproducethehtmllefromthepdble.TheauthorwouldliketothankWillieTaylor,TheNationalInstituteforMedicalResearch,London,UKforhelpwiththeawkcommandonwhichthi

26 sfunctionisbased.Author(s)AedinCulhane i
sfunctionisbased.Author(s)AedinCulhane isDataFrame25Examplesdata(khan)if(require(ade4,quiet=TRUE)){khan.bga(khan$train.classes)}out.3D(khan.bga$fac,writepdb=TRUE,filenamebase="Khan",writehtml=TRUE)##Notrun:browseURL(paste("file://",file.path(paste(getwd(),"/khan.html",sep="")),sep=""))##End(Notrun) isDataFrameConvertsmicroarrayinputdataintoadataframesuitableforanalysisinADE4. DescriptionConvertsinputdataintoadata.framesuitableforanalysisinADE4.Thisfunctioniscalledbybgaandothermade4functionUsageisDataFrame(dataset,pos=FALSE,trans=FALSE)ArgumentsdatasetAmatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.posLogicalindicatingwhethertoaddanintegertodataset,togeneratepositivedata.frame.Requiredfordudi.coaordudi.nsc.transLogicalindicatingwhetherdatasetshouldbetransposed.DefaultisFALSE.Detailsbgaandotherfunctionsinmade4callthisfunctionanditisgenerallynotnecessarytocallisDataFramethisdirectly.isDataFramecallsasDataFrame,andwillacceptamatrix,data.frame,ExpressionSetormarrayRaw-classorSummarizedExperimentformat.Itwillalsotransposedataoraddaintegertogenerateapositivedatamatrix.Iftheinputdatacontainsmissingvalues(NA),thesemustrstberemovedorimputed(seetheRlibrariesimpute()orpamr()). 26khanValueReturnsadata.framesuitableforanalysisbyade4ormade4functions.Author(s)Aedin

27 CulhaneSeeAlsoasinBioconductorExamplesda
CulhaneSeeAlsoasinBioconductorExamplesdata(geneData)class(geneData)dim(geneData)dim(isDataFrame(geneData))class(isDataFrame(geneData)) khanMicroarraygeneexpressiondatasetfromKhanetal.,2001.Subsetof306genes. DescriptionKhancontainsgeneexpressionprolesoffourtypesofsmallroundbluecelltumoursofchildhood(SRBCT)publishedbyKhanetal.(2001).ItalsocontainsfurthergeneannotationretrievedfromSOURCEathttp://source.stanford.edu/.Usagedata(khan)FormatKhanisdatasetcontainingthefollowing:•\$train:data.frameof306rowsand64columns.Thetrainingdatasetof64arraysand306geneexpressionvalues•\$test:data.frame,of306rowsand25columns.Thetestdatasetof25arraysand306genesexpressionvalues•\$gene.labels.imagesID:vectorof306Imagecloneidentierscorrespondingtotherownamesof\$trainand\$test.•\$train.classes:factorwith4levels"EWS","BL-NHL","NB"and"RMS",whichcorrespondtothefourgroupsinthe\$traindataset•\$test.classes:factorwith5levels"EWS","BL-NHL","NB","RMS"and"Norm"whichcorrespondtothevegroupsinthe\$testdataset khan27•\$annotation:data.frameof306rowsand8columns.Thistablecontainsfurthergeneanno-tationretrievedfromSOURCEhttp://SOURCE.stanford.eduinMay2004.Foreachofthe306genes,itcontains:–\$CloneIDImageCloneID–\$UGClusterTheUnigeneclustertowhichthegeneisassigned–\$SymbolTheHUGOgenesymbol–\$LLIDThelocusID–\$UGRepAccNucleotidesequenceaccessionnumber–\$LLRepProtAccProtein

28 sequenceaccessionnumber–\$Chromosom
sequenceaccessionnumber–\$Chromosomechromosomelocation–\$CytobandcytobandlocationDetailsKhanetal.,2001usedcDNAmicroarrayscontaining6567clonesofwhich3789wereknowngenesand2778wereESTstostudytheexpressionofgenesinoffourtypesofsmallroundbluecelltumoursofchildhood(SRBCT).Thesewereneuroblastoma(NB),rhabdomyosarcoma(RMS),Burkittlymphoma,asubsetofnon-Hodgkinlymphoma(BL),andtheEwingfamilyoftumours(EWS).Geneexpressionprolesfrombothtumourbiopsyandcelllinesampleswereobtainedandarecontainedinthisdataset.Thedatasetdownloadedfromthewebsitecontainedtheltereddatasetof2308geneexpressionprolesasdescribedbyKhanetal.,2001.Thisdatasetisavailablefromthehttp://bioinf.ucd.ie/people/aedin/R/.InordertoreducethesizeoftheMADE4package,andproducesmallexampledatasets,thetop50genesfromtheendsof3axesfollowingbgawereselected.Thisproducedareduceddatasetsof306genes.Sourcekhancontainsaltereddataof2308geneexpressionprolesaspublishedandprovidedbyKhanetal.(2001)onthesupplementarywebsitetotheirpublicationhttp://research.nhgri.nih.gov/microarray/Supplement/.ReferencesCulhaneAC,etal.,2002Between-groupanalysisofmicroarraydata.Bioinformatics.18(12):1600-8.Khan,J.,Wei,J.S.,Ringner,M.,Saal,L.H.,Ladanyi,M.,Westermann,F.,Berthold,F.,Schwab,M.,An-tonescu,C.R.,Peterson,C.etal.(2001)Classicationanddiagnosticpredictionofcancersusinggeneexpressionprolingandarticialneuralnetworks.Nat.Med.,7,673-679.Examplesda

29 ta(khan)summary(khan) 28NCI60 NCI60Micro
ta(khan)summary(khan) 28NCI60 NCI60MicroarraygeneexpressionprolesoftheNCI60celllines DescriptionNCI60isadatasetofgeneexpressionprolesof60NationalCancerInstitute(NCI)celllines.These60humantumourcelllinesarederivedfrompatientswithleukaemia,melanoma,alongwith,lung,colon,centralnervoussystem,ovarian,renal,breastandprostatecancers.ThispanelofcelllineshavebeensubjectedtoseveraldifferentDNAmicroarraystudiesusingbothAffymetrixandspottedcDNAarraytechnology.ThisdatasetcontainssubsetsfromonecDNAspotted(Rossetal.,2000)andoneAffymetrix(Stauntonetal.,2001)study,andarepre-processedasdescribedbyCulhaneetal.,2003.Usagedata(NCI60)FormatTheformatis:Listof3•\$Ross:data.framecontaining144rowsand60columns.144geneexpressionlogratiomea-surementsoftheNCI60celllines.•\$Affy:data.framecontaining144rowsand60columns.144AffymetrixgeneexpressionaveragedifferencemeasurementsoftheNCI60celllines.•\$classes:Datamatrixof60rowsand2columns.Therstcolumncontainsthenamesofthe60celllinewhichwereanalysed.Thesecondcolumnliststhe9phenotypesofthecelllines,whichareBREAST,CNS,COLON,LEUK,MELAN,NSCLC,OVAR,PROSTATE,RENAL.•\$Annot:Datamatrixof144rowsand4columns.The144rowscontainthe144genesinthe\$Rossand\$Affydatasets,togetherwiththeirUnigeneIDs,andHUGOGeneSymbols.TheGeneSymbolsobtainedforthe\$Rossand\$Affydatasetsdiffered(seenotebelow),hencebotharegiven.ThecolumnsofthematrixaretheIMAGEIDoftheclonesofthe\$Rossdatase

30 t,theHUGOGeneSymbolsoftheseIMAGEcloneIDo
t,theHUGOGeneSymbolsoftheseIMAGEcloneIDobtainedfromSOURCE,theAffymetrixIDofthe\$Affydataset,andtheHUGOGeneSymbolsoftheseAffymetrixIDsobtainedusingannaffy.DetailsThedatasetswereprocessedasdescribedbyCulhaneetal.,2003.TheRossdata.framecontainsgeneexpressionprolesofeachcelllinesintheNCI-60panel,whichweredeterminedusingspottedcDNAarrayscontaining9,703humancDNAs(Rossetal.,2000).ThedataweredownloadedfromTheNCIGenomicsandBioinformaticsGroupDatasetsre-sourcehttp://discover.nci.nih.gov/datasetsNature2000.jsp.Theupdatedversionofthisdataset(updated12/19/01)wasretrieved.Datawereprovidedaslogratiovalues.Inthisstudy,rows(genes)withgreaterthan15andwereremovedfromanalysis,reducingthedatasetto5643spotvaluespercellline.RemainingmissingvalueswereimputedusingaKnearest NCI6029neighbourmethod,with16neighboursandaEuclideandistancemetric(Troyanskayaetal.,2001).Thedataset\$Rosscontainsasubsetofthe144genesofthe1375genessetdescribedbyScherfetal.,2000.Thisdatasetsisavailablefordownloadfromhttp://bioinf.ucd.ie/people/aedin/R/.Inordertoreducethesizeoftheexampledatasets,theUnigeneID'sforeachofthe1375IMAGEID'sforthesegeneswereobtainedusingSOURCEhttp://source.stanford.edu.ThesewerecomparedwiththeUnigeneID'softhe1517genesubsetofthe\$Affydataset.144geneswerecommonbetweenthetwodatasetsandthesearecontainedin\$Ross.TheAffydatawerederivedusinghighdensityHu6800Affymetrixmicroarrayscontaining7129probesets(Stauntonetal.,2001).Thedat

31 asetwasdownloadedfromtheWhiteheadInstitu
asetwasdownloadedfromtheWhiteheadInstituteCan-cerGenomicssupplementaldatatothepaperfromStauntonetal.,http://www-genome.wi.mit.edu/mpr/NCI60/,wherethedatawereprovidedasaveragedifference(perfectmatch-mismatch)values.AsdescribedbyStauntonetal.,anexpressionvalueof100unitswasassignedtoallaveragedifferencevalueslessthan100.Geneswhoseexpressionwasinvariantacrossall60celllineswerenotconsidered,reducingthedatasetto4515probesets.ThisdatasetNCI60\$Affyof1517probeset,containsgenesinwhichtheminimumchangeingeneexpressionacrossall60celllineswasgreaterthan500averagedifferenceunits.Datawerelogged(base2)andmediancentred.Thisdatasetsisavailablefordownloadfromhttp://bioinf.ucd.ie/people/aedin/R/.Inordertoreducethesizeoftheexampledatasets,theUnigeneID'sforeachofthe1517AffymetrixIDofthesegeneswereobtainedusingthefunctionaafUniGeneintheannaffyBioconductorpack-age.These1517UnigeneIDswerecomparedwiththeUnigeneID'softhe1375genesubsetofthe\$Rossdataset.144geneswerecommonbetweenthetwodatasetsandthesearecontainedin\$Affy.SourceThesepre-processeddatasetswereavailableasasupplementtothepaper:CulhaneAC,PerriereG,HigginsDG.Cross-platformcomparisonandvisualisationofgeneex-pressiondatausingco-inertiaanalysis.BMCBioinformatics.2003Nov21;4(1):59.http://www.biomedcentral.com/1471-2105/4/59ReferencesCulhaneAC,PerriereG,HigginsDG.Cross-platformcomparisonandvisualisationofgeneexpres-siondatausingco-inertiaanalysis.BMCBioinformatics.2003

32 Nov21;4(1):59.RossDT,ScherfU,EisenMB,Per
Nov21;4(1):59.RossDT,ScherfU,EisenMB,PerouCM,ReesC,SpellmanP,IyerV,JeffreySS,VandeRijnM,WalthamM,PergamenschikovA,LeeJC,LashkariD,ShalonD,MyersTG,WeinsteinJN,BotsteinD,BrownPO:Systematicvariationingeneexpressionpatternsinhumancancercelllines.NatGenet2000,24:227-235ScherfU,RossDT,WalthamM,SmithLH,LeeJK,TanabeL,KohnKW,ReinholdWC,MyersTG,AndrewsDT,ScudieroDA,EisenMB,SausvilleEA,PommierY,BotsteinD,BrownPO,WeinsteinJN:Ageneexpressiondatabaseforthemolecularpharmacologyofcancer.NatGenet2000,24:236-244.StauntonJE,SlonimDK,CollerHA,TamayoP,AngeloMJ,ParkJ,ScherfU,LeeJK,ReinholdWO,WeinsteinJN,MesirovJP,LanderES,GolubTR:Chemosensitivitypredictionbytranscrip-tionalproling.ProcNatlAcadSciUSA2001,98:10787-10792. 30ordTroyanskayaO,CantorM,SherlockG,BrownP,HastieT,TibshiraniR,BotsteinD,AltmanRB:MissingvalueestimationmethodsforDNAmicroarrays.Bioinformatics2001,17:520-525.Examplesdata(NCI60)summary(NCI60) ordOrdination DescriptionRunprincipalcomponentanalysis,correspondenceanalysisornon-symmetriccorrespondenceanal-ysisongeneexpressiondataUsageord(dataset,type="coa",classvec=NULL,ord.nf=NULL,trans=FALSE,...)##S3methodforclass'ord'plot(x,axis1=1,axis2=2,arraycol=NULL,genecol="gray25",nlab=10,genelabels=NULL,arraylabels=NULL,classvec=NULL,...)ArgumentsdatasetTrainingdataset.Amatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainth

33 evariables(genes)andcases(arraysamples)r
evariables(genes)andcases(arraysamples)respectively.classvecAfactororvectorwhichdescribestheclassesinthetrainingdataset.typeCharacter,"coa","pca"or"nsc"indicatingwhichdatatransformationisre-quired.Thedefaultvalueistype="coa".ord.nfNumeric.Indicatingthenumberofeigenvectortobesaved,bydefault,ifNULL,alleigenvectorswillbesaved.transLogicalindicatingwhether'dataset'shouldbetransposedbeforeordination.UsedbyBGADefaultisFALSE.xAnobjectofclassord.Theoutputfromord.Itcontainstheprojectioncoordi-natesfromord,the\$coor\$licoordinatestobeplotted.arraycol,genecolCharacter,colourofpointsonplot.IfarraycolisNULL,arraycolwillobtainasetofcontrastingcoloursusinggetcol,foreachclassesofcases(microarraysamples)onthearray(case)plot.genecolisthecolourofthepointsforeachvariable(genes)ongeneplot.nlabNumeric.Anintegerindicatingthenumberofvariables(genes)attheendofaxestobelabelled,onthegeneplot. ord31axis1Integer,thecolumnnumberforthex-axis.Thedefaultis1.axis2Integer,thecolumnnumberforthey-axis,Thedefaultis2.genelabelsAvectorofvariableslabels,ifgenelabels=NULLtherow.namesofinputmatrixdatasetwillbeused.arraylabelsAvectorofvariableslabels,ifarraylabels=NULLthecol.namesofinputmatrixdatasetwillbeused....furtherargumentspassedtoorfromothermethods.Detailsordcallseitherdudi.pca,dudi.coaordudi.nscontheinputdataset.TheinputformatofthedatasetisveriedusingisDataFrame.Iftheuserdenesmicroarraysamplegroupings,thesearecoloursonplot

34 sproducedbyplot.ord.Plottingandvisualisi
sproducedbyplot.ord.Plottingandvisualisingbgaresults:2Dplots:plotarraystodrawanxyplotofcases(\$ls).plotgenes,isusedtodrawanxyplotofthevariables(genes).3Dplots:3Dgraphscanbegeneratedusingdo3Dandhtml3D.html3Dproducesawebpageinwhicha3Dplotcanbeinteractivelyrotated,zoomed,andinwhichclassesorgroupsofcasescanbeeasilyhighlighted.1Dplots,showoneaxisonly:1Dgraphscanbeplottedusinggraph1D.graph1Dcanbeusedtoploteithercases(microarrays)orvariables(genes)andonlyrequiresavectorofcoordinates(\$li,\$co)Analysisofthedistributionofvarianceamongaxes:Thenumberofaxesorprincipalcomponentsfromaordwillequalnrowthenumberofrows,orthencol,numberofcolumnsofthedataset(whicheverisless).Thedistributionofvarianceamongaxesisdescribedintheeigenvalues(\$eig)oftheordanalysis.Thesecanbevisualisedusingascreeplot,usingscatterutil.eigenasitdoneinplot.ord.Itisalsousefultovisualisetheprincipalcomponentsfromausingaordorprincipalcomponentsanalysisdudi.pca,orcorrespondenceanalysisdudi.coausingaheatmap.InMADE4thefunctionheatplotwillplotaheatmapwithnicerdefaultcolours.Extractinglistoftopvariables(genes):Usetopgenestogetlistofvariablesorcasesattheendsofaxes.Itwillreturnalistofthetopnvariables(bydefaultn=5)atthepositive,negativeorbothendsofanaxes.sumstatscanbeusedtoreturntheangle(slope)anddistancefromtheoriginofalistofcoordinates.ValueAlistwithaclassordcontaining:ordResultsofinitialordination.Alistofclass"dudi"(seedudi)facTheinputclassvec,thefa

35 ctororvectorwhichdescribedtheclassesinth
ctororvectorwhichdescribedtheclassesintheinputdataset.CanbeNULL. 32overviewAuthor(s)AedinCulhaneSeeAlsoSeeAlsodudi.pca,dudi.coaordudi.nsc,bga,Examplesdata(khan)if(require(ade4,quiet=TRUE)){khan.coa(classvec=khan$train.classes,type="coa")}khan.coaplot(khan.coa,genelabels=khan$annotation$Symbol)plotarrays(khan.coa)#Provideaviewofthefirst5principalcomponents(axes)ofthecorrespondenceanalysisheatplot(khan.coa$ord$co[,1:5],dend="none",dualScale=FALSE) overviewDrawboxplot,histogramandhierarchicaltreeofgeneexpressiondata DescriptionVerysimplewrapperfunctionthatdrawsaboxplot,histogramandhierarchicaltreeofexpressiondataUsageoverview(dataset,labels=NULL,title="",classvec=NULL,hc=TRUE,boxplot=TRUE,hist=TRUE,returnTree=FALSE)ArgumentsdatasetAmatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.labelsVector,labelstobeplacedonsamplesinplots.Defaultisrownames(dataset).titleCharacter,labeltobeplacedonplots.DefaultisNULL.classvecAfactororvectorwhichdescribestheclassesincolumnsofthedataset.DefaultisNULL.Ifincludedcolumns(arraysamples)onthedendrogramwillbecolouredbyclass.hcLogical.Drawdendrogramofhierarchicalclusteranalysisofcases.DefaultisTRUE. plotarrays33boxplotLogical.Drawboxplot.DefaultisTRUE.histLogical.Drawhistogram.DefaultisTRUE.returnTreeLogical.Returnthehieracrhicalclu

36 steranalysisresults.DefaultisFALSE.Detai
steranalysisresults.DefaultisFALSE.DetailsThehierarchicalplotisproducedusingaveragelinkageclusteranalysiswithPearson'scorrelationmetricasdescribedbyEisenetal.,1999.Author(s)AedinCulhaneSeeAlsoSeealsoasboxplot,hclust,histExamplesdata(khan)logkhan()print(class(logkhan))overview(logkhan,title="SubsetofKhanTrain")overview(logkhan,classvec=khan$train.classes,labels=khan$train.classes,title="SubsetofKhanTrain")overview(logkhan,classvec=khan$train.classes,labels=khan$train.classes,title="SubsetofKhanTrain",boxplot=FALSE,his=FALSE) plotarraysGraphxyplotofvariable(array)projectionsfromordination,betweengroupanalysisorcoinertiaanalysis. DescriptionGraphxyplotofvariablesusings.var,s.groupsors.match.col.Usefulforvisualisingarraycoordi-nates(\$li)resultingfromord,bgaorciaofmicroarraydata.Usageplotarrays(coord,axis1=1,axis2=2,arraylabels=NULL,classvec=NULL,graph=c("groups","simple","labels","groups2","coinertia","coinertia2"),labelsize=1,star=1,ellipse=1,arraycol=NULL,...) 34plotarraysArgumentscoordadata.frameormatrixorobjectfromordbgaorciaanalysiswithatleasttwocolumns,containingx,ycoordinatestobeplottedaxis1Aninteger,thecolumnnumberforthex-axis.Defaultis1,soaxes1isdudi-var[,1]axis2Aninteger,thecolumnnumberforthey-axis.Defaultis2,soaxes2isdudi-var[,2]arraylabelsAvectorofvariableslabels.Defaultisrow.names(coord)classvecAfactororvectorwhichdescribestheclassesincoord.DefaultisNULL.Ifincludedvariableswillbecolour

37 edbyclass.graphAcharacteroftype"groups",
edbyclass.graphAcharacteroftype"groups","simple","labels","groups2","coinertia"or"coin-ertia2"whichspeciesthetypeofplottypeor"graph"tobedrawn.Byde-faultthegraphwillbeselecteddependingontheclassofcooord,andwhetheraclassvectorisspeciedlabelsizeSizeofsamplelabels,bydefault=1starIfdrawinggroups,whethertojoinsamplestocentroidcreatinga"star"ellipseIfdrawinggroups,whethertodrawanellipseorringaroundthesamplesarraycolCharacterwithlengthequaltothenumberoflevelsinthefactorclassvec.Colorsforeachofthelevelsinthefactorclassvec...furtherargumentspassedtoorfromothermethodDetailsplotarrayscallsthefunctions.var,s.groupsors.match.col.Ifyouwishtoreturnatableorlistofthetoparrayattheendofanaxis,usethefunctiontopgenes.ValueAnxyplotNoteplotarraysplotsvariablesusings.var,s.groups,s.match.colwhicharemodiedsversionofs.label,s.class.,ands.match.Author(s)AedinCulhaneSeeAlsoSeeAlsoass.varands.label plotgenes35Examplesdata(khan)if(require(ade4,quiet=TRUE)){khan.bga(khan$train.classes)}attach(khan.bga)par(mfrow=c(2,1))plotarrays(khan.bga)plotarrays(khan.bga,graph="simple")plotarrays(khan.bga,graph="labels")plotarrays(khan.bga,graph="groups")plotarrays(khan.bga,graph="groups2") plotgenesGraphxyplotofvariable(gene)projectionsfromPCAorCOA.Onlylabelvariablesatendsofaxes DescriptionGraphxyplotofvariablesbutonlylabelvariablesatendsofXandYaxes.Usefulforgraphinggenescoordinates(\$co)resultingfromPCAorCOAofmicroarraydata.Us

38 ageplotgenes(coord,nlab=10,axis1=1,axis2
ageplotgenes(coord,nlab=10,axis1=1,axis2=2,genelabels=row.names(coord),boxes=TRUE,colpoints="black",...)Argumentscoordadata.frameormatrixorobjectfromordbgaorciaanalysiswithatleasttwocolumns,containingx,ycoordinatestobeplotted.nlabNumeric.Anintegerindicatingthenumberofvariablesatendsofaxestobelabelled.axis1Aninteger,thecolumnnumberforthex-axis.Defaultis1,soaxis1isdudi-var[,1].axis2Aninteger,thecolumnnumberforthey-axis.Defaultis2,soaxis2isdudi-var[,2].genelabelsAvectorofgene(variable)labels.Defaultisrow.names(coord)boxesAlogical,indicatingwhetheraboxshouldbeplottedsurroundingeachvariablelabel.ThedefaultisTRUE.colpointsThecolourofthepointsontheplot.Thedefaultis"black"....furtherargumentspassedtoorfromothermethod. 36prettyDendDetailsplotgenescallsthefunctiongeneswhichreturnanindexofthe"top"variablesattheendsofthexandyaxes.Ifyouwishtoreturnatableorlistofthetopgenesattheendofanaxis,usethefunctiontopgenes.ValueAnxyplotNoteplotgenesplotsvariablesusings.var,whichisamodiedversionofs.label.Author(s)AedinCulhaneSeeAlsoSeeAlsoass.varands.labelExamplesdata(khan)if(require(ade4,quiet=TRUE)){khan.ord(classvec=khan$train.classes)}par(mfrow=c(2,2))#s.var(khan.ord$co,col=as.numeric(khan$train.classes),clabel=0.8)plotgenes(khan.ord,colpoints="red")plotgenes(khan.ord,colpoints="red",genelabels=khan$annotation$Symbol)plotgenes(khan.ord,colpoints="gray",genelabels=khan$annotation$Symbol,boxes=FALSE) prettyDendDraw

39 hierarchicaltreeofgeneexpressiondatawith
hierarchicaltreeofgeneexpressiondatawithacolorbarfornumerousclassvectors DescriptionFunctionwhichperformsahierarchicalclusteranalysisofdata,drawingadendrogram,withcolor-barsfordifferentsamplecovariatebeneaththedendrogramUsageprettyDend(dataset,labels=NULL,title="",classvec=NULL,covars=1,returnTree=FALSE,getPalette=getcol,...) prettyDend37Argumentsdatasetamatrix,data.frame,ExpressionSetormarrayRaw-class.Iftheinputisgeneexpressiondatainamatrixordata.frame.Therowsandcolumnsareexpectedtocontainthevariables(genes)andcases(arraysamples)respectively.labelsVector,labelstobeplacedonsamplesinplots.Defaultisrownames(dataset).titleCharacter,labeltobeplacedonplots.DefaultisNULL.classvecAfactororvectorormatrixordata.framewhichdescribestheclassesincolumnsofthedataset.DefaultisNULL.covarsNumeric.Thecolumnsofthedata.frameclassvetobeusedasclassvectors.Thesewillbedisplayedascolorbarsunderthedendrogram.Thedefaultis1(column1).returnTreeLogical.Returnthehieracrhicalclusteranalysisresults.DefaultisFALSE.getPaletteFunction,whichgeneratesapaletteofcolors.Thedefaultusesgetcolfunctioninmade4.Otherexamplesareprovidedbelow...furtherargumentspassedtoorfromothermethods.DetailsThehierarchicalplotisproducedusingaveragelinkageclusteranalysiswith1-Pearson'scorrela-tionmetric.Thedefaultsetofcolorsusedtogeneratethecolorbarsoftheplotscanbechanged(seeexamplebelow).Bydefault,ifthereisonlytwolevelsinthefactor,thecolorswillbeblackandgre

40 y.Author(s)AedinCulhaneSeeAlsoSeealsoaso
y.Author(s)AedinCulhaneSeeAlsoSeealsoasoverview,hclustExamplesdata(khan)logkhan()#GetacharactervectorwhichdefineswhichkhansamplesarecelllinesortissuesamplekhanAnnot=cbind(as.character(khan$train.classes),khan$cellType)print(khanAnnot[1:3,])#Add2colorbar,oneforcancersubtype,anotherforcelltypeunderdendrogramprettyDend(logkhan,classvec=khanAnnot,covars=c(1,2),labels=khan$train.classes)#Tochangethepaletteofcolors#Usetopo.colors(),seecolors()formorehelponinbuiltpalettesprettyDend(logkhan,classvec=khanAnnot,covars=c(1,2), 38randomiserlabels=khan$train.classes,getPalette=topo.colors)#TouseRColorBrewerPaletteslibrary(RColorBrewer)#UseRColorBrewerDark2whichcontains8colorsprettyDend(logkhan,classvec=khanAnnot,covars=c(1,2),labels=khan$train.classes,getPalette=function(x)brewer.pal(8,"Dark2")[1:x])#UseRColorBrewerSet1whichcontains9colorsprettyDend(logkhan,classvec=khanAnnot,covars=c(1,2),labels=khan$train.classes,getPalette=function(x)brewer.pal(9,"Set1")[1:x]) randomiserRandomlyreassigntrainingandtestsamples DescriptionThisfunctionisusedtocheckforbiasbetweenatrainingandtestdata.Itreturnanewindex,whichrandomlyre-assignssamplesinthetrainingdatatothetestdatasetandviceversa.Usagerandomiser(ntrain=77,ntest=19)ArgumentsntrainNumeric.AintegerindicatingthenumberofcasesinthetrainingdatasetntestNumeric.AintegerindicatingthenumberofcasesinthetestdatasetDetailsProducesnewindicesthatcanbeusedfortraining/testdatasetsVa

41 lueItreturnsalist,containing2vectorstrai
lueItreturnsalist,containing2vectorstrainAvectoroflengthntrain,whichcanbeusedtoindexanewtrainingdatasettestAvectoroflengthntest,whichcanbeusedtoindexanewtestdatasetAuthor(s)AedinCulhane sumstats39Examplesrandomiser(10,5)train(()ncol=20,nrow=20,dimnames=list(1:20,paste("train",letters[1:20],sep=".")))test(()ncol=10,nrow=20,dimnames=list(1:20,paste("test",LETTERS[1:10],sep=".")))all()colnames(train)colnames(test)newInd(ntest=10)newtrainnewtestcolnames(newtrain)colnames(newtest) sumstatsSummarystatisticsonxyco-ordinates,returnstheslopesanddistancefromoriginofeachco-ordinate. DescriptionGivenadata.frameormatrixcontainingxycoordinates,itreturnstheslopeanddistancefromoriginofeachcoordinate.Usagesumstats(array,xax=1,yax=2)ArgumentsarrayAdata.frameormatrixcontainingxycoordinates,normallya\$co,\$lifromdudisuchasPCAorCOA,or\$lsfrombgaxaxNumeric,anintegerindicatingthecolumnofthexaxiscoordinates.Defaultxax=1yaxNumeric,anintegerindicatingthecolumnofthexaxiscoordinates.Defaultxax=2DetailsInPCAorCOA,thevariables(upregulatedgenes)thataremostassociatedwithacase(microarraysample),arethosethatareprojectedinthesamedirectionfromtheorigin. 40sumstatsVariablesorcasesthathaveagreatercontributiontothevarianceinthedataareprojectedfurtherfromtheorigininPCA.Equallyvariablesandcaseswiththestrongassociationhaveahighchi-squarevalue,andareprojectedwithgreaterdistancefromtheorigininCOA,SeeadescriptionfromCulhaneetal.,2002formor

42 edetails.Althoughtheprojectionofco-ordin
edetails.Althoughtheprojectionofco-ordinatesarebestvisualisedonanxyplot,sumstatsreturnstheslopeanddistancefromoriginofeachx,ycoordinateinamatrix.ValueAmatrix(ncol=3)containingslopeangle(indegrees)distancefromoriginofeachx,ycoordinatesinamatrix.Author(s)AedinCulhaneExamplesdata(khan)if(require(ade4,quiet=TRUE)){khan.bga(khan$train.classes)}plotarrays(khan.bga$bet$ls,classvec=khan$train.classes)st.out()#GetstatsonclassesEWSandBLEWS()st.out[EWS,]BL()st.out[BL,]#AdddashedlinetoplottohighlightminandmaxslopesofclassBLslope.BL.min()slope.BL.max()abline(c(0,slope.BL.min),col="red",lty=5)abline(c(0,slope.BL.max),col="red",lty=5) suppl41 supplProjectionofsupplementarydataontoaxesfromabetweengroupanalysis DescriptionProjectionandclasspredictionofsupplementarypointsontoaxesfromabetweengroupanalysis,bga.Usagesuppl(dudi.bga,supdata,supvec=NULL,assign=TRUE,...)##S3methodforclass'suppl'plot(x,dudi.bga,axis1=1,axis2=2,supvec=x$true.class,supvec.pred=x$predicted,...)Argumentsdudi.bgaAnobjectreturnedbybga.supdataTestorblinddataset.Acceptedformatsareamatrix,data.frame,ExpressionSetormarrayRaw-class.supvecAfactororvectorwhichdescribestheclassesinthetrainingdataset.supvec.predAfactororvectorwhichdescribestheclasseswhichwerepredictedbysuppl.assignAlogicalindicatingwhetherclassassignmentshouldbecalculatedusingthemethoddescribedbyCulhaneetal.,2002.ThedefaultvalueisTRUE.xAnobjectreturnedbysuppl.axis1Integer,thecolumnnumb

43 erforthex-axis.Thedefaultis1.axis2Intege
erforthex-axis.Thedefaultis1.axis2Integer,thecolumnnumberforthey-axis.Thedefaultis2....furtherargumentspassedtoorfromothermethods.DetailsAfterperformingabetweengroupanalysisonatrainingdatasetusingbga,atestdatasetcanbethenprojectedontobgaaxesusingsuppl.supplreturnstheprojectedcoordinatesandassignmentofeachtestcase(array).Thetestdatasetmustcontainthesamenumberofvariables(genes)asthetrainingdataset.TheinputformatofboththetrainingdatasetandtestdatasetareveriedusingisDataFrame.Useplot.bgatoplotresultsfrombga.ValueAlistcontaining:supplAnobjectreturnedbysuppl 42topgenesAuthor(s)AedinCulhaneReferencesCulhaneAC,etal.,2002Between-groupanalysisofmicroarraydata.Bioinformatics.18(12):1600-8.SeeAlsoSeeAlsobga,bca,plot.bga,bga.jackknifeExamplesdata(khan)#khan.bga(khan$train.classes)if(require(ade4,quiet=TRUE)){khan.bga(supdata=khan$test,classvec=khan$train.classes,supvec=khan$test.classes)khan.bgaplot.bga(khan.bga,genelabels=khan$annotation$Symbol)khan.bga$supplplot.suppl(khan.bga$suppl,khan.bga)plot.suppl(khan.bga$suppl,khan.bga,supvec=NULL,supvec.pred=NULL)plot.suppl(khan.bga$suppl,khan.bga,axis1=2,axis2=3,supvec=NULL,supvec.pred=NULL)} topgenesTopgenes,returnsalistofvariablesattheends(positive,negativeorboth)ofanaxis DescriptiontopgeneswillreturnalistofthetopNvariablesfromthepositive,negativeorbothendsofanaxis.Thatis,itreturnsalistofvariablesthathavethemaximumand/orminimumvaluesinavector.Usagetopgenes(x

44 ,n=10,axis=1,labels=row.names(x),ends="b
,n=10,axis=1,labels=row.names(x),ends="both",...)ArgumentsxAvector,matrixordata.frame.Typicallyadataframe\$coor\$lifromdudior\$ls,\$li,\$cofrombga.nAnintegerindicatingthenumberofvariablestobereturned.Defaultis5.axisAnintegerindicatingthecolumnofx.Defaultis1(rstaxis,of\$coor\$lile) topgenes43labelsAvectorofrownames,forx[,axis].Defaultvaluesisrow.names(x)endsAstring,"both","neg","pos",indicatingwhethervariablelabelshouldbereturnfromboth,thenegativeorthepositiveendofanaxis.Thedefaultisboth....furtherargumentspassedtoorfromothermethodsDetailstopgenescallsgenes1d.genes1dissimilartogenes,butreturnsanindexofgenesattheendsofoneaxisnottwoaxes.Givena\$coor\$lileitwillreturnthatvariablesattheendsoftheaxis.ValueReturnsavectororlistofvectors.Author(s)AedinCulhaneSeeAlsoSeeAlsoasgenesExamples#Simpleexamplea()order(a)topgenes(a,labels=c(1:length(a)),ends="neg")#Appliedexampledata(khan)if(require(ade4,quiet=TRUE)){khan.coa()ind(ends="pos")ind.ID(ends="pos",labels=khan$gene.labels.imagesID)ind.symbol(ends="pos",labels=khan$annotation$Symbol)Top10.poscbind("GeneSymbol"=ind.symbol,"CloneID"=ind.ID,"Coordinates"=khan.coa$ord$li[ind,],row.names=c(1:length(ind)))Top10.pos Indexcolorgetcol,18datasetskhan,26NCI60,28hplotbetween.graph,4cia,11commonMap,13do3d,16getcol,18graph1D,19heatplot,20html3D,23overview,32plotarrays,33plotgenes,35prettyDend,36manipbet.coinertia,2bga,5bga.jackknife,8bga.su

45 ppl,9comparelists,15graph1D,19heatplot,2
ppl,9comparelists,15graph1D,19heatplot,20isDataFrame,25ord,30overview,32prettyDend,36randomiser,38sumstats,39suppl,41topgenes,42multivariatebet.coinertia,2between.graph,4bga,5bga.jackknife,8bga.suppl,9cia,11commonMap,13ord,30plotarrays,33plotgenes,35suppl,41as,26asDataFrame,25bca,3,6–8,10,42bet.coinertia,2between.graph,4,6,14bga,3,4,5,7,8,10,12,16,23,25,32,39,41,42bga.jackknife,6,7,8,10,42bga.suppl,5,6,8,9boxplot,33chime3D,24cia,2,3,11coinertia,3,4,12,13commonMap,13comparelists,15data.frame,3,5,8,9,11,16,19,20,23,25–28,30,32,34,35,37,39,41,42dendrogram,22do3D,6,31do3d,16dudi,3,7,10,13,23,31,39,42dudi.coa,4,7,11,13,16,19,31,32dudi.nsc,3,11,13,31,32dudi.pca,3,4,7,16,19,31,32dudi.rwcoa,11ExpressionSet,3,5,9,11,20,25,30,32,37,41factor,2644 INDEX45genes,43genes1d,43getcol,4,18graph1D,4–6,14,19,31hclust,22,33,37heatmap,22heatmap.2,21heatplot,7,20,31hist,33html3D,6,23,31intersect,15isDataFrame,6,25,31khan,26matrix,3,5,8,9,11,16,19,20,23,25,28,30,32,34,35,37,39,41,42NCI60,28ord,30overview,32,37plot.bga,8,10,41,42plot.bga(bga),5plot.cia(cia),11plot.ord(ord),30plot.suppl(suppl),41plotarrays,6,31,33plotgenes,6,31,35prettyDend,36print.comparelists(comparelists),15randomiser,6,38rotate3d(do3d),16s.class,34s.groups,34s.label,34,36s.match,34s.match.col,34s.var,34,36scatterplot3d,16,17scatterutil.eigen,7,31setdiff,15sumstats,7,31,39suppl,6–8,10,41,41suppl.bga,7topgenes,7,31,34,36,42vecto

Related Contents


Next Show more