and Data Analysis Roy Williams PhD Sanford Burnham Medical Research Institute Microarray Revolution Idea measure the amount of mRNA to see which genes are being expressed ID: 933472
Download Presentation The PPT/PDF document "Transcriptome Analysis Microarray Techn..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Transcriptome Analysis
Microarray Technology and Data Analysis
Roy Williams PhD Sanford | Burnham Medical Research Institute
Slide2Microarray Revolution
Slide3Idea
: measure the amount of
mRNA to see which genes are being expressed
in (used by) the cell. Measuring
protein
would be more direct, but is currently harder.
Measuring Gene Expression
Slide4General assumption of microarray technology
Use mRNA transcript abundance level as a measure of expression for the corresponding geneProportional to degree of gene expression
Slide5How to measure RNA abundance
Several different approaches with similar themesIllumina bead array – highly redundant oligo array
Affymetrix GeneChip – highly redundant oligo arrayNimblegen – highly redundant long oligo
array
2-colour array (very long
cDNA
; low redundancy)
SAGE (random
Sanger sequencing
of
cDNA
library)
Reborn as Next Gen RNA seq
Slide6The Illumina Beadarray Technology
Highly redundant ~50 copies of a bead60mer oligosAbsolute expressionEach array is deconvoluted using a colour coding tag systemHuman, Mouse, Rat, Custom
Slide7Affymetrix Technology
Highly redundant (~25 short oligos per gene)Absolute expressionPM-MM oligo system valuable for cross hybe detectionHuman, Mouse, E. coli, Yeast……..Affy and illumina arrays have been systematically compared
Slide8Spotted ArraysLow redundancy
cDNA and oligoTwo dyes Cy5/Cy3Relative expressionCost and custom
Slide9Single Colour Labelling
Slide10Microarrays in action
off
on
Slide11Areas Being Studied with Microarrays
Differential gene expression between two (or more) sample typesSimilar gene expression across treatments
Tumour sub-class identification using gene expression profilesClassification of malignancies into known classesIdentification of “marker” genes
that characterize different cell types
Identification of genes
associated with clinical outcomes (e.g. survival)
Slide12Experimental Design
Slide13Microarray Data Analysis Workflow
Slide14Recommended Software
Free Software – GenePattern -- powerful, many plug-in packages and pipelines-- good video examples/tutorials
GeneSpring GX11R-Bioconductor (with guidance)Hierarchical Cluster Explorer – easy clusteringCytoscape, GSEA – for pathway visualisationPartekIPA, Nextbio,
GeneGo
<= Burnham subscriptions!
Slide15Log Transformed Data
2/2 = 1 log2(1) = 04/1=4 log2(4) = +2¼=0.25 log2(0.25) = -2
Transformation often performed before normalisation
Slide16After QC for low confidence genes (P<0.99)
Note: ~50 replicate beads per array
MedianOutliers
25% quartile
75% quartile
BAD CHIP
BOXPLOT REPRESENTATION OF DATA SPREAD
CHIP NUMBER
SIGNAL INTENSITY
Slide17The effect of quantiles
Normalisation on the filtered 36 data setsIMPORTANT: use non-linear normalisation
>library(affy)>Qdata <- normalize.quantiles(Rawdata)
All same range
Slide18Data Analysis Examples
1# Illumina arrays with GeneSpring
GX112# Affymetrix data, with a GenePattern module
Import, Quality Control, normalize
Detect differentially expressed genes
Pathway analysis
Slide19Illumina Analysis Workflow
Check array
hybridisation qualityDirect Export file as “sample probe profile”Import into GENESPRING GX11
Genome Studio Application: process binary .
idat
files to txt
Normalisation
here is optional
Slide20GeneSpring GX11 features
Guided workflowsPathwaysGSEAIPA integrationOntologiesMySQLR script API
Slide21GeneSpring GX11
Create New ProjectBrowse to and load Data Automated install ofGenomeDef from Agilent repository
Slide22Illumina Advanced Workflow
Slide23Grouping Sample Replicates
Slide24Check Replicates Are Similar
Slide25Scatterplot of replicates
Slide26Scatterplot of differently treated
samples
Slide27Filter genes on P-value
Slide28Significantly different genes in a Volcano plot
Slide29Significant Pathway Determination
Slide30Which types of genes are enriched in a cluster?
Idea: Compare your cluster of genes with lists of genes with common properties (function, expression, location).Find how many genes overlap between your cluster and a gene list.
Calculate the probability of obtaining the overlap by chance This measures if the enrichment is significant.This analysis provides an unbiased way of detecting connections between expression and function.
25
0
7
GeneOntology
Cell cycle
Our
Cell cycle
15000
Slide31Send list to IPA for pathway Analysis
Slide32Significant Pathways sent to Ingenuity Pathway Analysis
Slide33Completed Analysis
genelists
DataPathways
Slide34Affymetrix Workflow: GenePattern
Slide35Comparative Marker Selection
Slide36Paste the URLs for Data files
Slide37Send results to next module
Viewer module
Slide38Outputs ranked list of genes
List of Marker genes can beFiltered and exported
Slide39Nextbio
Compares your Genelists
to the Nextbio databaseCan reveal unexpected similarities between datasets
Has a very good literature database connected to the results
Contains data from model organisms
Slide40Ingenuity Pathway Analysis
Detects networks in your dataAllows you to look for connections between genes and drugs/small moleculesFocused on Man and Mouse
GeneGo
High Quality hand annotated
ontologies
Has a very good literature database connected to the results
Contains data from model organisms
Slide41Start a new core analysis
Slide42Ingenuity Data import
Slide43IPA determines functions
Slide44Overlay drug and disease data
Slide45Data Import to
Nextbio
Slide46The Nextbio Report Page
Slide47What else does my gene do?
Slide48THE ENDMany thanks for coming!