and RNA seq Vladimir Teif Intro to NGS analysis Proficio course 2020 NGS data integration httpdeterminedtoseecomwpcontentuploads201408jigsawpuzzlejpg 1 Signal existing annotation ID: 916668
Download Presentation The PPT/PDF document "Integrative analysis: ChIP-seq" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Integrative analysis: ChIP-seq and RNA-seq
Vladimir Teif
Intro to NGS analysis
Proficio
course 2020
Slide2NGS data integration
http://determinedtosee.com/wp-content/uploads/2014/08/jigsaw-puzzle.jpg
Slide31. Signal + existing annotation
deepTools 2.0
https://github.com/fidelram/deepTools/wiki/Visualizations
Slide4Comparing cluster
heatmaps
between two cell conditions
NucTools
Slide5Histone modifications around TSS
http://www.ie-freiburg.mpg.de/bioinformaticsfac
Slide6Different datasets in several tracks of a genome browser
Gifford et.al., Cell 2013
5mC
Slide7Heat maps again: Signal from data 1 around regions in data 2Here:Nucleosome
occupancyaround bound CTCFin mouse stem cells
Vainshtein
et.al.,
BMC Genomics
2017
Slide8http://homer.salk.edu/homer/ngs/quantification.html
Correlation analysis: any 2 datasets can be correlated
Slide9RNA-seq: how many reads per gene
DESeq, edgeR,
Cuffdiff
ChIP-seq: where binding is enrichedMACS, CISER, HOMER, PeakSeq,
edgeR, DESeq, CisGenome
Slide11RNA-seq & ChIP-seq together:which protein regulates which gene
ChIP-seq
peak size
Gene expression
Log fold change
Slide12Correlation of regulatory protein binding with gene expressionPavlaki et al., 2017
Slide13Functional analysishttps://blog.arduino.cc/2018/08/17/build-a-4-button-arcade-game-out-of-lego/
Slide14Gene Ontology (GO)Ontology: A set of concepts and categories in a subject area or domain that shows their properties and the relations between them
.Gene ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species
.
http
://
www.geneontology.org
Gene ontology typesCellular
component, the parts of a cell or its extracellular environment;
Molecular
function
, the elemental activities of a gene product at the molecular level, such as binding or catalysis
;
Biological
process
, operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms.
Slide16Cellular component
Slide17Molecular function
drug transporter activity
Slide18Biological process
Slide19GO is manually curatedBased on experimental evidence
Based on computational predictionsBased on the claims reported in publications
For example,
the
Experimental Evidence codes are:
Inferred
from Experiment (EXP)
Inferred
from Direct Assay (IDA)
Inferred
from Physical Interaction (IPI)
Inferred
from Mutant Phenotype (IMP)
Inferred
from Genetic Interaction (IGI)
Inferred
from Expression Pattern (IEP
)
Slide20GO is manually curated
Slide21Characteristics of GO terms >40,000 terms and growing
GO is species independent, but some terms may be specific to a certain group (
e.g
.
photosynthesis)
GO is hierarchical (terms can have parents/
childs
)
GO terms are linked by relationships:
is-a
part-of
regulates
(and +/- regulates)
occurs-in
enables
involved-in
Slide22GO hierarchy
24th Feb 2006 Jane Lomax EBI
Slide23GO hierarchy
Slide24GO hierarchy
http://geneontology.org/page/ontology-structure
Slide25Anatomy of a GO term
Adapted from Melanie Courtot
, 2012
Slide26How GO analysis tools workModified from Jane Lomax, 2006
input
a gene list and a subset of ‘interesting’
genes
tool shows which GO categories have most interesting genes associated with them i.e. which categories are ‘enriched’ for interesting genes
tool provides a statistical measure to determine whether enrichment is significant
Slide27Whether enrichment is significant…
~60,000 genes in total in the mouse genome
RNA-seq
2,000
genes differentially
regulated
mitosis –
80
apoptosis –
40
positive control of cell proliferation
–
30
glucose
transport
–
20
Mitosis
Apoptosis
Glucose
transport
Proliferation
Slide28Whether enrichment is significant…
Proliferation: 3-fold enriched
Glucose transport: 4-fold
Slide29Whether enrichment is significant…
Slide30Other enrichment tests
Slide31GO tools online: GSEA
http://
software.broadinstitute.org/gsea
GO tools online: DAVID
https://david.ncifcrf.gov
GO tools online: GOrillahttp://cbl-gorilla.cs.technion.ac.il
Slide34GO tools online: EnrichRhttp://amp.pharm.mssm.edu/Enrichr/
Slide35https://www.dovepress.com/role-of-nsc319726-in-ovarian-cancer-based-on-the-bioinformatics-analys-peer-reviewed-fulltext-article-OTTGO analysis, first example
Unrealistically small P-values
Slide36GO analysis, typical exampleDAVID, GOrilla, GREAT,
EnrichR
Calo et al. (2015) Nature 518, 249–253
Slide37GO analysis, cool and easy to do
Massie et al., EMBO J. (2011) 30, 2719–2733
Slide38Zao et al., Cell Death & Disease (2016), 7:e2053
GO analysis:cool,but
difficult
to do
Slide39GO analysis“manual”
Red - up
Blue - down
Yellow -
unchanged
Massie
et al., EMBO J. (2011) 30, 2719–2733
Slide40Problems of GO analysisSince we do many tests (one for each
term i) we encounter the multiple
testing problem: the
results significance is
not as
high
as individual P
i
suggest.
Both
the ontology and the annotations are updated
regularly. Results
can change
overnight. Make sure
you know
GO versions
you have used in each
analysis.
When
using a tool to study
GO enrichment
, make
sure you
know what the software is
doing.
Gene
lists produced in publications to tackle the same problem
using different software solutions can have
almost no overlap
Slide41Adapted from http://cs273a.stanford.edu [Bejerano Fall09/10]
C
ombinatorial TF binding
Gene
Proteins
DNA
DNA
TF binding
site
Sequence
logo
R
egulatory region
The expression of a gene is determined by a combination of TFs simultaneously bound to its regulatory DNA region
Slide42Blais
and
Dynlacht
(2005)
Genes Dev.,
19
, 1499-1511
Genes act together in gene networks
Slide43Network visualisation
pathwaycommons.org