/
1 Joint analysis of regulatory networks and expression profiles 1 Joint analysis of regulatory networks and expression profiles

1 Joint analysis of regulatory networks and expression profiles - PowerPoint Presentation

DreamGirl
DreamGirl . @DreamGirl
Follow
342 views
Uploaded On 2022-08-03

1 Joint analysis of regulatory networks and expression profiles - PPT Presentation

Ron Shamir School of Computer Science Tel Aviv University April 2013 1 Sources Igor Ulitsky and Ron Shamir Identification of Functional Modules using Network Topology and HighThroughput Data BMC Systems Biology 18 2007 ID: 933860

complex expression genes modules expression complex modules genes protein connected network analysis nodes module high dna gene ribosome data

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "1 Joint analysis of regulatory networks ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

1

Joint analysis of regulatory networks and expression profiles

Ron ShamirSchool of Computer ScienceTel Aviv UniversityApril 2013

1

Sources:

Igor Ulitsky and Ron Shamir. Identification of Functional Modules using Network Topology and High-Throughput Data. BMC Systems Biology 1:8 (2007).

Igor Ulitsky and Ron Shamir. Identifying functional modules using expression profiles and confidence-scored protein interactions. Bioinformatics Vol. 25 no. 9 1158-1164 (2009) .

Slide2

OutlineBackgroundJoint network and expression profiles

MatisseCezanne

2

Slide3

Background

3

Slide4

DNA

RNA

protein

transcription

translationThe hard disk

One program

Its output

4

Slide5

DNA Microarrays / RNA-seqSimultaneous measurement of expression levels of all genes /

transcripts.Perform 105

-109 measurements in one experimentAllow global view of cellular processes. The most important biotechnological breakthroughs of the last /current decade

http://www.biomedcentral.com/1471-2105/12/323/figure/F2

5

Slide6

The Raw Data

genes

experiments

Entries of the Raw Data matrix:

expression levels.Ratios/absolute values/… expression pattern for each gene Profile for each experiment/condition/sample/chip Needs normalization!

6

Slide7

7EXP

ression ANalyzer and D

isplayERClustering Identify clusters of co-expressed genesCLICK, KMeans, SOM, hierarchical

http://acgt.cs.tau.ac.il/expander

A. Maron, R. Sharan Bioinformatics 03Function. enrichmentGO, TANGOVisualization

Promoter analysis

Analyze TF binding sites of co-regulated genes

PRIMA

Biclustering

Identify homogeneous submatrices

SAMBA

A. Maron-Katz, A. Tanay, C. Linhart, I. Steinfeld, R. Sharan, Y. Shiloh, R. Elkon

BMC Bioinformatics 05

microRNA function inference:

FAME

Ulitsky et al.

Nature Protocols 10

Slide8

Networks of Protein-protein interactions (PPIs)Large, readily available resourceRepresentation: Network with nodes=proteins/genes edges=interactions

8

Analysis methods:

Global propertiesMotif content analysis

Complex extractionCross-species comparison

Slide9

The hairball syndrome

9

Slide10

Potential inroad into pathways and functionCan the network help to improve the analysis?

10

Slide11

Analysis of gene expression profiles + a network11

Slide12

12Goal

Challenge: Detect

active functional modules: connected subnetwork of proteins whose genes are co-expressed“Where is the action in the network in a particular experiment?”

Slide13

Ron Shamir, RNA Antalia, April 08

13

13

Slide14

14

Slide15

15

Ulitsky & Shamir

BMC Systems Biology 07

Slide16

Input: Expression data and a PPI networkOutput: a collection of modulesConnected PPI subnetworksCorrelated expression profiles

Interaction

High expression similarity

http://acgt.cs.tau.ac.il/matisse

16

M

odular

A

nalysis for

T

opology of

I

nteractions and

S

imilarity

SE

ts

Slide17

Probabilistic model

Event

Mij: i,j are mates

= highly co-expressed

P(Sij|Mij) ~ N(m , 2m)P(Sij|Mij) ~ N(n , 

2n

)

H

0

: U is a set of unrelated genes

H

1

: U is a

module

= connected subnetwork with high internal similarity

R

i

: gene

i

transcriptionally regulated

m

: fraction of mates out of module gene pairs that are transcriptionally regulated

m

= P(

M

ij

|

R

i

R

j

, H

1

)

p

m

: fraction of mates out of all gene pairs that are transcriptionally regulated

17

Slide18

Probabilistic model (2)Is connected gene set U a module? Assuming pair indep:Define 

mij=

m P(Ri

)P(R

j)Define nij= pm P(Ri)P(Rj).Likelihood ratio Pr(Data|H1)/Pr Data|H0)Taking log: sum of terms

ij

:

18

Slide19

Probabilistic model - summary

Similarities:

mixture of two GaussiansFor a candidate group U, the likelihood ratio of originating from a module or from the background is

Module score = Gene group likelihood ratio =

sum over all the gene pairsFind connected subgraphs U with high WU19

Slide20

ComplexityFinding heaviest connected subgraph: NP hard even without connectivity constraints (+/- edge weights)Devised a heuristic for the problem

20

Slide21

MATISSE workflow

Seed generation

Greedy optimization

Significance filtering

Slide22

Finding seedsThree seeding alternatives testedAll alternatives build a seed and delete it from the networkBuilding small seeds around single nodes:Best neighborsAll neighbors

Approximating the heaviest subgraphDelete low-degree nodes and record the heaviest subnetwork found

Slide23

Greedy optimization

Simultaneous optimization of all the seeds

The following steps are considered:Node additionNode removalAssignment changeModule merge

Slide24

Front vs. Back nodesOnly a fraction of the genes (front nodes) have meaningful similarity values

MATISSE can link them using other genes (back nodes).

Back nodes correspond to:Unmeasured transcriptsPost-translational regulationPartially regulated pathways

24

Slide25

Advantages of MATISSENo p-vals needed for measurementsWorks when a fraction of the genes expression patterns are informativeCan handle any similarity dataNo prespecified number of modules

25

Slide26

Test case: Yeast osmotic shockNetwork

: 65,990 PPIs & protein-DNA interactions among 6,246 genesExpression: 133 experimental conditions – response of perturbed strains to osmotic shock (O’Rourke & Herskowitz 04)

Front nodes: 2,000 genes with the highest variance

26

Slide27

Pheromone response subnetwork

Back

Front

27

Slide28

Performance comparison

% of modules with category enrichment at p< 10

-3

% annotations enriched at p<10

-3 in modules28

Slide29

GO and promoter analysis

(c)

29

Slide30

Application to stem cells~150 human stem cell lines of diverse types profiled using microarraysClustered profiles into groups

Adjusted Matisse to seek subnetworks that characteristic to each group Focused analysis on pluripotent stem cells

F. Müller, L. Laurent, D. Kostka, I. Ulitsky, R. Williams, C. Lu, I. Park, M. Rao, P. Schwartz, N. Schmidt, J. Loring Nature 08

30

Slide31

Pluripotent stem cells network

Highlights the key protein machinery underlying pluripotency

31

Slide32

Ulitsky & Shamir Bioinformatics 2009

32

Slide33

Accounting for PPI confidencePPI-based analysis is made difficult by abundant false positive

/ negative interactionsVarious methods can assign

confidence (probability) to individual edgesIdea: seek modules that are connected with high probability

Ulitsky & Shamir

Bioinformatics,

2009

33

Slide34

What is a confidently connected module?With high probability,

any two parts of the module are connected by an edgeAccommodates both sparse and dense pathways

Accommodates genes with low-confidence connectivity with many module genesConfidently-connected modules can be found efficiently

34

Slide35

Connected with high probability?Every two genes are connected by a confident path Bias to dense pathways

There is a minimum spanning tree with high-confidence edges Same as ignoring low-confidence edges

An edge connects any two parts of the module are connected with high probability

35

Slide36

CEZANNE: (Co-Expression

Zone ANalysis using

NEtworks)Edge probability p(e)  Edge weight

–log(1-p(e))For any W

U, ≥1 edge connects W with U\W with probability q (e.g. 0.95)  The weight of the minimum cut of U is at least -log(1-q)Algorithm: among the subnets whose minimum cut exceeds -log(1-q) find the one with the maximum co-expression scoreP({A},{B,C,D})=1-0.3*0.3=0.91

P

({A,C,D},{

B

})=0.94

P

({

A,B

},{C,D})=0.94

P

({

A,B,D

},{C})=0.994

minimum cut

0.7

0.9

0.7

0.8

A

B

C

D

36

Slide37

How to find confidently connected modules?Seed identification

: Run MATISSE ignoring edge weights, then “slice” the modules using minimum cut, until all subnetworks are “legal”Greedy optimization (how to find legal moves?):Adding nodes is easy to test (positive edge weights)

Merging modules is easy to test(Re)moving modules: requires maintaining the set of ‘crucial’ nodes in each moduleSolvable in minutes on real world examples

37

Slide38

DNA damage response in S. cerevisiae47 DNA Damage Response expression profiles

(Gasch et al., 01)Front nodes: 2,074 genes with at least two-fold expression change

Network and confidence values: purification enrichment (PE) scores (Collins et al. 07)

38

Slide39

Module size

GO biological process

p-value

GO-slim protein complexes

p-value346

ribosome biogenesis and assembly

1.2·10

-117

ribosome

5.9·10

-91

translation

1.0·10

-85

eukaryotic 43S preinitiation complex

3.8·10

-49

rRNA processing

7.5·10

-79

small nucleolar ribonucleoprotein complex

1.5·10

-41

35S primary transcript processing

4.6·10

-44

DNA-directed RNA polymerase III complex

3.1·10

-17

ribosome assembly

4.3·10

-39

exosome (RNase complex)

4.4·10

-15

ribosomal large subunit biogenesis

9.2·10

-14

DNA-directed RNA polymerase I complex

5.7·10

-14

rRNA modification

4.4·10

-12

Noc complex

3.2·10

-6

38

protein catabolism

1.8·10

-46

proteasome complex (sensu Eukaryota)

5.7·10

-71

proteolysis

9.0·10

-44

proteasome core complex (sensu Eukaryota)

9.4·10

-32

ubiquitin cycle

1.1·10

-42

12

histone acetylation

3.6·10

-13

histone acetyltransferase complex

2.1·10

-12

chromatin modification

5.9·10

-11

transcription from RNA polymerase II promoter

1.4·10

-6

12

translation

1.1·10

-14

ribosome

1.4·10

-15

12

nuclear mRNA splicing, via spliceosome

3.5·10

-21

spliceosome complex

3.5·10

-17

small nuclear ribonucleoprotein complex

2.5·10

-15

10

barbed-end

actin

filament capping

4.8·10

-6

F-actin capping protein complex

4.8·10

-6

endocytosis

1.1·10

-5

cytoskeleton organization and biogenesis

2.8·10

-5

8

establishment and/or maintenance of chromatin architecture

1.1·10

-5

chromatin remodeling complex

4.6·10

-6

7

glycogen metabolism

3.0·10

-8

protein phosphatase type 1 complex

3.3·10

-5

sporulation

(

sensu

Fungi)

2.0·10

-6

6

translation

1.1·10

-7

ribosome

4.0·10

-8

6

tRNA

processing

2.5·10

-14

ribonuclease P complex

9.2·10

-8

rRNA

processing

2.2·10

-9

4

trehalose biosynthesis

6.8·10

-14

alpha,alpha-trehalose-phosphate synthase complex (UDP-forming)

6.8·10

-14

4

ubiquitin-dependent protein catabolism

5.2·10

-7

3

pseudohyphal growth

9.8·10

-7

cAMP-dependent protein kinase complex

9.6·10

-7

3

proteasome assembly

3.2·10

-6

protein folding

3.9·10

-6

DNA damage response modules

Cytoplasmic

ribosome biogenesis

Proteasome

Mitochondrial ribosome – small subunit

Mitochondrial ribosome – large subunit

Spliceosome

Novel

actin

-localized pathway?

Hsp90

PKA

Trehalose

biosynthesis

Ribonuclease

P

Suggests SWS2 a novel member

Novel pathway enriched with

actin

-localized proteins; Supported in other datasets; Similar deletion phenotypes

39

Slide40

Comparison with prior work

Combined measure of sensitivity

(% of annotations enriched)and specificity (% of modules enriched) with p<0.001

Clustering of only expression data

Clustering expression & network (Hanisch et al., 2002)Expression similarity + network connectivity

Expression similarity + confident network connectivity

40

Slide41

41

Slide42

SummaryAlgorithms using co-expression + networks to detect functionally coherent modules

Accommodate both sparse and dense subnetworks

Subnetworks linked to osmotic shock and DNA damageA general framework for confident connectivity in PPI networksThe next steps:

Co-expression is not the only interesting way to utilize GE data

Scaling to complex human datasets42