/
Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs

Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs - PowerPoint Presentation

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
390 views
Uploaded On 2018-02-26

Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs - PPT Presentation

Yue Zhang Chunfang Zheng David Sankoff Presented by Suzy Sun Seeks to infer the nature and timing of evolutionary events by examining the distribution of similarities between orthologous and paralogous gene pairs ID: 637360

gene wgd speciation pairs wgd gene pairs speciation genome genes unpaired proportion probability time fractionation pair triplication interval introduction

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Evolutionary model for the statistical d..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs generated by whole genome duplication and speciation

Yue Zhang, Chunfang Zheng, David SankoffPresented by Suzy Sun Slide2

Seeks to infer the nature and timing of evolutionary events by examining the distribution of similarities between orthologous and paralogous gene pairs Identify peaks as duplications that were generated by speciation or whole genome duplication (WGD) events

However, there is no rigorous methodology to calculate the volume of the individual normal distributions

Comparative Genomics

IntroductionSlide3

Analyze duplicate gene similarity distributions based on sequence divergence and fractionation of duplicate genes that result from whole genome duplication (WGD) forSeries of 2 or 3 WGD

Whole genome triplication followed by WGDTriplication, followed by speciation, then WGDCalculate probabilities of possible gene pairs to predict the number of surviving pairs from each event

Purpose

IntroductionSlide4

Speciation creates a set of orthologous gene pairs that evolve through random single nucleotide mutations

Whole genome duplication (WGD) creates a set of paralogous gene pairs that also diverge through random mutationFractionation: one of the two genes is excised, pseudogenized, or otherwise removed as a coding gene

Gene events

IntroductionSlide5

p = proportion of nucleotide positions occupied by the same base in two orthologues/paralogs

G = gene length (number of nucleotides in the coding region) Assume p follows a normal approximation to the sum of G binomial distributions, divided by G

, over time

t

ϵ [0,∞) since the event that gave rise to the gene pair

Mean: E[

p

] =

+

[0,1]

Variance: E(

p

-E[

p

])

2

=

Where

 

Building blocks

IntroductionSlide6

Fractionation can be represented by u

[0,1]u = probability, for a pair of genes, that neither gene is lost over a time interval t The assumption that any gene pair has a constant probability of fractionation is u

=

where

is the fractionation parameter

 

Building blocks

IntroductionSlide7

Consider 4 cases:

Two WGDThree WGD

Whole genome triplication followed by WGD

Whole genome triplication, followed by speciation, followed by WGD

GENE EVENTSSlide8

Two WGD

Two WGDSlide9

Two WGD

Two WGD

u

is the probability, for a pair of genes, that neither gene is lost over the time interval t

1,

and similarly,

v

for time

interval

t

2Slide10

Two WGD

Two WGD

u

is the probability, for a pair of genes, that neither gene is lost over the time interval t

1,

and similarly,

v

for time

interval

t

2Slide11

Two WGD

Two WGD

In Figure 1, let

A

=

E

(

t

1

pairs)

=

4

uv

2

+ 4

uv

(1-

v

) +

u

(1-

v)

2

=

u

(1+

v

)

2

B

=

E

(

t

2

pairs)

=

2

uv

2

+ 2

uv

(1-

v

) + (1-

u)v

=

v

(1+

u)

C

=

E

(unpaired genes)

=

(1-

u

)(1-

v

)Slide12

Two WGD

Two WGD

In Figure 1, let

P(A)

=

Proportion

of

t

1

pairs

=

P(B)

=

Proportion

of

t

2

pairs

=

P

(C)

=

Proportion

of

unpaired

=

In Figure 1, let

P(A)

=

Proportion

of

t

1

pairs

=

P(B)

=

Proportion

of

t

2

pairs

=

P

(C)

=

Proportion

of

unpaired

=Slide13

Two WGD

Two WGD

Let

N

p

(s) = the density at point

s

of a normal distribution with mean

p

and variance

Probability that a gene pair will have similarity

Probability of an unpaired gene is

The likelihood of a dataset with gene pairs at s

1

,

…,s

l

and

k

unpaired genes is

The log likelihood

= log

is

 Slide14

Three WGD

Three WGDSlide15

Three WGD

Three WGD

For Figure 2 where

u, v, w

are retention probabilities for

t

1

, t

2

, t

3

E(t

1

pairs) = (1 - 3w

2

+ 2w)uv

2

+ (2 + 6w

2 + 4w)

uv + (1 + w

2 + 2w)uE(t

2

pairs) =

((1 + w

2

+ 2w)u + 1 + w

2

+ 2w)v

E(t

3

pairs) =

-2uv

2

w

2

+ ((2w

2

– w)u + w)v +

uv

+w

E(unpaired) =

(1-u)(1-v)(1-w)Slide16

WG Triplication + WGD

WGT + WGD

E(t

1

pairs) =

(u’+3u’’’)v

2

+(2u’ + 6u’’’)+b+3u’’’

E(t

2

pairs) =

-3u’’’v

3

+3u’’’v

2

+(1+2u’’’-u’)v

E(unpaired) = (

1-u’’’-u’)(1-v)Slide17

Speciation

SpeciationSlide18

Speciation

Speciation

Whole genome triplication (t

1

)

Speciation (t

2

)

WGD in one of the daughter genomes (t

3

)Slide19

Speciation

SpeciationSlide20

Application to

Populus

trichocarpaSlide21

Length is variable among genes and genomes

Duplicate genes are produced not only by WGDAssumption of constant rates of gene divergence Fractionation rates are not well understood

LimitationsSlide22

This is the first model that simultaneously processes duplicate gene divergence and fractionation through the course of evolution of one or more species that underwent WGD

We can predict the location, shape and amplitude of evolutionary signals in pairwise genome comparisons

ConclusionsSlide23

Thank you!