/
Transition Bias and Substitution models Transition Bias and Substitution models

Transition Bias and Substitution models - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
406 views
Uploaded On 2017-10-19

Transition Bias and Substitution models - PPT Presentation

Xuhua Xia xxiauottawaca http dambebiouottawaca Xuhua Xia Transition bias refers to the degree by which the sv ratio deviates from the expected 12 The observed sv ratio is almost always much larger than 12 ID: 597329

xuhua xia bias substitution xia xuhua substitution bias transition transversions transitions fold degenerate gcc aaa gly ala methylation models

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Transition Bias and Substitution models" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Transition Bias and Substitution models

Xuhua Xia

xxia@uottawa.ca

http://

dambe.bio.uottawa.caSlide2

Xuhua Xia

Transition bias refers to the degree by which the s/v ratio deviates from the expected 1/2. The observed s/v ratio is almost always much larger than 1/2.

A G

C T

A G

C T

A G

C T

Transitions and Transversions

Transition: t

he substitution of a purine for a purine or a pyrimidine for a pyrimidine. Symbolized by s.

Transversion: t

he substitution of a purine for a pyrimidine or vice versa. Symbolized by v.

What is transition bias?

Purine

PyrimidineSlide3

Xuhua Xia

Transition Bias is Ubiquitous. Why?

For both invertebrate and vertebrate genes:What causes transition bias?Mutation biasSelection bias

Selection bias in fixation probability

Protein-coding genes

RNA genes

Mutation biasSlide4

Xuhua Xia

Mitochondrial Genetic Code

Synonymous and nonsynonymous

Degeneracy:

Non-degenerate

Two-fold degenerate

Four-fold degenerate

Transitions are synonymous and transversions are nonsynonymous at two-fold degenerate sites.Slide5

Xuhua Xia

RNA secondary structure

CCAAU

CCAAU

Seq1: CA

C

GA

|||||

GUGCU

Seq2: CA

U

GA ||||| GUGCU

CCAAU

CCAAU

Seq1: C

A

CGA

|||||

GUGCUSeq2: C

GCGA ||||| GUGCU

G/U pair, although not as strong as A/U or C/G pair, generally does not disrupt RNA secondary structure (and occurs frequently in RNA secondary structure).Slide6

Xuhua Xia

Causes of transition bias

I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of Science, whatever the matter may be."

Lord Kelvin: Phys. Letter A, vol. 1, "Electrical Units of Measurement", 1883-05-03Slide7

Xuhua Xia

At Four-fold Degenerate Sites

At four-fold degenerate sites, all nucleotide substitutions are synonymous and subject to roughly the same selection pressure (similar fixation probabilities)

Glycine codon:

GGA

GGC

GGG

GGT

Four-fold

degenerate site

Gly Asn Lys Gly Asp Lys Ala Ala Pro Ala Cys ...

Fold 4 2 2 2 2 4 4 4 2

S1 GGA AAU AAA GGA GAC AAA GCC GCC CCU GCG UGU ...

S2

GGG AAC AAA GAA GAU AAG GCC GCU CCA GGG UGG ...

s s v

Glu Gly TrpSlide8

Xuhua Xia

At Nondegenerate Sites

Glycine codon:

GGAGGCGGGGGT

nondegenerate site

At nondegenerate sites, all nucleotide substitutions are nonsynonymous and subject to roughly the same selection pressure (similar fixation probabilities)

Gly Asn Lys Gly Asp Lys Ala Ala Pro Ala Cys ...

S1 GGA AAU AAA GGA GAC AAA GCC GCC CCU GCG UGU ...

S2

GGG AAC AAA GAA GAU AAG GCC GCU CCA GGG UGG ...

s v

Glu Gly TrpSlide9

Xuhua Xia

At Two-fold Degenerate Sites

At two-fold degenerate sites, all transitional substitutions are synonymous, and all transversional substitutions are nonsynonymous

GAA His

GAG His

GAC Gln

GAT Gln

2-fold

degenerate site

A transition is about 40 time as like to become fixed as a transversion.

Gly Asn Lys Gly Asp Lys Ala Ala Pro Ala Cys ...

Fold 4 2 2 2 2 4 4 4 2

S1 GGA AAU AAA GGA GAC AAA GCC GCC CCU GCG UGU ...

S2

GGG AAC AAA GAA GAU AAG GCC GCU CCA GGG UGG ...

s s s v

Glu Gly TrpSlide10

Xuhua Xia

Methylation and deamination

H

3C-

Methyltransferase

H

3

C- +

Donor AcceptorSlide11

Xuhua Xia

Methylation and DNA Repair in E. coli

DNA alphabets: ACGTRNA alphabets: ACGUDNA duplication and Watson-Crick paring rule: A-T, C-G

3’--CTAG-

-

--CT

A

GGTAT----C-----C--CT

AG-----------5’

|||| |||||||| ? ? ||||5’--GATC----GATCCATA----U-----T--GATC-----... 3’

H

3C H3C H3C

H

3C

mutS

mutH

mutL

Spacing of GATC: consequences of being too far.Slide12

Xuhua Xia

Methylation-Modification System

TGGC*CA

AC*CGGT

Brevibacterium albidum

dsDNA

phage

Bacterial

Genome

Restriction

enzyme

Transcription

and Translation

Bacterial Membrane

----TGG|CCA---

----ACC|GGT---

MethylaseSlide13

Xuhua Xia

CpG-Specific DNA Methylation

Mammalian DNA methyltransferase 1 (DNMT1)NLS-containing domainreplication foci-directing domainZnD, Zn-binding domain

polybromo domainCatD, the catalytic domain

Fatemi, M., A. Hermann, S. Pradhan and A. Jeltsch, 2001 J Mol Biol 309: 1189-99.

1

343

350

613

746

1124

609

748

1110

NlsD

ZnD

CatD

CpG mCpG

m

CpG

RFDD

PBD

1620Slide14

Xuhua Xia

CpG-Specific DNA Methylation

5’ATG

CGA-------CCGA--------ACGGC--TAA 3’

|||||| |||| |||||

3’TACG

C

T-------GG

CT--------TGC

CG--ATT 5’H3C

H

3

C

H3

C

Fully methylated Hemi-methylated Unmethylated

Note: 5’CG3’ = CpGSlide15

Xuhua Xia

Methylation and Gene Regulation

Proteins with a methyl-CpG binding domain (MBD)MBD1, MBD2, and MBD3 MeCP2Deacetylases: An enzyme that removes an acetyl group

Histone deacetylases: deacetylate lysyl residues in histones (the half life of an acetyl group is ~10min). Acetylation removes a positive charge on the lysine -amino group and promote nucleosome melting (and gene expression). Deacetylation tend to decrease or turn off gene expression.

---

m

CpG-----------------

MBD

Histone deacetylase

Condensed DNA with repressed transcription

Wade, P. A., and A. P. Wolffe, 2001 Nat Struct Biol 8: 575-7.

Lysine demethylationSlide16

Xuhua Xia

Slide

16

H

3

C

Methylation and Mutation

N

N

O

NH

2

O

Cytocine is converted to Thymine

methylation

Spontaneous deamination

N

N

O

H

3

C

OSlide17

Xuhua Xia

Vertebrate mitochondrionSlide18

Xuhua Xia

Spontaneous deaminationSlide19

Xuhua Xia

Transversion can erase transitions

Transitions can erase transitions, and transversions can erase transversions.

However, a transversion can erase many transitions occurring before it, and subsequent transitions cannot erase the transversion:AACGCTTGACGAACGCTTAACGAACGCTTGACGAACGCTT

C

ACG

AACGCTT

T

ACGAlthough a transition could also erase 2n transversions occurring before it, this is rare because transversions are in generally much rarer than transitions. Transitions tend to be missed in counting much more frequently than transversions.

AACGCTT

G

ACG

AACGCTT

T

ACGAACGCTTAACGAACGCTTGACGSlide20

Xuhua Xia

Summary

Selection: Transitions are tolerated more than transversion by natural selection becausethey are more likely synonymous in protein-coding sequences than transversionsthey are less likely to disrupt RNA secondary structure than transversions.Mutation: Transitional mutation occurs more frequently than transversions becauseMisincorporation during DNA replication occur more frequently between two purines or between two pyrimidines than between a purine and a pyrimidineA purine is more likely to mutate chemically to another purine than to a pyrimidine (e.g., through spontaneous deamination) . The same for pyrimidine.

Bias in counting: Transitions tend to be missed in counting much more frequently than transversions (which necessitates the substitution models)Slide21

Xuhua Xia

Nucleotide Substitutions

ACACTCGGATTAGGCT

ACACTCGGATTAGGCT

A

T

ACTC

A

GGTTAAGCTACAA

TC

CGGTTAAGCT

T C C

AGACTCGGATTAGGCT

Observed sequences

single

multiple

coincidental

parallel

convergent

back

Actual number of changes during the evolution of the two daughter sequences: 12

Observed number of differences between the two daughter sequences: 3.

Correcting for multiple substitutions to to estimate the true number of changes, i.e., 12.

From WHLSlide22

Xuhua Xia

Substitution models and phylogenetics

A substitution model is to model the evolutonary process so as to correct for multiple hits.A phylogenetic reconstruction method implicitly or explicitly assumes a substitution model.A phylogenetic method assuming a wrong substitution model will typically lead to wrong trees produced.An alignment with an inappropriate substitution score matrix will typically lead to inaccurate alignment (e.g., strong transition bias among sequences but a substitution score matrix without strong penalty against transversion)

A G

C T Slide23

A G C T

A a

1

a

2

a

3

G a

7

a

4 a5

C a8 a9 a6

T a

10 a11 a12

A G C T

A a

1

G

a

2

C

a

3

T G a

1A a4

C

a5

T C a2A

a

4G a6

T

T a

3

A

a

5

G

a

6

C

The diagonal of a transition probability matrix is subject to the constraint that each row sums up to 1.

JC69

i

= 0.25

a

i

= c

F81/TN84

A

, 

C

, 

G

, 

T

a

i

= c

K80

i

=0.25

a

1

= a

6

= a

7

= a

12

= 

a

2

= a

3

= a

4

= a

5

= a

8

= a

9

= a

10

= a

11

= 

HKY85

A

, 

C

, 

G

, 

T

a

1

= a

6

= a

7

= a

12

= 

a

2

= a

3

= a

4

= a

5

= a

8

= a

9

= a

10

= a

11

= 

TN93

A

, 

C

, 

G

, 

T

a

1

= a

7

= 

1

a

6

= a

12

= 

2

a

2

= a

3

= a

4

= a

5

= a

8

= a

9

= a

10

=a

11

= 

GTR

Unrestricted: no equilibrium

iSlide24

Xuhua Xia

The TN93 model as an example

- frequency parameters

- rate ratio parameters

In addition to illustrated assumptions, it also assumes that the frequency and rate ratio parameters do not change over time, i.e., the substitution process is stationary.

A G

C T

T C A GSlide25

Xuhua Xia

Substitution Models

There are three types of substitution models in molecular evolutionNucleotide-basedAmino acid-basedCodon-basedSubstitution models are characterized by two categories of parameters: the frequency parameters and the rate ratio parameters, and different models differ by their assumptions concerning these two categories of parameters.Substitution models, substitution score matrix and sequence alignment.