/
Mar k Mar k

Mar k - PowerPoint Presentation

pamella-moone
pamella-moone . @pamella-moone
Follow
359 views
Uploaded On 2015-10-13

Mar k - PPT Presentation

ov chains Assume a gene that has three alleles A B and C These can mutate into each other Transition probabilities Transition matrix Probability matrix Left probability matrix The column sums add to 1 ID: 159138

state probability probabilities matrix probability state matrix probabilities absorbing frequencies reach expected transition chain time vector steps substitution number

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Mar k" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Markov chains

Assume a gene that has three alleles A, B, and C. These can mutate into each other.

Transition probabilities

Transition matrix

Probability matrix

Left probability matrix: The column sums add to 1.

Right probability matrix: The row sums add to 1.

Transition matrices are always square

The trace contains the probabilities of no change.

A

B

C

A

B

C

68% of A

stays A, 12% mutates into B and 20% into C.7% mutates from B to A and 10% from C to A.Slide2

Calculating probabilities

Probabilities to reach

another state in the

next step.

Probabilities

to reach another state in exactly

two steps.

The

probability to reach

any state in exactly

n steps is given

bySlide3

Assume for instance you have a virus with N strains. Assume further that at each generation a strain mutates to another strain with probabilities ai→j. The probability to stay is therefore 1-Σa

i→j. What is the probability that the virus is after k generations the same as at the beginning? Slide4

Initial allele frequencies

Allele frequencies in the first generation

Given

initial allele frequencies.

What are the frequencies in

the next generation?Slide5

A Markov chain is a process where step n depends only on the transition probabilities at step n-1 and the realized values at step n.A

Marcov chain doesn’t have a memory.

Andrey

Markov (1856-1922)

Transition probabilities might change

.

The

model

assumes

constant t

ransition probabilities.Slide6

Does our mutation process above reach in stable allele frequencies or do they change forever?

Do we

get stable frequencies?

X

n is a

steady-state, stationary probability,

or equilibrium vector.The associated eigenvalue is

1.

The

equilibrium vector

is independent of the initial

conditions. The largest

eigenvalue (principal eigenvalue

) of every probability matrix

equals 1 and there is

an associated stationary probability vector

that defines the

equilibrium conditions (Perron-Frobenius theorem).Slide7

Eigenvalues

and eigenvectors of probability matrices

Column sums of probability

matrices are 1.Row

sums might be higher.

The eigenvalues of probability

matrices and their transposes

are identical.

One of the eigenvalues of a

probability matrix is 1.

If

one of

the

entries

of P is 1, the matrix

is called absorbing

.In this case

the eigenvector of the largest eigenvalue contains only zeros and one 1.

Absorbing

chains

become

monodominant

by one element.

To

get

frequencies

the

eigenvector

has

to be

rescaled

(

normalized

).Slide8

Normalizing the stationary state vector

Frequencies

have to

add to unity!

Stationary frequenciesSlide9

Final

frequencies

The

sum of the eigenvector entries

have to be rescaled.

N=1000Slide10

Do all Markov chains converge?

Closed

part

Recurrent part

Periodic

chain

R

ecurrent and aperiodic chains

are called

ergodic.

The probability matrix theorem tells that every irreducible ergodic transition matrix has a steady state vector T to which the process converges.

 

You can leave

every state.

State D cannot be left.The

chain is absorbing. Slide11

Absorbing chains

A

C

D

B

It

is

impossible

to

leave

state D

A chain

is called absorbing

if it containes

states without exit.

The other states are called transient.Any absorbing Markov chain finally

converges to the absorbing

states

.

Closed

part

Absorbing

partSlide12

The time to reach the absorbing state

Home

Bar

Assume

a

druncard

going

randomly

through

five

streets

. In

the

first street

is his home, in the last a bar.

At either home

or

bar

he

stays

.

0.5

0.5

0.5

0.5

0.5

0.5Slide13

The

canonical

form

We

rearrange

the transition matrix

to have the s

absorbing states in

the upper left

corner and the t transient

states in the

lower right corner.

We have four compartments

After n steps we have;

The unknown matrix contains information about

the frequencies to reach

an

absorbing

state

from

stateB

, C,

or

D.

Transient

partSlide14

Multiplication

of

probabilities

gives ever smaller values

Simple geometric series

The

entries n

ijof the matrix

B contain the

probabilities of ending in an

absorbing state i when started

in state j.

The

entries n

ijof the fundamental

matrix N of Q

contain the expected numbers of time the process is in state i when started in state j. Slide15

The

sum of all rows of N gives the

expected number of times

the chain is

is state i (afterwards it falls to

the absorbing state).t is a

column vector that gives

the expected number of steps

(starting at state i) before

the chain is

absorbed.

The

druncard’s

walk

The

expected number of steps to reach the absorbing state.

The

probability

of

reaching

the

absorbing

state

from

any

of

the

transient

states

.Slide16

Periodic

chains

do not

have stable points. Slide17

Expected return (recurrence) times

C

A

D

E

B

If

we start

at

state D,

how

long

does

it

take

on

average

to return to D?

If

u

is

the

rescaled

eigenvector

of

the

probability

matrix

P

,

the

expected

return time

t

ii

of state i back to i

is

given

by

the

inverse

of

the

i

th

element

u

i

of

the

eigenvector

u

.

The

rescaled

eigenvector

u of

the

probability

matrix

P

gives

the

steady

state

frequencies

to be

in

state i.

0.33

0.33

0.25

0.25

0.05

0.05

0.15

0.25

0.50

0.35

In

the

long run

it

takes

about

9

steps

to return to DSlide18

First passage times in ergodic

chainsIf we start at

state D, how long does it

take on average to reach

state A?

C

A

D

E

B

0.33

0.33

0.25

0.25

0.05

0.05

0.15

0.25

0.50

0.35

Applied to

the

original

probability

matrix

P

the

fundamental

matrix

N

of

P

contains

information

on

expected

number

of

times

the

process

is

in

state i

when

started

in

state j.

D

C

A

D

E

B

D

E

B

A

C

A

0.25

0.05

0.25

0.33

0.15

0.25

0.33

0.35

0.05

0.0125

0.012375

0.00144375

We

have

to

consider

all

possible

ways

from

D to A.

The

inverse

of

the

sum of

these

probabilities

gives

the

expected

number

of

steps

to

reach

from

point j to point k.

The

fundamental

matrix

of an

ergodic

chain

D

E

D

C

A

……

0.25

0.33

0.25

0.05

0.00103125

W

is

the

matrix

containing

only

the

rescaled

stationary

point

vector

.

The

expected

average

number

of

steps

t

jk

to

reach

from

j to k

comes

from

the

entries

of

the

fundamental

matrix

N

divided

through

the

respective

entry

of

the

(

rescaled

)

stationary

point

vector

.Slide19

Average

first

passage

timeSlide20

You

have sunny, cloudy, and rainy days

with respective transition

probabilities. How long does

it take for a sunny day to folow a rainy

day? How long does it

take that a sunny day

comes back? Slide21

Probabilities

of DNA substitutionWe assume equal

substitution probabilities. If

the total probability for a

substitution is p:A

T

CG

p

p

p

p

p

The

probability

that

A mutates to T, C, or G

isP¬A

=p+p+p

The

probability

of no

mutation

is

p

A

=1-3p

Independent

events

Independent

events

The

probability

that

A

mutates

to T and C to G

is

P

AC

=(p)x(p)

p(

A

→T

)+

p(

A

→C

)

+p(

A

→G

)

+p(

A

→A

)

=1

The

construction of

evolutionary

trees

from

DNA

sequence

data Slide22

The

probability matrix

A

T

CG

A

T

CG

What is

the probability that

after 5 generations A did not

change?

The

Jukes - Cantor model (JC69)

now assumes that all

substitution probabilities are

equal.Slide23

Arrhenius model

The Jukes Cantor model assumes equal substitution probabilities within these 4

nucleotides.

Substitution probability after

time t

Transition matrix

Substitution

matrix

t

A,T,G,C

A

The

probability that nothing

changes is the zero term of

the Poisson distribution

The probability of at least one substitution is

The

probability

to

reach

a

nucleotide

from

any

other

is

The

probability

that

a

nucleotide

doesn’t

change

after

time t

isSlide24

Probability for a single difference

This is the mean time to get x different sites from a sequence of n

nucleotides. It is also a measure of distance that dependents only on the number of substitutions

What

is the probability

of n differences after time t?

We

use

the

principle of maximum

likelihood and the Bernoulli distributionSlide25

Gorilla

Pan

paniscus

Pan

troglodytes

Homo sapiens

Homo

neandertalensis

Time

Divergence

-

number

of

substitutions

Phylogenetic

trees

are the basis of

any systematic classificaton

Related Contents


Next Show more