 83K - views

# Markov Chains Mixing Times

## Markov Chains Mixing Times

Download Presentation - The PPT/PDF document "Markov Chains Mixing Times" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

## Presentation on theme: "Markov Chains Mixing Times"— Presentation transcript:

Slide1

Markov Chains Mixing TimesLecture 5

Omer

Zentner

30.11.2016

Slide2

Agenda

Notes from previous lecture

Mixing time

Time reversal & reversed chain

Winning Streak Time Reversal (example)

A bit about time reversal in Random Walks on Groups

Eigenvalues (and bounds for mixing time we get with them)

Eigenvalues example

Slide3

Notes from previous lecture

We’ve seen the definition of Total Variation Distance“How similar 2 distributions are” = = We’ve used TV to define 2 distance measuresDistance from stationary distributionDistance between rows of the transition matrix

Slide4

Notes from previous lecture

We’ve seen some properties of the distance measuresFrom which we gotWe’ll now continue with defining Mixing Time

Slide5

Mixing Time

Definitions

Slide6

Mixing Time

From: We get:And with :

Slide7

Time Reversal

The time reversal of an irreducible Markov chain with transition matrix P and stationary distribution π is:Let be an irreducible Markov chain with transition matrix P and stationary distribution πWe write for the time reversed chain with transition matrix π is stationary for For every , we have:

Slide8

Time Reversal

We’ve seen in lecture 2:If a transition matrix and the stationary distribution have the detailed balance property, then the chain is reversible:This means that the distribution of is the same as the distribution of Follows from the detailed balance property When a chain is reversible, it’s time reversal is the same as itself. We have: =

Slide9

Time Reversal & reversed chains

Reversible chain example – undirected graph

Non reversible chain example – biased random walk on the n-cycle

Slide10

Example – Winning streak

Repeatedly tossing a fair coin, while keeping track of the length of last run of heads.Memory is limited – can only remember n last results. is a Markov chain with state space {0,…,n}Current state of chain is

Slide11

Example – Winning streak

Transition matrix is given by (non-zero transitions):

Slide12

Example – Winning streakstationary distribution for P

Can check:

Slide13

Example – Winning streak – Time reversal

The Time reversal is (non-zero transitions):

Slide14

Example – Winning streak Time reversal

For the time reversal of the winning streak:After n steps – distribution is stationary, regardless of initial distributionWhy?If , distribution is stationary for all , since If , transitions force , so stationary for t > k

Slide15

Example – Winning streak Time reversal

If :The location of depends on how much time we spent at nFor 0 < k < n: probability of to hold for (k-1) times, and then proceeding on the k-th turn.In this case : and ( since = (n-1) –(n-k))Altogether, If initial distribution is not concentrated on a single state – distribution at time n is a mixture of the distributions for each possible initial state, and is thus stationary.

Slide16

Example – Winning streak Time reversal

Note (lower bound):If the chain is started at n, and then leaves immediately, then at time n-1 it must be at state 1.Hence And from the definition of total variation distance we get:Conclusion - for the reverse winning streak chain we have:for any positive

Slide17

Example – Winning streak – conclusion

It is possible for reversing a Markov chain to significantly change the mixing time

The mixing time of the Winning-Streak will be discussed in following lectures.

Slide18

Reminder – random walk on a group

A random walk on a group G, with incremental distribution μ:Ω = Gμ is a distribution over ΩAt each step, we randomly choose , according to μ, and multiply by itOr, in other words: We’ve seen that for such a chain, the uniform probability distribution is a stationary distribution.Will sometimes be noted as We’ve also seen that if is symmetric (), the chain is reversible, and P =

Slide19

Mixing Time and Time Reversal

inverse distribution:If μ is a distribution on a group G, is defined by:(g) := , for all .let P be a transition matrix of a random walk on group G with incremental distribution μ, then the random walk with incremental distribution is the time reversal .Even when μ is not symmetrical, forward and reverse walk distributions are at the same distance from stationary

Slide20

Mixing Time and Time Reversal

Lemma 4.13Let P be the transition matrix of a random walk on a group G with incremental distribution μLet be the transition matrix of a walk on G with incremental distribution Let π be the uniform distribution on G.Then, for any

Slide21

Mixing Time and Time Reversallemma 4.13 - proof

Proof:Let be a Markov chain with transition matrix P, and initial state idCan write: where are random elements chosen independently from μMarkov chain with transition matrix , initial state idWith increments chosen independently from

Slide22

Mixing Time and Time Reversallemma 4.13 – proof cont.

For any fixed elements , From the definition of Summing over all strings such that :Hence:

Slide23

Mixing Time and Time Reversal

CorollaryIf is the mixing time of a random walk on a group and is the mixing time of the inverse walk, then

Slide24

Eigenvalues spectral representation of a reversible transition matrix

Note: “Because we regard elements of as functions from Ω to ℝ, we will call eigenvectors of the matrix P eigenfunctions”

Slide25

Eigenvalues

Define another inner-product on :

Slide26

Eigenvalues

Slide27

Lemma 12.2 - proof

Define a matrix A, as follows: A is symmetric:We assumed P is reversible with respect to π, so we have detailed balance: => So, A(x,y) =

Slide28

Lemma 12.2 – proof cont.

A =The Spectral Theorem for Symmetric Matrices, guarantees that the inner-product space (, ) has an orthonormal basis , such that is an eigenfunction with real eigenvalue

Slide29

Lemma 12.2 – proof cont.

Can also see that is an eigenfunction of A, with eigenvalue 1: + … =

Column

i

=

Slide30

Lemma 12.2 – proof cont.

We can decompose A as is diagonal with A =

Slide31

Lemma 12.2 – proof cont.

Let’s define We can see that is an eigenfunction of P with eigenvalue : =

Slide32

Lemma 12.2 – proof cont.

are orthonormal in respect to the inner product :=> so, we have that is an orthonormal basis for (, ), such that is an eigenfunction with real eigenvalue

Slide33

Lemma 12.2 – proof cont.

Now, let’s consider the following function: can be written as a vector in (, ) , via basis decomposition with :

Slide34

Lemma 12.2 – proof cont.

(x,y) = ()(x)So:(Dividing both sides by completes the proof

Slide35

Absolute spectral gap

For a reversible transition matrix, we label the eigenvalues of P in decreasing orderDefinition: : Definition: absolute spectral gap (: from lemma 12.1 we get that if P is aperiodic and irreducible, then -1 is not an eigenvalue of p, so > 0Side note:The spectral gap of a reversible chain is defined by If the chain is lazy, =

Slide36

Relaxation time

Definition: relaxation time ():We will now bound a reversible chain’s mixing time, with respect to its relaxation time

Slide37

Theorem 12.3 (upper bound)

Slide38

Theorem 12.3- proof

Slide39

Theorem 12.4 (lower bound)

Slide40

Theorem 12.4- proof

Suppose

is an eigenfunction of P with eigenvalue

We’ve seen that eigenfunctions are orthogonal with respect to , and that 1 (vector/function) is an eigenfunctionSo we have =

Slide41

Theorem 12.4- proof cont.

So, taking x such that

= , we getUsing :

Slide42

Theorem 12.4- proof cont.

Slide43

Theorem 12.4- proof cont.

=

= = Maximizing over eigenvalues different from 1, we get:(-1)

Slide44

Mixing time bound using eigenvalues example

We’ve seen random walks on the n-cycle, and random walks on groups.

A random walk on the n-cycle can be viewed as a random walk on an n-element cyclic group.

We will now use this interpretation to find eigenvalues and eigen functions of that chain

Slide45

Mixing time bound using eigenvalues example

Random walk on the cycle of n-th roots of unityLet , are the n-th roots of unitySince , we have Hence, is a cyclic group of order n, generated by

Slide46

Random walk on the cycle of n-th roots of unity

Now, will consider the random walk on the n-cycle, as the random walk on the multiplicative group our incremental distribution will be the uniform distribution over As usual, Let P denote the transition matrix for the walk.

Slide47

Random walk on the cycle of n-th roots of unity

Now, let’s examine at the eigenvalues of PIf is an eigenfunction of P, then

Slide48

Random walk on the cycle of n-th roots of unity

Let’s look at , when we define ) := i.e. … is an eigenfunction of P:Eigenvalue of is

Slide49

Random walk on the cycle of n-th roots of unity

What is the geometrical meaning of this?For any the average of and is a scalar multiple of Since the chord connecting and is perpendicular to , the projection of onto has length

Slide50

Random walk on the cycle of n-th roots of unity - note

Because

is an eigenfunction of the real matrix P, with a real eigenvalue, both its real part and its imaginary parts are eigenfunctions.In particular, the function , defined by: Is an eigenfunction.We note for future reference that is invariant under complex conjugation of the states of the chain.

Slide51

Random walk on the cycle of n-th roots of unity – spectral gap & relaxation time

) := is an eigenfunction with eigenvalue We have Cos(x) = 1 -

Slide52

Random walk on the cycle of n-th roots of unity – spectral gap & relaxation time

So, spectral gap is of order Relaxation time () is order Note that when n is even, the chain is periodic: is an eigen value, and so = 0

Slide53

THE END

Questions?

Slide54

Thank you!

Slide55

Post credits scene: Ergodic theorem

Slide56

Ergodic Theorem

Idea: “time averages equal space averages”Define: = Ergodic Theorem:Let f be a real-valued function defined on Ω.If is an irreducible Markov chain,Then for any starting distribution μ:

Slide57

Ergodic Theorem - Proof

Suppose the chain starts at state xDefine:,…, ,…,

Slide58

Ergodic Theorem – Proof

Strong Law of Large numbersThen

Slide59

Ergodic Theorem - Proof

Define:So, , From Strong Law of Large Numbers:

Slide60

Ergodic Theorem – Proof

Since , again be Strong Law of Large Numbers:thus

Slide61

Ergodic Theorem – Proof

Note: (Using 1.25) =>

Slide62

Ergodic Theorem – Proof

So, …

Slide63

Ergodic Theorem – Proof

let be an unbounded sequence.If for a sequence of integers satisfying , {We have Then This shows that theorem holds when we take (initial state is x)Averaging over the starting state completes the proof.

Slide64