/
Introduction to Markov chains Introduction to Markov chains

Introduction to Markov chains - PowerPoint Presentation

olivia-moreira
olivia-moreira . @olivia-moreira
Follow
400 views
Uploaded On 2017-10-19

Introduction to Markov chains - PPT Presentation

part 2 1 Haim Kaplan and Uri Zwick Algorithms in Action Tel Aviv University Last updated April 18 2016 Reversible Markov chain 2 A distribution is reversible for a Markov chain if ID: 597474

distribution chain metropolis markov chain distribution markov metropolis probability random set independent local pick uniformly graph gibbs reversible symmetric

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Introduction to Markov chains" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Introduction to Markov chains(part 2)

1

Haim Kaplan and Uri Zwick

Algorithms in Action

Tel Aviv University

Last updated: April

18

2016Slide2

Reversible Markov chain2

A distribution is reversible for a Markov chain if 

(detailed balance)

 

A Markov chain is

reversible

if it has a reversible distribution

Lemma: A reversible distribution is a stationary distribution

 

 

Proof:Slide3

Reversible Markov chain3

 

 

 

 

 

)

 Slide4

Symmetric Markov chain4

A Markov chain is symmetric if  

What is the stationary distribution of an

irreducible symmetric Markov chain ?Slide5

Example: Random walk on a graph5

Given a connected undirected graph , define a Markov chain whose states are the vertices of the graph. We move from a vertex to one of its neighbors with equal probability 

 

 

 

 

 

 

 

 

 

 

 

Consider

 Slide6

Example: Random walk on a graph6

 

 

 

 

 

 

 

 

 

 

 

Consider

 

 

Where do we use the fact that the graph is undirected ?Slide7

Reversible Markov chain7

 

If

is drawn from

 

 

Prove as an exercise Slide8

Another major application of Markov chains8Slide9

Sampling from large spacesGiven a distribution

on a set , we want to draw an object from with the distribution  

Say we want to estimate the average size of an independent set in a graph

Suppose we could draw an independent set uniformly at random

Then we can draw multiple times and use the average size of the independents sets we drew as an estimate

Useful also for approximate countingSlide10

Markov chain Monte carlo10

Given a distribution on a set , we want to draw an object from with the distribution

 

Build a Markov chain whose stationary distribution is

 

Run the chain for sufficiently long time (until it mixes) from some starting position

 

Your position is a random draw from a distribution close to

, its distribution is

 Slide11

Independent setsSay we are given a graph

and we want to sample an independent set uniformly at random Slide12

Independent setsTransitions: Pick a vertex

uniformly at random, flip a coin.Heads  switch to if

is an independent set

Tails  switch to

 

 

This chain is irreducible and aperiodic (why?)Slide13

Independent setsTransitions: Pick a vertex

uniformly at random, flip a coin.Heads  switch to if

is an independent set

Tails  switch to

 

 

What is the stationary distribution ?Slide14

Independent setsSo if we walk sufficiently long time on this chain we have an independent set almost uniformly at random…Lets generalize this

14Slide15

Gibbs samplersWe have a distribution

over functions

There are

’s (states)

 

Want to sample from

 

 

 

 

 

 

 

  

 

 Slide16

Gibbs samplersChain

: At state , pick a vertex uniformly at random. There are states

in which

is kept fixed

(

is

with

assigned to

).

Pick

with probability

.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Slide17

Gibbs samplersClaim

: This chain is reversible with respect to  Need to verify:

 

 

iff

 

Otherwise

and

We need to verify that:

 Slide18

Gibbs samplers

 

Easy to check that the chain is aperiodic, so if it is also irreducible then we can use it for samplingSlide19

Gibbs for uniform q-coloring19

Transitions: Pick a vertex uniformly at random, pick a (new) color for uniformly at random from the set of colors not attained by a neighbor of  

 

 Slide20

Gibbs for uniform q-coloring20

Notice that is hard to compute but

is easy

 

 

 Slide21

Gibbs samplers (summary)Notice that even if

may be hard to compute it is typically easy to compute

 

Chain

:

At state

, pick a vertex

uniformly at random. There

are

states

consistent with

(

is

with assigned to ). Pick

with probability

. Call this distribution

 Slide22

Metropolis chain22

Want to construct a chain over

with a stationary distribution

 

States do not necessarily correspond to labelings of the vertices of a graph Slide23

Metropolis chain23

 

 

Say

(symmetric)

 

Start with some chain over

 

Need that

is easy to compute when at

 Slide24

Metropolis chain24

We now modify the chain and obtain a Metropolis chain:At :1) Suggest a neighbor

with probability

2) Move to

with probability

(otherwise stay at

)

 Slide25

Metropolis chain25

 

 

 

 Slide26

26

At :1) Suggest a neighbor with probability

2) Move to

with probability

(otherwise stay at

)

 

A more general presentation

is not symmetric, but

 

The metropolis chain with respect to

:

 Slide27

A more general presentation27

 

 

 

 Slide28

Detailed balance conditions28

 

Assume

 

 

Other case is symmetricSlide29

Metropolis/GibbsOften

where

Then it is possible to compute the transition probabilities in the Gibbs and Metropolis chains

 

29Slide30

Metropolis chain for bisectionSlide31

Metropolis chain for bisection31

=

 

We introduce a parameter

and take the exponent of this quality measure

 

 

Our target distribution is proportional to

 Slide32

Boltzmann distribution

 

Generate a metropolis chain for

 

 Slide33

Boltzmann distribution

 

 Slide34

The base chain

Consider the chain over the cuts in the graph where the neighbors of a cut

are the cuts we can obtain from

by flipping the side of a single vertex

 

 

 

 

Symmetric

 Slide35

Metropolis chain for bisectionAt

:1) Suggest a neighbor with probability

2) Move to

with probability

(otherwise stay at

)

 

 

 Slide36

Properties of the Boltzmann distribution

 

Let

the global minima,

 

 

 Slide37

Properties of the Boltzmann distribution

 

 

 Slide38

Properties of the Boltzmann distributionAs

gets smaller get concentrated on the global minima Slide39

Metropolis chain for bisectionAt

:Suggest a neighbor with probability

If

move to

(assume

Otherwise move to

with probability

 Slide40

Generalization of local searchThis is a generalization of local searchAllows non improving movesWe take a non-improving move with probability that decreases with the amount of degradation in the quality of the bisection

40Slide41

Generalization of local searchAs

decreases it is harder to take non-improving moves For very small , this is like local searchFor very large , this is like random walk So which should we use ? 41Slide42

Simulated annealing Start with a relatively large

Perform iterationsDecrease  42Slide43

Motivated by physicsGrowing crystalsFirst we melt the raw materialThen we start cooling itNeed to cool carefully/slowly in order to get a good crystal

We want to bring the crystal into a state with lowest possible energy Don’t want to get stuck in a local optimumSlide44

Experiments with annealing

Average running times:Annealing 6 minLocal search 1 secKL 3.7 secSlide45

Experiments with annealingSlide46

The annealing parametersTwo parameters control the range of temperature considered:

: Pick the initial temperature so that you accept of the moves: You “freeze’’ when you accept at most at temperatures since the last winner found 46Slide47

 Slide48

After applying local opt to the sampleSlide49

Tails of 2 runs

Left: ,

 

Right:

,

 

Same quality for half the time !Slide50

Running time/quality tradeoffTwo natural parameters control this:

and was set to be

Doubling

doubles the running timeChanging

should double the running time (experiment shows that it grows only by a factor of )

 Slide51

51Slide52

Simulated annealing summaryModification to local search that allows to escape from local minimaMany applications (original paper has 36316 citations)VLSI designProtein foldingScheduling/assignment problems

52