/
Viterbi, Forward, and Backward Viterbi, Forward, and Backward

Viterbi, Forward, and Backward - PowerPoint Presentation

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
346 views
Uploaded On 2018-12-16

Viterbi, Forward, and Backward - PPT Presentation

Algorithms for Hidden Markov Models Prof Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI Resources used for these slides Durbin Eddy Krogh and ID: 742354

cpc start max hmm start cpc hmm max pepsi sequence probability coke observations parent algorithm find path start1 generated

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Viterbi, Forward, and Backward" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Viterbi, Forward, and BackwardAlgorithms for Hidden Markov Models

Prof. Carolina RuizComputer Science DepartmentBioinformatics and Computational Biology ProgramWPISlide2

Resources used for these slides

Durbin, Eddy, Krogh, and Mitchison. "Biological Sequence Analysis". Cambridge University Press. 1998. Sections 3.1-3.3.

Prof

. Moran's Algorithms in Computational Biology course (

Technion

Univ.)

:

Ydo

Wexler & Dan Geiger's Markov Chain Tutorial.

Hidden Markov Models (HMMs) Tutorial.

Slide3

HMM: Coke/Pepsi Example

start

B

R

A

C

P

C

P

Hidden States

:

start

: fake start state

A

: The price of Coke and Pepsi are the same

R

: “Red sale”: Coke is on sale (cheaper than Pepsi)

B

: “Blue sale”: Pepsi is on sale (cheaper than Coke)

Emissions

:

C

: Coke

P

: Pepsi

0.6

0.1

0.3

0.2

0.4

0.7

0.1

0.1

0.3

0.8

0.3

0.1

C

P

0.6

0.4

0.5

0.5

0.1

0.9Slide4

1. Finding the most likely trajectory

Given a HMM and a sequence of observables: x1,x2,…,xL

determine the most likely sequence of states that generated

x

1

,x

2

,…,

xL

: S* = (s*1,s*2,…,s*L)

= argmax p( s1,s2,…,s

L| x1,x2,…,x

L ) s1,s

2,…,sL =

argmax p( s1,s2,…,sL;

x1,x2,…,xL)/p(x1

,x2,…,xL) s

1,s

2,…,sL = argmax

p( s1,s2,…,sL; x

1,x2,…,xL )

s1,s2,…,sLSlide5

= argmax p( s

1,s2,…,sL; x1,x2,…,x

L

)

s

1

,s

2,…,

sL= argmax p(s1,s2,…,

sL-1; x1,x2,…,xL-1)p(s

L|sL-1)p(xL|sL

) s1,s2,…,

sLThis inspires a recursive formulation of S*.

Viterbi’s idea: This can be calculated using dynamic programming.v(

k,t) =

max p(s1

,..,st

= k ; x

1,..,x

t)

that is, the probability of a most probable path up to time t that ends on state k

. By the above derivation:v(k,t

) = max p(s1,..,s

t-1; x1,..,

xt-1)p(st=k|s

t-1)p(xt|st=k)

= max v(j,t-1)p(st=

k|sj)p(xt|st=k)

j = p(xt|st=k) max v(j,t-1)p(st=k|sj)

j Slide6

Viterbi’s Algorithm - Example

v

x

1

= C

x

2

= P

x

3 = C

start1

000

A0

R

0

B

0

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the most likely path

S

*= (s*1,s*2

,s*3) that generated x1,x2,x3

= CPCinitializationSlide7

Viterbi’s Algorithm - Example

v

x

1

= C

x

2

= P

x

3 = C

start1

00

0A

0p(xt|st

=k) max j v(j,t-1)p(s

t|sj)

= p(C|A) max {v(start,0)p(

A|start), 0, 0, 0}

= p(C|A) v(start,0)p(A|start) = 0.6

*1*0.6 = 0.36Parent: start

R

0

p(C|R) max {

v(start,0)p(

R|start), 0, 0, 0}= 0.9*1*0.1 = 0.09

Parent: start

B

0

p(C|B) max {v(start,0)p(B|start

), 0, 0, 0}= 0.5*1*0.3 = 0.15

Parent: start

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the most likely path

S*= (s*1

,s*2,s*3) that generated x1

,x2,x3= CPCSlide8

Viterbi’s Algorithm - Example

v

x

1

= C

x

2

= P

x

3=C

start1

000

A0

0.36Parent: start

=

p(

xt

|s

t=k)

max

j v(j,t-1)

p(

st

|sj

)

= p(P|A) max {

v(start,1)p(

A|start),

v(A,1)p(A|A),

v(R,1)

p(A|R),

v(B,1)

p(A|B)}

= 0.4* max{0,

0.36*0.2

, 0.09*0.1, 0.15*0.4} = 0.4*0.072= 0.0288

Parent: A

R

0

0.09

Parent:

start

= p(x

t|s

t

=k) max

j

v(j,t-1)p(

s

t|s

j

)

= p(P|R) max {

v(start,1)p(

R|start

),

v(A,1)

p(R|A),

v(R,1)

p(R|R),

v(B,1)

p(R|B)}

= 0.1* max{0, 0.36*0.1, 0.09*0.1,

0.15*0.3

} = 0.1*0.045= 0.0045

Parent: B

B

00.15Parent: start= p(xt|st=k) max j v(j,t-1)p(st|sj)= p(P|B) max {v(start,1)p(B|start), v(A,1)p(B|A), v(R,1)p(B|R), v(B,1)p(B|B)}= 0.5* max{0, 0.36*0.7, 0.09*0.8, 0.15*0.3} = 0.5*0.252= 0.126Parent: A

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the most likely path

S

*= (s*

1

,s*

2

,s*

3

) that generated x

1

,x

2

,x

3

= CPCSlide9

Viterbi’s Algorithm - Example

v

x

1

= C

x

2

= P

x

3=C

start1

000

A0

0.36Parent: start

0.0288

Parent: A

= p(

x

t

|st

=k)

max j

v(j,t-1)

p(s

t|s

j)

= p(C|A) max {

v(start,2)

p(A|start),

v(A,2)p(A|A),

v(R,2)

p(A|R),

v(B,2)

p(A|B)}

= 0.6* max{0, 0.0288*0.2, 0.0045*0.1,

0.126*0.4

} = 0.6*0.0504= 0.03024

Parent: B

R

0

0.09

Parent:

start

0.0045

Parent: B

= p(

x

t

|s

t

=k) max

j

v(j,t-1)p(

s

t

|s

j

)

= p(C|R) max {

v(start,2)

p(

R|start

),

v(A,2)

p(R|A),

v(R,2)

p(R|R), v(B,2)p(R|B)}= 0.9* max{0, 0.0288*0.1, 0.0045*0.1, 0.126*0.3} = 0.9*0.0378= 0.03402Parent: BB00.15Parent: start0.126Parent: A= p(xt|st=k) max j v(j,t-1)p(st|sj)= p(C|B) max {v(start,1)p(B|start

),

v(A,2)

p(B|A), v(R,2)p(B|R), v(B,2)p(B|B)}= 0.5* max{0, 0.0288*0.7, 0.0045*0.8, 0.126*0.3} = 0.5*0.0378= 0.0189Parent: B

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the most likely path

S

*= (s*

1

,s*

2

,s*

3

) that generated x

1

,x

2

,x

3

= CPCSlide10

Viterbi’s Algorithm - Example

v

x

1

= C

x

2

= P

x

3=C

start1

000

A0

0.36Parent: start

0.0288

Parent: A

0.03024

Parent: B

R

0

0.09

Parent: start

0.0045

Parent: B

0.03402

Parent: B

B

0

0.15Parent: start

0.126

Parent: A

0.0189

Parent: B

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the most likely path

S

*= (s*

1

,s*2,s*3

) that generated x1,x2,x3

= CPCHence, the most likely path that generated

CPC is: start A B RThis maximum likelihood path is extracted from the table as follows:

The last state of the path is the one with the highest value in the right-most columnThe previous state in the path is the one recorded as Parent of the last

Keep following the Parents trail backwards until you arrive at startSlide11

2. Calculating the probability of a sequence of observations

Given a HMM and a sequence of observations: x1,x2

,…,

x

L

determine

p(x

1,x2

,…,xL):

p(x1,x2,…,

xL) =

 p( s1,s

2,…,sL; x1,x2,…,x

L) s1

,s2,…,sL

= 

p(s1

,s2,…,sL-1; x

1,x2,…,xL-1

)p(sL|sL-1)p(

xL|sL)

s1,s2

,…,sLSlide12

Let f(

k,t) = p(s

t

= k ; x

1

,..,x

t

)

that is, the probability of x

1,..,xt

requiring st

= k. In other words, the sum of probabilities of all the paths

that emit (x1,..,

xt) and end in state s

t=k.

f(k,t) =

p(st

=

k ; x1,..,x

t, xt

)= 

j p(st-1=j; x1,x

2,…,xt-1) p(s

t=k|st-1=j) p(x

t|st=k)

= p(xt|s

t=k) j

p(st-1=j; x1,x2,…,xt-1

) p(st=k|st-1=j)= p(xt|s

t=k) j f(j,t-1) p(s

t=k|st-1)

Slide13

Forward Algorithm - Example

f

x

1

= C

x

2

= P

x

3 = C

start1

000

A0

R

0

B

0

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the probability that the HMM

emits x

1,x2,x3

= CPC. That is, find p(CPC). initializationSlide14

Forward Algorithm - Example

f

x

1

= C

x

2

= P

x

3 = C

start1

00

0A

0p(xt|st

=k)  j f(j,t-1)

p(st|sj)

= p(C|A)  {

f(start,0)p(A|start

), 0, 0, 0}= p(C|A) f(start,0)p(A|start)

= 0.6 *1*0.6 = 0.36

R0

p(C|R)

{f(start,0)

p(

R|start), 0, 0, 0}= 0.9*1*0.1 = 0.09

B

0

p(C|B)  {f(start,0)

p(B|start), 0, 0, 0}

= 0.5*1*0.3 = 0.15

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the probability that the HMM

emits x1,x2

,x3= CPC. That is, find p(CPC). Slide15

Forward Algorithm - Example

f

x

1

= C

x

2

= P

x

3=C

start1

000

A0

0.36

= p(

xt

|s

t

=k) 

j f(j,t-1)

p(s

t

|sj

)

= p(P|A) (f(start,1)

p(

A|start),

+ f(A,1)p(A|A),

+ f(R,1)p(A|R),

+

f(B,1)

p(A|B))

= 0.4* (0 + 0.36*0.2 + 0.09*0.1 + 0.15*0.4) = 0.4*0.141= 0.0564

R

0

0.09

= p(

x

t

|s

t

=k)

j

f(j,t-1)

p(s

t|s

j

)

= p(P|R) (f(start,1)

p(

R|start) +

f(A,1)

p(R|A) +

f(R,1)

p(R|R) +

f(B,1)

p(R|B))

= 0.1* (0 + 0.36*0.1 + 0.09*0.1 + 0.15*0.3) = 0.1*0.09= 0.009

B

0

0.15

= p(

x

t

|s

t

=k)  j f(j,t-1)p(st|sj)= p(P|B) (f(start,1)p(B|start) + f(A,1)p(B|A) + f(R,1)p(B|R) + f(B,1)p(B|B))= 0.5* (0 + 0.36*0.7 + 0.09*0.8 + 0.15*0.3) = 0.5*0.369= 0.1845 Given: Coke/Pepsi HMM, and sequence of observations: CPCFind the probability that the HMM emits x1,x2,x3= CPC. That is, find p(CPC). Slide16

Forward Algorithm - Example

f

x

1

= C

x

2

= P

x

3=C

start1

000

A0

0.36

0.0564

= p(

xt

|s

t

=k) 

j f(j,t-1)

p(

st

|sj

)

= p(C|A)

 {

f(start,2)p(A|start),

f(A,2)p(A|A),

f(R,2)

p(A|R),

f(B,2)

p(A|B)}

= 0.6* (0 + 0.0564*0.2 + 0.009*0.1 + 0.1845*0.4} = 0.6*0.08598= 0.05159

R

0

0.09

0.009

= p(

x

t

|s

t

=k) 

j f(j,t-1)

p(

st

|s

j)

= p(C|R)

{

f(start,2)

p(

R|start

),

f(A,2)

p(R|A),

f(R,2)

p(R|R),

f(B,2)

p(R|B)}

= 0.9* (0 + 0.0564*0.1 + 0.009*0.1 + 0.1845*0.3} = 0.9*0.06189= 0.05570

B

00.150.1845= p(xt|st=k)  j f(j,t-1)p(st|sj)= p(C|B)  {f(start,1)p(B|start), f(A,2)p(B|A), f(R,2)p(B|R), f(B,2)p(B|B)}= 0.5* (0 + 0.0564*0.7 + 0.009*0.8 + 0.1845*0.3} = 0.5*0.10203= 0.05102Given: Coke/Pepsi HMM, and sequence of observations: CPCFind the probability that the HMM emits x1,x2,x3= CPC. That is, find p(CPC). Slide17

Forward Algorithm - Example

f

x

1

= C

x

2

= P

x

3=C

start1

000

A0

0.36

0.0564

0.05159

R

0

0.09

0.009

0.05570

B

0

0.15

0.1845

0.05102

Hence, the probability of CPC being generated by this HMM is:

p(CPC) =  j f(j,3) = 0.05159 +

0.05570 + 0.05102 = 0.15831Given: Coke/Pepsi HMM, and sequence of observations: CPCFind the probability that the HMM emits x1,x2,x3= CPC. That is, find p(CPC). Slide18

3. Calculating the probability of St = k given a sequence of observations

Given a HMM and a sequence of observations: x1

,x

2

,…,

x

L

determine the probability that the state visited at time t was k: p(st=k|

x1,x2,…,x

L), where 1 <= t <= Lp(

st=k| x1,x

2,…,xL) =

p(x1,x2,…,

xL; st

=k)/p(x1,x

2,…,xL)

Note that

p(x1,x2,…,x

L) can be found using the forward algorithm. We’ll focus now on determining p(x

1,x2,…,x

L; st

=k)Slide19

p(x1

,…,xt,…,xL

;

s

t

=k

)

= p(x1,…,xt; s

t=k) p(xt+1,…,xL

| x1,…,

xt ;

st=k)= p(x1,…,

xt; st=k) p(x

t+1,…,xL|

st=k)

f(k,t) b(k,t

) forward algorithm backward algorithm

b(k,t) = p(x

t+1,…,xL|

st=k)= 

j p(st+1

=j|st=k)p(xt+1

|st+1=j) p(xt+2,…,

xL| st+1=j)

b(j,t+1)Slide20

Backward Algorithm - Example

b

x

1

= C

x

2

= P

x

3 = CA

1

R

1

B

1

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the probability that the HMM emits xt+1,…,

xL

given that St=k: p(xt+1,…,x

L| st=k)

initializationSlide21

Backward Algorithm - Example

b

x

1

= C

x

2

= P

x

3 = CA

j p(st+1=

j|st=k) p(xt+1

|st+1=j) b(j,t+1)

=  j

p(s3=j|A) p(C|s

3=j) b(j,3)

= p(A|A)p(C|A)

b(A,3) + p(R|A)p(C|R)b(R,3) + p(B|A)p(C|B)b(B,3)

= 0.2*0.6*1 + 0.1*0.9*1 + 0.7*0.5*1 = 0.561

R

j

p(st+1

=

j|st

=k) p(xt+1

|st+1=j) b(j,t+1)=

 j

p(s3

=

j|R

) p(C|s

3

=j) b(j,3)

= p(A|R)p(C|A)b(A,3) + p(R|R)p(C|R)b(R,3) + p(B|R)p(C|B)b(B,3)

= 0.1*0.6*1 + 0.1*0.9*1 + 0.8*0.5*1 = 0.55

1

B

j

p(s

t+1

=j|s

t=k) p(x

t+1

|st+1

=j) b(j,t+1)

=

j

p(s

3

=

j|R

) p(C|s

3

=j) b(j,3)

= p(A|B)p(C|A)b(A,3) + p(R|B)p(C|R)b(R,3) + p(B|B)p(C|B)b(B,3)

= 0.4*0.6*1 + 0.3*0.9*1 + 0.3*0.5*1 = 0.66

1

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the probability that the HMM

emits

x

t+1,…,xL given that St=k: p(xt+1,…,xL| st=k) Slide22

Backward Algorithm - Example

b

x

1

= C

x

2

= P

x

3 = CA

j

p(st+1

=j|s

t

=k) p(xt+1

|s

t+1=j) b(j,t+1)

=

j

p(s

2=

j|A) p(P|s

2=j) b(j,2)

= p(A|A)p(P|A)b(A,2) + p(R|A)p(P|R)b(R,2) + p(B|A)p(P|B)b(B,2)

= 0.2*0.4*0.56 + 0.1*0.1*0.55 + 0.7*0.5*0.66 = 0.2813

0.561R

 j

p(s

t+1=

j|s

t

=k) p(x

t+1

|st+1

=j) b(j,t+1)

=

j

p(s2

=j|R

) p(P|s2

=j) b(j,2)

= p(A|R)p(P|A)b(A,2) + p(R|R)p(P|R)b(R,2) + p(B|R)p(P|B)b(B,2)

= 0.1*0.4*0.56 + 0.1*0.1*0.55 + 0.8*0.5*0.66 = 0.2919

0.55

1

B

j

p(st+1

=

j|s

t

=k) p(x

t+1

|s

t+1

=j) b(j,t+1)

=

j

p(s

2=j|R) p(P|s2=j) b(j,2)= p(A|B)p(P|A)b(A,2) + p(R|B)p(P|R)b(R,2) + p(B|B)p(P|B)b(B,2)= 0.4*0.4*0.56 + 0.3*0.1*0.55 + 0.3*0.5*0.66 = 0.20510.661Given: Coke/Pepsi HMM, and sequence of observations: CPCFind the probability that the HMM emits xt+1,…,xL given that St=k: p(xt+1,…,xL| st=k) Slide23

Backward Algorithm - Example

b

x

1

= C

x

2

= P

x

3 = CA

0.2813

0.56

1R

0.2919

0.55

1

B

0.2051

0.66

1

Given: Coke/Pepsi HMM, and sequence of observations: CPC

Find the probability that the HMM emits

xt+1,…,xL given that St=k:

p(xt+1,…,xL

| st=k)

We can calculate the probability of CPC being generated by this HMM from the Backward table as follows: p(CPC) =

 j b(j,1)p(j|start)p(C|j

) = (0.2813+0.6*0.6) + (0.2919*0.1*0.9) + (0.2051*0.3*0.5)= 0.15831

though we can obtain the same probability from the Forward table (as we did in a previous slide). Slide24

3. (cont.) Using the Forward and Backward tables to calculate the probability of St

= k given a sequence of observationsExample:

Given

: Coke/Pepsi HMM, and sequence of observations: CPC

Find the probability

that

the state visited at time

2

was B, that is p(s2=B| CPC)

In other words, given that the person drank CPC, what’s the probability that Pepsi was on sale during the 2nd

week?Based on the calculations we did on the previous slides:

p(s2

=B|CPC) = p(CPC;

s2=B)/p(CPC) = [ p( x

1=C, x2=P; s2

=B) p(x3=C|

x1=C, x2

=P

; s2=B) ] / p(x

1=C, x2=

P, x3=C)

= [ p(x1=C,

x2=P; s2=B

) p(x3=C|

s2=B) ] / p(CPC) = [ f(B,2) b(B,2) ] / p(CPC)

= [0.1845 * 0.66] / 0.15831 = 0.7691

here, p(CPC) was calculated by summing up the last column of the Forward table.so there is a high probability that Pepsi was on sale during week 2, given that the person drank Pepsi that week!