If we substitute the joint distribution pr x x z z θ

Info icon This preview shows pages 85–87. Sign up to view the full content.

View Full Document Right Arrow Icon
If we substitute the joint distribution Pr( X = x , Z = z | θ ), along with γ and ξ , into (3.5), we obtain Q ( θ , θ old ) = γ ( z 10 ) log(1 π ) + γ ( z 11 ) log π + T t =2 1 j =0 1 k =0 ξ ( z t 1 ,j , z tk ) log A jk + T t =1 γ ( z t 0 ) I ( x t = 0) + γ ( z t 1 ) log Pr( X t = x t | Z t = 1 , φ ) . (3.6) Next, we seek an efficient procedure for evaluating the quantities γ ( z tk ) and ξ ( z t 1 ,j , z tk ). The forward–backward algorithm (Baum and Eagon, 1967; Baum and Sell, 1968) is used to accomplish this. First, we define the forward variable as α ( z t,k ) = Pr( X 1 = x 1 , . . . , X t = x T , Z t = k | θ ) k = 0 , 1 . α can be solved for inductively: (1) Initialization: α ( z 1 , 0 ) = 1 π α ( z 1 , 1 ) = π Pr( X 1 = x 1 | Z 1 = 1 , φ ). (2) Induction: For k = 0 , 1 and 1 t T 1, α ( z t +1 ,k ) = [ α ( z t, 0 ) A 0 k + α ( z t, 1 ) A 1 k ] Pr( X t = x t | Z t = k, φ ) . (3.7) Below, we will use the fact that Pr( X = x | θ ) = α ( z T, 0 ) + α ( z T, 1 ). We next need to define the backward variable , the probability of the partial observation sequence from t + 1 to T : β ( z t ) = Pr( X t +1 = x t +1 , . . . , X T = x T | Z t = z t , θ ) . Copyright © 2014. Imperial College Press. All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under U.S. or applicable copyright law. EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 2/16/2016 3:37 AM via CGC-GROUP OF COLLEGES (GHARUAN) AN: 779681 ; Heard, Nicholas, Adams, Niall M..; Data Analysis for Network Cyber-security Account: ns224671
Image of page 85

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
86 J. Neil, C. Storlie, C. Hash and A. Brugh β ( z t ) can be solved for inductively as follows: (1) Initialization: β ( z T,k ) = 1 k = 0 , 1. (2) Induction: For k = 0 , 1 and t = T 1 , . . . , 1, β ( z t,k ) = A k, 0 Pr( X t +1 = x t +1 | Z t +1 = 0) β ( z t +1 , 0 ) + A k, 1 Pr( X t +1 = x t +1 | Z t +1 = 1 , φ ) β ( z t +1 , 1 ) . (3.8) Finally, γ ( z t ) = α ( z t ) β ( z t ) Pr( X = x | θ ) (3.9) ξ ( z t 1 , z t ) = α ( z t 1 ) Pr( X t = x t | Z t = z t , φ ) Pr( Z t = z t | Z t 1 = z t 1 ) β ( z t ) Pr( X = x | θ ) . (3.10) The M Step. In the M step, we maximize (3.6) with respect to θ . Max- imization with respect to π and A is easily achieved using appropriate Lagrange multipliers. Taking the derivative with respect to µ results in a closed form update as well. ˆ π = γ 11 1 j =0 γ ( z 1 j ) ˆ A jk = T t =2 ξ ( z t 1 ,j , z tk ) 1 l =0 T t =2 ξ ( z t 1 ,j , z tl ) j = 0 , 1 k = 0 , 1 ˆ µ = T t =1 γ ( z t 1 ) x t T t =1 γ ( z t 1 ) . The size parameter update is not closed form. From (3.6), we see that it comes down to maximizing log Pr( X t = x t | Z t = 1 , φ ) with respect to s , which we achieve through a numerical grid optimization routine. Scaling. For moderate lengths of chains, the forward and backward vari- ables quickly get too small for the precision of the machine. One cannot work with logarithms, as is the case for independent and identically dis- tributed (i.i.d) data, since here we have sums of products of small numbers. Therefore a rescaling has been developed, and is described in Bishop (2006). Define a normalized version of α as ˆ α ( z t ) = Pr( Z t = z t | X 1 = x 1 , . . . , X t = x t ) = α ( z t ) Pr( X = x | θ ) .
Image of page 86
Image of page 87
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern