If we substitute the joint distribution Pr(
X
=
x
,
Z
=
z

θ
), along with
γ
and
ξ
, into (3.5), we obtain
Q
(
θ
,
θ
old
) =
γ
(
z
10
) log(1
−
π
) +
γ
(
z
11
) log
π
+
T
t
=2
1
j
=0
1
k
=0
ξ
(
z
t
−
1
,j
, z
tk
) log
A
jk
+
T
t
=1
γ
(
z
t
0
)
I
(
x
t
= 0) +
γ
(
z
t
1
) log Pr(
X
t
=
x
t

Z
t
= 1
, φ
)
.
(3.6)
Next, we seek an eﬃcient procedure for evaluating the quantities
γ
(
z
tk
)
and
ξ
(
z
t
−
1
,j
, z
tk
). The forward–backward algorithm (Baum and Eagon,
1967; Baum and Sell, 1968) is used to accomplish this. First, we define
the
forward variable
as
α
(
z
t,k
) = Pr(
X
1
=
x
1
, . . . , X
t
=
x
T
, Z
t
=
k

θ
)
k
= 0
,
1
.
α
can be solved for inductively:
(1) Initialization:
α
(
z
1
,
0
) = 1
−
π α
(
z
1
,
1
) =
π
Pr(
X
1
=
x
1

Z
1
= 1
, φ
).
(2) Induction: For
k
= 0
,
1 and 1
≤
t
≤
T
−
1,
α
(
z
t
+1
,k
) = [
α
(
z
t,
0
)
A
0
k
+
α
(
z
t,
1
)
A
1
k
] Pr(
X
t
=
x
t

Z
t
=
k, φ
)
.
(3.7)
Below, we will use the fact that Pr(
X
=
x

θ
) =
α
(
z
T,
0
) +
α
(
z
T,
1
).
We next need to define the
backward variable
, the probability of the partial
observation sequence from
t
+ 1 to
T
:
β
(
z
t
) = Pr(
X
t
+1
=
x
t
+1
, . . . , X
T
=
x
T

Z
t
=
z
t
,
θ
)
.
Copyright © 2014. Imperial College Press. All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under
U.S. or applicable copyright law.
EBSCO Publishing : eBook Collection (EBSCOhost)  printed on 2/16/2016 3:37 AM via CGCGROUP OF
COLLEGES (GHARUAN)
AN: 779681 ; Heard, Nicholas, Adams, Niall M..; Data Analysis for Network Cybersecurity
Account: ns224671