CS228 Problem Set #1
1
CS 228, Winter 20112012
Problem Set #1
This assignment is due at 12 noon on January 23
.
Submissions should be placed in
the ﬁling cabinet labeled “CS228 Homework Submission Box” located in the lobby
outside Gates 187”
A Hidden Markov Model (HMM) is a dynamic Bayesian network with two variables for each
time slice
t
: a state variable
S
(
t
)
and an output variable
O
(
t
)
. In a standard HMM, the output
variable is always observed for all time slices
t
, while the hidden state variable is never observed.
The state at each time slice depends only on the state at the previous time slice (i.e.,
S
(
t
)
depends
only on
S
(
t

1)
), and the output at each time slice only depends on the state at that time slice
(
O
(
t
)
depends only on
S
(
t
)
). For the purposes of this exercise, we will assume that the variables
S
(
t
)
∈ {
s
1
,...,s
K
}
and
O
(
t
)
∈ {
o
1
,...,o
N
}
, for all
t
, where
K
denotes the number of states,
and
N
denotes the number of possible observations.
HMMs are deﬁned by their
transition model
P
(
S
0

S
)
and
observation model
P
(
O

S
)
. We
will use the variables
f
and
g
to represent these where necessary, i.e.,
P
(
S
0
=
s
j

S
=
s
i
) =
f
ij
and
P
(
O
=
o
j

S
=
s
i
) =
g
ij
. In this question, we will consider only
stationary
transition and
observation models, i.e., these models
f
and
g
are the same for all time steps
t
. We can thus
represent a HMM with the following 2TBN, where shaded nodes denote observed variables:
Despite their simplicity, HMMs are used extensively in realworld applications in which we sus
pect that there is an underlying sequence of hidden states which are generating the observed
outcomes. For example, HMMs are the method of choice for speech recognition, with the hidden
states representing the actual word that the speaker is saying, and the observed states repre
senting the audio recording of the word. In this case, the transition model would be based on
the language and context (e.g., in English, the word “San” might be very likely to transition to
the word “Francisco”).
However, standard HMMs are often unable to represent more complex distributions. In this
exercise, we will investigate a series of extensions to the standard Hidden Markov Model that
allows it to encode a richer class of distributions, and apply these more expressive models to the
problem of sequence alignment.