SameTitle(title,title) SameVenue(venue,venue) During weight learning, each formula is converted to conjunctive normal form (CNF), and a weight is learned for each of its clauses. Since neither a weight nor a period is specified, a default prior weight is
er-col.mln : Time taken for MC-SAT sampling = 9 mins, 29.05 secs Time taken for SampleSat = 3 mins, 20.92 secs Time-Results: Init 0.02 Run 569.05 Total 569.07 Total time taken = 9 mins, 35.9 secs
5
5.1
Area Under Precision Recall Curve
For er-th.mln
For
6
MLN Models used
er-th corresponds to model MLN(B), because it has the four reverse predicate equivalence rules connecting each word to the corresponding field/record match predicate. For e.g. HasWordAuthor(a1, w) HasWordAuthor(a2, w) = SameAuthor(a1, a
8
Playing with alchemy
I tried some other things with weight learning and inference algorithms.
Change Number of iterations : I changed maximum number of iterations in weight learning by setting ag dNumIter = 150, and ran weight
learning and inference al
614
13. SEQUENTIAL DATA
Figure 13.9
Example of the state transition diagram for a 3-state
left-to-right hidden Markov model. Note that once a
state has been vacated, it cannot later be re-entered.
A11
A22
A12
k=1
A23
k=2
A13
A33
k=3
state transition diagr
13.2. Hidden Markov Models
Figure 13.11 Top row: examples of on-line handwritten digits. Bottom row: synthetic digits sampled generatively from a left-to-right hidden Markov model that has been trained on a data set of 45 handwritten digits.
615
One of th
616
Section 9.2
13. SEQUENTIAL DATA
exponentially with the length of the chain. In fact, the summation in (13.11) corresponds to summing over exponentially many paths through the lattice diagram in
Figure 13.7.
We have already encountered a similar difcul
617
13.2. Hidden Markov Models
and make use of the denitions of and , we obtain
Q( ,
old
)=
K
(z1k ) ln k +
(zn1,j , znk ) ln Ajk
n=2 j =1 k=1
k=1
+
NKK
NK
(znk ) ln p(xn |k ).
(13.17)
n=1 k=1
Exercise 13.5
The goal of the E step will be to evaluate t
618
13. SEQUENTIAL DATA Gaussian emission densities we have p(x|k ) = N (x|k , k ), and maximization of the function Q(, old ) then gives
N
(znk )xn (13.20) (znk )
k
=
n=1 N n=1 N
(znk )(xn - k )(xn - k )T
N n=1
k
=
n=1
. (znk )
(13.21)
For the case of
13.2. Hidden Markov Models
619
the messages that are propagated along the chain (Jordan, 2007). We shall focus on
the most widely used of these, known as the alpha-beta algorithm.
As well as being of great practical importance in its own right, the forwar
620
13. SEQUENTIAL DATA represents a vector of length K whose entries correspond to the expected values of znk . Using Bayes' theorem, we have (zn ) = p(zn |X) = p(X|zn )p(zn ) . p(X) (13.32)
Note that the denominator p(X) is implicitly conditioned on the
13.2. Hidden Markov Models
Figure 13.12
621
Illustration of the forward recursion (13.36) for (zn,1 ) (zn-1,1 ) evaluation of the variables. In this fragment A11 of the lattice, we see that the quantity (zn1 ) k = 1 is obtained by taking the elements (zn-
622
13. SEQUENTIAL DATA
Figure 13.13 Illustration of the backward recursion (zn+1,1 ) (zn,1 ) (13.38) for evaluation of the variables. In A11 this fragment of the lattice, we see that the k = 1 quantity (zn1 ) is obtained by taking the components (zn+1,k
13.2. Hidden Markov Models
623
Thus we can evaluate the likelihood function by computing this sum, for any convenient choice of n. For instance, if we only want to evaluate the likelihood function,
then we can do this by running the recursion from the sta
624
13. SEQUENTIAL DATA
Exercise 13.12
This completes the E step, and we use the results to nd a revised set of parameters
new using the M-step equations from Section 13.2.1. We then continue to alternate
between E and M steps until some convergence crit
13.2. Hidden Markov Models
Figure 13.14 A fragment of the factor graph representation for the hidden Markov model.
625
z1
zn-1
n
zn
g1
gn-1
gn
x1
xn-1
xn
Section 10.1
Note that in (13.44), the influence of all data from x1 to xN is summarized in the K val
626
13. SEQUENTIAL DATA
Figure 13.15 A simplied form of factor graph to describe the hidden Markov
model.
fn
h
z1
zn 1
zn
To derive the alpha-beta algorithm, we denote the nal hidden variable zN as
the root node, and rst pass messages from the leaf node h
13.2. Hidden Markov Models
627
we obtain the beta recursion given by (13.38). Again, we can verify that the beta variables themselves are equivalent by noting that (8.70) implies that the initial message send by the root variable node is zN fN (zN ) = 1,
628
13. SEQUENTIAL DATA From the product rule, we then have p(x1 , . . . , xn ) = and so (zn ) = p(zn |x1 , . . . , xn )p(x1 , . . . , xn ) =
n
cm
(13.57)
m=1
m=1
n
cm
Note that at each stage of the forward message passing phase, used to evaluate (zn )
13.2. Hidden Markov Models
Section 13.3
629
Finally, we note that there is an alternative formulation of the forward-backward
algorithm (Jordan, 2007) in which the backward pass is dened by a recursion based
the quantities (zn ) = (zn ) (zn ) instead of u