Maximum likelihood II
Peter Beerli
October 3, 2005
1
Conditional likelihoods revisited
Some of this is already covered in the algorithm in the last chapter, but more elaboration on the
practical procedures and why are using these might give an even better understanding. This section
follows the Felsenstein book pages 251255. We express the likelihood of the tree in Figure 1 as
Prob(
D
(
i
)

T
) =
z
y
w
x
Prob(
A, A, C, G, G, w, y, x, z

T
)

T
)
(1)
where
T
= (
t
1
, t
2
, t
3
, t
4
, t
5
, t
6
, t
7
, t
8
). Each summation is over all 4 nucleotides. The above proba
bility can be separated into
Prob(
A, A, C, G, G, w, y, x, z

T
)

T
) =Prob(
z
)
×
Prob(
w

y, t
3
)Prob(
A

w, t
1
)Prob(
A

w, t
2
)
×
Prob(
y

t
5
, z
)Prob(
C

y, t
4
)
×
Prob(
x

z, t
6
)Prob(
G

x, t
7
)Prob(
G

x, t
8
)
Prob(
z
) at the root is often assumed to depend on the stationary base frequencies. All parts are
easy to calculate and if we order the terms of the sum in formula 1 and move the summations as
far right as possible we get a summation pattern that uses the same structure as our tree ((C, (A,
A)),(G, G))
Prob(
D
(
i
)

T
) =
z
Prob(
z
)
y
Prob(
y

t
5
, z
)Prob(
C

y, t
4
)
×
w
Prob(
w

y, t
3
)Prob(
A

w, t
1
)Prob(
A

w, t
2
)
×
x
Prob(
x

z, t
6
)Prob(
G

x, t
7
)Prob(
G

x, t
8
)
1