STAT244  HW7 solutions  Due 2/3/2017
Total:
100
pts.
Notation
:
1. If
n
2
N
, i.e.,
n
is an integer,
[
n
]
denotes
1
,
2
, . . . , n

1
, n
(sometimes denoted
as
1
, n
);
2.
l
(
✓

x
)
will denote the loglikelihood for parameter
✓
and data
x
;
3.
ˆ
✓
MLE
stands for the MLE of
✓
;
4.
df
stands for degrees of freedom.
Problem 1 [20 pts]
Call “This, it, thus, and” class I words; class II is “everything
else”. For each of
N
= 215
groups of
n
= 5
of James Mill’s sentences, the number
of class I words was counted.
# of class I words
0
1
2
3
4
5
# of groups
87
11
51
42
20
4
Test whether a Bin
(
n,
✓
)
fits these data.
Solution
: first we find
ˆ
✓
MLE
. Let
x
= (
x
1
, . . . , x
N
)
denote the counts of class I
words within given groups.
l
(
✓

x
) =
N
X
i
=1
log
(
5
x
i
)
+ log
✓
·
N
X
i
=1
x
i
+ log(1

✓
)
N
X
i
=1
(
n

x
i
)
@
✓
l
(
✓

x
) =
N
✓
¯
x
✓

n

¯
x
1

✓
◆
= 0
when
✓
=
ˆ
✓
MLE
,
¯
x
n
@
2
✓✓
l
(
✓

x
) =

N
✓
¯
x
✓
2
+
n

¯
x
(1

✓
)
2
◆
0
8
✓
We have
87
⇥
0 + 11
⇥
1 + 2
⇥
51 + 3
⇥
42 + 4
⇥
20 + 5
⇥
4 = 339
sentences with class I words in total. Thus,
ˆ
✓
MLE
=
339
215
/
5
⇡
.
315
.
[5
pts
]
# of sentences with class I words,
i
0
1
2
3
4
5
Observed,
O
i
87
11
51
42
20
4
Expected,
E
i
32.35
74.5
68.61
31.59
7.27
.67
[5 pts]
Under the null hypothesis, the test statistics is
X
=
5
X
i
=0
(
E
i

O
i
)
2
E
i
⇡
193
.
238
with
df
= 6

1

1 = 4
[5 pts]
The
p
value is nearly
0
, so we have strong evidence to reject the null hypothesis, i.e.,
we conclude that the count of sentences with class I words does not fit a binomial
distribution.
[5 pts]
Problem 2 [15 pts]
The members of a community are classified by Blood type:
0
A
B
AB
Total
121
120
79
33
353
Theory has it that the probabilities of those types depend on gene frequency param
eters
r, p, q
, where
r
+
p
+
q
= 1
,
P
{
”0”
}
=
r
2
,
P
{
”
A
”
}
=
p
2
+2
pr
,
P
{
”
B
”
}
=
q
2
+2
qr
and
P
{
”
AB
”
}
= 2
pq
.
Using numerical methods (see Chapter 5), we can get
ˆ
r
MLE
=
.
580
,
ˆ
p
MLE
=
.
246
and
ˆ
q
MLE
=
.
173
.
