Assignment Answers
Categorical Data Analysis, CHL 5407H
1.
You are asked to design a crosssectional study investigating the ex
ercise habits of new graduate students at the University of Toronto.
Students will be classiFed as being inactive if they exercise less than
twice a week while students who exercise at least twice a week are clas
siFed as being active. How many students would have to be enrolled to
ensure that a 90% conFdence interval about the estimated proportion
of students being active is no wider than
±
0
.
05
?
An
approximate
90% confdence interval For
π
is given by
ˆ
π
±
1
.
645
±
ˆ
π
(1

ˆ
π
)
/n.
As discussed in class this confdence interval will be widest when ˆ
π
=0
.
5. There
Fore solving the equation
0
.
05 = 1
.
645
±
0
.
5
×
0
.
5
/n
gives
n
= 270
.
60. Thus a sample oF 271 women is needed to be able to estimate the
probability oF preFerring the new analgesic to within
±
0
.
05, with 90% confdence.
2.
Data on smoking history from four studies of lung cancer patients are
provided in Table 1. Use these data to answer the following questions.
(a)
Construct an hypothesis test using these data to test the null hy
pothesis that the true proportion of smokers does not vary across
the four studies. Provide a brief explanation of your results.
We are testing the null hypothesis that the true proportion oF smokers does
not vary across the Four studies versus the alternative hypothesis that the
true proportion oF smokers does vary across the Four studies. We can test
this null hypothesis using a Pearson chisquare statistic and comparing it to
a chisquare distribution with 3 degrees oF Freedom.
χ
2
P
=
2
²
i
=1
3
²
j
=1
(
O
ij

E
ij
)
2
E
ij
=
(3

[25
×
86
/
397])
2
25
×
86
/
397
+
...
+
(70

[372
×
82
/
397])
2
372
×
82
/
397
=1
2
.
6004 (
p
.
0056)
.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentWe therefore reject the null hypothesis at the 5% level concluding that the
observed diﬀerence among the four studies is statistically signiFcant.
Note in particular that there was little diﬀerence in the proportion of smokers
between study 1, study 2 or study 3 (see Appendix 1). Rather the statis
tical signiFcance is almost entirely a consequence of the relatively smaller
proportion of smokers in study 4. One way of demonstrating this is using the
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '11
 ManWai's
 Epidemiology, Null hypothesis, Statistical hypothesis testing, Statistical tests, Pearson's chisquare test, endometrial cancer

Click to edit the document details