STAT 3504
Review Problems I:
Midterm
1.
A marketing researcher, having collected data on breakfast cereal expenditures by families with 1, 2, 3,
4 and 5
children living at home, plans to use an ordinary regression model to estimate the mean
expenditures at each of these five family size levels.
However, the researcher is undecided between
fitting a linear or a quadratic model, and the data not do give clear evidence in favour of one model or
the other.
A colleague suggests:
“For your purposes you might simply use an ANOVA model”.
Is this
a useful suggestion?
Explain.
2.
Text
#16.5
3.
A student asks:
“Why is the
F
test for inequality of factor level means not a twotailed test, since any
differences among the factor level means can occur in either direction?”
Explain using the
expected
mean squares
for
MSE
and
MSTR.
4.
A rehabilitation center researcher was interested in examining the relationship between physical fitness
prior to surgery of persons undergoing corrective knee surgery and time required in physical therapy
until successful rehabilitation.
Patient records in the center were examined and
24
male subjects
ranging in age from
18 to 30
years who had undergone similar corrective knee surgery during the past
year were selected for the study.
The number of days required for successful completion of physical
therapy and the prior physical fitness status (below average, average, above average) for each patient are
given below.
j
i (level)
1
2
3
4
5
6
7
8
9
10
Below average
29
42
38
40
43
40
30
42
Average
30
35
39
28
31
31
29
35
29
33
Above average
26
32
21
20
23
22
Y
2
ij
n
j=1
r
=1
i
i
∑
∑
= 25664,
Y
1.
= 304,
Y
2.
= 320,
Y
3.
= 144
Do this problem
by hand
 not using ANY software
(you will not have Excel or anything else on the
midterm).
Assume all assumptions are satisfied.
a)
Obtain the fitted values.
b)
Obtain the residuals for the "high average" factor level.
For this factor level find E{e
3j
}
(up to a
proportionality constant) assuming errors are normally distributed.
c)
Obtain the analysis of variance table.
Include the
expected mean squares
.
d)
Test at a significance level of 0.01 whether the mean number of days required for rehabilitation
differs between the
3
fitness groups.
e)
Estimate with a 99% confidence interval the mean number of days in rehabilitation required for
persons of average physical fitness.
Assume that it had been decided in advance of looking at
the data that this was the C.I. of interest.
f)
Use the Bonferroni procedure with
95%
family confidence to obtain confidence intervals for
μ
μ
3
2

and
μ
μ
2
1

.
Interpret your results.
What is the
per comparison
level of significance
here?
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
g)
Would the Tukey procedure have been more efficient in part (f)?
Explain.
h)
Under what conditions are the confidence intervals of
(f) valid?
i)
If the researcher had wished to estimate
μ
μ
3
1

as well as the other 2 comparisons in (f),
would the tvalue per confidence interval
need to be modified?
Would this also be the case if
the Tukey procedure had been used?
This is the end of the preview.
Sign up
to
access the rest of the document.
 Winter '10
 Ann
 Statistics, Normal Distribution, Statistical hypothesis testing, Pearson correlation coefficients, Resid

Click to edit the document details