STA 302 / 1001 H1F – Fall 2006
Test 1
October 18, 2006
LAST NAME:
SOLUTIONS
FIRST NAME:
STUDENT NUMBER:
ENROLLED IN: (circle one)
STA 302
STA 1001
INSTRUCTIONS:
•
Time: 90 minutes
•
Aids allowed: calculator.
•
A table of values from the
t
distribution is on the last page (page 7).
•
Total points: 50
Some formulae:
b
1
=
∑
(
X
i

X
)(
Y
i

Y
)
∑
(
X
i

X
)
2
=
∑
X
i
Y
i

n
X
Y
∑
X
2
i

n
X
2
b
0
=
Y

b
1
X
Var(
b
1
) =
σ
2
∑
(
X
i

X
)
2
Var(
b
0
) =
σ
2
1
n
+
X
2
∑
(
X
i

X
)
2
Cov(
b
0
, b
1
) =

σ
2
X
∑
(
X
i

X
)
2
SSTO =
∑
(
Y
i

Y
)
2
SSE =
∑
(
Y
i

ˆ
Y
i
)
2
SSR =
b
2
1
∑
(
X
i

X
)
2
=
∑
(
ˆ
Y
i

Y
)
2
σ
2
{
ˆ
Y
h
}
= Var(
ˆ
Y
h
) =
σ
2
1
n
+
(
X
h

X
)
2
∑
(
X
i

X
)
2
σ
2
{
pred
}
= Var(
Y
h

ˆ
Y
h
) =
σ
2
1 +
1
n
+
(
X
h

X
)
2
∑
(
X
i

X
)
2
r
=
∑
(
X
i

X
)(
Y
i

Y
)
∑
(
X
i

X
)
2
∑
(
Y
i

Y
)
2
1
2ab
2cdef
2gh
3
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
1. A simple linear regression model
Y
i
=
β
0
+
β
1
X
i
+
i
is fit using least squares to
n
data points. Assume that the GaussMarkov conditions hold
and that the error terms are normally distributed with mean 0 and variance
σ
2
.
(a) (3 marks) What is the probability distribution of
b
1
? What is the probability distribu
tion of
β
1
?
b
1
∼
N
β
1
,
σ
2
∑
(
X
i

X
)
2
(2 marks)
β
1
is not random (1 mark)
(b) (4 marks) Describe the method of least squares. How is it related to
R
2
?
Find the slope and intercept of the line that minimizes the sum of squares of the vertical
deviations of the data points from the line. (2 marks)
The quantity that is being minimized as a proportion of variation in the
Y
’s is
1

R
2
.
(2 marks)
(c) (3 marks) Suppose the regression model is being used to predict blood pressure as a
function of weight.
Explain the difference between a confidence interval for the mean
response at a new
X
and a prediction interval at a new
X
in this context.
(Do not
discuss the details of the formulae for calculating the intervals.)
For a particular weight, there is a probability distribution for the possible values of blood
pressure.
The confidence interval for the mean of
Y
at that weight gives an interval
that captures the mean of this probability distribution (
β
0
+
β
1
X
)
100(1

α
)
% of the
time. The prediction interval gives an interval that captures the actual value of the blood
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '09
 Statistics, Regression Analysis, Yi, Xi X, Xi Yi nXY

Click to edit the document details