STA 414S/2104S
: TakeHome Midterm Test. Due March 25, 2010 before 2 pm.
Please work alone.
1. (Adapted from Exercise 2.7, HTF). Suppose we have a sample (
y
1
,x
1
)
,...,
(
y
N
,x
N
),
and we assume the model
y
i
=
f
(
x
i
) +
±
i
,
(1)
where
f
(
·
) is an unkown regression function,
±
i
∼
N
(0
,σ
2
), and the
±
’s are independent.
A fairly wide class of estimators considered in the course are of the form
ˆ
f
(
x
0
) =
N
X
i
=1
‘
i
(
x
0
;
x
)
y
i
,
where
x
= (
x
1
,...,x
N
).
(a) Show that linear regression and
k
nearest neighbour regression are members of
this class of estimators, and describe the weights
‘
i
(
x
0
;
x
) in each of these cases.
(b)
STA 2104 only
Decompose the conditional meansquared error
E
y

x
{
ˆ
f
(
x
0
)

f
(
x
0
)
}
2
,
where the expectation is over the conditional distribution of
y
1
,...,y
N
, given
x
1
,...,x
N
.
2. (Adapted from Exercise 2.1, R.A. Berk). Figure 1 shows a plot of Ozone against
Temperature, from a database of daily measurements in New York over 154 summer
days. The following ﬁts are summarized in the code extract:
> data(airquality);attach(airquality)
> library(gam)