Stats 203  Problem Set 1
Courtesy to Lee Shoa Long Clarke
January 31, 2010
1a.
By deﬁnition
∑
n
i
=1
e
i
=
∑
n
i
=1
y
i

ˆ
y
i
. Using the deﬁnition of ¯
y
we can rewrite this:
n
X
i
=1
y
i

ˆ
y
i
=
n
X
i
=1
y
i

n
X
i
=1
ˆ
y
i
=
n
¯
y

n
X
i
=1
ˆ
y
i
Since ˆ
y
i
=
ˆ
β
0
+
ˆ
β
1
x
i
, we can substitute to get:
n
X
i
=1
y
i

ˆ
y
i
=
n
¯
y

n
X
i
=1
ˆ
β
0
+
ˆ
β
1
x
i
=
n
¯
y

n
¯
y
+
n
ˆ
β
1
¯
x

n
X
i
=1
ˆ
β
1
x
i
=
n
ˆ
β
1
¯
x

n
X
i
=1
ˆ
β
1
x
i
=
n
ˆ
β
1
¯
x

n
ˆ
β
1
¯
x
=
0
1b.
No. The fact that
∑
n
i
=1
e
i
= 0 is a consequence of how we estimate ˆ
y
i
(least squares estimation).
Whereas the assumption that
±
i
are i.i.d. normal with mean 0 is based on our belief that there
is not an inherent bias in our measurement of
Y
. Even if the error is not iid
N
(0
,σ
2
), the least
square estimation will still give us
∑
n
i
=1
e
i
= 0.
2a.
Let
X
=
1
x
1
.
.
.
.
.
.
1
x
n
,Y
=
y
1
.
.
.
y
n
,β
=
±
β
0
β
1
²
.
ˆ
β
= arg min
β
L
(
β
) = arg min
β
(
Y

Xβ
)
T
(
Y

Xβ
)
Hence
ˆ
β
satisﬁes
∂L
∂β
=
2
X
T
(
Y

Xβ
) = 0
ˆ
β
=
(
X
T
X
)

1
X
T
Y
ˆ
Y
=
X
ˆ
β
=
X
(
X
0
X
)

1
X
0
Y
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
r
can be expressed as a linear transformation
of
Y
.
r
=
Y

ˆ
Y
=
Y

(
X
T
X
)

1
X
T
Y
=
(
I
n

(
X
T
X
)

1
X
T
)
Y
2b.
Deﬁne matrix
P
= (
X
T
X
)

1
X
T
,
P
is an projection matrix which project data to the space spanned
by columns of X. Given
ˆ
β
=
PY
and
r
= (
I
n

P
)
Y
,
Cov
(
ˆ
β,r
)
=
Cov
(
PY,
(
I
n

P
)
Y
)
=
PCov
(
Y,Y
)(
I
n

P
)
T
=
Pσ
2
I
n
(
I
n

P
)
=
σ
2
P
(
I
n

P
T
)
=
σ
2
P
(
I
n

P
)
since P is symmetric
=
σ
2
(
P

PP
)
=
σ
2
(
P

P
)
since P is idempotent
=
0
In addition, Y is normally distributed and linear combination of normal r.v. is still normally dis
tributed,hence
ˆ
β
and r are normal random varialbes. Since uncorrelated normal random variables
are independent,
ˆ
β
and r are independent.
2c.
For simplicity, let’s rewrite the model using matrix notation,
Y
=
Xβ
+
±
. Since each
±
i
is distributed
N
(0
,σ
2
), we see that
n
X
i
=1
±
2
i
=
±
T
±
∼
σχ
2
n
We can rewrite this,
±
T
±
=
(
Y

Xβ
)
T
(
Y

Xβ
)
=
(
Y

Xβ
+
X
ˆ
β

X
ˆ
β
)
T
(
Y

Xβ
+
X
ˆ
β

X
ˆ
β
)
=
(
Y

X
ˆ
β
+
X
(
ˆ
β

β
))
T
(
Y

X
ˆ
β
+
X
(
ˆ
β

β
))
=
(
Y

X
ˆ
β
)
T
(
Y

X
ˆ
β
) + (
ˆ
β

β
)
T
X
T
X
(
ˆ
β

β
) + 2(
ˆ
β

β
)
T
X
T
(
Y

X
ˆ
β
)
=
(
Y

X
ˆ
β
)
T
(
Y

X
ˆ
β
) + (
ˆ
β

β
)
T
X
T
X
(
ˆ
β

β
)
The last equality follows from the fact that (
ˆ
β

β
)
T
X
T
(
Y

X
ˆ
β
) = (
ˆ
β

β
)
T
(
X
T
Y

X
T
X
ˆ
β
) = 0
by using the deﬁnition of
ˆ
β
as a linear transformation of Y as done above. Now, using
V ar
(
ˆ
β
) =
σ
2
(
X
T
X
)

1
and
E
[
ˆ
β
] =
β
, we see
(
ˆ
β

β
)
T
X
T
X
(
ˆ
β

β
) =
σ
2
(
ˆ
β

β
)
T
V ar

1
(
ˆ
β
)(
ˆ
β

β
)
∼
N
(0
,σ
2
I
2
)
.
and since
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '08
 KARAALI
 Normal Distribution, Regression Analysis, Yi, Errors and residuals in statistics

Click to edit the document details