The multiple linear model can also be expressed in the matrix format
y
=
Xβ
+
ε
,
Multiple Linear Regression
59
where
X
=
x
11
x
12
· · ·
x
1
k
x
21
x
22
· · ·
x
2
k
· · ·
x
n
1
x
n
2
· · ·
x
nk
β
=
β
0
β
1
β
2
· · ·
β
k

1
ε
=
ε
1
ε
2
ε
3
· · ·
ε
n
(3.17)
The matrix form of the multiple regression model allows us to discuss and
present many properties of the regression model more conveniently and
efficiently. As we will see later the simple linear regression is a special case
of the multiple linear regression and can be expressed in a matrix format.
The least squares estimation of
β
can be solved through the least squares
principle:
b
=
arg min
β
[(
y

Xβ
)
0
(
y

Xβ
)]
,
where
b
0
= (
b
0
, b
1
,
· · ·
b
k

1
)
0
,
a
k
dimensional vector of the estimations of
the regression coefficients.
Theorem 3.12.
The least squares estimation of
β
for the multiple linear
regression model
y
=
Xβ
+
ε
is
b
= (
X
0
X
)

1
X
0
y
, assuming
(
X
0
X
)
is
a nonsingular matrix.
Note that this is equivalent to assuming that the
column vectors of
X
are independent.
Proof.
To obtain the least squares estimation of
β
we need to minimize
the residual of sum squares by solving the following equation:
∂
∂
b
[(
y

Xb
)
0
(
y

Xb
)] = 0
,
or equivalently,
∂
∂
b
[(
y
0
y

2
y
0
Xb
+
b
0
X
0
Xb
)] = 0
.
By taking partial derivative with respect to each component of
β
we obtain
the following normal equation of the multiple linear regression model:
X
0
Xb
=
X
0
y
.
Since
X
0
X
is nonsingular it follows that
b
= (
X
0
X
)

1
X
0
y
. This com
pletes the proof.
/
60
Linear Regression Analysis: Theory and Computing
We now discuss statistical properties of the least squares estimation of the
regression coefficients. We first discuss the unbiasness of the least squares
estimation
b
.
Theorem 3.13.
The estimator
b
= (
X
0
X
)

1
X
0
y
is an unbiased estimator
of
β
. In addition,
Var
(
b
) = (
X
0
X
)

1
σ
2
.
(3.18)
Proof.
We notice that
E
b
=
E
((
X
0
X
)

1
X
0
y
) = (
X
0
X
)

1
X
0
E
(
y
) = (
X
0
X
)

1
X
0
Xβ
=
β
.
This completes the proof of the unbiasness of
b
.
Now we further discuss
how to calculate the variance of
b
. The variance of the
b
can be computed
directly:
Var(
b
) = Var((
X
0
X
)

1
X
0
y
)
= (
X
0
X
)

1
X
0
Var(
b
)((
X
0
X
)

1
X
0
)
0
= (
X
0
X
)

1
X
0
X
(
X
0
X
)

1
σ
2
= (
X
0
X
)

1
σ
2
.
/
Another parameter in the classical linear regression is the variance
σ
2
, a
quantity that is unobservable. Statistical inference on regression coefficients
and regression model diagnosis highly depend on the estimation of error
variance
σ
2
. In order to estimate
σ
2
, consider the residual sum of squares:
e
t
e
= (
y

Xb
)
0
(
y

Xb
) =
y
0
[
I

X
(
X
0
X
)

1
X
0
]
y
=
y
0
P
y
.
This is actually a distance measure between observed
y
and fitted regression
value
ˆ
y
.
Note that it is easy to verify that
P
= [
I

X
(
X
0
X
)

1
X
0
] is
idempotent. i.e.,
P
2
= [
I

X
(
X
0
X
)

1
X
0
][
I

X
(
X
0
X
)

1
X
0
] = [
I

X
(
X
0
X
)

1
X
0
] =
P.
You've reached the end of your free preview.
Want to read all 349 pages?
 Fall '14
 Regression Analysis, The Land, Linear Regression