Imbens, Lecture Notes 18, ARE213 Spring ’06
1
ARE213
Econometrics
Spring 2006 UC Berkeley Department of Agricultural and Resource Economics
Endogeneity II:
TwoStageLeastSquares, Control Function,
and LimitedInformationMaximumLikelihood Estimation
1. TwoStageLeastSquares
A more systematic way to combine the multiple instruments is through twostageleast
squares estimation. Let us do this in more generality. The equation of interest is
Y
i
=
X
i
β
+
ε
i
=
X
i
1
β
1
+
X
i
2
β
2
+
ε
i
.
Let
σ
2
be the variance of
ε
i
. The vector of covariates
X
i
can be split into two parts, a possibly
endogenous part
X
i
1
and an exogenous part
X
i
2
. The vector of instruments is
Z
i
. It can be
split into the excluded instruments
Z
i
1
and the exogenous covariates
X
i
2
, or
Z
i
= (
Z
i
1
, X
i
2
).
Typically the common part
X
i
2
of the vectors
Z
i
and
X
i
will at least contain the intercept.
The TSLS estimation method consists of two stages. In the first stage all the endogenous
regressors are regressed on all the instruments and exogenous variables. That is, we estimate
X
i
1
=
Z
i
Π +
η
i
=
Z
i
1
Π
1
+
X
i
2
Π
2
+
η
i
.
Note that
X
i
1
is a
K

vector, so that with
Z
i
an
L

vector, Π is a
L
×
K
matrix of parameters.
Estimating this by least squares leads to
ˆ
Π = (
Z Z
)

1
Z X
1
.
We then calculate the predicted values for
X
based on this regression:
ˆ
X
1
=
Z
ˆ
Π
.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Imbens, Lecture Notes 18, ARE213 Spring ’06
2
Note that if we have a similar equation for
X
i
2
,
X
i
2
=
Z
i
Π +
η
i
=
Z
i
1
Π
1
+
X
i
2
Π
2
+
η
i
,
the result would be Π
2
=
I
and Π
1
= 0, so that the predicted value is
ˆ
X
2
=
X
2
. Hence in
the end we could treat all regressors symmetrically and just regress
X
on
Z
to get
ˆ
X
=
Z
(
Z Z
)

1
Z X
.
In the second stage the outcome is regressed on the predicted regressors:
Y
i
=
ˆ
X
i
β
+
ν
= (
Z
i
ˆ
Π)
β
+
ν
i
.
We can write the estimator for
β
as:
ˆ
β
=
(
(
X Z
)
·
(
Z Z
)

1
·
(
Z X
)
)

1
·
(
X Z
)
·
(
Z Z
)

1
·
(
Z Y
)
.
In large samples
√
N
·
(
ˆ
β

β
)
∼ N
0
, σ
2
·
(
(
X Z
)
·
(
Z Z
)

1
·
(
Z X
)
)

1
.
The error variance is
E
[(
Y

X β
)
2
], estimated as
∑
i
(
Y
i

X
i
ˆ
β
)
2
/N
. Note that this variance
is not the variance you would get as the standard ols variance from regressing
Y
i
on
ˆ
X
i
.
Let us see what we get from this for the AngristKrueger data.
The first stage is the
same regression we did before:
educ
i
=
12
.
6881
+
0
.
0566
·
qob
2
i
+
0
.
1173
·
qob
3
i
+
0
.
1514
·
qob
4
i
(0
.
0115)
(0
.
0163)
(0
.
0160)
(0
.
0163)
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '06
 IMBENS
 Normal Distribution, Variance, zi, TSLS

Click to edit the document details