5
Identification, ctd.
The coefficients
β
1
,…,
k
are said to be:
•
exactly identified
if
m
=
k
.
There are just enough instruments to estimate
1
,…,
k
.
•
overidentified
if
m
>
k
.
There are more than enough instruments to estimate
1
,…,
k
.
If so, you can test whether the instruments are valid
(
a test of
the “overidentifying
restrictions
”)
– we’ll return to this later
•
underidentified
if
m
<
k
.
There are too few instruments to estimate
1
,…,
k
.
If so, you
need to get more instruments!
6
The general IV regression model:
Summary of jargon
Y
i
=
0
+
1
X
1
i
+ … +
k
X
ki
+
k
+1
W
1
i
+ … +
k+r
W
ri
+
u
i
•
Y
i
is the
dependent variable
•
X
1
i
,…,
X
ki
are the
endogenous regressors
(potentially
correlated with
u
i
)
•
W
1
i
,…,
W
ri
are the
included exogenous variables
or
included
exogenous regressors
(uncorrelated with
u
i
)
•
0
,
1
,…,
k+r
are the unknown regression coefficients
•
Z
1
i
,…,
Z
mi
are the
m
instrumental variables
(the
excluded
exogenous variables
)
•
The coefficients are
overidentified
if
m
>
k
;
exactly identified
if
m
= k; and
underidentified
if
m
<
k
.
7
TSLS with a single endogenous
regressor
Y
i
=
0
+
1
X
1
i
+
2
W
1
i
+ … +
1
+r
W
ri
+
u
i
•
m
instruments:
Z
1
i
,…,
Z
m
•
First stage
•
Regress
X
1
on
all
the exogenous regressors: regress
X
1
on
W
1
,…,
W
r
,
Z
1
,…,
Z
m
by OLS
•
Compute predicted values
1
ˆ
i
X
,
i
= 1,…,
n
•
Second stage
•
Regress
Y
on
1
ˆ
X
,
W
1
,…,
W
r
by OLS