This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ECON 103, Lecture 15A: Instrumental Variables I
Maria Casanova March 4 (NEW VERSION!) March 9th th (version 0) May 26th Maria Casanova Lecture 15A Requirements for this lecture: Chapter 12 of Stock and Watson Maria Casanova Lecture 15A 0. Introduction
In lecture 12 we covered 5 threats to internal validity of linear regression model. The 5 threats to internal validity arose because the error term was correlated with the regressor, which caused OLS estimator of unknown population coeﬃcients to be biased. 2 of those threats to validity are: Omitted variable bias Simultaneous causality bias Instrumental variables (IV) regression can be used to obtain a consistent estimator of the unknown coeﬃcients in the presence of omitted variable bias or simultaneous causality bias. Maria Casanova Lecture 15A 0. Introduction
How does IV work? - Intuition Consider the following model: Y = β0 + β1 X + ε Think of the variation in X as having two sources: One part that is correlated with the error term One part that is not correlated with it IV uses one or more additional variables Z called instrumental variables or instruments to isolate the variation in X that is not correlated with ε. In this way the source of bias is avoided so that consistent estimates of β1 can be obtained.
Maria Casanova Lecture 15A 0. Introduction Example 1: omitted variable bias Consider the following model for the average test score in class j : Av test scorej = β0 + β1 Sizej + ε Income would be an omitted variable in this model if: Income had an eﬀect of average test scores AND Income was correlated with class size. If income increases average test scores and is negatively correlated with class size, the the OLS estimate of β1 would be biased downwards. Maria Casanova Lecture 15A 0. Introduction
Example 1: omitted variable bias (contd.) Now imagine that we can isolate two diﬀerent sources of variation in class sizes: On the one hand, there is some variation in class sizes that is due to income (richer neighborhoods can aﬀord more teachers, i.e. smaller classes). But class size can also vary for other reasons. For example, imagine that in year 2000 county A and county B have two state schools each, one in a poor and one in a rich neighborhood: Table: Average number of students per class in 2000 County A Rich neighborhood Poor neighborhood
Maria Casanova County B 17 25 17 25
Lecture 15A 0. Introduction
Example 1: omitted variable bias (contd.) While the two counties have the same budget, in 2000 County A makes some lucky investment decisions that produce an unexpected windfall in 2001. This allows county A to hire extra teachers in 2001. Table: Average number of students per class in 2001 County A Rich neighborhood Poor neighborhood 15 21 County B 17 25 If we can isolate the variation in class size due to the unexpected returns, we can measure the eﬀect of a change in class size that is not due to neighborhood income. We can do this with IV regression. This procedure will give us unbiased estimates of β1
Maria Casanova Lecture 15A 0. Introduction
Example 2: Simultaneous causality Remember the model we saw in lecture 12B: Crimej = β0 + β1 Polj + εj There is simultaneous causality in the model if the number of policemen per capita has an eﬀects on crime, and at the same time the number of crimes per capita has an eﬀect on the size of the police force. We saw in lecture 12B that this led to a correlation between the regressor and the error term. If we could isolate the variation in the size of the police force that is not caused by changes in the crime rate, we could obtain a consistent estimate of β1 Maria Casanova Lecture 15A 0. Introduction
Example 2: Simultaneous causality (contd.) Imagine that we have data on policemen and crimes per capita for every county in the state of CA. We discover that when an out-of-control ﬁre threatens a city, it is a common measure for the county to send policemen to help the ﬁre department with logistics. Measuring the number of ﬁres that required police cooperation in each county every month for one year would identify variation in policemen per capita that is unrelated to the crime rate. We could use the number of such ﬁres as an instrument to estimate β1 through an IV regression. Maria Casanova Lecture 15A 1. IV regression with 1 regressor and 1 instrument
We start from the population regression: Yi = β0 + β1 Xi + εi , where Xi and εi are correlated. Since the ﬁrst least squares regression is violated, the OLS estimator is inconsistent. IV regression uses an instrumental variable Z to isolate the variation in X that is not correlated with ε. Terminology: An endogenous variable is one that is correlated with ε. An exogenous variable is one that is uncorrelated with ε Maria Casanova Lecture 15A 1. IV regression with 1 regressor and 1 instrument
Conditions for a valid instrument For an instrumental variable Z to be valid, it must satisfy two conditions:
1 Instrumental relevance: variation in the instrument is related to variation in Xi , i.e. Corr (Zi , Xi ) = 0 2 Instrumental exogeneity: the part of variation in Xi captured by the instrumental variable is exogenous, i.e. not correlated with the error term. Corr (Zi , εi ) = 0 Maria Casanova Lecture 15A 1. IV regression with 1 regressor and 1 instrument
If Z satisﬁes the two conditions to be a valid instrument, then β1 can be estimated using an IV estimator called two-stage least squares (TSLS). As its name suggests, TSLS proceeds in two stages:
1 First, we isolate the part of X that is uncorrelated with ε by regressing X on Z using OLS: Xi = π0 + π1 Zi + vi Because Zi is uncorrelated with εi , also π0 + π1 Zi is uncorrelated with εi . We don’t know π0 and π1 , so we estimate them and then compute the predicted values of Xi , i.e. ˆ Xi = π0 + π1 Zi ˆ ˆ Maria Casanova Lecture 15A 1. IV regression with 1 regressor and 1 instrument 2 ˆ Second, we replace Xi with Xi in the regression of interest, and ˆ regress Yi on Xi using OLS: ˆ Yi = β0 + β1 Xi + εi ˆ Because Xi is uncorrelated with εi , the ﬁrst least squares assumption holds. ˆ Thus the estimate β1 obtained by OLS in the second regression is consistent. The resulting estimator is called Two Stage Least Squares ˆTSLS . (TSLS) estimator, β1 Maria Casanova Lecture 15A 2. The general IV regression model
The equation of interest is: Yi = β0 + β1 X1i + ... + βk Xki + βk +1 W1i + ... + βk +r Wri + εi where: Yi is the dependent variable X1i , ..., Xki are the endogenous regressors (potentially correlated with εi ) W1i , ..., Wri are the included exogenous regressors (uncorrelated with εi ) β0 , ..., βk +r are the unknown regression coeﬃcients Z1i , ..., Zmi are the instrumental variables (or excluded exogenous regressors) 2. The general IV regression model The population ﬁrst-stage regression relates X to the exogenous variables, i.e. the Z ’s and the W ’s: Xi = π0 + π1 Z1i + ... + πm Zmi + πm+1 W1i + ... + πm+r Wri + vi where: π0 , ..., πm+r are the unknown regression coeﬃcients When there are multiple endogenous regressors X1i , ..., Xki , each endogenous regressor requires its own ﬁrst-stage regression. Each of these ﬁrst-stage regressions includes as independent variables all the instruments (Z ’s) and all the included exogenous variables (W ’s) 2. The general IV regression model
The conditions for an instrument to be valid in the general IV model:
1 Relevance If there is only one X but multiple Z ’s, relevance requires that at least one Z is useful for predicting X , given W . When there are multiple X ’s the instruments must provide enough information about the exogenous movements in the X ’s to sort out their separate eﬀects on Y (otherwise, there will be perfect multicollinearity in the second stage) 2 Exogeneity requires that each instrument Z be uncorrelated with the error term εi 3. The IV regression assumptions The IV regression assumptions modify the least squares assumptions that we covered in lecture 6 and lecture 7. Under the IV regression assumptions, the TSLS estimator ... 1 ... is consistent ... is approximately normally distributed in large samples. 2 3. The IV regression assumptions Ass1: The conditional distribution of εi given Wi has zero mean. E (εi |Wi ) = 0 → Notice that the ﬁrst least squares assumption required that all regressors had a conditional mean of 0; while the ﬁrst IV assumption only requires that the included exogenous variables have mean zero. 3. The IV regression assumptions
Ass2: The (Xi , Wi , Zi , Yi ) variables are independently and identically distributed. Ass3: Large outliers are unlikely. Ass4: There is no perfect collinearity between two included exogenous regressors (e.g. W1 and W2 ). → Assumptions 2 and 3 are the same as the second and third least squares assumptions. → Assumption 4 doesn’t require that there isn’t perfect collinearity among two of the endogenous regressors (X ), but this will be implicit in assumption 5 (see next slide). 3. The IV regression assumptions Ass5: The two conditions for a valid instrument hold:
1 Relevance: [When there are multiple X ’s] the instruments must provide enough information about the exogenous movements in the X ’s to sort out their separate eﬀects on Y (otherwise, there will be perfect multicollinearity in the second stage) Exogeneity requires that each instrument Z be uncorrelated with the error term εi 2 4. Inference with the TSLS estimator The IV regression assumptions ensure that the TSLS estimator is approximately normally distributed in large samples. The procedures for statistical inference we saw for the OLS estimator extend to TSLS regression. 4. Inference with the TSLS estimator
The TSLS standard errors: When we run the two steps separately to compute the TSLS estimator we obtain incorrect standard errors, because we do not ˆ take into account that the regressors X are predicted from the ﬁrst-stage regression. Using the command ivreg in Stata ensures that we get the appropriate standard errors. As it was the case with OLS, the error term ε may be heteroskedastic. When in doubt, you should use the robust option in Stata in order to obtain heteroskedasticity-robust standard errors. ...
View Full Document
This note was uploaded on 03/15/2010 for the course ECON 103 taught by Professor Sandrablack during the Winter '07 term at UCLA.
- Winter '07