{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lasso Methods IV


Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
LASSO METHODS FOR GAUSSIAN INSTRUMENTAL VARIABLES MODELS A. BELLONI, V. CHERNOZHUKOV, AND C. HANSEN Abstract. In this note, we propose to use sparse methods (e.g. LASSO, Post-LASSO, LASSO, and Post- LASSO) to form first-stage predictions and estimate optimal instru- ments in linear instrumental variables (IV) models with many instruments in the canonical Gaussian case. The methods apply even when the number of instruments is much larger than the sample size. We derive asymptotic distributions for the resulting IV estimators and provide conditions under which these sparsity-based IV estimators are asymptotically oracle-efficient. In simulation experiments, a sparsity-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. We illustrate the procedure in an empirical example using the Angrist and Krueger (1991) schooling data. 1. Introduction Instrumental variables (IV) methods are widely used in applied statistics, econometrics, and more generally for estimating treatment effects in situations where the treatment status is not randomly assigned; see, for example, [1, 4, 5, 7, 16, 21, 26, 27, 29, 30] among many others. Identification of the causal effects of interest in this setting may be achieved through the use of observed instrumental variables that are relevant in determining the treatment status but are otherwise unrelated to the outcome of interest. In some situations, many such instrumental variables are available, and the researcher is left with the question of which set of the instruments to use in constructing the IV estimator. We consider one such approach to answering this question based on sparse-estimation methods in a simple Gaussian setting. Date : First version: June 2009, This version of December 7, 2010. 1 arXiv:1012.1297v1 [stat.ME] 6 Dec 2010
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 BELLONI CHERNOZHUKOV HANSEN Throughout the paper we consider the Gaussian simultaneous equation model: 1 y 1 i = y 2 i α 1 + w 0 i α 2 + i , (1.1) y 2 i = D ( x i ) + v i , (1.2) i v i ! N 0 , σ 2 σ v σ v σ 2 v !! (1.3) where y 1 i is the response variable, y 2 i is the endogenous variable, w i is a k w -vector of control variables, and x i = ( z 0 i , w 0 i ) 0 is a vector of instrumental variables (IV), and ( i , v i ) are distur- bances that are independent of x i . The function D ( x i ) = E[ y 2 i | x i ] is an unknown, potentially complicated function of the instruments. Given a sample ( y 1 i , y 2 i , x i ) , i = 1 , . . . , n , from the model above, the problem is to construct an IV estimator for α 0 = ( α 1 , α 0 2 ) 0 that enjoys good finite sample properties and is asymptotically efficient. We consider the case of fixed design, namely we treat the covariate values x 1 , . . . , x n as fixed. This includes random sampling as a special case; indeed, in this case x 1 , . . . , x n represent a realization of this sample on which we condition throughout. Note that for convenience, the notation has been collected in Appendix A.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 28


This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon bookmark
Ask a homework question - tutors are online