This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: LASSO METHODS FOR GAUSSIAN INSTRUMENTAL VARIABLES MODELS A. BELLONI, V. CHERNOZHUKOV, AND C. HANSEN Abstract. In this note, we propose to use sparse methods (e.g. LASSO, PostLASSO, LASSO, and Post LASSO) to form firststage predictions and estimate optimal instru ments in linear instrumental variables (IV) models with many instruments in the canonical Gaussian case. The methods apply even when the number of instruments is much larger than the sample size. We derive asymptotic distributions for the resulting IV estimators and provide conditions under which these sparsitybased IV estimators are asymptotically oracleefficient. In simulation experiments, a sparsitybased IV estimator with a datadriven penalty performs well compared to recently advocated manyinstrumentrobust procedures. We illustrate the procedure in an empirical example using the Angrist and Krueger (1991) schooling data. 1. Introduction Instrumental variables (IV) methods are widely used in applied statistics, econometrics, and more generally for estimating treatment effects in situations where the treatment status is not randomly assigned; see, for example, [1, 4, 5, 7, 16, 21, 26, 27, 29, 30] among many others. Identification of the causal effects of interest in this setting may be achieved through the use of observed instrumental variables that are relevant in determining the treatment status but are otherwise unrelated to the outcome of interest. In some situations, many such instrumental variables are available, and the researcher is left with the question of which set of the instruments to use in constructing the IV estimator. We consider one such approach to answering this question based on sparseestimation methods in a simple Gaussian setting. Date : First version: June 2009, This version of December 7, 2010. 1 arXiv:1012.1297v1 [stat.ME] 6 Dec 2010 2 BELLONI CHERNOZHUKOV HANSEN Throughout the paper we consider the Gaussian simultaneous equation model: 1 y 1 i = y 2 i 1 + w i 2 + i , (1.1) y 2 i = D ( x i ) + v i , (1.2) i v i ! N , 2 v v 2 v !! (1.3) where y 1 i is the response variable, y 2 i is the endogenous variable, w i is a k wvector of control variables, and x i = ( z i ,w i ) is a vector of instrumental variables (IV), and ( i ,v i ) are distur bances that are independent of x i . The function D ( x i ) = E[ y 2 i  x i ] is an unknown, potentially complicated function of the instruments. Given a sample ( y 1 i ,y 2 i ,x i ) ,i = 1 ,...,n , from the model above, the problem is to construct an IV estimator for = ( 1 , 2 ) that enjoys good finite sample properties and is asymptotically efficient. We consider the case of fixed design, namely we treat the covariate values x 1 ,...,x n as fixed....
View
Full
Document
This note was uploaded on 12/26/2011 for the course ECON 245a taught by Professor Staff during the Fall '08 term at UCSB.
 Fall '08
 Staff

Click to edit the document details