Cluster Sampling Lecture

Cluster Sampling Lecture - Imbens/Wooldridge, Lecture Notes...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Imbens/Wooldridge, Lecture Notes 8, Summer ’07 What s New in Econometrics ? NBER , Summer 2007 Lecture 8 , Tuesday , July 31st , 2 . 00 - 3 . 00 pm Cluster and Stratified Sampling These notes consider estimation and inference with cluster samples and samples obtained by stratifying the population. The main focus is on true cluster samples, although the case of applying cluster-sample methods to panel data is treated, including recent work where the sizes of the cross section and time series are similar. Wooldridge (2003, extended version 2006) contains a survey, but some recent work is discussed here. 1 . THE LINEAR MODEL WITH CLUSTER EFFECTS This section considers linear models estimated using cluster samples (of which a panel data set is a special case). For each group or cluster g ,let  y gm , x g , z gm : m 1,. .., M g be the observable data, where M g is the number of units in cluster g , y gm is a scalar response, x g is a 1 K vector containing explanatory variables that vary only at the group level, and z gm is a 1 L vector of covariates that vary within (as well as across) groups. 1 . 1 Specification of the Model The linear model with an additive error is y gm x g z gm v gm , m 1,. .., M g ; g 1,. .., G . (1.1) Our approach to estimation and inference in equation (1.1) depends on several factors, including whether we are interested in the effects of aggregate variables or individual-specific variables . Plus, we need to make assumptions about the error terms. In the context of pure cluster sampling, an important issue is whether the v gm contain a common group effect that can be separated in an additive fashion, as in v gm c g u gm , m 1,. .., M g , (1.2) where c g is an unobserved cluster effect and u gm is the idiosyncratic error. (In the statistics literature, (1.1) and (1.2) are referred to as a “hierarchical linear model.”) One important issue is whether the explanatory variables in (1.1) can be taken to be appropriately exogenous. Under (1.2), exogeneity issues are usefully broken down by separately considering c g and u gm . Throughout we assume that the sampling scheme generates observations that are independent across g . This assumption can be restrictive, particularly when the clusters are large geographical units. We do not consider problems of “spatial correlation” across clusters, although, as we will see, fixed effects estimators have advantages in such settings. We treat two kinds of sampling schemes. The simplest case also allows the most flexibility 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Imbens/Wooldridge, Lecture Notes 8, Summer ’07 for robust inference: from a large population of relatively small clusters, we draw a large number of clusters ( G ), where cluster g has M g members. This setup is appropriate, for example, in randomly sampling a large number of families, classrooms, or firms from a large population. The key feature is that the number of groups is large enough relative to the group
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/26/2011 for the course ECON 245a taught by Professor Staff during the Fall '08 term at UCSB.

Page1 / 31

Cluster Sampling Lecture - Imbens/Wooldridge, Lecture Notes...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online