Cluster Sample Methods in Applied Econometrics

Cluster Sample Methods in Applied Econometrics -...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon
CLUSTER-SAMPLE METHODS IN APPLIED ECONOMETRICS: AN EXTENDED ANALYSIS Jeffrey M. Wooldridge Department of Economics Michigan State University East Lansing, MI 48824-1038 (517) 353-5972 wooldri1@msu.edu This version: June 2006 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ABSTRACT This is an expanded version of Wooldridge (2003), which provided an overview of cluster-sample methods in linear models. Here I include additional details for linear models and provide, as much as possible, a parallel treatment for nonlinear models, including a summary of strategies for dealing with data sets having a small number of clusters. Keywords: Cluster Correlation; Generalized Estimating Equations; Minimum Distance Estimation; Panel Data; Robust Variance Matrix; Unobserved Effect JEL Classification Codes: C13, C21, C23 2
Background image of page 2
1. INTRODUCTION In Wooldridge (2003), I provided a brief overview of econometric approaches to analyzing cluster samples in the context of a linear regression model. I considered cases with both large and small cluster sizes (relative to the number of clusters). That treatment was necessarily terse, and some subtle issues were only briefly mentioned or neglected entirely. The asymptotic theory for the case with a large number of clusters (and relatively small cluster sizes), either in linear or nonlinear models, has been pretty well worked out; for a summary, see, for example, Wooldridge (2002). Just as importantly, popular statistical packages, such as Stata ® , allow for computation of variance matrices that are robust to arbitrary cluster correlation for a variety of linear and nonlinear estimation methods. Still, while accounting for clustering in data is much more common than it was 10 years ago, inference methods robust to cluster correlation are still not used routinely in all relevant applications. I think that is partly because empirical researchers are not entirely sure when certain estimators are robust to various kinds of misspecification. I hope this expanded paper helps to fill that gap. For nonlinear models, there are some open modeling questions for cases where the group sizes vary – a common situation with true cluster samples – and one wants to allow correlation between the unobserved group effect (or heterogeneity) and the covariates that vary within group. I discuss the modeling issues, and offer some tentative solutions, in Section 3.1. This is very preliminary and hopefully generates some interest in the problem. The case of a small number of clusters has received much attention recently, particularly 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
for linear models. In Wooldridge (2003), I summarized two possible ways to estimate the effects of cluster-level covariates on individual-specific outcomes when the number of clusters is small (while the sizes of the clusters are moderately large). One approach, suggested by Donald and Lang (2001), is to effectively treat the number of groups as the number of observations, and use finite sample analysis (with individual-specific unobservables becoming unimportant – relative to the cluster effect – as the cluster sizes get large). A second approach
Background image of page 4
Image of page 5
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/26/2011 for the course ECON 245a taught by Professor Staff during the Fall '08 term at UCSB.

Page1 / 57

Cluster Sample Methods in Applied Econometrics -...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online