This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Imbens, Lecture Notes 4, ARE213 Spring 06 1 ARE213 Econometrics Spring 2006 UC Berkeley Department of Agricultural and Resource Economics Ordinary Least Squares IV: Clustering and Variance Estimation (W 6.3.4, Mo) When we looked at the standard linear model Y i = X i + i , where X i is an Ldimensional column vector, or in matrix notation, Y = X + , we assumed we had independent observations. Often that is not quite true. In general this makes progress difficult, but progress can be made if we impose some additional structure. Suppose that the pairs ( Y i , X i ) are clustered . Let S i be index for the cluster, so that with K clusters S i { 1 , . . ., K } . Within each cluster the ( Y i , X i ) are correlated, but ( Y i , X i )s from different clusters are independent. Clusters could be states, or classrooms, or any other grouping that could be expected to lead to a particular form of dependencies. To do asymptotics we assume that the number of observations per cluster is fixed and the number of clusters increases. Let us initially also assume that the number of observations per cluster is the same for all clusters, and equal to M . More generally the sample size in cluster or group k is M k . The total sample size is N = K k =1 M k , equal M K in the special case with a constant group size. This structure will be seen to greatly affect standard errors if the variable of interest varies only between clusters. It is useful to introduce some additional notation and give some preliminary results. Let Z be the N K matrix of group or cluster indicators with typical element Z ij = 1 { S i = j } . For example, with three clusters, ten observations, of which the first two are from cluster 1, Imbens, Lecture Notes 4, ARE213 Spring 06 2 the next five are from cluster two, and the last three are from cluster three, we would have Z = 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 . If N is the Ndimensional vector with all elements equal to one, then Z N gives a K vector with the k th element equal to M k , the group size of cluster k . With Y an Ndimensional vector, ( Z Z ) 1 ( Z Y ) is the K vector with group means: ( ( Z Z ) 1 ( Z Y ) ) k = N i =1 1 { S i = k } Y i /M k ....
View Full
Document
 Spring '06
 IMBENS

Click to edit the document details