This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Imbens, Lecture Notes 10, ARE213 Spring ’06 1 ARE213 Econometrics Spring 2006 UC Berkeley Department of Agricultural and Resource Economics Maximum Likelihood Estimation V: Truncation, Censoring, and Corner Solutions (W 16.4-16.8, T) I. Introduction Here we look at a set of complications with the standard linear model where part of the information is missing. Suppose we have a normal linear model Y * i = X i β + ε i , with ε i | X i ∼ N (0 , σ 2 ) . If we observe ( Y i , X i ) for a random sample, we can estimate β by least squares, ˆ β = ( X X )- 1 ( X Y * ) , which is optimal (minimum variance unbiased estimator, best linear unbiased estimator, maximum likelihood estimator, etcetera). Here we want to look at three complications. First, the truncated regression model. Sup- pose we do not have a random sample from the population, but a random sample conditional on Y * i ≥ 0. (More generally, we can have a random sample conditional on Y * i ∈ Y ⊂ R , but the main ideas are illustrated just as well in the simple case. One generalization, known as stratified sampling , is concerned with the case where R is partitioned in J strata, and we have a J random samples, one from each of the strata, with the sampling probabilities for each of the strata potentially different from their population shares. See for example Imbens and Lancaster (1995).) The second is censoring . In that case we have a random sample from Imbens, Lecture Notes 10, ARE213 Spring ’06 2 the population, but we only observe Y * i if Y * i is positive. If Y * i is positive we only observe X i . The difference with truncated samples is ( a ) we know whether Y * i is negative, and ( b ) we always observe X i . The third case is that of what Wooldridge calls corner solutions . This is often not distinguished from censoring. We observe the same data as in censoring, but here we are interested not in the distribution of Y * i , but in the distribution of Y i = max( Y * i , 0). What is the difference? An example of censoring is topcoding in social security earnings data sets: we only observe earnings up to the social security maximum and otherwise ob- serve the maximum. In that case we are obviously interested in the actual earnings and its relation to covariates, not the observed minimum of actual earnings and the social security maximum. An example of a corner solution is hours worked. These are non-negative, and to take account of that we may wish to model a latent variable Y * i as linear in covariates, with the observed Y i equal to the maximum of Y * i and zero. We remained interested though in the distribution of the observed variable, actual hours worked, not in the distribution of the latent variable....
View Full Document
This note was uploaded on 08/01/2008 for the course ARE 213 taught by Professor Imbens during the Spring '06 term at Berkeley.
- Spring '06