This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Economics 20 Prof. Patricia M. Anderson Final Review Binary Dependent Variables Using OLS on a binary dependent variable is referred to as a linear probability model (LPM). The biggest problem with a LPM is that the predicted values are not constrained to be between 0 and 1. An alternative to estimating P( y = 1 | x ) = + x is to model the probability as a function, G ( + x 29, where 0 < G(z) < 1. When G(z) is the standard normal cdf we call this a probit model. When G(z) is the logistic function, we call this a logit model. Both are similar functions increasing in z . Since these are now nonlinear in parameters, OLS is inappropriate and we must use maximum likelihood estimation. Interpreting probits and logits is more complicated than interpreting the LPM, since ( 29 j j x g x p + = , where g ( z ) is d G /d z . For the logit, you can estimate g by p (1- p ), while for the probit it is best to let Stata compute the derivative for you. A worse approximation, but one you can always do if necessary, is to multiply the logit coefficients by .25 and the probit coefficients by .4 to compare them both to the LPM. In any case, the sign and significance of coefficients can always be compared across models. Since probits and logits are estimated by maximum likelihood, we cant form an F statistic to test exclusion restrictions. Instead, we can use a likelihood ratio test. Estimate the restricted and unrestricted models, then form LR = 2( L ur L r ) ~ 2 q where L is the log likelihood. Similarly, we cannot form an R 2 as a goodness-of-fit measure. One alternative is a pseudo R 2 defined as 1 - L ur / L r where the restricted model is the model with just an intercept. The Tobit Model Suppose we have an unobserved variable y* such that y* = x + u , u| x ~ Normal(0 , 2 ), and we only observe y = max(c, y *) or y = min(c, y *), where c is constant. Then a Tobit model can be used to estimate and . Thus, the estimated coefficients represent the effect of x on the latent variable y *, not on the observed variable y . Would need to scale by ( x / 29 to get the effect on y. If the errors are not normal or are heteroskedastic then the Tobit model will usually be meaningless. The best use of the Tobit is for the case of top-coded data, that is where y = min(c, y *), and c is the highest value the survey folks are willing to report. This is usually done for confidentiality reasons, but as researchers we are interested in the underlying values, y*. Difference-in-Differences With either pooled cross-sections or panel data it is possible to do difference-in-differences estimation. The idea is to compare treatment and control groups before and after a treatment. We call it difference-in-differences estimation because without any other controls it is just the difference in the means across groups in the before-after differences in the means. In a regression framework, this is just y it = + 1 treatment...
View Full Document