Propensity Score Weighting (psw) Recall expressions for Ĳ ate and Ĳ att in terms of (expectations of) functions of w, y and the propensity score (bottom of p.9 and top of p.10). These expressions are “weighted” means of y, where the weights are functions of w and the propensity score. A “natural” way to use them to estimate program effects is to estimate the propensity score and then take the sample averages of the above expressions. This can be expressed as follows:
14 W ˆ ate, psw = (1/N) N 1 i 6 » ¼ º « ¬ ª ± ± ± )) ( p ˆ 1 ( y ) w 1 ( ) ( p ˆ y w i i i i i i x x = (1/N) N 1 i 6 )) ( p ˆ 1 )( ( p ˆ y )) ( p ˆ w ( i i i i i x x x ± ± W ˆ att, psw = (1/N) N 1 i 6 )) ( p ˆ 1 ( ˆ y )) ( p ˆ w ( i i i i x x ± U ± where p ˆ ( x ) is an estimate of p( x ) and U ˆ = the fraction of observations for which w = 1. The propensity score can be estimated using a logit (or probit) functional form. It is best to use a flexible function form with lots of squared terms and interaction terms for the x variables. Hirano, Imbens and Ridder (2003) showed that if you increase the number of these terms as the sample size gets larger (this is known as a series estimator) this method is asymptotically efficient (in a semiparametric sense). Of course, we also need to get standard errors for these two estimates. This can be done using regressions methods. To get started, define d i as the “score” (vector of first derivatives for all x variables) of the log likelihood of the propensity score function p( x , Ȗ ) [see p.7 of Lecture 14], which has the parameters Ȗ : d i = d i (w i , x i , Ȗ ) = )) , ( p 1 )( , ( p )) , ( p w ( )' , ( p i i i i i Ȗ x Ȗ x Ȗ x Ȗ x Ȗ ± ±
15 Then define k i as: k i = )) , ( p 1 )( , ( p y )) , ( p w ( i i i i i Ȗ x Ȗ x Ȗ x ± ± Use your estimates of Ȗ to estimate d i and k i for all observations in the data. Denoting these estimates as i ˆ d and i k ˆ , regress i k ˆ on all the elements in i ˆ d and save the estimated residuals from this regression, which can be denoted as i e ˆ . Then the (asymptotic) standard error of W ˆ ate, psw can be calculated as: ( N)[(1/N) 2 i N 1 i e ˆ 6 ] 1/2 If you use a logit functional form for p( x ), then this is even simpler to do: d i simplifies to: d i = h i (w i - i p ˆ ) where h i is simply the x variables, including squared and interaction terms, and i p ˆ = exp( h i ƍ Ȗ ˆ )/(1 + exp( h i ƍ Ȗ ˆ )). Wooldridge shows on pp.923-924 that the standard error for W ˆ att, psw can be expressed as:
16 (1/ U ˆ ) 2 / 1 2 i psw , att i N 1 i ) w ˆ rˆ ( ) N / 1 ( » ¼ º « ¬ ª W ± 6 ( N) where the i rˆ ’s are the residuals from a regression of i q ˆ on i ˆ d , where i q ˆ = [w i - p( x i , Ȗ ˆ )]y i /[1 - p( x i , Ȗ ˆ )]. Propensity Score Regression On pages 924-927, Wooldridge discusses another population method using propensity scores. This is very simple; just regress y on a constant, w, and the estimated propensity score. However, he points out (p.927) that this method is, in general, inefficient.
You've reached the end of your free preview.
Want to read all 24 pages?