Propensity Score Weighting psw Recall expressions for \u0132 ate and \u0132 att in terms

Propensity score weighting psw recall expressions for

This preview shows page 13 - 17 out of 24 pages.

Propensity Score Weighting (psw) Recall expressions for IJ ate and IJ att in terms of (expectations of) functions of w, y and the propensity score (bottom of p.9 and top of p.10). These expressions are “weighted” means of y, where the weights are functions of w and the propensity score. A “natural” way to use them to estimate program effects is to estimate the propensity score and then take the sample averages of the above expressions. This can be expressed as follows:
Image of page 13
14 W ˆ ate, psw = (1/N) N 1 i 6 » ¼ º « ¬ ª ± ± ± )) ( p ˆ 1 ( y ) w 1 ( ) ( p ˆ y w i i i i i i x x = (1/N) N 1 i 6 )) ( p ˆ 1 )( ( p ˆ y )) ( p ˆ w ( i i i i i x x x ± ± W ˆ att, psw = (1/N) N 1 i 6 )) ( p ˆ 1 ( ˆ y )) ( p ˆ w ( i i i i x x ± U ± where p ˆ ( x ) is an estimate of p( x ) and U ˆ = the fraction of observations for which w = 1. The propensity score can be estimated using a logit (or probit) functional form. It is best to use a flexible function form with lots of squared terms and interaction terms for the x variables. Hirano, Imbens and Ridder (2003) showed that if you increase the number of these terms as the sample size gets larger (this is known as a series estimator) this method is asymptotically efficient (in a semiparametric sense). Of course, we also need to get standard errors for these two estimates. This can be done using regressions methods. To get started, define d i as the “score” (vector of first derivatives for all x variables) of the log likelihood of the propensity score function p( x , Ȗ ) [see p.7 of Lecture 14], which has the parameters Ȗ : d i = d i (w i , x i , Ȗ ) = )) , ( p 1 )( , ( p )) , ( p w ( )' , ( p i i i i i Ȗ x Ȗ x Ȗ x Ȗ x Ȗ ± ±
Image of page 14
15 Then define k i as: k i = )) , ( p 1 )( , ( p y )) , ( p w ( i i i i i Ȗ x Ȗ x Ȗ x ± ± Use your estimates of Ȗ to estimate d i and k i for all observations in the data. Denoting these estimates as i ˆ d and i k ˆ , regress i k ˆ on all the elements in i ˆ d and save the estimated residuals from this regression, which can be denoted as i e ˆ . Then the (asymptotic) standard error of W ˆ ate, psw can be calculated as: ( N)[(1/N) 2 i N 1 i e ˆ 6 ] 1/2 If you use a logit functional form for p( x ), then this is even simpler to do: d i simplifies to: d i = h i (w i - i p ˆ ) where h i is simply the x variables, including squared and interaction terms, and i p ˆ = exp( h i ƍ Ȗ ˆ )/(1 + exp( h i ƍ Ȗ ˆ )). Wooldridge shows on pp.923-924 that the standard error for W ˆ att, psw can be expressed as:
Image of page 15
16 (1/ U ˆ ) 2 / 1 2 i psw , att i N 1 i ) w ˆ ( ) N / 1 ( » ¼ º « ¬ ª W ± 6 ( N) where the i ’s are the residuals from a regression of i q ˆ on i ˆ d , where i q ˆ = [w i - p( x i , Ȗ ˆ )]y i /[1 - p( x i , Ȗ ˆ )]. Propensity Score Regression On pages 924-927, Wooldridge discusses another population method using propensity scores. This is very simple; just regress y on a constant, w, and the estimated propensity score. However, he points out (p.927) that this method is, in general, inefficient.
Image of page 16
Image of page 17

You've reached the end of your free preview.

Want to read all 24 pages?

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture