Propensity Score Weighting (psw)
Recall expressions for
Ĳ
ate
and
Ĳ
att
in terms of
(expectations of) functions of w, y and the propensity
score (bottom of p.9 and top of p.10).
These expressions
are “weighted” means of y, where the weights are
functions of w and the propensity score.
A “natural” way
to use them to estimate program effects is to estimate the
propensity score and then take the sample averages of the
above expressions.
This can be expressed as follows:
14
W
ˆ
ate, psw
= (1/N)
N
1
i
6
»
¼
º
«
¬
ª
±
±
±
))
(
p
ˆ
1
(
y
)
w
1
(
)
(
p
ˆ
y
w
i
i
i
i
i
i
x
x
= (1/N)
N
1
i
6
))
(
p
ˆ
1
)(
(
p
ˆ
y
))
(
p
ˆ
w
(
i
i
i
i
i
x
x
x
±
±
W
ˆ
att, psw
= (1/N)
N
1
i
6
))
(
p
ˆ
1
(
ˆ
y
))
(
p
ˆ
w
(
i
i
i
i
x
x
±
U
±
where
p
ˆ
(
x
) is an estimate of p(
x
) and
U
ˆ = the fraction of
observations for which w = 1.
The propensity score can be estimated using a logit (or
probit) functional form.
It is best to use a flexible
function form with lots of squared terms and interaction
terms for the
x
variables.
Hirano, Imbens and Ridder
(2003) showed that if you increase the number of these
terms as the sample size gets larger (this is known as a
series estimator) this method is asymptotically efficient
(in a semiparametric sense).
Of course, we also need to get standard errors for these
two estimates.
This can be done using regressions
methods.
To get started, define
d
i
as the “score” (vector
of first derivatives for all
x
variables) of the log likelihood
of the propensity score function p(
x
,
Ȗ
) [see p.7 of Lecture
14], which has the parameters
Ȗ
:
d
i
=
d
i
(w
i
,
x
i
,
Ȗ
) =
))
,
(
p
1
)(
,
(
p
))
,
(
p
w
(
)'
,
(
p
i
i
i
i
i
Ȗ
x
Ȗ
x
Ȗ
x
Ȗ
x
Ȗ
±
±
15
Then define k
i
as:
k
i
=
))
,
(
p
1
)(
,
(
p
y
))
,
(
p
w
(
i
i
i
i
i
Ȗ
x
Ȗ
x
Ȗ
x
±
±
Use your estimates of
Ȗ
to estimate
d
i
and k
i
for all
observations in the data.
Denoting these estimates as
i
ˆ
d
and
i
k
ˆ
, regress
i
k
ˆ
on all the elements in
i
ˆ
d
and save the
estimated residuals from this regression, which can be
denoted as
i
e
ˆ
.
Then the (asymptotic) standard error of
W
ˆ
ate, psw
can be calculated as:
(
N)[(1/N)
2
i
N
1
i
e
ˆ
6
]
1/2
If you use a logit functional form for p(
x
), then this is
even simpler to do:
d
i
simplifies to:
d
i
=
h
i
(w
i

i
p
ˆ
)
where
h
i
is simply the
x
variables, including squared and
interaction terms, and
i
p
ˆ
= exp(
h
i
ƍ
Ȗ
ˆ )/(1 + exp(
h
i
ƍ
Ȗ
ˆ )).
Wooldridge shows on pp.923924 that the standard error
for
W
ˆ
att, psw
can be expressed as:
16
(1/
U
ˆ )
2
/
1
2
i
psw
,
att
i
N
1
i
)
w
ˆ
rˆ
(
)
N
/
1
(
»
¼
º
«
¬
ª
W
±
6
(
N)
where the
i
rˆ
’s are the residuals from a regression of
i
q
ˆ
on
i
ˆ
d
, where
i
q
ˆ
= [w
i
 p(
x
i
,
Ȗ
ˆ )]y
i
/[1  p(
x
i
,
Ȗ
ˆ )].
Propensity Score Regression
On pages 924927, Wooldridge discusses another
population method using propensity scores.
This is very
simple; just regress y on a constant, w, and the estimated
propensity score.
However, he points out (p.927) that this
method is, in general, inefficient.
You've reached the end of your free preview.
Want to read all 24 pages?
 Spring '14
 Glewwe,PaulW
 Econometrics, Regression Analysis, Nonparametric statistics, att, Propensity score