1. INTRODUCTION
In this paper I further study inverse probability weighted (IPW) M-estimation in the
context of nonrandomly missing data. In previous work, I considered IPW M-estimation to
account for variable probability sampling [Wooldridge (1999)] and for attrition and
nonresponse [Wooldridge (2002a)]. The current paper extends this work by allowing a
more general class of missing data mechanisms. In particular, I allow the selection
probabilities to come from a conditional maximum likelihood estimation problem that does
not necessarily require that the conditioning variables to always be observed. In addition,
for the case of exogenous selection – to be defined precisely in Section 4 – I study the
properties of the IPW M-estimator when the selection probability model is misspecified.
In Wooldridge (2002a), I adopted essentially the same selection framework as Robins
and Rotnitzky (1995), Rotnitzky and Robins (1995), and Robins, Rotnitzky, and Zhao
(1995). Namely, under an ignorability assumption, the probability of selection is obtained
from a probit or logit on a set of always observed variables. A key restriction, that the
conditioning variables are always observed, rules out some interesting cases. A leading
one is where the response variable is a censored survival time or duration, where the
censoring times are random and vary across individual; see, for example, Koul, Susarla,
and van Ryzin (1981) and Honoré, Khan, and Powell (2002). A related problem arises
when one variable, say, medical cost or welfare cost, is unobserved because a duration,