Unformatted text preview: Detecting Discrimination in
Audit and Correspondence Studies
Audit David Neumark Audit and correspondence studies
Audit Fictitious individuals who are identical except for race, sex, or
Fictitious
ethnicity apply for jobs
ethnicity Audit studies – actual testers, observe job offers
Correspondence studies – paper/online applications, observe callbacks Evidence of group differences in outcomes – for example, blacks
Evidence
getting fewer job offers than whites – is viewed as compelling
evidence of discrimination (Pager, 2007; Riach and Rich, 2002)
evidence
A/C studies nearly unanimous in finding evidence of
A/C
discrimination
discrimination Blacks, Hispanics, and women in the United States (Mincy, 1993;
Blacks,
Neumark, 1996; Bertrand and Mullainathan [BM], 2004)
Neumark,
Moroccans in Belgium and the Netherlands (Smeeters and Nayer,
Moroccans
1998; Bovenkerk et al., 1995)
1998;
Lower castes in India (Banerjee et al., 2008) Criticisms: controlling for mean
differences
differences Difficult to control for experimenter effects Difficult to control for all productivityrelevant differences
Difficult
employers can observe
employers E.g., in Urban Institute study, white and minority testers were aware
E.g.,
of the purpose of the test, and even – in their training – informed
about “the pervasive problem of discrimination in the United States” Arguments sometimes a stretch, and surely better than regression
Arguments
studies (“residual approach” to discrimination)
studies
Controlling for subset of relevant characteristics can lead to more
Controlling
bias than not controlling for any, when characteristics controlled and
not controlled are negative correlated, and can exacerbate effects of
unimportant differences (but is negative correlation likely?)
unimportant These criticisms (Heckman and Siegelman [HS], 1993) have been
These
addressed in many ways, including switch from audit to
correspondence studies
correspondence Criticisms: distributional differences
Criticisms: Best case scenario: no mean differences in observables or
Best
unobservables
unobservables More likely to hold in correspondence study
No average observable differences between groups, and information
No
rich enough that no reason to assume unobservable group
differences
differences HS show that even this case is problematic Differences in variance of unobservable can lead to over or underestimate of discrimination (equivalently, effect of discrimination on
estimate
hiring is unidentified)
hiring
Potentially devastating criticism that has been ignored (most recently,
Potentially
Pager’s 2007 review of the contributions and critiques of these
studies)
studies)
Assumption of different variances of unobservables is common in
Assumption
models of statistical discrimination (Aigner and Cain, 1977; Lundberg
and Startz, 1983)
and Goals of paper
Goals Develop method to (1) test for difference in variance of
Develop
unobservables and (2) identify discrimination even when
variances differ, in correspondence study
variances
Requires data that are not usually collected in correspondence
Requires
study, but can be easily added so that this method can be easily
implemented
implemented
Implement method using existing data set from one
Implement
correspondence study (BM) that has such data
correspondence
Assess estimator Intuition behind problem of differences
in variance of unobservables Ignore any mean differences
Correspondence study controls for one characteristic XI, and
and
employers care about sum of XI and XIIII
employers XI and XIIII uncorrelated (not required), and XIIII not observed
uncorrelated
not Hire only if expected sum exceeds some critical value (hiring
Hire
standard) with sufficiently high probability
standard)
XI low relative to hiring standard – employer will favor group with
high variance of unobservable, since there is higher probability
that this group will high expected sum XI + XII
that
XI high relative to hiring standard – employer less likely to hire
from group with high variance of unobservable, since this group is
more likely to have low sum XI + XII
more
Statistical discrimination models assume higher variance for
Statistical
blacks vs. whites, but we don’t know how this will bias results (if
true) since we don’t know whether XI is high or low relative to
true) Intuition behind solution
Intuition In hiring/callback probit we only identify ratios of coefficients (in
In
latent variable model) to standard deviation of unobservable
latent
In typical study, latent variable model coefficients of “controls” like
In
resume characteristics should be zero, since applicants designed
to be equally qualified
to
If data also include applicants with different levels of
If
qualifications, coefficients should be nonzero, so we can learn
something from these data
something
If we rule out differences in latent variable coefficients of these
If
variables for blacks and whites (e.g.), then differences in probit
coefficients on these variables are informative about differences in
standard deviation on unobservables
standard Formal setup
Formal Let productivity depend on two individual characteristics
X’ = (XI,XII)
Let R be a dummy for race, with R = equal to 1 for blacks and 0
Let
for whites
Assume productivity is additive
P(X’,F) = XI + XII + F
Treatment of a worker (valuation of productivity, implicitly or
Treatment
explicitly) depends on P and possibly R (if there is discrimination),
assumed also additive, so
assumed
T(P(X’,F),R) = P + γ ’R
T(P(X’,F),R)
Discrimination against blacks implies γ ’ < 0
Discrimination Simple example would be determination of wages in Becker’s
Simple
employer discrimination model
employer Audit or correspondence study
Audit Two testers (one with R = 1 and one with R = 0 in each pair)
Two
are sent to firms to apply for jobs
are
Researcher attempts to standardize productivity based on
Researcher
observable characteristics
Denote expected productivity for blacks and whites, based on
Denote
what the firm observes, as PB* and PW*
what Study tries to set PB* = PW* Outcome T is observed for each tester, so each “test” yields
Outcome
observation on T(PB*,1) − T(PW*,0) = PB* + γ ’ − PW*
observation
,1) If PB* = PW*, then averaging across tests yields an estimate of
*,
γ’
Alternatively, we can estimate γ ’ from regression of T on R
Alternatively,
from Unbiased test requires equal means for
uncontrolled variables
uncontrolled Audit study controls only one component of productivity, e.g.,
Audit
setting XBI = XWI = XI*, and XII is unobserved
setting In this case each test yields an observation on
T(PB*,1) − T(PW*,0) = PB* + γ ’ − PW*
,1)
= XBI + E(XBII) + γ ’ − (XWI + E(XWII)) = γ ’ + E(XBII) − E(XWII)
)) Study yields an unbiased estimate only if E(XBII) = E(XWII) Audit studies can control for some observable qualifications,
Audit
but not possible to control for all observable differences
but
Correspondence studies instead send resumes, typically
Correspondence
randomized by group, so there are no observable differences,
and perhaps also no unobserved differences
and
But equal means is necessary, not sufficient for unbiasedness Taste and statistical discrimination
Taste In correspondence study, still can’t rule out differences in group
In
means of unobservables (uncontrolled), in which case we
estimate γ + E(XBII) − E(XWII), since E(XBII) ≠ E(XWII) In this case, estimated difference reflects taste discrimination plus
In
mean difference in unobservable assumed by employer (statistical
discrimination)
With rich controls, though, one might be skeptical about
With
unobserved mean differences
unobserved EEOC describes as illegal “employment decisions based on
EEOC
stereotypes or assumptions about the abilities, traits, or
performance of individuals of a certain sex, race, age, religion,
or ethnic group”
So correspondence study may estimate “illegal” discrimination,
So
without isolating taste discrimination
without Some recent correspondence studies try to distinguish taste from
Some
statistical discrimination (difficult)
statistical Distributional differences in bestcase
scenario
scenario Consensus that correspondence studies meet higher standard
Consensus
of validity
of
But HS show that even in best case, when group means equal
But
for uncontrolled variables (conditional on observed, or simply
all variables), results can be biased
all
Key reason is that outcome is hiring, dependent on expected
Key
productivity exceeding a threshold
productivity Bias from distributional differences (I)
Bias HS: even with group means equal for uncontrolled variables
HS:
results can be biased because hiring outcome depends on
expected productivity exceeding a threshold (c)
Hiring rules are
Hiring
T(P(X’,F)R = 1) = 1 if βI’XBI + XBII + γ ’ > c’
T(P(X’,F)R = 0) = 1 if βI’XWI + XWII > c’
T(P(X’,F)R Audit study controls for XI, with XBI = XWI = XI* XBII and XWII are normally distributed, with mean zero (without
loss of generality) and standard deviations σBII and σWII Bias from distributional differences (II)
Bias Probabilities that blacks and whites are hired are
Probabilities Pr[T(P(XI*,XBII)R = 1) = 1] = Φ[( βI’XI* + γ ’ − c’)/σBII]
)R
[(
Pr[T(P(XI*,XWII) R = 0) = 1] = Φ[( βI’ XI* − c’)/σWII]
R
[( Difference between two expressions (or ratio) is supposed to
Difference
be informative about discrimination, but we can’t distinguish
influence of nonzero γ from unequal standard deviations σWII ≠
σBI
E.g., σWII > σBII, and XI* set at low level
E.g., With βI’XI* < c’, σWII > σBII, first probability (black callback) lower
With
first
than second even if γ ’ = 0 Different outcomes possible depending on relative magnitudes of
Different
σWII vs. σBII and βI’XI* (or XI*) vs. c’ So with different variances of unobservables, effect of
So
discrimination on callbacks is unidentified
discrimination Bias from
distributional
differences (III)
differences Heckman (1998) X2 is unobservable, superscript “1” denotes blacks, and c1, c0 are hiring thresholds So despite absence of discrimination, can get evidence in either direction E.g., with higher var. for whites (0), and low level of standardization, we “find” discrimination against blacks, because employers want high X2 when X1 low Similar spurious evidence can occur in cases when there is discrimination Solution(I)
Solution(I) A higher variance for one group, cet. par., implies a smaller
higher
effect of observed characteristics on employment for that
group
group Intuition: in the limit, if the variance of unobserved XII were
infinite for a group, then XI (the observed productivityrelated
infinite
variable) would have no effect on the evaluation of whether an
applicant from that group meets the standard for hiring
applicant So information from a correspondence study on how variation
So
in observable qualifications is related to employment
outcomes can be informative about the relative variances of
the unobservables
the
And since we saw that identification problem comes down to
And
distinguishing between relative variance of unobservable and
γ ’, this can identify discrimination Solution (II)
Solution We are interested in
We
Φ[( βI’XI* + γ ’ − c’)/σBII] − Φ[( βI’XI* − c’)/σWII]
[(
[( Simplify by normalizing one variance, and writing the other in
Simplify
terms of the relative variance σBRII = σBII/σWII, so we are now
so
estimating
Φ[( β1XI* + γ − c)/σBRII] − Φ[βIXI* − c]
[(
where now all the coefficients are normalized with respect to
where
σWII
So identification of γ now gives us the marginal effect of
So
race
race
With meaningful variation in XI* we can identify βI/σBRIIII and
γ /σBRII from the probit for blacks, and βI from the probit for
from
whites, in which case the ratio identifies σBRII, and therefore γ
and With joint estimation, we can do inference on the parameter σ
With , II Identifying assumption
Identifying Assumption that βI is the same for blacks and whites is
Assumption
necessary for the ratio of the two coefficients to identify σBRII
Untestable with data on only one productivityrelated
Untestable
characteristic
characteristic
Multiple productivityrelated characteristics yield testable
Multiple
restriction
E.g., if there are two such characteristics we estimate
E.g.,
*
Φ[( βIBXI* + βIIBZI* + γ − c)/σBRII] − Φ[( βIWXII* + βIIWZI* − c)]
[(
[(
allowing different coefficients for blacks and whites
If the only reason coefficients for blacks and whites differ is
If
because σBRII ≠ 1, then we must have βIB/βIW = βIIB/βIIW
So I start by estimating a probit model with a full set of race
So
interactions, and testing this constraint
interactions, In application, constraint is not rejected Implementation
Implementation Estimation of βI/σBRII and βI can be done via a heteroskedastic
Estimation
probit model (e.g., Williams, 2009) that allows the variance of
the unobservable to vary with race
the
We pool the data for blacks and whites, and estimate a probit
We
model with Var(εij) = [exp(µ + ωRi)]2
model
Normalize µ = 0 (equivalent to standard normalization in probit,
but for just whites in this case)
but
Estimate via MLE
Estimate of exp(ω) iis exactly the estimate of σBRII
s Marginal effects (I)
Marginal Typically, to translate probit coefficient estimates into
Typically,
magnitudes that can be interpreted as the marginal effects of a
variable (Zk, generically, with coefficient βk, when Z is the vector
variable
when
of controls with coefficients β), we use
∂P(hire)/∂Zk = βkφ(Zβ) φ(.) is the standard normal density, standard deviation of the
(.)
unobservable is normalized to 1
unobservable
Evaluated at the means of Z
When Zk is a dummy variable – such as race – the difference in
the cumulative normal distribution functions is often used instead,
although the difference is usually trivial
although In heteroskedastic probit model, if variances of unobservables
In
differ by race, then when race “changes” both the variance and
the level of the latent variable (the valuation of a worker’s
productivity) that determines hires can shift
productivity) Marginal effects (II)
Marginal I want to isolate effect on latent variable, since differential
want
treatment of blacks and whites based only on differences in
variances of unobservables should not be interpreted as
discrimination (more later)
discrimination With continuous version of partial derivative, can decompose the
With
effect of a change in Zk into these two components
effect For HP model with Var(εij) = [exp(Wω)], overall partial derivative
[exp(Wω)],
of P(hire) with respect to Zk is
of
∂P(hire)/∂ Zk = φ{Zβ/exp(Wω)}∙{(βk – Zβ∙ωk)}/exp(Wω) First term (before minus sign) is partial derivative with respect to
First
changes in Zk affecting only the level of the latent variable –
changes
corresponding to the counterfactual of Zk changing the valuation
corresponding
of the worker without changing the variance of the unobservable
of
Second term is partial derivative with respect to changes via the
Second Application/prior evidence (I)
Application/prior Bertrand and Mullainathan (2004) do correspondence study
Bertrand
of race discrimination, based on blacksounding names
of
Explicitly designed resumes of varying quality to test whether
Explicitly
returns to higher qualifications were lower for blacks
returns
Evidence of discrimination against blacks
The quality variation does matter
Evidence generally consistent with lower returns to higher
Evidence
qualifications for blacks
qualifications Replication (marginal effects)
Replication
Black
Female
Selected individual resume controls
Bachelor’s degree Males and females
(1)
(2)
(3) .033
.030
.030
(.006)
.009
(.012) (6)
.030 (.008)
… (.007)
… (.007)
… Academic honors
Special skills
Other controls:
Individual resume characteristics (.006)
.001
(.011)
.009
(.009)
.076
(.028)
.021
(.010)
.040
(.015)
.055
(.009) .019
(.010)
.080
(.034)
.019
(.013)
.026
(.017)
.060
(.010) .019
(.010)
.076
(.033)
.018
(.012)
028
(.017)
.059
(.010) X Experience2 ∙102 (.006)
.001
(.011)
.009
(.009)
.080
(.029)
.022
(.011)
.039
(.015)
.056
(.009) Experience ∙101 X X X Neighborhood characteristics
Mean callback rate (4)
.033 Females
(5) .030 X
.080
4,784 .080
4,784 .080
4,784 X
.082
3,670 .082
3,670 .082
3,670 Application/prior evidence (I)
Application/prior Bertrand and Mullainathan (2004) do correspondence study
Bertrand
of race discrimination, based on blacksounding names
of
Explicitly designed resumes of varying quality to test whether
Explicitly
returns to higher qualifications were lower for blacks
returns
Evidence of discrimination against blacks
The quality variation does matter
Evidence generally consistent with lower returns to higher
Evidence
qualifications for blacks
qualifications Replication
Replication
Black
Female
Selected individual resume controls
Bachelor’s degree Males and females
(1)
(2) .033
.030
(.006)
(.006)
.009
.001
(.012)
(.011) (3)
.030
(.006)
.001
(.011) (4)
.033
(.008)
… Females
(5) .030
(.007)
… (6)
.030
(.007)
… .009 .009 .019 .019 Experience ∙101 (.009)
.080 (.009)
.076 (.010)
.080 (.010)
.076 Experience2 ∙102 (.029)
.022 (.028)
.021 (.034)
.019 (.033)
.018 Academic honors (.011)
.039 (.010)
.040 (.013)
.026 (.012)
028 Special skills (.015)
.056 (.015)
.055 (.017)
.060 (.017)
.059 (.009) (.009) (.010) (.010) X X X X Other controls:
Individual resume characteristics
Neighborhood characteristics
Mean callback rate X
.080
4,784 .080
4,784 .080
4,784 X
.082
3,670 .082
3,670 .082
3,670 Application/prior evidence (I)
Application/prior Bertrand and Mullainathan (2004) do correspondence study of
Bertrand
race discrimination, based on blacksounding names
race
Explicitly designed resumes of varying quality to test whether
Explicitly
returns to higher qualifications were lower for blacks
returns
Evidence of discrimination against blacks
The quality variation does matter
Evidence generally consistent with lower returns to higher
Evidence
qualifications for blacks
qualifications Application/prior evidence (II)
Application/prior Lower coefficients for blacks are consistent with a larger
Lower
variance for blacks, i.e., σBRII > 1 If BM study has chosen low levels of the control variables on
If
which to standardize applicants – and BM explicitly state that
they tried to avoid overqualification even of the higherquality
resumes (p. 995) – then the HS analysis would imply that there
is a bias towards finding discrimination in favor of blacks
blacks With low level of standardization, employers want higher
With
probability of high value of unobservable
probability In this case, evidence of discrimination would be stronger
In
absent the bias from differences in the distribution of
unobservables
unobservables
We don’t really know whether level of controls is low or high
We
relative to hiring standard
relative Fully interactive specifications
Fully
A. Estimates from basic probit (Table 1)
Black
B. Heteroskedastic probit model
Black (unbiased estimates)
Effect of race through level
Effect of race through variance
Standard deviation of unobservables, black/white
Wald test statistic, null hypothesis that ratio of standard deviations = 1 (pvalue)
Wald test statistic, null hypothesis that ratios of coefficients for
whites relative to blacks are constant, fully interactive probit model (pvalue)
Test overidentifying restrictions: include in heteroskedastic probit model interactions for variables with white coefficient < black coefficient, Wald test for joint significance of interactions (pvalue)
Number of overidentifying restrictions
Other controls:
Individual resume characteristics
Neighborhood characteristics Males and females
(1)
(2) (3) Females (4) .030
(.006) .030
(.006) .030
(.007) .030
(.007) .024
(.007)
.086
(.038)
.062
(.042)
1.37 .026
(.007)
.070
(.040)
.044
(.043)
1.26 .026
(.008)
.072
(.040)
.046
(.045)
1.26 .027
(.008)
.054
(.040)
.028
(.044)
1.15 .22 .37 .37 .56 .62 .42 .17 .35 .83 .33 .34 .56 3 6 2 6 X X
X X X
X Fully interactive specifications
Fully
A. Estimates from basic probit (T...
View
Full Document
 Fall '09
 Kuhn
 Normal Distribution, Variance, Discrimination, Probit, Wald test statistic

Click to edit the document details