This preview shows page 1. Sign up to view the full content.
Unformatted text preview: The Redesign of the Matching Market for American Physicians:
Some Engineering Aspects of Economic Design
By ALVIN E. ROTH AND ELLIOTT PERANSON* We report on the design of the new clearinghouse adopted by the National Resident
Matching Program, which annually ﬁlls approximately 20,000 jobs for new physicians. Because the market has complementarities between applicants and between
positions, the theory of simple matching markets does not apply directly. However,
computational experiments show the theory provides good approximations. Furthermore, the set of stable matchings, and the opportunities for strategic manipulation, are surprisingly small. A new kind of “core convergence” result explains
this; that each applicant interviews only a small fraction of available positions is
important. We also describe engineering aspects of the design process. (JEL C78,
B41, J44) The entrylevel labor market for new physicians in the United States is organized via a
centralized clearinghouse called the National
Resident Matching Program (NRMP). Each
year, approximately 20,000 jobs are ﬁlled in a
process in which graduating physicians and
other applicants interview at residency programs throughout the country and then compose
and submit Rank Order Lists (ROLs) to the
NRMP, each indicating an applicant’s preference ordering among the positions for which
she has interviewed. Similarly, the residency
programs submit ROLs of the applicants they
have interviewed, along with the number of
positions they wish to ﬁll. The NRMP processes
these ROLs and capacities to produce a matching of applicants to residency programs.
The clearinghouse used in this market dates
from the early 1950’s. It replaced a decentralized process that suffered a market failure when
residency programs and applicants started to
seek each other out individually through informal channels, earlier and earlier in advance of employment, rather than waiting to participate
in the larger market. (By the 1940’s, contracts
were typically being signed two years in advance of employment.) Although the matching
algorithm has been adapted over time to meet
changes in the structure of medical employment, roughly the same form of clearinghouse
market mechanism has been used since 1951
(see Roth, 1984). The kind of market failure that
gave rise to this clearinghouse has since been
seen in many markets (Roth and Xiaolin Xing,
1994), a number of which have also organized
clearinghouses in response.
In the mid 1990’s, in an environment of rapidly changing healthcare ﬁnancing with many
implications for the medical labor market, the
market began to suffer a crisis of conﬁdence
concerning whether the matching algorithm was
unreasonably favorable to employers at the expense of applicants, and whether applicants
could “game the system” by strategically manipulating the ROLs they submitted. The controversy was most clearly expressed in an
exchange in Academic Medicine (Peranson and
Richard R. Randlett, 1995a, b; Kevin J.
Williams, 1995a, b). In reaction to this exchange, groups such as the American Medical
Student Association together with Ralph Nader’s Public Citizen Health Research Group
(1995), and the Medical Student Section of the
American Medical Association (AMAMSS,
1995) advocated that the matching algorithm be * Roth: Department of Economics, and Graduate School
of Business Administration, Harvard University, Cambridge, MA 02138 (email: al_roth@harvard.edu); Peranson: National Matching Services, Inc., 595 Bay Street, Suite
301, Box 29, Toronto, ON M5G 2C2, Canada. We thank
Aljosa Feldin for able assistance with the theoretical computations reported in Section VI. Parts of this work were
sponsored by the National Resident Matching Program, and
parts by the National Science Foundation.
748 VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS changed and/or that the description of the match
be changed to give applicants more accurate
advice about how to participate.1
Medicalschool personnel responsible for advising students about the job market began to
report that many students believed the NRMP did
not function in the best interest of students, and
that students were discussing the possibility of
different kinds of strategic behavior. Given the
prior history of market failure due to lack of conﬁdence in the market in this and other entrylevel
professional labor markets, these reports deserved
and received the most serious attention.
In this atmosphere, in the fall of 1995 the Board
of Directors of the NRMP commissioned the design of a new algorithm for conducting the annual
match, and a study comparing it to the existing
NRMP algorithm. The present paper reports how
the new algorithm was designed, how the two
algorithms were compared, and what was learned
about the market in the process. (In May 1997, the
NRMP Board of Directors decided to switch to
the new algorithm, and the ﬁrst match using the
new algorithm was successfully completed in
March 1998.)
In the course of designing, testing, and evaluating the new clearinghouse algorithm, some
surprising properties of large labor markets
emerged. The high transaction costs involved in
interviewing place a practical limit on how
many interviews are conducted, and one consequence of this is that the set of stable outcomes
is very small, and there are very few opportunities for participants to engage in strategic manipulation of their stated preferences when it
comes to making and accepting offers. (Neither
of these would be the case in the absence of
transaction costs.)
Aside from describing these new facts, and
presenting some theoretical computation to explain them, we also describe in this paper the
process by which the new clearinghouse algorithm was designed, evaluated, and compared
with the existing algorithm. At each stage, this 1
At around the same time, the Antitrust Division of the
Department of Justice initiated a wideranging discovery
process concerning these markets. This ultimately gave rise
to a fairly narrowly focused consent decree involving the
practices of the Association of Family Practice Residency
Directors (U.S. District Court for the Western District of
Missouri, 1996). 749 process involved computational experiments.
This process resembles engineering practice
rather than theoremproving or hypothesistesting. But despite the fact that economists are
increasingly called upon to design markets,
there is little or no economic literature devoted
to the engineering aspects of economic design
and the practical problems of moving from theory about simple markets to workable institutions for complex markets. Yet if we fail to
develop such an “engineering” literature, we
will fail to proﬁt from design experience in a
cumulative way. The present paper then, in addition to presenting some new results, is intended to take a step in the direction of an
engineering literature as well, by describing
how those facts were learned, and how they
impacted design decisions.2
A rough analogy may be helpful for thinking
about how the different parts of this paper hang
together. Consider the design of suspension
bridges. The Newtonian physics they embody is
beautiful both in mathematics and in steel, and
college students can be taught to derive the
curves that describe the shape of the supporting
cables. But no bridge could be built based only
on this elegant theoretical treatment, in which
the only force is gravity, and all beams are
perfectly rigid. Real bridges are built of steel
and rest on rock and soil and water, and so
bridge design also concerns metal fatigue, soil
mechanics, and the forces of waves and wind.
Many design questions concerning these realworld complications cannot be answered analytically but, instead, must be explored using
physical or computational models. Often these
involve estimating magnitudes of phenomena
missing from the simple Newtonian model,
some of which are small enough to be of little
consequence, while others will cause the bridge
to fall down if not adequately addressed. Just as
no suspension bridges could be built without an
2
Some beginnings of such a literature can also be found
in connection with the design of electricity markets (Robert
B. Wilson, 1993) and the auction of the radio spectrum (see
e.g., John McMillan, 1994, 1995; R. Preston McAfee and
McMillan, 1996; Lawrence M. Ausubel et al., 1997; Peter
Cramton, 1997; John O. Ledyard et al., 1997; Paul Milgrom,
1997; Charles R. Plott, 1997; David J. Salant, 1997). There
is, of course, already something of an engineeringoriented
literature in ﬁnance; for an innovative example, see Robert
J. Shiller (1993). 750 THE AMERICAN ECONOMIC REVIEW understanding of the underlying physics, neither
could any be built without understanding many
additional features, also physical in nature, but
more varied and complex than addressed by the
simple model. These additional features, and
how they are related to and interact with that
part of the physics captured by the simple
model, are the concern of the scientiﬁc literature
of engineering. Some of this is less elegant than
the Newtonian model, but it is what makes
bridges stand. Just as important, it allows
bridges designed on the same basic Newtonian
model to be built longer, stronger, and lighter
over time, as the complexities and how to deal
with them become better understood.
For the design of the medical labormarket
clearinghouse, the underlying theory is the theory
of twosided matching. Simple models of twosided matching markets have proved to be elegant
and tractable, and very useful in understanding the
organization and evolution of many markets. But
the theory concentrates on simple models in which
no worker needs more than one job, and there are
no married couples or other connections between
workers or between positions. There is a large
body of theory relevant to design problems (see
e.g., Roth and Marilda Sotomayor, 1990), but
none of the theorems applies directly to the medical market, although many of the counterexamples do. That is, many of the existing theorems
rest on assumptions not met in the complex medical market, and many of the medical market’s
complexities are known to open the door to the
possibility of serious design problems. But the
counterexamples do not give any guidance to the
magnitude of these problems, and for this we will
have to rely on computational exploration, both of
the data from the medical market itself, and of
simpler models which will help explain what is
going on in the complex market. In both cases, the
computational explorations will be guided by the
theory, which will make possible computational
experiments that would be impossible to conduct
by brute force on such large markets. It seems
likely that, as game theory moves from simple
conceptual problems to complex design problems,
we will need to make more general use of this
interaction among theory, computational investigation of market data, and theoretical computation, and that this in turn will produce new
problems and directions for traditional theory.
This paper is organized as follows. Section I SEPTEMBER 1999 gives an overview of the medical market and the
design problem, and it presents some necessary
background by discussing stable matchings and
why they are important, how complex markets
differ from simple markets with respect to stable matchings, and how the algorithm used by
the NRMP prior to this study is structured.
Section II presents statistics describing the market and previous match results. These demonstrate that three of the four match variations that
make the NRMP a complex market are present
in substantial numbers. Section III describes
how the new algorithm was designed, including
the role of computational experiments. Section
IV compares the performance of the two algorithms on the data from recent matches, and
Section V looks at the possibilities for strategic
behavior when each of the two algorithms is
employed. In studying the possibilities for strategic behavior, we will ﬁrst treat the ROL data
as if they were the true preferences of the
agents, and then (in Section VI) show why this
is justiﬁed; we will also explain why the set of
stable matchings turns out to be so small. Section VII presents some thoughts on the interplay
among theory, computational experiments, and
theoretical computation in the design of market
mechanisms. The theory of simple markets
framed the questions that needed to be answered
in the course of this design and suggested how
to construct and evaluate computational experiments on the complex system to answer these
questions. The magnitudes determined by the
computational experiments were then explained
with theoretical computations on simple markets, providing results which, with the aid of
theory, could be unambiguously interpreted.
This interplay was what gave the present design
effort its “engineering” ﬂavor, and we suspect
that this will generalize to other design efforts.
Section VIII contains concluding remarks.
I. Background to the Present Study The considerable body of theory that has
been developed for twosided matching markets, together with multiple opportunities to observe empirically both the successful and
unsuccessful clearinghouse organization of
other entrylevel labor markets, provided a general road map for both the design and evaluation
of a new clearinghouse algorithm. Speciﬁcally, VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS there was strong empirical evidence that successful clearinghouses are generally those that
produce matchings that are stable in the sense
that they do not create “blocking pairs” of
agents, not matched to one another, who would
mutually prefer to be matched to one another
than to accept the matching produced by the
clearinghouse. The theory clearly shows that, in
sufﬁciently simple markets (simple in a way
that will shortly be made precise), systematic
welfare comparisons can be made between different stable matchings, with some being relatively favorable to ﬁrms and unfavorable to
workers, and some the reverse. In addition, for
sufﬁciently simple markets the theory allows
strong conclusions to be drawn about the opportunity and scope for strategic behavior. (For
an overview of the theory, relevant parts of
which will be reviewed below, see Roth and
Sotomayor [1990].)
The goal of the design was to construct an
algorithm that would produce stable matchings as
favorable as possible to applicants, while meeting
the speciﬁc constraints of the medical market. The
comparisons between the new and existing algorithms were to focus both on how many applicants
and residency programs could be expected to receive morepreferred or lesspreferred matches
under the two algorithms and on how the different
algorithms might inﬂuence the opportunity or
need for strategic behavior by applicants and programs. Closely related issues were what advice
could be given to participants in the match when it
is conducted with one or the other of the algorithms, and what kinds of changes in the behavior
of match participants might be anticipated if the
matching algorithm were changed.
These questions were at the heart of the controversy that spilled into the medical journals in
1995. Much of that discussion referred to results
in the theoretical literature concerning simple twosided matching markets. But, although the NRMP
originated as a simple market, it has become more
complex particularly since the early 1980’s, as it
has developed complementarities and linkages between positions and between applicants. These
arise through four kinds of “match variations,”
introduced to accommodate the changing structure of the medical labor market, namely:
(i) couples in the applicant pool who seek
two positions close to one another; 751 (ii) applicants who seek secondyear positions
in the match and, if they are successful,
have supplemental Rank Order Lists
which must be consulted to match them to
prerequisite ﬁrstyear positions;
(iii) residency programs with positions that revert to other programs if they remain unﬁlled;3 and
(iv) programs that wish to ﬁll an even number
of positions if they cannot ﬁll all their
positions.
These linkages can be shown to allow situations
in which many of the conclusions reached about
simpler markets no longer apply.
It was therefore necessary, both in designing
the new algorithm and in making comparisons to
the existing algorithm, ﬁrst to conduct computational experiments to determine the extent to
which the predictions of the theory of simple
matching markets applied to the NRMP. These
computational experiments, as well as those employed to compare the two algorithms, were conducted on the ROLs submitted by all applicants
and residency programs in the four most recent
matches (1993, 1994, 1995, and 1996) and in the
1987 match. The recent matches were selected to
have contemporary patterns of preferences among
applicants and residency programs, and 1987 was
selected for a comparison over a longer period,
and speciﬁcally because it had the lowest rate of
unmatched U.S. seniors in the available data set
(6.0 percent, as opposed to the historically high
rate of 7.5 percent for 1996).
A number of specialty matches are also run
under the auspices of the NRMP, and these are
largely free of the match variations which add
complexity to the general resident match. The
existing theory of simple matching markets
therefore provides accurate predictions about
the nature and direction of changes to be anticipated in these matches if the existing NRMP
algorithm were replaced by the new algorithm. 3
Typically these reversions arise when, for example, the
director of a secondyear postgraduate residency program
arranges with the director of a prerequisite ﬁrstyear program that his residents will spend their ﬁrst year in that
prerequisite program. However if the secondyear program
then fails to match with as many residents as were anticipated, this leaves vacancies in the ﬁrstyear program that
can be ﬁlled by other applicants. 752 THE AMERICAN ECONOMIC REVIEW However the theory offers little guidance as to
the magnitude of the changes to be expected,
and for this purpose, computational experiments
on the data of past matches were also needed.
These were conducted for the Thoracic Surgery
match, for the ﬁve years 1991–1994 and 1996.
The design of the new algorithm and the
comparisons of the two algorithms will be discussed in detail in the body of the paper. The
general conclusions can be summarized by noting that, both for the NRMP and the specialty
matches, the effects of changing from the existing algorithm to the newly designed algorithm
are in the directions predicted by the theory for
simple markets, but the size of these changes is
small, and the opportunities for proﬁtable strategic behavior are comparably small for both
applicants and programs under either algorithm.
In the course of explaining why the differences
are so small, we will present a new kind of “core
convergence” result, which shows that the size of
the set of stable matchings becomes small as the
size of the market increases, even when preferences are uncorrelated, provided that the number
of positions for which an applicant can interview
remains small (and not otherwise).
A. Stable Matchings in Simple and Complex
Matching Markets
Centralized matching mechanisms often arise
to solve market failures due to unraveling of appointment dates. Perhaps the most important and
least controversial empirical ﬁnding about centralized matching algorithms is that they are most
often successful if the matchings they produce are
stable (Roth, 1984, 1990, 1991; Roth and Xing,
1994; John Kagel and Roth, 2000). In a simple
matching market, a matching between applicants
and residency programs is stable if there is no
applicant or program matched to an unacceptable
(unlisted) partner, and if there are no applicant–
program pairs such that the applicant prefers the
program to his/her current match, and the program
also prefers the applicant to one of its current
matches (or vacant position).4
4 Among the programs and applicants who have interviewed one another, programs do not list applicants with
whom they are unwilling to match, and applicants do not list
programs with whom they are unwilling to match. (Unmatched programs and applicants can be matched in the SEPTEMBER 1999 Therefore, this study, and the controversy
which preceded it, focused on choices among
algorithms that produce stable matchings. The
reason for the controversy is that there can be
systematic differences among stable matchings.
Appendix C gives formal deﬁnitions of stability
in simple and complex matching markets, but
the basic ideas can be conveyed by considering
the “deferred acceptance algorithm” ﬁrst formally studied by David Gale and Lloyd Shapley
(1962).5 There are two basic versions of this
algorithm, in each of which one side of the
market (ﬁrms or workers) makes offers, which
the other side can reject or hold to see whether
any better offers are forthcoming.
In the workerproposing version of the algorithm, each worker begins by applying for the
position at the top of her preference list. Each
ﬁrm rejects any unacceptable candidates, and if
it has q positions it temporarily holds the (up to)
q mostpreferred applications it has so far received and rejects the rest. A candidate who is
rejected at any step of the algorithm next applies
to her nexthighestranked position (if any remain) among those not yet applied to. The algorithm stops at any step in which no new
applications are made, at which point each
worker is matched to the ﬁrm (if any) holding
her application.
In a simple market the resulting matching
must be stable (i.e., there are no ﬁrm–worker
blocking pairs) since, if a worker w prefers ﬁrm
f to her ﬁnal match, she must have applied to
ﬁrm f and been rejected, and hence ﬁrm f does
postmatch secondary market called the “scramble,” which
takes place primarily in the 24hour period before the ofﬁcial public announcements of the match results.) Also, programs and applicants generally do not list applicants or
programs with whom they have not had interviews [and this
is of course an equilibrium, since the clearinghouse produces a stable matching, at which an applicant (program)
cannot be matched to a program (applicant) without being
listed, so there is no incentive to list programs or applicants
with whom one has not had an interview]. There is also a
charge to applicants who list more than 15 residency programs, which may dissuade some applicants from listing
some programs.
5
Although Gale and Shapley (1962) discussed the algorithm in an abstract setting, it appears that, in various forms,
equivalent algorithms have been developed in applied contexts both before and since, with the initial NRMP algorithm, dating from 1951, being the ﬁrst we know of (Roth,
1984). VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS not prefer her to any of the workers whose
applications it held when the algorithm stopped.
Furthermore, Gale and Shapley showed that,
when preferences are strict, the particular stable
matching produced by the workerproposing
version of the algorithm gives each worker her
mostpreferred position among those she can
get at any stable matching. Even more striking,
the ﬁrmproposing version of the algorithm
gives every ﬁrm that ﬁlls q positions its q mostpreferred workers among those it can be
matched to at any stable matching (Roth, 1985;
Roth and Sotomayor, 1989). Much of the controversy about the organization of the NRMP
focused on this difference between these two
versions of the deferredacceptance algorithm.
But a deferredacceptance algorithm may fail to
produce a stable matching in a market with
some of the complexities of the NRMP, such as
the presence of couples who submit Rank Order
Lists of pairs of positions. The key to the stability
of the outcome in simple markets is that (in the
workerproposing version of the algorithm) no
ﬁrm ever regrets having rejected a worker’s
application, since it only does so when it has an
application it prefers, and it will be matched to
this preferred applicant unless it receives applications it prefers even more. However, in a
market containing couples, suppose that a ﬁrm
f 1 receives an application from a worker w 1 ,
and rejects an application from a lesspreferred
worker w in order to hold w 1 ’s application.
Suppose further that w 1 is married to w 2 , whose
application is being held by ﬁrm f 2 , because the
pair ( f 1 , f 2 ) is high on the preference list submitted by the couple c
( w 1 , w 2 ). Finally,
suppose that ﬁrm f 2 now receives an application
it prefers and rejects the application of w 2 . In
order for the couple c now to apply to its nextchoice pair of ﬁrms, ( f 3 , f 4 ), w 1 must be withdrawn from ﬁrm f 1 . Thus, ﬁrm f 1 now regrets
having rejected worker w , and there may be a
potential instability involving f 1 and w (and, if
w is part of a couple, this instability may
involve another ﬁrm as well; see Appendix C).
The differences between simple and complex
markets involve more than the failure of the
deferredacceptance algorithm to produce stable
matchings: they extend to the nonemptiness and
structure of the set of stable matchings itself.
Some of the important differences are summarized below, by noting theorems about simple 753 matching markets that do not hold when the
market contains couples or other linkages that
create complementarities between positions or
applicants (see Roth and Sotomayor [1990] for
a comprehensive treatment and more detailed
references to the literature):
(i) In simple matching markets, ﬁrm and
worker optimal stable matchings exist
for all possible ROLs and are produced
by the ﬁrm and workerproposing variants of the deferredacceptance algorithm (Gale and Shapley, 1962; Roth,
1985).
(i ) In markets with complementarities, no
stable matching may exist, and even
when stable matchings exist there may
be no optimal stable matchings for either
side of the market (Roth, 1984; Brian
Aldershof and Olivia Carducci, 1996).
(ii) In simple markets, the same applicants
are matched at every stable matching,
and the same positions are ﬁlled. (That
is, any applicant who is unmatched at
one stable matching is unmatched at every stable matching, and the positions
that are unﬁlled are the same at every
stable matching.) Furthermore, a ﬁrm
that ﬁlls only some of its positions at a
stable matching ﬁlls them with the same
workers at every stable matching (Roth,
1986).
(ii ) In markets with complementarities, different stable matchings may have different applicants matched and different
positions ﬁlled (Aldershof and Carducci,
1996).
(iii) In simple markets, when the applicantproposing algorithm is used (but not
when the programproposing algorithm
is used), it is a dominant strategy for
applicants to submit ROLs corresponding to their true preferences. No parallel
assertion can be made about residency
programs that have more than one position (Roth, 1982, 1985).
(iii ) In markets with complementarities, no
algorithm exists that chooses a stable
matching whenever one exists and makes
it a dominant strategy for all agents to
state their true preferences (Roth, 1985;
Aljosa Feldin, 1999). 754 THE AMERICAN ECONOMIC REVIEW Therefore, a major focus of this study was to
assess the extent to which these theoretical possibilities play a role in the actual NRMP matches. In
the course of this report it will become clear that,
while it has always been possible to ﬁnd stable
matchings in the previous years’ NRMP matches
(a stable matching has been found in every match
at least since the mid 1970’s), it appears that no
stable matching is precisely programoptimal or
applicantoptimal in any of the years we have
examined. However, we will show that applicantproposing and programproposing algorithms
continue to perform approximately as in the case
of simple markets. SEPTEMBER 1999 NRMP match variations such as the use of
supplemental lists to form multiyear matches,
and it is organized in a single phase). But when
no match variations are present, the specialtymatch algorithm and the 1995 NRMP algorithm
are functionally equivalent to the programproposing deferredacceptance algorithm in that
they all produce the programoptimal stable
matching.
II. Descriptive Statistics and Original NRMP
Match Results A. The NRMP in the Years 1987 and
1993–1996 B. The Preexisting NRMP Algorithm
The preexisting NRMP algorithm (the one
in use in 1995 when this study began, and
used through the 1997 matches) is the result
of incremental modiﬁcations over a period of
years. It is primarily, but not entirely, a
programproposing algorithm and deals with
match variations through a threephase process. The ﬁrst phase produces an initial match
by ignoring most match variations, using the
programproposing deferredacceptance algorithm, modiﬁed to let couples hold on to many
offers until a late stage in the algorithm. The
match produced in this way will in general
not be stable (because of the way it handles
couples, and because the other match variations are ignored), so the second phase of the
program identiﬁes potential instabilities. The
third phase of the program uses an algorithm
to ﬁx these instabilities one by one and produces a stable match. The processing in this
third phase does not always have residency
programs proposing. Instead, couples propose
in part of the algorithm designed to ﬁx instabilities due to couples, and applicants also
propose in part of the algorithm that ﬁxes
instabilities related to supplemental (ﬁrstyear) matches. Thus the 1995 NRMP algorithm is a hybrid; programproposing in its
ﬁrst phase (which performs the bulk of the
matching), and applicantproposing in some
parts of its third phase.
The NRMP specialty matches like Thoracic
Surgery are run using an algorithm that is technically a little different from the original NRMP
algorithm (it does not handle some of the Table 1 gives the descriptive statistics of the
NRMP match in the ﬁve years we consider.
Notice that, in each year, a substantial number
of the more than 20,000 applicants who participate do so in ways that utilize the match variations that the NRMP allows: about 4 percent
participate as couples, and 8 –12 percent submit
supplemental Rank Order Lists. In addition, in
the 1990’s about 7 percent of the 3,000 – 4,000
programs that participate in each year have positions that could revert to other programs if
they remain unﬁlled (accounting for almost 6
percent of the total quota of positions). Thus,
the match variations are a substantial part of the
match. Before investigating how these match
variations change the properties of stable
matches and of strategic behavior, the ﬁrst task
is the design of an applicantproposing algorithm to produce stable matches that meet the
matchvariation requirements of thousands of
participants.
Quotas include positions in active programs
with no ROL returned. Changes during the
match are caused primarily by reversions. In
some cases, one position is reverted simultaneously to two programs, causing a net increase
in the number of positions offered. In addition,
a few positions may be dropped from the match
during processing to accommodate requests for
even/odd matching.
B. Specialty Matches: Thoracic Surgery in
the Years 1991–1994 and 1996
In contrast, the Thoracic Surgery match is a
simple match, with no match variations. Its ba VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS TABLE 1—DESCRIPTIVE STATISTICS AND ORIGINAL MATCH RESULTS FOR (A) THE NRMP AND 755 (B) THORACIC SURGERY A. NRMP
Category 1987 Applicants (Active, ROL returned):
Primary ROLs
Applicants with supplemental ROLs
Results
Primary matches
Supplemental matches
Couples
Applicants who are coupled
Coupled applicants who matched
Programs:
Active programs
Active programs with ROL returned
Potential reversions of unﬁlled positions
Programs specifying reversion
Positions to be reverted if unﬁlled
Programs requesting even/odd matching
Quotas:a
Total quota before match
Changes during match processing
Quota decreases
Programs
Positions
Quota decreases
Programs
Positions
Total quota after match (ﬁnal quota)
Results:
Positions ﬁlled
Positions unﬁlled
Programs ﬁlled 1993 1994 1995 1996 20,071
1,572 20,916
2,515 22,353
2,312 22,937
2,098 24,749
2,436 16,117
577 17,209
1,294 17,725
1,152 18,170
990 18,316
725 694
646 854
794 892
817 998
899 1,008
912 3,225
3,170 3,677
3,622 3,715
3,662 3,800
3,745 3,830
3,758 69
225
4 247
1,329
2 276
1,467
6 285
1,291
7 282
1,272
8 19,973 22,737 22,801 22,806 22,578 22
45 120
357 143
357 124
327 130
336 23
46
19,972 127
338
22,756 142
338
22,820 128
303
22,830 138
326
22,588 16,694
3,278
2,100 18,503
4,253
2,309 18,877
3,943
2,440 19,160
3,670
2,599 19,041
3,547
2,589 B. Thoracic Surgery
Category 1991 1992 1993 1994 1996 Applicant ROLs
Active programs
Program ROLs
Total quota
Positions ﬁlled 127
67
62
93
79 183
89
86
132
123 200
91
90
141
136 197
93
93
146
140 176
92
92
143
132 a Quotas include positions in active programs with no ROL returned. Changes during the match are caused primarily by
reversions. In some cases, one position is reverted simultaneously to two programs, causing a net increase in the number of
positions offered. In addition, a few positions may be dropped from the match during processing to accommodate requests
for even/odd matching. sic descriptive statistics and match results are
given in Table 1B.
III. Design of the ApplicantProposing
Algorithm The process by which the applicantproposing
algorithm was designed is roughly as follows. First, a conceptual design was formulated and
circulated for comment (Roth, 1996a). This was
based on an algorithm for simple markets, modiﬁed to deal with the complexities of the NRMP. In
order for the design to be coded into a working
algorithm, a number of choices had to be made
concerning the sequence in which proposals
would be made. The sequencing of proposals can 756 THE AMERICAN ECONOMIC REVIEW be shown to have no effect on the outcome of
simple matches, but it could potentially affect the
outcome when the NRMP match variations are
present. Thus, like the overall architecture of the
algorithm, the sequencing of proposals is a design
question about which the existing theory gives
some general guidance that falls short of a complete engineering speciﬁcation. Consequently, we
performed computational experiments before
making sequencing choices. In what follows, we
ﬁrst present the conceptual design (from Roth
[1996a]) and then discuss the sequencing experiments and implementation decisions.
A. The Conceptual Design
The algorithm described here is based on the
instabilitychaining algorithm in Roth and John H.
Vande Vate (1990) (which ﬁnds stable matchings
by resolving applicant–program instabilities one
at a time) and on the general design of phase 3 of
the preexisting NRMP algorithm.
The object of the algorithm is to produce a
stable matching, by assembling a set A(k) of
residency programs and applicants and a tentative
matching M(k) with the property that there are no
instabilities within the set A(k), and no applicant
or program in A(k) is matched to anyone outside
of A(k). When the set A(k) has grown to include
all applicants and programs, the resulting match is
stable, and the algorithm stops.
In the applicantproposing algorithm, the initial set, A(0), consists of all positions offered in
the match, and the initial tentative matching has
all positions vacant. The algorithm begins by
selecting an applicant S (1) from the set of applicants in the match and adding S (1) to A(0)
to make the new set A(1).
At any step k of the algorithm, at which a new
applicant S(k) has just been added to form the set
A(k), the new tentative matching M(k) is formed
as follows. First, applicant S(k) [ S(k, 1)] proposes down his Rank Order List [of programs that
also rank S(k)], from the top, until the ﬁrst program is reached that either has a vacant position or
prefers S(k) to its leastpreferred current tentative
match. In the latter case, this leastpreferred applicant, S(k, 2) is now rejected by the program in
question, and this applicant now proposes down
her ROL in a similar way, and so on. Each applicant S(k, n) displaced in this way similarly proposes down his or her ROL. SEPTEMBER 1999 At some point in this process, an applicant S(k,
n) may be displaced who is a member of a couple,
or who is displaced from a primary (secondyear)
position for which she also holds a supplemental
(ﬁrstyear) position. In either case, a second position now potentially becomes vacant, as the
spouse of S(k, n) is withdrawn from his tentative
match, or as S(k, n) is withdrawn from her supplemental match. In either case, the program
whose position is left vacant, P(k, n), is added to
a “program stack” to be held for later processing.
If S(k, n) is a couple, then both couple members
[S(k, n, a) and S(k, n, b)] now propose down their
joint ROL of pairs of programs, and they each
may displace another applicant. Also, if any S(k,
n) has a supplemental ROL associated with her
new tentative match, she proposes down it as well,
which may also result in the displacement of another applicant. Thus, both couples and supplemental matches may simultaneously displace
more than one applicant. One displaced applicant
is processed immediately, and any others are
added to an “applicant stack” for later processing.
Applicants propose down their ROLs in this
way until the applicant stack is empty. (Applicants
continue throughout to be able to propose to programs which may be on the program stack.) A
residency program is then selected from the program stack, and all of the applicants in A(k) with
whom it might form instabilities [i.e., all of the
applicants in A(k) who are preferred by the program to its leastpreferred current tentative match
and who prefer this program to their current
match] are added to the applicant stack, which is
processed as before, with applicants proposing
down their ROLs from the top.
When both the applicant and program stacks
are empty, the tentative matching thus produced
is M ( k ): no instabilities for M ( k ) are contained
in the set A( k ), and no applicant or program in
A( k ) is matched by M ( k ) to anyone outside of
A( k ). The algorithm is now ready to pick a new
applicant S ( k 1 ), and start the process again,
for the set A( k
1 ).
When all applicants have been included in the
set A( k ), even/odd requests and program reversions are adjusted, which causes additions to the
applicant and program stacks, which are handled as above. When these stacks are empty, the
algorithm stops, and the last tentative match
becomes ﬁnal.
In a match with no match variations, the VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS applicant and position stacks would always become empty, and the ﬁnal match would be the
applicantoptimal stable matching. When the
match variations are present, there is a possibility that at some stages of the algorithm the
position stacks would never become empty (i.e.,
a cycle would occur, in which the same positions reappeared on the stack). Therefore,
“loopdetectors” need to be added to each stage
k . Every loop must involve a position becoming
unmatched and made vacant either because a
couple or a supplemental assignment has been
withdrawn from the position, or a position has
been withdrawn from an applicant (e.g., in satisfying an even/odd constraint). Thus, a loopdetector can work by keeping a log of when
positions become unmatched in these ways [i.e.,
recording which applicant is unmatched from
which position, during the processing of some
step A( k )]. If the same pairs appear multiple
times, a loop is in progress. How to proceed at
this point may depend on the nature of the loop.
(It is observed in Roth and Vande Vate [1990]
that certain kinds of inessential loops can be
rendered harmless by randomizing the order in
which applicants and positions are processed
from the stacks. Loops due to the nonexistence
of a stable matching would be more serious, but
the prior experience of the NRMP suggests that
these may be rare.)
Thus, the existing theory suggests the general
architecture for an applicantproposing algorithm that can deal with instabilities one at a
time as they are detected, and it provides guidance on how the algorithm may possibly fail to
ﬁnd a stable matching. But to determine how
often it might fail to produce a stable matching
we need some computational experiments. The
experiments reported next, which will help determine the details of the algorithm design, will
also show that failures are rare: we will not
observe even a single failure when we explore
different versions of the algorithm on previous
years’ ROL data.
B. Sequencing Questions and Implementation
Decisions
In a simple match, without the NRMP match
variations, the applicantproposing algorithm just
described always produces the applicantoptimal
stable match, and the programproposing algo 757 rithm always produces the programoptimal stable
match, regardless of the order in which proposals
are processed within the algorithm. One consequence of the fact that these optimal stable
matches do not exist in general when the match
variations are present is that the order in which
applicants and programs are processed may have
an effect on the match produced. Thus the sequence in which applicants and programs are processed at various points in the algorithm needs
to be considered as part of the design of the
applicantproposing algorithm.
Two issues were considered in conducting
and evaluating experiments related to the sequencing of operations in the algorithms.
(i) Do sequencing differences cause substantial or predictable changes in the match
result (e.g., do applicants or programs selected ﬁrst do better or worse than their
counterparts selected later)?6
(ii) Does the sequence of processing affect the
likelihood that an algorithm will produce a
stable matching? (In connection with this
latter point, recall that instabilitychaining
algorithms can cycle—even when stable
matchings exist, and certainly when they
do not. Therefore, one objective was to
consider how sequencing decisions might
inﬂuence the frequency of “loops” occurring in the algorithm.)
Experiments to test the effect of sequencing
were conducted using data from three NRMP
matches: 1993, 1994, and 1995.
1. Sequencing Experiments on the Preexisting
NRMP Algorithm.—We investigated the effect of
different sequencing of operations in variants of
the preexisting NRMP algorithm, in part to establish a baseline against which to compare the algorithm to be designed. In the preexisting algorithm, 6
Even in a simple matching market, the order in which
proposals are made can matter in versions of the Roth and
Vande Vate (1990) instabilitychaining algorithm in which
members of both sides of the market may be chosen to make
the next proposal (in contrast to versions in which all
proposals are made by one side of the market). Yosef Blum
and Uriel G. Rothblum (1999) show that, in such a version
of the algorithm, late proposers have an advantage over
early proposers. 758 THE AMERICAN ECONOMIC REVIEW programs are processed in ascending sequence by
sixdigit program code number. To test the sensitivity of the results to this sequencing, computational experiments were run on the ROL data in
which this sequencing was reversed (i.e., programs were processed in descending order by
program code number). As expected, the results
showed differences, but the differences were
small: the largest difference was in 1994 when
only four out of 3,662 programs that submitted
ROLs received a different match under the alternative ordering, as did four out of 22,353 applicants. Not only are these differences very small,
they do not appear to be systematic.7 (A fuller
account of the results of these experiments appears in Appendix A.)
The preexisting NRMP algorithm was also
investigated for its sensitivity to the sequence in
which reversions are processed. Rather than
simply changing the order in which reversions
occurred, the experiments involved setting the
input program quotas to be the ﬁnal postmatch
quotas produced by the preexisting NRMP algorithm. All further reversion processing was
then eliminated. These experiments then provided an indication of the differences caused,
not only by changing the order of reversions,
but also by altering the time when reversions
enter into the match processing (i.e., all required
reversions were assumed to take place simultaneously, at the beginning of match processing).
No more than two programs or applicants were
observed to be affected by such changes in any
of the three years 1993–1995 (see Appendix A).
Finally, it should be noted that no loops were
detected in any of these experiments on the preexisting NRMP algorithm. Consequently, despite
the presence of match variations, sequencing does
not appear to play a signiﬁcant role in the operation of the preexisting NRMP algorithm.
2. Sequencing Experiments on the ApplicantProposing Algorithm.—Computational experiments were conducted to measure the impact of:
7
We use the term “very small” informally, but not
merely to express an opinion of changes that affect on the
order of 0.01 percent of applicants. These changes are also
at least an order of magnitude smaller than the main effects
we will ﬁnd due to changes between programproposing and
applicantproposing algorithms. Since the effects appear to
be unsystematic, they do not appear to have any welfare
implications, on average. SEPTEMBER 1999 (i) the sequence in which applicants are admitted to the algorithm for processing;
(ii) the sequence in which couples are processed relative to other applicants; and
(iii) the sequence in which applicants ranked
by a program are processed when attempting to ﬁll a program that has been
selected from the program stack.
To understand the results of the computational
experiments (which are tabulated in detail in
Appendix A) it is useful to compare the outcomes from each experiment to those from
a ﬁxed baseline. We chose as a baseline an
applicantproposing algorithm in which applicants were processed in ascending order by their
applicant codes, regardless of whether they
were single or members of couples. (In all
cases, when a member of a couple was processed, so was the other member. When applicants were processed in ascending code order, a
couple was selected for processing based on the
code number of the spouse with the lower applicant code.) When a program was selected
from the program stack, applicants were processed in ascending sequence by program rank
number. All of these experiments were carried
out on the ROL data from the NRMP matches in
1993, 1994, and 1995.
The experiments were conducted in a partial
factorial design. The handling of couples had
three treatments (couples intermixed with singles, couples ﬁrst, and couples last); the order of
introducing applicants into the match had two
treatments (ascending order by applicant code
or descending order); and the order of processing applicants when a program is pulled from
the stack had two treatments (ascending order
by program rank or descending order). The results are that none of these sequencing decisions
had a large or a systematic effect on the matching produced. In twothirds of the cases, the
match was the same as in the baseline case. (In
the majority of the remaining cases only two
applicants received different matches, and the
maximum number of applicants affected was 12
out of 22,937, which occurred when a couple
received a worse match and initiated a chain of
displacements. This happened in two of the 18
cases and involved the same 12 applicants in
both cases).
However, there was an effect of sequencing on VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS the internal processing of the algorithm. The number of loops encountered was fewest when couples were introduced to the match after single
applicants. This is not too surprising in view of the
fact that no loops would occur in the absence of
match variations. The results indicate that loops
are least likely to occur when the couples are
introduced into the larger market with some tentative matches already assembled, as opposed to
when couples enter ﬁrst, so that the initial tentative
matches involve only couples. Introducing couples last reduces the numbers of loops (and hence
the potential that in some future match it would be
difﬁcult to ﬁnd a stable matching) without changing the prospects of couples or single applicants in
the match.
Finally, experiments related to the sequence in which reversions are processed
were performed on an applicantproposing
algorithm. These experiments were similar to
those performed on the preexisting NRMP algorithm. Again, no substantial changes were induced
by changing the order in which reversions were
handled; no changes at all resulted in the 1993
match, and only two applicants and programs
were affected in the 1994 and 1995 matches (see
Appendix A). Thus, for both the preexisting
NRMP algorithm and the applicantproposing algorithm, there is almost no difference between the
results obtained with reversion processing and the
results obtained by setting the quotas to the ﬁnal
quotas after reversions and eliminating further reversion processing. (This point simpliﬁes the design of some of the experiments to compare the
two algorithms, in connection with strategic behavior by residency programs, to be discussed
later in this article.)
Based on the sequencing experiments described above, it was decided to sequence all
proposals by couples after proposals by single
applicants, since this was the order that produced the fewest internal loops.8 Note that we did not at any point choose to
randomize the processing order (randomization
was shown in Roth and Vande Vate [1990] to
allow the algorithm to escape from certain kinds
of loops). The reason is that loops do not appear
to be a problem with the processing sequences
selected, and it was felt that a desirable feature
of the match is that it should be precisely reproducible from the ROL data.
IV. Differences in the Matches Produced
by the Two Algorithms A. The NRMP
The preexisting NRMP algorithm and the
newly designed applicantproposing algorithm
were compared in terms of the matches that they
produce for the ROLs submitted in 1987 and
1993–1996. Table 2 gives the results of these (iv) (v) (vi) (vii) (viii) (ix)
8 The full details of the sequencing decisions are as
follows:
(i)
(ii)
(iii) All single applicants are admitted to the algorithm
for processing before any couples are admitted.
Single applicants are admitted for processing in
ascending sequence by applicant code.
Couples are admitted for processing in ascending
sequence by the lower of the two applicant codes of
the couple. 759 When a program is selected from the program stack
for processing, the applicants ranked by the program
are processed in ascending order by program rank
number.
The processing of programs requesting even numbers of matches or reversions of unﬁlled positions is
deferred until all applicants have been admitted for
processing.
Programs requesting even numbers of matches are
processed in ascending sequence by program code.
An applicant deleted from a program in order to
leave an even number of matches in the program is
placed on the applicant stack for processing.
Programs requesting reversions of unﬁlled positions
are processed in ascending sequence by the program
code of the program “donating” the unﬁlled position(s). A program that “receives” a reverted position is placed on the program stack for processing.
After all reversions have been processed, the requests for reversions are reprocessed, in case any
new reversions of unﬁlled positions are required as
a result of changes made in the processing of reversions that have been processed since the last time
this reversion request was considered.
When no further processing is required to satisfy
all reversions, requests for even numbers of
matches are reprocessed as in point (vi) above,
and if any changes are made, requests for reversions are reprocessed as in points (vii) and (viii)
above. This iterative processing continues until
no further changes are made by even processing
or reversion processing. (The possible need for a
reverted position to be “unreverted” is checked as
part of the check for stability, by using original
quotas for programs which have lost positions
through reversions.) 760 THE AMERICAN ECONOMIC REVIEW
TABLE 2—COMPARISON OF RESULTS BETWEEN ORIGINAL NRMP ALGORITHM Result SEPTEMBER 1999 AND APPLICANTPROPOSING ALGORITHM 1987 1993 1994 1995 1996 Number of applicants affected
Applicantproposing result preferred
Current NRMP result preferred 20
12
8 16
16
0 20
11
9 14
14
0 21
12
9 U.S. applicants affected
Independent applicants affected 17
3 9
7 17
3 12
2 18
3 12
3
2
2
(max 9) 11
1
3
1
(max 4) 13
4
2
1
(max 5) 8
2
2
2
(max 6) 8
6
3
3
(max 6) 0
1 0
0 0
0 0
0 1
0 20
8
12 15
0
15 23
12
11 15
1
14 19
10
9 5
5
0
9
(max 178) 3
3
5
4
(max 36) 9
3
1
6
(max 31) 6
5
3
0 3
3
1
11
(max 191) 0
1 0
0 2
2 1
0 1
0 Applicants: Difference in result by rank number
1 rank
2 ranks
3 ranks
More than 3 ranks
New matched
New unmatched
Programs:
Number of programs affected
Applicantproposing result preferred
Current NRMP result preferred
Difference in result by rank number
5 or fewer ranks
6–10 ranks
11–15 ranks
More than 15 ranks
Programs with new position(s) ﬁlled
Programs with new unﬁlled position(s) comparisons. The ﬁrst half of the table concentrates on the comparisons from the point of view
of applicants; the second half is from the point
of view of programs.
Only about 0.1 percent of applicants are affected by the change in algorithms, and of these,
most prefer the match they receive under the
applicantproposing algorithm. Note that in two
of the ﬁve years the number of applicants
matched changed by one (one fewer in 1987,
one more in 1996). Recall that in a simple
match a change from one stable matching to
another would never change the number of applicants matched; so here is another case in
which the match variations cause a difference,
but a difference which turns out to be very small
and unsystematic.
Equally few programs are affected by the
change of algorithms—and these constitute
about 0.5 percent of all programs. Most, but
not all, of the programs prefer the match they receive under the preexisting NRMP algorithm, but in 1994 and 1996 slightly more
programs would even have preferred the
applicantproposing algorithm to the preexisting NRMP algorithm. Most programs that
receive a different match have only one applicant different between the matches produced by the two algorithms. The majority of
differences have to do with ﬁlling a position
with a different applicant; only a small number of positions move from being ﬁlled to
unﬁlled or vice versa. Again, this is a consequence of the match variations; as already
noted in the case of applicants, it turns out to
be both very rare and unsystematic.
It may be helpful at this point to consider an
example of precisely how the match variations can
cause a deviation from the predictions of the theory for simple markets; for example, how it can be
that a few applicants do worse with the applicantproposing algorithm than with the program VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS proposing algorithm. For example, if switching to
the applicantproposing algorithm causes applicant A to improve his match from his second to
his ﬁrst choice, it may be that the ﬁrst choice now
requires a supplemental match that was not required before. If this new supplemental match
displaces a previously matched but lesspreferred
applicant in a program, that displaced applicant is
forced to go further down his or her list (i.e., does
worse). Furthermore, matching that applicant may
displace another applicant, who may displace another, and so on, causing a chain of applicants
who do worse (even though, as expected of the
applicantproposing algorithm, this chain of
events began with an applicant who did better than
he would have if the programproposing algorithm had been used).
It is worth noting that when we refer to “only
0.1 percent” of applicants, we are talking about a
change whose small size we will explain in what
follows. But this does not necessarily imply that
the associated change in welfare is small. Indeed,
in the debate that led to this study, and after our
report was circulated to the interested parties, a
great deal of discussion stemmed from the view
that the difference in welfare was likely to be large
for the affected applicants, and likely to be small
for the affected programs. This contributed to the
decision to adopt the applicantproposing algorithm, a decision strongly lobbied for by the student organizations, and eventually unanimously
adopted by the NRMP Board.9
B. Thoracic Surgery
Because there are no match variations in the
Thoracic Surgery matches for the years we consider, they are simple matches and are well
described by the existing theory. Consequently
9
The argument about the size, and relative size, of the
welfare effects for applicants and programs can be paraphrased in part roughly as follows. Both programs and
applicants have some uncertainty in their rankings. There
may not be that much difference between a program’s 7th
and 17th ranked candidates. Similarly, applicants may not
be able to judge clearly whether they will get a better
educational experience at their ﬁrst or secondchoice programs. But applicants can clearly judge other factors in their
preferences, such as whether they would prefer to live in
Seattle or Miami, where these programs may be located.
Therefore, a change of algorithms may have a big effect on
the affected applicants, and only a small one on the affected
programs. 761 we know that the applicants will all do as well
as possible at the stable match produced by the
applicantproposing algorithm, and the programs will all do as poorly as possible at that
stable matching. What the theory does not tell
us is how large this effect will be; for that we
need to look at the data (see Table 3). As
discussed in the introduction, the effect turns
out to be minimal: in the ﬁve years we studied,
only four applicants and four programs would
have been affected by a change in algorithms; in
three of the ﬁve years, the applicantproposing
algorithm would have produced the same match
as the programproposing algorithm, indicating
that this was the only stable matching in those
years. (August Colenbrander [1996] reports
similarly small differences in the specialty
matches he maintains.)
V. Differences in Sensitivity to
Participant Behavior The comparisons of match outcomes discussed in the previous section are all based on
Rank Order Lists that were submitted for
matches made by the preexisting NRMP and
specialty match algorithms. While the changes
observed when the match was instead produced
by the applicantproposing algorithm were
small, a comparison of the algorithms also requires us to consider whether participants might
have reason to submit different kinds of ROLs if
the new algorithm were to be substituted for the
preexisting one. For this purpose, we consider
whether participants could have favorably inﬂuenced the match, under either algorithm, by
submitting different ROLs. The idea is to assess
both how many participants could do so and
how the number is different for the two algorithms. This will also allow us to determine
what kinds of advice can be given to participants about how to participate in the match,
under either algorithm.
Once again, this is a subject about which the
theory of simple matching markets tells us a
great deal for markets without the match variations found in the NRMP. To see how well the
theory for simple markets approximates the
NRMP matches, and also to assess the size of
the effects to expect, again required computational experiments on the data. A quick review
of the theory will help organize the discussion. 762 THE AMERICAN ECONOMIC REVIEW TABLE 3—DIFFERENCE IN RESULT WHEN ALGORITHM
CHANGED FROM PREEXISTING SPECIALTY MATCH TO
APPLICANTPROPOSING
Year Difference 1991
1992
1993
1994
1996 none
2 applicants improve, 2 programs do worse
2 applicants improve, 2 programs do worse
none
none A. Strategic Behavior in Simple and Complex
Matching Markets
In a simple matching market, without match
variations, it has been shown (Roth, 1982) that
there do not exist any stable matching algorithms that completely remove the possibility
that some applicant or program can get a better
match by submitting an ROL that differs from
the applicant or program’s straightforward preferences. However, we have already noted the
following:
In simple markets, when the applicantproposing algorithm is used, but not when
the programproposing algorithm is used,
no applicant can possibly improve his
match by submitting an ROL that is different from his true preferences. (Recall
also that no parallel assertion can be made
about residency programs that have more
than one position.)
Therefore, in simple markets, we would ﬁnd
strategic opportunities for applicants only when
the programproposing algorithm is used, and
the theory tells us what these might be. Specifically, consider the ROL of some applicant, and
deﬁne a truncation of that ROL to be a shorter
ROL that is the same as the original ROL for as
many programs as it ranks. We can then say the
following:
In simple markets when the programproposing algorithm is used, every applicant who can do better than to submit his
true preferences as his ROL can do so by
submitting a truncation of his true preferences. That is, if (holding all other ROLs
constant) an applicant would be matched
to his k th choice if he submitted his true
preferences, and his j th choice (with j SEPTEMBER 1999 k ) if he submitted some other ROL, then
he can be matched to his j th choice by
submitting a truncation of his true preferences at the j th choice. Furthermore, no
part of his original ROL below the k th
choice has any effect on the match (Roth
and Vande Vate, 1991).
It can also be shown that truncations are the
kind of manipulation that can potentially be
identiﬁed with the least information about others’ preferences (Roth and Rothblum, 1999).
In simple markets, the reason that all successful manipulations can (also) be accomplished
by truncations is that, in a simple market, a
deferredacceptance algorithm never “backtracks”: no information in an agent’s ROL is
used beyond the point at which that agent is
matched. Although we cannot apply this result
directly to the complex market, we can do computational experiments to assess how good an
approximation is provided by concentrating
only on truncations in the investigation of possible strategic manipulations in the NRMP. Speciﬁcally, if we ﬁnd that information about
agents’ preferences among options below the
point at which they are matched has little effect
on the match, then we can be conﬁdent that
investigating truncations will give us a comparably good approximation for the magnitude of
possible strategic manipulations in the complex
NRMP market.10
For simple markets, the theory also tells us
which applicants can potentially proﬁt from manipulation, and how much:
In simple markets, when the programproposing algorithm is used, the only applicants who can do better than to submit
their true preferences are those who
would have received a different match
from the applicantproposing algorithm.
Furthermore, the best such applicants can
do is to obtain the applicantoptimal
match, and they can do this by submitting
to the programproposing algorithm the
truncation of their true preferences that
stops at the match they would have gotten
from the applicantproposing match (see 10
This would free us from the computationally impossible task of investigating all possible manipulations by all
participants. VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS Gabrielle Demange et al., 1987; Roth and
Sotomayor, 1990).
It is important to note that, even in the case of
a simple match without match variations, an
applicant generally would not have the information needed to submit such a truncation (and if
he submitted a truncation that was one program
too short he would become unmatched). But
this result shows that, in a simple match, we can
identify an upper bound on the number of applicants who could possibly proﬁt from manipulating their Rank Order Lists, by seeing how
many applicants receive different matches at the
two algorithms.
We cannot directly apply this upper bound
to the NRMP, because it depends for its proof
on the existence of optimal stable matchings
for each side of the market, which we know
(from the sequencing experiments) do not
exist in the NRMP data. But the theory of
simple matches allows us to use the computational results reported in Table 2 as a numerical benchmark against which to compare
the computational estimates we will make of
the scope for possible manipulation. That is,
we can compare the estimates we get of how
many applicants can potentially proﬁt from
strategically stating their ROLs with the numbers of applicants who were observed to get
different matches from the two algorithms. If
these numbers are close for the programproposing algorithm (and close to zero for the
applicantproposing algorithm), then the theory of simple matches provides a comparably
close approximation for the situation in the
complex NRMP market.
The case of programs that have more than
one position is not so simple, even in the case
of simple matches. Programs may, at least in
theory, proﬁt both from truncating their
ROLs, and from reducing the number of positions they submit to the match (either by
making early arrangements with some applicants or by withholding positions to be ﬁlled
by unmatched applicants after the match).
The temptation for this latter kind of manipulation can be shown to be larger with the
programproposing algorithm than with the
applicantproposing algorithm (see Tayfun
Sonmez, 1997, 1999). Thus, in addition to
experiments with truncations of ROLs, we 763 must also conduct computational experiments
involving reductions in stated capacities.
B. Experiments to Determine Upper Bounds
for Proﬁtable Strategic Behavior
1. Preliminary Experiments: Truncation of
ROLs at the Match Point.—As noted above,
in a simple market, if an applicant is matched
to his k thchoice program, or if the lowestranked applicant a program is matched to is
its k thchoice applicant, truncation of the
ROL at the k th entry would have no inﬂuence
on the match. This is because, in a simple
match, the applicant or programproposing
algorithms never have to “backtrack” on an
ROL. But in the NRMP, backtracking can
occur, because of the match variations. Thus,
before exploring what truncations, if any,
could have a strategic effect on the match, it
was ﬁrst necessary to see whether truncations
at the match point [i.e., deleting the k 1 and
higher (lesspreferred) choices for a participant who was matched to his k th choice]
could inﬂuence the result of the match under
either algorithm, and how much. The truncations of applicant ROLs and program ROLs
were investigated separately, for each algorithm, for the 1993, 1994, and 1995 matches.
In the majority of cases no change was produced when all ROLs were truncated at the
match point; and in no case were more than
three applicants affected by such truncations.
(Over the more than 60,000 applicants involved in these experiments, only four were
affected by truncations of applicants’ ROLs;
see Table B1 in Appendix B for the detailed
results.) Thus, truncations at the match point,
while not entirely without effect, do not play
a substantial role; they affect on the order of
0.01 percent of applicants, an order of magnitude smaller than the effects of changing
algorithms.
Because we have now seen that information
beyond the match point inﬂuences the outcome
for only a tiny percentage of participants, concentrating on truncations will give us a comparably good approximation for the numbers of
participants who could potentially proﬁt from
any kind of strategic manipulation of ROLs.
The computational experiments which follow, 764 THE AMERICAN ECONOMIC REVIEW therefore, will concentrate on identifying an upper bound on the number of participants who (if
they had the necessary information) could potentially proﬁt from strategic behavior involving truncations of their ROLs above the match
point, and (for programs) reductions in the number of positions they offer in the match below
the number of applicants to which they were
matched.
2. Experiments to Determine Upper Bounds.
—As discussed above, the kinds of strategic
manipulation to be considered involve truncation of ROLs by applicants or programs, and
reductions in stated numbers of positions (quotas) by programs. Since we want to know how
often a single agent can proﬁtably manipulate
the stated ROL, we could in principle conduct a
separate experiment for each participant, but
this would be computationally infeasible. Consequently we need to design an efﬁcient experiment that will let us tightly bound the number
of individuals who can potentially proﬁt from
manipulating their ROLs.
The manipulations involving program quotas
raise the question of how to handle reversions of
positions when quotas are to be different from
those in the data. Similar questions arise when
truncating program ROLs, as this may increase
unﬁlled positions. All of the experiments concerning strategic behavior of programs handle
reversions by ﬁxing quotas at the ﬁnal quotas
observed after the match with the original match
data. None of the results are likely to be sensitive to this simpliﬁcation, as shown by the results of the sequencing experiments discussed
in Section III and detailed in Appendix A.
For each of the strategic manipulations
whose potential magnitude is to be assessed, the
chief difﬁculty in designing the experiments is
that a change in a single ROL or quota has two
kinds of effects: it may potentially change the
match of the applicant or residency program
whose ROL or quota is changed, but it may also
potentially change the match of other applicants
and residency programs. To see why, suppose
that the ROL of some applicant is truncated
above his current match point, and that the
match under one of the algorithms is rerun after
making (only) this change. Then the applicant
whose ROL was changed may do better (by
being matched to a morepreferred choice) or SEPTEMBER 1999 worse (by being unmatched instead of
matched). At the same time, other applicants
may do better (and other residency programs
may do worse) because of the availability of the
position previously held by the applicant whose
list was truncated.
This means that, if we truncate a group of
applicant ROLs, for example, and see how many
of the applicants in this group receive a better
match as a consequence of this change, we will be
looking at an overestimate of the number of applicants in the group who could have beneﬁted
from truncating their own ROL; many of them
will have instead proﬁted from someone else’s
truncation (even if that person himself became
unmatched as a result of his own truncation).
Thus, the number obtained in this way would be
an upper bound on the number that would have
beneﬁted by truncating their own ROL, while
holding all others’ constant. But we do not have to
settle for this upper estimate; we can reﬁne it
iteratively, by now continuing to truncate the
ROLs only of those applicants whose match improved as a result of the previous (collective)
truncations. This will allow us to further eliminate
from the set of truncations those who proﬁted
from the truncations of applicants who were themselves harmed by their own truncation. Proceeding in this way, we can continue until no more
reductions in the sample are achieved. This ﬁnal
number will still be an upper bound, of course,
since even in a group of truncators who all do
better when they all truncate their preferences,
some may be proﬁting from the truncated ROLs
of the others, not from their own truncation.
Experiments were conducted separately for
applicants and for programs, and separately for
each of the two algorithms. A computational
experiment for applicants in a given year started
by truncating all ROLs just above the (lowest)
match point (i.e., every applicant’s primary
ROL was truncated just above the match he
received when no ROLs were truncated using
the algorithm in question). For example, if an
applicant originally matched to rank 3 on his
primary ROL, the truncated ROL contained
only his ﬁrst two choices. Of course, many
applicants were left unmatched by this truncation, while others received preferred matches
(these were the only two possibilities at this
stage). Then at the next step, the ROLs of all
those who had truncated their lists but did not VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS improve were restored to their original length,
and the process was repeated with the smaller
number of truncations that remained. This process was repeated until it converged. Computational experiments for programs were structured
similarly; starting with every program’s ROL
truncated just above the lowestranked match it
received. The full results for the NRMP for
1987 and 1993–1996 are given in Appendix B.
(Table B2 reports the results of the truncation of
applicant ROLs for each algorithm; Table B3
reports the results for programs.) The results
can be summarized by looking at the ﬁnal upper
bounds of the number of applicants and the
number of programs that could possibly beneﬁt
from truncating their ROLs.
The results are reported and analyzed below,
ﬁrst for the NRMP matches, and then for Thoracic Surgery.
a. Results for the NRMP.—The truncation
experiments with applicants’ ROLs yield the
upper bounds shown in Table 4 for the two
algorithms in the years studied. As expected,
more applicants can beneﬁt from list truncation
under the preexisting NRMP algorithm than
under the applicantproposing algorithm. Note
that the number of applicants who could even
potentially beneﬁt from truncating their ROLs
under the preexisting NRMP algorithm is in
each year almost exactly equal to the number of
applicants who received a preferred match under the applicantproposing match (line 2 of
Table 2). We will return to this point in a
moment, but note that it suggests that this upper
bound is very close to the precise number that
would be predicted in the absence of match
variations.
The truncation experiments with programs’
ROLs yield the upper bounds shown in Table
5. As expected, some programs can beneﬁt
from list truncation under either algorithm.
However, consistently more programs beneﬁt
from list truncation under the applicantproposing algorithm than under the preexisting NRMP algorithm. Note that, although the
numbers of programs in these upper bounds
remain small, they are in many cases about
twice as large as the number of programs that
received a preferred match at the stable
matching produced by the algorithm other
than the one being manipulated. (That is, re 765 TABLE 4—UPPER LIMIT OF THE NUMBER OF APPLICANTS
WHO COULD BENEFIT BY TRUNCATING THEIR LISTS AT ONE
ABOVE THEIR ORIGINAL MATCH POINT
Upper limit
Year Preexisting NRMP
algorithm Applicantproposing
algorithm 1987
1993
1994
1995
1996 12
22
13
16
11 0
0
2
2
9 TABLE 5—UPPER LIMIT OF THE NUMBER OF PROGRAMS
THAT COULD BENEFIT BY TRUNCATING THEIR LISTS AT
ONE ABOVE THE ORIGINAL MATCH POINT Year Preexisting NRMP
algorithm Applicantproposing
algorithm 1987
1993
1994
1995
1996 15
12
15
23
14 27
28
27
36
18 ferring back to Table 2, we see, for example,
that in 1995 only 14 programs preferred the
matching produced by the preexisting NRMP
algorithm to the one produced by the applicantproposing algorithm, but we now ﬁnd
36 programs in our upper bound of programs
that could potentially proﬁt from a manipulation of the applicantproposing algorithm.) It
therefore seemed worthwhile to examine
these upper bounds further and see if they
were overestimates.
For each algorithm, this was ﬁrst done by
taking a 50percent sample of the programs
contained in the upper bound for 1995 and
restarting the truncation experiment with only
these programs having truncated ROLs. The
idea is that, if each of these programs can in fact
beneﬁt from its own truncation, the experiment
would stop after the ﬁrst iteration, with no further reductions in the upper bound. But if in fact
the upper bound is an overestimate, and some of
the programs in it are beneﬁting not from their
own truncated ROLs, but from the truncation of
one of the other ROLs in the upper bound, then
on average half of such “false positives” in our
50percent sample would have been beneﬁting
from the truncation by one of the programs in 766 THE AMERICAN ECONOMIC REVIEW TABLE 6—REFINED ESTIMATE OF THE UPPER LIMIT OF THE
NUMBER OF PROGRAMS THAT COULD IMPROVE THEIR
RESULTS BY TRUNCATING THEIR OWN ROLS IN 1995 Estimate
Original results
Current estimate
(still an upper
limit) Preexisting NRMP Applicantproposing
algorithm
algorithm
23 36 12 22 the other 50 percent, which are no longer truncated. In this case we would iterate until the
number of truncators who improved their outcome again stabilized at a new, lower upper
bound. This is in fact what happened; the new
estimates for 1995 (equal to twice the number
obtained from the 50percent sample) are compared to the old ones in Table 6. These results
conﬁrm that the number of programs that can
beneﬁt from the ROL truncations stated earlier
are indeed overestimates.
A further analysis was undertaken for each of
the ﬁve years, to compare the speciﬁc individual
programs and applicants who appear in these
upper bounds as potentially beneﬁting from
ROL truncations with the programs and applicants whose results changed when the algorithm
changed. This analysis indicated that those who
could beneﬁt from ROL truncations were, for
the most part, those who did differently (generally worse) when the algorithm was changed
from their side proposing to the other side proposing (without ROL truncations). For example, the applicants who can beneﬁt from ROL
truncations when the programproposing algorithm is used are very largely the same as those
who beneﬁt when the algorithm is changed to an
applicantproposing algorithm with no ROL
truncations. Thus, in this respect also, it appears
that the theory for simple markets provides a
good approximation of the situation in the
NRMP match.
We next turn to the question of capacity
manipulation by programs. Recall that in an
actual match this could be considered by a program in the context of either an early agreement
(for example with an independent applicant) or
in anticipation that some positions would be
ﬁlled postmatch. SEPTEMBER 1999 An initial experiment was run setting all
program quotas to the number of positions ﬁlled
with the algorithm in question and the original
data. (This is analogous to the initial experiment
involving truncations of the ROLs at the match
point, rather than above it.) In a simple match
without NRMP match variations, this would be
expected to have no impact on the results. However, with NRMP match data some differences
were observed, as seen in Table 7. With the
applicantproposing algorithm, the differences
are negligible. However, more differences were
observed with the preexisting NRMP algorithm,
and the results obtained by setting the quotas to
the original positions ﬁlled tended to produce
better results for the programs.
In order to identify programs that could improve their remaining matches by further reducing their quotas, an iterative technique was
employed similar to that used to investigate the
effects of ROL truncations. After several iterations revised downward the upper bounds obtained in this way, the resulting upper bounds
on the number of programs that could potentially proﬁt from stating lower quotas was as
shown in Table 8. Again, these numbers are still
estimates of the upper bound; further reﬁnement
is still possible. However, given the size of
these numbers, it seems clear that only a very
small number of programs (less than 1 percent)
could improve their remaining matches by reducing their quotas. This does not appear to be
an advisable strategy for programs to follow
with either algorithm.
b. Results for Thoracic Surgery.—Because
the Thoracic Surgery match does not have
match variations, the theory tells us precisely
which applicants and programs could improve
their match by an optimal manipulation. As a
check on our computational procedures, we
conﬁrmed these predictions by running the
same computational experiments on ROL truncations as described for the NRMP matches.
The results, summarized in Table 9, are as expected. Thus, in Thoracic Surgery as in the
larger and more complex NRMP match, the
opportunities for strategic manipulation are essentially nonexistent under either algorithm.
(Colenbrander [1996] reaches essentially the
same conclusions about the specialty matches
he maintains.) VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS TABLE 7—RESULTS WITH INPUT QUOTAS SET TO POSITIONS FILLED, COMPARED 1993 Result TO 767 ORIGINAL RESULTS 1994 1995 Preexisting
NRMP
algorithm Applicantproposing
algorithm Preexisting
NRMP
algorithm Applicantproposing
algorithm Preexisting
NRMP
algorithm Applicantproposing
algorithm 12
none 2
none 9
3 none
2 25
9 none
2 none
12 none
2 3
9 2
none 6
27 2
none Programs
Improve
Do worse
Applicants
Improve
Do worse TABLE 8—REVISED ESTIMATE OF THE UPPER BOUND OF
THE NUMBER OF PROGRAMS THAT COULD IMPROVE THEIR
REMAINING MATCHES BY REDUCING QUOTAS Year Preexisting NRMP
algorithm Applicantproposing
algorithm 1987
1993
1994
1995
1996 28
16
32
8
44 8
24
16
16
32 VI. Why the Differences Are Small: Insights
from the Theory of Simple Markets All the results to this point can be characterized by noting that the theory of simple
matches, without match variations, gives a good
prediction of the direction of each of the comparisons, and, in addition, the size of all the
changes has been very small. This section explores what insights we can get from simple
markets to help explain why these differences
are so small. The results in this section are
based on computational comparisons similar to
those discussed earlier, but now concerning hypothetical markets without any match variations.
The small differences between algorithms we
have been seeing reﬂects that, in each of the
years studied, the set of stable matchings has
been small, as measured by the number of participants who receive different matches from the
programproposing and applicantproposing algorithms.11 It is therefore of interest to consider
11
This is the natural measure for the size of the set of
stable matchings in the present context, since the concern is
with how many market participants will be affected by a how the set of stable matchings looks in comparably large markets when we concentrate on
simple matches. For this purpose, we consider
the very simple matching markets with n ﬁrms
(each with one position) and n applicants, as n
approaches the size of the markets we are studying, namely, the specialty markets like Thoracic
Surgery and the general NRMP match.
One factor that strongly inﬂuences the size of
the set of stable matchings (which coincides
with the core in this simple model) is the correlation of preferences among programs and
among applicants. When preferences are highly
correlated (i.e., when similar programs tend to
agree which are the most desirable applicants,
and applicants tend to agree which are the most
desirable programs), the set of stable matchings
is small. (When preferences are perfectly correlated, then there is a unique stable matching, so
both algorithms would produce the same matching.) However, as the correlation of preferences
goes down, the size of the set of stable matchings grows, and more and more participants
would be matched differently by the two algorithms. This is true independently of the size of
the market. change in algorithms. Note, however, that it is different
from the more common measure of the size of the set of
stable matchings, the number of distinct stable matchings. If
20 applicants receive different assignments at different stable matchings, there could be as many as 210
1,024
different stable matchings, in case the 20 applicants can be
resolved into 10 independent pairwise interchanges of positions, or there could be as few as two stable matchings, if
all 20 applicants are involved in a single irreducible cycle.
In either case, if there are 20,000 jobs being ﬁlled, we have
been focusing on the approximately 20 applicants who
receive different assignments when we conclude that the set
of stable matchings is small. 768 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 1999 TABLE 9—THORACIC SURGERY: (A) NUMBERS OF APPLICANTS WHO COULD IMPROVE MATCHES BY TRUNCATING THEIR
ROLS; (B) NUMBERS OF PROGRAMS THAT COULD IMPROVE MATCHES BY TRUNCATING THEIR ROLS
Algorithm 1991 1992 1993 1994 1996 2 applicants improve
(same ones who did
better when the
algorithm changed)
none none none none none none
2 programs improve
(same programs that
did worse when the
algorithm changed) none
none none
none A. Applicants Who Could Improve by Truncating Their ROLs:
Preexisting NRMP none Applicantproposing none 2 applicants improve
(same ones who did
better when the
algorithm changed)
none B. Programs That Could Improve by Truncating Their ROLs:
Preexisting NRMP
Applicantproposing none
none none
2 programs improve
(same programs that
did worse when the
algorithm changed) It turns out, however, that the size of the
market also plays a critical role, in an interesting way. Consider the case in which preferences
are uncorrelated (so the set of stable matchings
is large). If every applicant could somehow
interview and be interviewed for all of the positions, then the set of stable matchings would
grow larger and larger (even as a percentage of
the number of applicants who could get different stable matchings) as the number of applicants and positions grew. Figure 1 shows that
this percentage grows to over 90 percent by the
time n reaches 1,000.
Of course, in a real market there is a limit to
how many interviews an applicant can have, or
a program can conduct. When we take this into
account, we see that the set of stable matchings
quickly becomes very small as the market becomes large.
Speciﬁcally, let k equal the number of interviews a candidate can have, and let n equal the
number of applicants and positions in the market. Then, even when preferences are completely uncorrelated, as k / n becomes small, the
set of stable matchings becomes small. For example, if k
1 5 (not an unreasonable approximation for the NRMP) and n 1 0,000, fewer
than 0.1 percent of applicants would receive a
different match from the two algorithms.12 That
is, even with completely uncorrelated prefer 12
The variance (based on 1,000 randomly generated
simple markets) is well under 0.001 percent. ences, we see in this simple market the same
oneinathousand order of magnitude that we
see in the NRMP. And for simple markets the
size of the specialty matches like Thoracic Surgery, with n on the order of 100 positions, if we
suppose that applicants interview at no more
than k
1 0 programs we ﬁnd only about 2
percent of applicants receiving different
matches from the two algorithms. Figure
2 graphs the curves for ﬁxed k , as n goes from
10 to 10,000.
Especially in view of the fact that preferences
are not uncorrelated in the medical matches, this
means that the orders of magnitude of the effects studied in the actual matches are very
comparable to what we should expect of simple
matches with similar values of k and n . Thus
(once we look at both k and n ) these simple
markets turn out to provide a good approximation not only for the direction of the effects we
are seeing, but also for their size.
The reason this is important for the present
study of the NRMP and specialty matches is
that, in the theoretical study of simple markets, we can look at what would happen when
we know agents’ true preferences, not just the
ROLs they submit to the match (whereas in
the study of real matches we have been using
as data the submitted ROLs). One theoretical
possibility for why we ﬁnd such small potential for strategic manipulation is that our data
has been collected after such manipulation
has already taken place. That is, one counterhypothesis might explain our results by pos VOL. 89 NO. 4 FIGURE 1. SIZE ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS OF THE SET OF STABLE MATCHINGS AS A FRACTION OF n , WHEN k 769 n (UNCORRELATED PREFERENCES) Note: C ( n ) is the number of applicants who get different stable matches, when the market size is n . iting that there are substantial opportunities
for strategic manipulation but that these have
been exhausted by the time we look at the
ROLs submitted to the match, because the
participants have already behaved strategically in an optimal way. Another counterhypothesis could be that the hybrid nature of the
preexisting NRMP algorithm in fact produces
matches that are far from the worst possible
stable matching for applicants, and that the
set of stable matchings is therefore substantially larger than we detect. The results discussed in this section show that these
hypotheses are implausible, because when we
looked at similarly sized artiﬁcial matches, in
which we can examine the hypothetical
participants’ true preferences, we ﬁnd that the
set of stable matchings is close to the size we
have computed from the ROL data. Thus the
study of simple markets provides an explanation of not only the direction of the effects we
have been examining, but also their small
size.13 13
It remains an open problem to develop analytical
results that explain why the core of this simple market
shrinks as the market grows when the number of interviews
an applicant can go on remains constant. The fact that every
worker who does get a different job at different stable
matchings is involved in certain sorts of preference cycles
may provide an avenue for obtaining such results. VII. Theory and Computation in Economic
Design: Some Methodological Reﬂections Perhaps the ﬁrst rule of any design effort is
that “details matter.” The details determine
what outcomes are even feasible, and so they
matter in the most basic aspects of design; and
they have implications for all of the market’s
properties, so they matter for the subtlest aspects of the design’s consequences. Thus, every
design effort will be different. But if we are to
develop a body of knowledge about design
practice in economics, we need to think about
the methodological issues that may be common
to many design efforts. This section is an attempt to put the methodological issues encountered in the NRMP design and evaluation into a
context that may be useful for other design
efforts. Speciﬁcally, this design effort involved
the continual interplay among various aspects
of simple theory, computational experiments,
and theoretical computation. The simple theory
guided the design of computational experiments
on the complex system, which provided unpredicted results that were then explained by theoretical computation.
The reason why there are gaps between theory and design is that, just as design is detailed,
theoretical models must often be sparse, to be
useful for organizing and directing work in a
variety of applications whose connections may
become apparent only with the beneﬁt of the 770 THE AMERICAN ECONOMIC REVIEW FIGURE 2. SIZE OF THE SET OF STABLE MATCHINGS AS A FRACTION OF
OF k (UNCORRELATED PREFERENCES) SEPTEMBER 1999 n FOR DIFFERENT VALUES Notes: C ( n ) is the number of applicants who get different stable matches, when the market size is n ; k is the number of
programs on an applicant’s ROL. ory. Much of this paper has therefore been
concerned with ﬁlling the gaps between simple
abstract markets and complex real ones. But
before we discuss the ﬁlling of gaps, it is useful
to recall the essential role played by the theory
of simple matching markets. This role ranged
from suggesting the basic design of the clearinghouse algorithm and the comparisons of the
algorithms, to directing attention to aspects of
the market in which problems might be anticipated, and to offering insights into how these
might be overcome.
It was the existing simple theory, and the
empirical studies it permitted to be conducted
on ﬁeld data, that pointed to the importance of
stable matchings. Although counterexamples
showed that stable matchings might not exist in
the complex American medical market (Roth,
1984), the theory of simple markets suggested a
general architecture for an algorithm to ﬁnd
stable matchings. Furthermore, it showed that
algorithms in which proposals were issued by applicants could be expected to produce stable
matchings as favorable as possible to applicants. In short, the body of theory that existed
prior to the start of this design (e.g., as summarized in Roth and Sotomayor [1990]) already
constituted a rough road map for the mechanism
design and evaluation reported here.
At the same time, the existing body of theory,
through counterexamples designed to explore
its limits (inspired by empirical studies of existing markets), pointed to questions that needed
to be answered. These included the role of sequencing in design of the algorithm, the frequency with which the algorithm might fail to
ﬁnd a stable matching, and the frequency with
which opportunities for strategic manipulation
might arise. These all required estimations of
magnitudes, which in turn required computational experiments on the data. Some of these
computational experiments were straightforward to conduct. But for estimating how often
strategic opportunities might arise, the theory VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS played an essential role in the design of the
computational experiments.
Speciﬁcally, although the main conclusions
about strategic behavior do not carry over
from the simple to the complex market, the
theory of the simple case gives us not only
ﬁnal conclusions, but also insight into the
way that strategic behavior works. In the case
of misrepresentation of ROLs, the way an
applicant might gain an advantage, in either
the simple or complex markets, is to state an
ROL that causes him, at some point in the
algorithm, to make a rejection that would not
have been made if he had submitted his true
preferences. This rejection causes a residency
program to have a vacancy and hence make an
offer to another applicant, who in turn may
make a different rejection than he would have
if the original applicant had stated his true
preferences. It is the propagation of this “vacancy chain” through the market that raises
the possibility that an applicant could do better than to state his true preferences.14 The
fact that the potential advantage comes from
rejections being made implies in the simple
model that the possibility of proﬁtable strategic misrepresentations of ROLs can be investigated by looking at only the small subset of
misrepresentations that consist of truncations.
To see if this was approximately true for the
complex market required a computational experiment, and (when this proved to be the
case) it became computationally feasible to
investigate the strategic properties of the
complex market, through an experiment concentrating on truncations. Thus, the theory
allowed us to see what computational experiments would give us the answer to a question
that the theory alone could not answer.
While computational experiments on the data
allow us to get answers that may not be available from simple theory, they do not necessarily
let us understand why the answers are what they
are. In addition, results obtained from exploring
a large and complex data set with a large and
new piece of software (the new algorithm) need
to be checked in some way, to make sure that 14
The propagation of vacancy chains as such in simple
markets is a topic that was explored in the course of this
design effort, and is reported in Yossi Blum et al. (1997). 771 the results are not due to some unanticipated
artifact of the way the algorithm deals with the
complexities of the data.15 That is, although
properly constructed computational experiments on the data offer us answers to questions
we cannot answer with theory alone, we need
both to check and to understand these answers
before we can have the conﬁdence in them that
we would like to have before recommending
that the new algorithm be considered for use in
the market.
We addressed these issues in two ways: by
computational experiments on the data from
the Thoracic Surgery matches and by theoretical computation to determine how the size of
the set of stable matches behaved in large
simple markets. The ﬁrst of these allowed us
to exercise the software on a medical market
free of the match variations present in the
general medical market. The theory therefore
permitted us to interpret the comparisons between the two algorithms as unambiguously
measuring the size of the set of stable matchings. The small size of this set therefore had
no possibility of resulting from some aspect
of how the algorithm deals with match variations.
The computations on thousands of randomly generated simple markets with ﬁxed
length of ROLs and varying numbers of participants allowed us to see how the size of the
set of stable matchings shrinks as the market
grows, which establishes a new kind of core
convergence result. This shows that the match
variations in the medical market do not substantially contribute to the size of the set of
stable matchings, since the results on the market data are entirely consistent with the results for similarly sized simple markets.
Given the theoretical results on strategic misrepresentation, this core convergence result
also shows that it is always a best reply for all 15
Of course, it was necessary to check directly that the
program worked, and in fact it was easy to conﬁrm that the
matchings it produced were stable as well as feasible with
regard to all the match variations. The question we are
referring to here is not whether the program does what it
was designed to do, but rather whether the apparent small
size of the set of stable matchings might have to do with
some aspect of how the program handles the match variations. 772 THE AMERICAN ECONOMIC REVIEW but a tiny percentage of participants in large
simple markets to state their true preferences.
Note that we distinguish between what we
call the “computational experiments” on the
actual NRMP data and the “theoretical computation” on the randomly generated simple
markets. This has to do with our view that
“theory” resides in the simplicity of the model
and systematic nature of the conclusions,
rather than the body of mathematical technique traditionally associated with theory.
The theoretical computations tell us how the
difference between the applicant and ﬁrmoptimal stable matches varies with the size of
a simple market. This new computational result, combined with existing theory, allows us
to interpret this as precisely measuring the
size of the core of the market, and to determine the implications this has for the possibility of proﬁtable strategic manipulation.
The theorems explaining why the core must
converge as it does will surely follow (see
Feldin [1999] for some progress in this
direction).
In summary, the design process discussed
here involved interplay among various aspects of simple theory, computational experiments, and theoretical computation.16 We
suspect that, as we build a body of engineering practice in economics, this will prove to
be a general pattern.
VIII. Concluding Remarks The crisis of conﬁdence that threatened to
undermine participation in the NRMP was
serious precisely because the kind of market
failure which the NRMP was initially developed to correct arose when residency programs and applicants lost conﬁdence in the
existing market. But by the time of this modern crisis, the historical market failure and
how it was corrected by the NRMP were
understood; and so was the fact that similar
market failures in British medical markets
had occurred and been corrected with stable
matching mechanisms, while unstable mech 16
Laboratory experiments also have a role to play, although not one we will discuss here (but see Kagel and Roth
[2000]). SEPTEMBER 1999 anisms had failed (Roth, 1990, 1991). In addition, the general class of market failures due
to unraveling of appointment dates had been
identiﬁed in many markets (Roth and Xing,
1994). Therefore, although physicians who
had participated in the unraveling of the
American medical market and in the formation of the NRMP were no longer active, it
was not difﬁcult to communicate to the participants in the modern market why it was
desirable to focus on changes in the market
that would not reignite the unraveling of appointment dates (Roth, 1996b).17 Thus, although what we knew about twosided
matching markets did not provide an immediate solution to the design of a new market
for physicians, it provided clear guidelines
and suggested clear approaches.
It was nevertheless troubling to us at the
outset of this design effort that not only did
none of the standard theorems about simple
matching markets apply directly to the medical market, but counterexamples to the conclusions of many of them were known to exist
when the complications of the actual market
were present. These counterexamples had the
potential to be of great importance, as in the
possibility that different stable matchings
might yield different levels of employment (a
possibility that does not arise in simple markets). Indeed, our results show that in this
market this possibility is real and so cannot be
ruled out with better theory. But of the more 17
Indeed the initial study proposal (Roth, 1995) quoted
Hippocrates’s famous dictum that, when preparing to treat a
disease, The physician must be able to tell the antecedents,
know the present, and foretell the future, must mediate these things, and have two special objects in
view with regard to disease, namely, to do good or to
do no harm.
In this connection it is worth mentioning that, particularly because conﬁdence in the market was the key
issue, the study was conducted in an unusually public
way, with progress reports posted regularly on the internet (see
http://www.economics.harvard.edu/ aroth/
nrmp.html ) and widely distributed to interested
organizations of physicians and medical students. A ﬁnal
report, brieﬂy summarizing the overall results as in Tables 1 and 2, was presented to the medical community at
large in Roth and Peranson (1997). VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS than 100,000 applicants in the years we studied in detail, only two applicants (one in 1987
and one in 1996) would have changed from
employed to unemployed or vice versa at the
different stable matchings we consider (see
Table 2). Because this difference was both
tiny and unsystematic, it did not play a role in
the market design.
This and the related results about the small
number of applicants who receive different
matches at the different stable matchings
point to a need to develop theory in ways that
will tell us not only about the possibility of
different effects, but also about their probability and likely magnitudes. It seems to us
that questions about magnitudes of the sort
we encountered in the course of this design
will often arise in efforts to employ economic
theory in the design of institutions for complex markets. Theoretical computation can be
a big help in this effort, as it was in the
present case in clarifying the unexpected consequences of the simple fact that applicants
can interview at only a small fraction of the
available positions.
More generally, just as there is a chemicalengineering literature (and not just literature
about theoretical and laboratory chemistry)
and a medical literature (and not just a biology literature), economists need to develop a
scientiﬁc literature concerned with practical
problems of design. An engineeringoriented
design literature, and the theory that supports
it, will be different from the basic science on
which it depends, both in emphasis and in
method. If we do not develop such a literature, the practical problems of design will be
relegated to the arena of “just consulting,”
and we will fail to beneﬁt from the accumulation of knowledge which is so evident in
other kinds of engineering. 773 APPENDIX A: RESULTS OF THE COMPUTATIONAL
EXPERIMENTS CONCERNED WITH SEQUENCING
Table A1 presents results from the sequencing
experiments on the preexisting NRMP algorithm.
Table A2 summarizes results of the experiments
related to sequencing in the applicantproposing
algorithm. Table A3 compares the results when
input quotas are set to ﬁnal quotas and reversion
processing is eliminated to the initial results. TABLE A1—EFFECTS OF SEQUENCE IN WHICH PROGRAMS
ARE PROCESSED
A. Results with Programs Processed in Descending Code
Order Compared to Original Results with Preexisting
NRMP Algorithm
Result
Programs
Improve
Do worse
Applicants
Improve
Do worse 1993 1994 1995 none
2 2
2 2
none 2
none 2
2 none
2 B. Sequencing of Reversions: Results with Input Quotas
Set to Final Quotas and Reversion Processing Eliminated,
Compared to Original Results with Preexisting NRMP
Algorithm
Result
1993
1994
1995a
Programs
Improve
Do worse
Applicants
Improve
Do worse 2
none none
2 none
2 none
2 2
none 2
none Notes: In 1994, when some programs and applicants did
better while others did worse, there was no correlation
between the change in result and the code numbers of the
applicants and programs.
a
Subsequently, the years 1987 and 1996 were also examined with similar results: no applicants were affected in
1987, two were affected in 1996. 774 THE AMERICAN ECONOMIC REVIEW TABLE A2—SUMMARY OF RESULTS OF EXPERIMENTS RELATED Sequence of processing TO SEQUENCING
1993 IN THE SEPTEMBER 1999
APPLICANTPROPOSING ALGORITHM
1994 1995 A. Baseline Results (When Program Selected from Stack, Applicants Processed in Ascending Program Rank Number
Sequence)
Applicants ascending; singles and couples intermixed
Match resulta
Loops detected —
3 —
6 —
4 B. Applicant and Couples Processing Sequence (When Program Selected from Stack, Applicants Processed in Ascending
Program Rank Number Sequence)
Applicants descending; couples last
Match result
Loops detected same
0 same
0 same
0 Applicants ascending; couples last
Match result
Loops detected same
2 same
0 same
0 Applicants ascending; couples ﬁrst
Match result
Loops detected 2 applicants worse
3 2 applicants worse
6 same
1 Applicants descending; couples ﬁrst
Match result
Loops detected 2 applicants worse
1 2 applicants worse
59 same
3 C. Sequence of Processing Applicants Ranked by Program Selected from Program Stack (When Program Selected from
Stack, Applicants Processed in Descending Program Rank Number Sequence)
Applicants ascending; singles and couples intermixed
Match result
Loops detected same
17 Applicants ascending; couples last
Match result same Loops detected 2 9 applicants improved,
3 applicants worseb
148 same 9 applicants improved,
3 applicants worseb
0 same 62 0 a This is the base result to which others are compared.
In part C, the results for the two experiments for 1994 (couples intermixed and couples last) were the same. In both cases,
the differences in the results in part C as compared to the baseline results in part A were caused by chains resulting from two
applicants doing worse in part C when compared to part A.
b TABLE A3—RESULTS WITH INPUT QUOTAS SET TO FINAL
QUOTAS AND REVERSION PROCESSING ELIMINATED,
COMPARED TO INITIAL RESULTS WITH APPLICANT
PROPOSING ALGORITHM
Result
Programs
Improve
Do worse
Applicants
Improve
Do worse 1993 1994 1995a none
none none
2 none
2 none
none 2
none 2
none a
Subsequently, 1987 and 1996 were also examined, with
no applicants affected in 1987 and a single chain of nine
affected in 1996. APPENDIX B: RESULTS OF THE COMPUTATIONAL
EXPERIMENTS CONCERNED WITH TRUNCATION OF
ROLS AND CAPACITY REDUCTIONS
The results of truncation at the match point are
reported in Table B1. Table B2 shows the results
for iterative truncations of applicant ROLs, while
Table B3 shows the corresponding results for iterative truncations of program ROLs. VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS
TABLE B1—TRUNCATIONS 1993 AT THE 775 MATCH POINT 1994 1995 Difference in Result for Both the Preexisting NRMP Algorithm and the ApplicantProposing Algorithm When Applicant ROLs Are Truncated at the
Match Point:
none 2 applicants improve, same positions ﬁlled 2 applicants improve, same positions ﬁlled
Difference in Result for the Preexisting NRMP Algorithm When Program ROLs Are Truncated at the Match Point:
none
none
2 applicants do worse, same positions ﬁlled
Difference in Result for the ApplicantProposing Algorithm When Program ROLs Are Truncated at the Match Point:
none
3 applicants do worse, same number of positions ﬁlled, but not same positions (3
programs ﬁlled 1 less position; 1 program ﬁlled 1 more position; 1 program ﬁlled
2 more positions; 1 additional position was reverted from one program to another) TABLE B2—RESULTS
1987
Original
NRMP
algorithm
Run
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 T FOR ITERATIVE 1993
Applicantproposing T&I T T&I Original
NRMP
algorithm
T TRUNCATIONS OF APPLICANT ROLS 1994
Applicantproposing T&I T T&I Original
NRMP
algorithm
T none 1995
Applicantproposing T&I T T&I Original
NRMP
algorithm
T 1996
Applicantproposing T&I T T&I Original
NRMP
algorithm
T Applicantproposing T&I T T&I 16,117 4,324 16,116 4,317 17,209 4,546 17,209 4,536 17,725 4,935 17,725 4,934 18,170 5,763 18,170 5,758 18,316 5,805 18,317 5,806
4,324 1,894 4,317 1,887 4,546 2,093 4,536 2,082 4,935 2,361 4,934 2,359 5,763 2,907 5,758 2,899 5,805 2,915 5,806 2,917
1,894
898 1,887
891 2,093 1,036 2,082 1,023 2,361 1,185 2,359 1,183 2,907 1,572 2,899 1,559 2,915 1,569 2,917 1,571
898
437
891
429 1,036
514 1,023
498 1,185
602 1,183
598 1,572
857 1,559
844 1,569
861 1,571
864
437
203
429
194
514
258
498
241
602
292
598
287
857
473
844
460
861
481
864
482
203
93
194
84
258
135
241
116
292
151
287
143
473
251
460
238
481
271
482
271
93
41
84
31
135
73
116
52
151
75
143
66
251
136
238
124
271
157
271
155
41
24
31
13
73
48
52
25
75
40
66
31
136
79
124
67
157
89
155
87
24
18
13
6
48
34
25
12
40
27
31
17
79
45
67
31
89
57
87
55
18
14
6
2
34
27
12
5
27
18
17
7
45
31
31
17
57
36
55
33
14
12
2
0
27
24
5
2
18
14
7
3
31
22
17
8
36
24
33
21
12
12
—
—
24
22
2
0
14
13
3
2
22
18
8
4
24
19
21
15
—
—
—
—
22
22
—
—
13
13
2
2
18
16
4
2
19
15
15
13
—
—
—
—
—
—
—
—
—
—
—
—
16
16
2
2
15
14
13
12
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
14
13
12
11
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
13
12
11
10
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
12
11
10
9
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
11
11
9
9 Notes: Columns labeled “T” report the number of matches involving truncated ROLs. Columns labeled “T & I” report the number of matches involving truncated ROLs
and “improved” matches.
TABLE B3—RESULTS FOR ITERATIVE TRUNCATIONS OF PROGRAM ROLS
1987
Original
NRMP
algorithm 1993
Applicantproposing Original
NRMP
algorithm 1994
Applicantproposing Original
NRMP
algorithm 1995
Applicantproposing Original
NRMP
algorithm 1996
Applicantproposing Original
NRMP
algorithm Applicantproposing Run T T&I T T&I T T&I T T&I T T&I T T&I T T&I T T&I T T&I T T&I 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 2,967
1,345
670
347
186
100
55
33
21
17
15
—
—
—
—
—
— 1,345
670
347
186
100
55
33
21
17
15
15
—
—
—
—
—
— 2,967
1,349
675
353
194
110
66
44
33
29
27
—
—
—
—
—
— 1,349
675
353
194
110
66
44
33
29
27
27
—
—
—
—
—
— 3,342
1,457
740
382
201
107
64
37
22
15
13
12
—
—
—
—
— 1,457
740
382
201
107
64
37
22
15
13
12
12
—
—
—
—
— 3,342
1,462
748
394
216
122
79
52
37
30
28
—
—
—
—
—
— 1,462
748
394
216
122
79
52
37
30
28
28
—
—
—
—
—
— 3,369
1,514
809
441
249
138
79
44
31
23
20
18
16
15
—
—
— 1,514
809
441
249
138
79
44
31
23
20
18
16
15
15
—
—
— 3,369
1,517
813
444
255
145
86
52
39
32
30
29
28
27
—
—
— 1,517
813
444
255
145
86
52
39
32
30
29
28
27
27
—
—
— 3,444
1,538
783
420
237
130
77
50
35
29
26
24
23
—
—
—
— 1,538
783
420
237
130
77
50
35
29
26
24
23
23
—
—
—
— 3,444
1,541
790
431
248
141
89
62
47
41
38
36
—
—
—
—
— 1,541
790
431
248
141
89
62
47
41
38
36
36
—
—
—
—
— 3,410
1,445
727
384
213
114
71
50
35
26
21
19
18
17
16
15
14 1,445
727
384
213
114
71
50
35
26
21
19
18
17
16
15
14
14 3,410
1,444
725
384
212
115
72
52
39
30
25
23
22
21
20
19
18 1,444
725
384
212
115
72
52
39
30
25
23
22
21
20
19
18
18 Notes: Columns labeled “T” report the number of matches involving truncated ROL’s. Columns labeled “T & I” report the number of matches involving truncated ROLs
and “improved” matches. 776 THE AMERICAN ECONOMIC REVIEW APPENDIX C: FORMAL DEFINITIONS OF STABILITY Simple Matching Markets
For markets without linkages between positions
we use the “college admissions” model as reformulated in Roth (1985) and Roth and Sotomayor
(1990 Ch. 5). There are two ﬁnite and disjoint sets,
F { f1, ... , fn} and W {w1, ... , wm}, of ﬁrms
and workers. For each ﬁrm f in F, there is a
positive integer qf , which indicates the number of
(identical) positions f has to offer.
An outcome is a matching of workers to
ﬁrms, such that each worker is matched to at
most one ﬁrm, and each ﬁrm is matched to at
most its quota of workers. It will be convenient
to denote a ﬁrm that has some number of unﬁlled positions as matched to itself in each of
those positions, and similarly an unmatched
worker will be matched to herself. To give a
formal deﬁnition, deﬁne for any set X an unordered family of elements of X to be a collection
of elements, not necessarily distinct, in which
the order is immaterial.
A matching is a function from the set F
W into the set of unordered families of elements of F W such that:
(i) (w)
1 for every worker w and
(w)
w if ( w )
F;
(ii)
( f)
q f for every ﬁrm f , and if the
number of workers in ( f ), say r , is less
than q f , then ( f ) contains q f r copies
of f ;
(w)
f if and only if w is in ( f ).
(iii)
Each worker has preferences over the ﬁrms
(and the possibility of remaining unmatched in
the market), and each ﬁrm has preferences over
the workers (and the possibility of leaving a
position unﬁlled). All preferences are transitive
and strict (recall that, in the markets we consider, participants are obliged to submit rank
orders which are necessarily strict). We will
write f i w f j to indicate that worker w prefers
fi to fj. Similarly, wi f wj represents ﬁrm f ’s
preferences P( f ) over individual workers. Firm f
is acceptable to worker w if f w w, and worker w
is acceptable to ﬁrm f if w f f (i.e., an acceptable
ﬁrm is one that is preferable to being unmatched,
and an acceptable worker is one which the ﬁrm
prefers to leaving a position unﬁlled). SEPTEMBER 1999 Each worker’s preferences over alternative
matchings correspond exactly to her preferences
over her own assignments at the two matchings.
Things are not quite so simple for ﬁrms, because
even though we have described ﬁrms’ preferences
over workers, each ﬁrm with a quota greater than
1 must be able to compare groups of workers in
order to compare alternative matchings. It will be
sufﬁcient for our purposes to assume merely that a
ﬁrm’s preferences over groups of employees it
could be matched with (i.e., over groups of not
more than qf workers) are such that, for any two
assignments that differ in only one worker, it
prefers the assignment containing the morepreferred worker (and is indifferent between them
if it is indifferent between the workers). Any preferences of this sort are called responsive to the
ﬁrm’s preferences over individual workers (Roth,
1985).
is individually irrational if
A matching
(w) f for some worker w and ﬁrm f such that
either the worker is unacceptable to the ﬁrm or the
ﬁrm is unacceptable to the worker. Such a matching will also be said to be blocked by the unhappy
agent. A ﬁrm f and worker w will be said together
to block a matching if they are not matched to
one another at
but would both prefer to be
matched to one another than to (one of) their
present assignments. That is, is blocked by the
ﬁrm–worker pair ( f, w) if (w)
f and if
f w (w) and w f for some in ( f ). (Note
that may equal either some worker w in ( f ),
or, if one of ﬁrm f’s positions is unﬁlled at ( f ),
may equal f.) Matchings blocked in this way by
an individual or by a pair of agents are unstable in
the sense that there are agents with both the incentive (because preferences are responsive) and
the power (under rules that allow any ﬁrm and
worker to conclude an agreement with each other)
to disrupt such matchings. We can now deﬁne a
matching to be stable if it is not blocked by any
individual or any ﬁrm–worker pair.18
Complex Matches
In the medical markets served by the NRMP,
the employers are residency programs, and the
18
This deﬁnition of stability appears to account only for
coalitions of size 1 or 2, but in fact it accounts for coalitions
of any size (i.e., stable matchings are in the core; see Roth
and Sotomayor [1990]). VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS workers are physicians applying to those programs. The simple model of the previous section does not allow for the variety of matching
requirements observed in the medical market,
for which purpose we will have to distinguish
between different kinds of applicants and different kinds of residency programs.
Let the set of applicants be A A1 A2
C, where A1 is the set of (single) applicants
who seek no more than one position, A2 is the
set of applicants who may want two jobs, and
who submit supplemental lists of ﬁrstyear jobs
in connection with any secondyear position on
their ROLs that requires a complementary ﬁrstyear position (and does not come with one automatically), and C is the set of couples, who
submit a single ROL listing pairs of positions. A
member of C is a couple { a i , a j } such that a i is
in the set A3 (of husbands) and a j is in the set
A4 , and the sets A1 , A2 , A3 , and A4 are sets
of applicants, who together make up the entire
population of individual applicants, which will
be denoted A
A1 A2 A3 A4. (The
Ai may not be disjoint, since members of a
couple may also submit supplemental lists.) The
reason for denoting the set of applicants both as
A and as A is that, from the point of view of
a potential employer, the members of a couple
C
{ a i , a j } are two distinct applicants who
seek distinct positions (typically in different
residency programs), while from the point of
view of the couple they are one agent with
preferences over pairs of positions.
The set of residency programs is R
{ r 1 , ... , r n }, and associated with each program
r is a positive integer q r indicating how many
positions it seeks to ﬁll. However, for some
programs r , q r may not be a constant at every
point in the matching process. There are two
reasons why q r may change. A residency program r may have an agreement with another
residency program r (typically within the same
hospital) that if r can only ﬁll k q r positions,
the remaining q r k positions will be added to
the capacity of r . In such a situation, the algorithm will change q r to k and q r to q r
(qr
k ). (It can also happen that the q i
k unﬁlled
positions revert to more than one other residency program, and so the total number of
positions need not remain constant, and different positions from a given program may revert
to different programs.) The other reason why 777 quotas may vary is that some residency programs wish to have an even number of residents, so a residency program r with quota q
may have its quota reduced to q
q
2 in
the event that it can only be matched to a
maximum of q
1 residents. (These quota
adjustments take place after an initial attempt to
make a stable match, and they cause the matching algorithm to continue from the current
match; in what follows, discussion of stability
will refer to the current quota of a program r at
any point in the algorithm, except as indicated.)
Applicants in the set A1 submit ROLs over
residency programs and hence have preferences
just like the workers in the simple model discussed earlier. Applicants in the set A2 have on
their ROLs at least one secondyear program
that requires (but does not supply) ﬁrstyear
training as well, and these applicants submit a
supplemental ROL for each such position, indicating their preferences for ﬁrstyear positions,
conditional on being matched to a given secondyear position. Each couple
{ a i , a j } in the
set C submits, as a single ROL, a ranked list of
ordered pairs of positions [i.e., an ordered list of
elements of R R whose ﬁrst element is some
( r i , r j ) which is the couple’s ﬁrstchoice pair of
positions for a i and a j , respectively, and so
forth]. Each residency program submits as its
ROL an ordered list of members of A (i.e., of
individual applicants, whether or not they are
members of a couple).
Having thus deﬁned the form in which different kinds of agents state their preferences, we
can now deﬁne stable matchings. A matching
with range R A is deﬁned as in the simple
market, except that for an applicant a in the set
A2 it may be that ( a )
1 or 2 if ( a )
matches a to a program for which it has submitted a supplemental ROL. In case ( a )
2
we will write ( a )
( r 1 , r 2 ), where r 1 is the
(secondyear) residency program on a ’s primary ROL, and r 2 is the (ﬁrstyear) residency
program on a ’s supplemental ROL when a is
matched with r 1 . (When ( a )
1 it must be
that r
( a ) is on a ’s primary ROL.)
As in the case of the simple market considered
earlier, we will say that a matching is stable if it is
not blocked by any individual agent or by a pair of
agents consisting of an individual and a residency
program, or by a couple together with one or two
residency programs. 778 THE AMERICAN ECONOMIC REVIEW A matching is blocked by an individual applicant (in the set A1 or A2), or by a residency
program, if matches that agent to some individual or residency program not on its ROL, precisely as in the simple model. A matching is
blocked by an individual couple {ai , aj} if they are
matched to a pair (ri , rj) not on their ROL. Of
course no individual or couple blocks a matching
at which the individual or couple is unmatched.
A residency program r and an applicant a in
the set A1 together block a matching
precisely as in the simple market, if they are not
matched to one another and would both prefer
to be. A residency program r and an applicant a
in the set A2 together block a matching if r
[ i.e.,
prefers a to one of its matches under
aj r
for some
in ( r )], and if either
r a r1
( a ) where the preferences a
correspond to a ’s primary ROL, or r a r 2
( a ) where a corresponds to a ’s supplemental ROL for the position r 1
( a ).
A couple
{a1, a2} and residency programs
r and r block a matching if (r, r ) c ( ) and
if either:
(i) a 1
( r ), a 1 r for some
and either a 2
( r ) or a 2 r
( r ) ; or
some
(ii) a 2
(r ), a2 r
for some
( r ) and either a 1
( r ) or a 1 r
( r ).
some (r)
for
for REFERENCES
Aldershof, Brian and Carducci, Olivia M. “Stable Matchings with Couples.” Discrete Applied
Mathematics, August 1996, 68(1–2), pp.
203– 07.
AMAMSS. “Resolutions for the 1995 Interim
Meeting.” [Online: http://www.bcm.tmc.edu/
amamss/i95res.htm#11 ], 1995.
American Medical Students’ Association and
Public Citizen Health Research Group. “Re port on Hospital Bias in the NRMP.” [Online:
http://pubweb.acns.nwu.edu/ alan/nrmp2.
html ], 1995.
Ausubel, Lawrence M.; Cramton, Peter;
McAfee, R. Preston and McMillan, John. “Synergies in Wireless Telephony: Evidence from the Broadband PCS Auctions.”
Journal of Economics and Management SEPTEMBER 1999 Strategy (Special Issue: Market Design and
the Spectrum Auctions), Fall 1997, 6(3), pp.
497–527.
Blum, Yossi; Roth, Alvin E. and Rothblum,
Uriel G. “Vacancy Chains and Equilibration in SeniorLevel Labor Markets.” Journal of
Economic Theory, October 1997, 76(2), pp.
362– 411.
Blum, Yosef and Rothblum, Uriel G. “ ‘Timing Is
Everything’ and Marital Bliss.” Journal of
Economic Theory, 1999 (forthcoming).
Colenbrander, August. “Match Algorithms Revisited.” Academic Medicine, May 1996,
71(5), pp. 414 –15.
Cramton, Peter. “The FCC Spectrum Auctions:
An Early Assessment.” Journal of Economics
and Management Strategy (Special Issue:
Market Design and the Spectrum Auctions),
Fall 1997, 6(3), pp. 431–95.
Demange, Gabrielle; Gale, David and Sotomayor,
Marilda. “A Further Note on the Stable Matching Problem.” Discrete Applied Mathematics, 1987, 16, pp. 217–22.
Feldin, Aljosa. “Core Convergence in TwoSided Matching Markets: Some Theoretical
Considerations.” Mimeo, University of Pittsburgh, 1999.
Gale, David and Shapley, Lloyd. “College Admissions and the Stability of Marriage.”
American Mathematical Monthly, January
1962, 69(1), pp. 9 –15.
Kagel, John H. and Roth, Alvin E. “The Dynamics of Reorganization in Matching Markets:
A Laboratory Experiment Motivated by a
Natural Experiment.” Quarterly Journal of
Economics, 2000 (forthcoming).
Ledyard, John O.; Porter, David and Rangel, Antonio. “Experiments Testing Multiobject Al location Mechanisms.” Journal of Economics
and Management Strategy (Special Issue:
Market Design and the Spectrum Auctions),
Fall 1997, 6(3), pp. 639 –75.
McAfee, R. Preston and McMillan, John. “Analyzing the Airwaves Auction.” Journal of
Economic Perspectives, Winter 1996, 10(1),
pp. 159 –75.
McMillan, John. “Selling Spectrum Rights.”
Journal of Economic Perspectives, Summer
1994, 8(3), pp. 145– 62.
. “Why Auction the Spectrum?” Telecommunications Policy, April 1995, 19, pp.
191–99. VOL. 89 NO. 4 ROTH AND PERANSON: MATCHING MARKET FOR PHYSICIANS Milgrom, Paul. “Auction Theory in Practice: The Simultaneous Ascending Auction.”
Mimeo, Stanford University, 1997.
Peranson, E. and Randlett, R. R. “The NRMP
Matching Algorithm Revisited: Theory versus Practice.” Academic Medicine, June
1995a, 70(6), pp. 477– 84.
. “Comments on Williams’ ‘A Reexamination of the NRMP Matching Algorithm’.”
Academic Medicine, June 1995b, 70(6), pp.
490 –94.
Plott, Charles R. “Laboratory Experimental
Testbeds: Application to the PCS Auction.”
Journal of Economics and Management
Strategy (Special Issue: Market Design and
the Spectrum Auctions), Fall 1997, 6(3), pp.
605–38.
Roth, Alvin E. “The Economics of Matching:
Stability and Incentives.” Mathematics of
Operations Research, 1982, 7, pp. 617–28.
. “The Evolution of the Labor Market
for Medical Interns and Residents: A Case
Study in Game Theory.” Journal of Political
Economy, December 1984, 92(6), pp. 991–
1016.
. “The College Admissions Problem Is
Not Equivalent to the Marriage Problem.”
Journal of Economic Theory, August 1985,
36(2), pp. 277– 88.
. “On the Allocation of Residents to
Rural Hospitals: A General Property of TwoSided Matching Markets.” Econometrica,
March 1986, 54(2), pp. 425–27.
. “New Physicians: A Natural Experiment in Market Organization.” Science, December 14, 1990, 250, pp. 1524 –28.
. “A Natural Experiment in the Organization of EntryLevel Labor Markets: Regional Markets for New Physicians and
Surgeons in the United Kingdom.” American
Economic Review, June 1991, 81(3), pp.
415– 40.
. “Proposed Research Program: Evaluation of Changes to Be Considered in
the NRMP Algorithm.” Consultant’s report
to the National Resident Matching Program
[Online: http://www.economics.harvard.edu/
alroth/nrmp.html ], 1995.
. “Interim Report No. 1: Evaluation of the
Current NRMP Algorithm, and Preliminary
Design of an ApplicantProposing Algorithm.”
Consultant’s report to the National Resident 779 Matching Program [Online: http://www.
economics.harvard.edu/ alroth/nrmp.html ],
March 1996a.
. “The NRMP as a Labor Market.”
Journal of the American Medical Association, April 3, 1996b, 275(13), pp. 1054 –56.
Roth, Alvin E. and Peranson, Elliott. “The Effects
of the Change in the NRMP Matching Algorithm.” Journal of the American Medical Association, September 3, 1997, 278(9), pp.
729 –32.
Roth, Alvin E. and Rothblum, Uriel G. “Truncation
Strategies in Matching Markets: In Search of
Practical Advice for Participants.” Econometrica, January 1999, 67(1), pp. 21– 43.
Roth, Alvin E. and Sotomayor, Marilda. “The
College Admissions Problem Revisited.”
Econometrica, May 1989, 57(3), pp. 559 –70.
. Twosided matching: A study in gametheoretic modeling and analysis. Econometric Society Monograph Series. Cambridge:
Cambridge University Press, 1990.
Roth, Alvin E. and Vande Vate, John H. “Random Paths to Stability in TwoSided Matching.” Econometrica, November 1990, 58(6),
pp. 1475– 80.
. “Incentives in TwoSided Matching
with Random Stable Mechanisms.” Economic
Theory, January 1991, 1(1), pp. 31– 44.
Roth, Alvin E. and Xing, Xiaolin. “Jumping the
Gun: Imperfections and Institutions Related
to the Timing of Market Transactions.”
American Economic Review, September
1994, 84(4), pp. 992–1044.
Salant, David J. “Up in the Air: GTE’s Experience in the MTA Auction for Personal Communication Services Licenses.” Journal of
Economics and Management Strategy (Special Issue: Market Design and the Spectrum
Auctions), Fall 1997, 6(3), pp. 549 –72.
Shiller, Robert J. Macro markets: Creating institutions for managing society’s largest economic risks. Clarendon Lectures in
Economics. Oxford: Oxford University
Press, 1993.
Sonmez, Tayfun. “Manipulation via Capacities
in TwoSided Matching Markets.” Journal of
Economic Theory, November 1997, 77(1),
pp. 197–204.
. “Can Prearranged Matches Be
Avoided in TwoSided Matching Markets?”
Journal of Economic Theory, May 1999, 780 THE AMERICAN ECONOMIC REVIEW 86(1), pp. 148 –56.
U.S. District Court for the Western District of
Missouri, Western Division. United States of America v. Association of Family Practice
Residency Directors, Final judgment. May
28, 1996.
Williams, K. J. “A Reexamination of the NRMP
Matching Algorithm.” Academic Medicine, SEPTEMBER 1999 June 1995a, 70(6), pp. 470 –76.
. “Comments on Peranson and
Randlett’s ‘The NRMP Matching Algorithm Revisited: Theory versus Practice’.”
Academic Medicine, June 1995b, 70(6), pp.
485– 89.
Wilson, Robert B. Nonlinear pricing. Oxford:
Oxford University Press, 1993. ...
View
Full
Document
This note was uploaded on 12/25/2011 for the course ECON 171 taught by Professor Charness,g during the Fall '08 term at UCSB.
 Fall '08
 Charness,G

Click to edit the document details