This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Population Games and Evolutionary Dynamics
William H. Sandholm
April 29, 2008 1 3
4 A chaotic attractor of the replicator dynamic ii CONTENTS 0 Introduction 1 I Population Games 3 1 Population Games
1.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Population Games . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Populations, Strategies, and States . . . . . . . . . .
1.1.2 Payoﬀs . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.3 Best Responses and Nash Equilibria . . . . . . . . .
1.1.4 Prelude to Evolutionary Dynamics . . . . . . . . . .
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Random Matching in Normal Form Games . . . . .
1.2.2 Congestion Games . . . . . . . . . . . . . . . . . . .
1.2.3 Two Simple Externality Models . . . . . . . . . . . .
1.3 The Geometry of Population Games and Nash Equilibria .
1.3.1 Drawing TwoStrategy Games . . . . . . . . . . . .
1.3.2 Displacement Vectors and Tangent Spaces . . . . .
1.3.3 Orthogonal Projections . . . . . . . . . . . . . . . . .
1.3.4 Drawing ThreeStrategy Games . . . . . . . . . . .
1.3.5 Tangent Cones and Normal Cones . . . . . . . . . .
1.3.6 Normal Cones and Nash Equilibria . . . . . . . . .
1.A Aﬃne Spaces, Tangent Spaces, and Orthogonal Projections
1.A.1 Aﬃne Spaces . . . . . . . . . . . . . . . . . . . . . .
1.A.2 Aﬃne Hulls of Convex Sets . . . . . . . . . . . . . . iii .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 5
5
6
6
6
7
8
8
8
10
11
12
12
14
17
19
22
23
27
28
29 1.A.3 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.B The Moreau Decomposition Theorem . . . . . . . . . . . . . . . . . . . . . . 34
1.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2 Potential Games, Stable Games, and Supermodular Games
2.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1 Full Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Full Population Games . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Deﬁnition and Characterization . . . . . . . . . . . . . . . . . .
2.1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.4 Nash Equilibria of Full Potential Games . . . . . . . . . . . . . .
2.1.5 The Geometry of Nash Equilibrium in Full Potential Games . .
2.1.6 Eﬃciency in Homogeneous Full Potential Games . . . . . . . .
2.1.7 Ineﬃciency Bounds for Congestion Games . . . . . . . . . . . .
2.2 Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Deﬁnition, Characterizations, and Examples . . . . . . . . . . .
2.2.3 Potential Games and Full Potential Games . . . . . . . . . . . .
2.2.4 Passive Games and Constant Games . . . . . . . . . . . . . . . .
2.3 Stable Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.3 Invasion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.4 Global Neutral Stability and Global Evolutionary Stability . . .
2.3.5 Nash Equilibrium and Global Neutral Stability in Stable Games
2.4 Supermodular Games . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.3 Best Response Monotonicity in Supermodular Games . . . . . .
2.4.4 Nash Equilibria of Supermodular Games . . . . . . . . . . . . .
2.A Multivariate Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.A.1 Univariate Calculus . . . . . . . . . . . . . . . . . . . . . . . . .
2.A.2 The Derivative as a Linear Map . . . . . . . . . . . . . . . . . . .
2.A.3 Diﬀerentiation as a Linear Operation . . . . . . . . . . . . . . . .
2.A.4 The Product Rule and the Chain Rule . . . . . . . . . . . . . . .
2.A.5 Homogeneity and Euler’s Theorem . . . . . . . . . . . . . . . .
2.A.6 Higher Order Derivatives . . . . . . . . . . . . . . . . . . . . . .
iv .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 37
37
38
38
38
40
43
49
51
53
56
56
57
60
62
64
64
67
70
71
75
79
79
82
84
85
86
86
86
89
89
91
92 2.A.7 The Whitney Extension Theorem . . . . . . . . . . . . . . . . .
2.A.8 Vector Integration and the Fundamental Theorem of Calculus
2.A.9 Potential Functions and Integrability . . . . . . . . . . . . . . .
2.B Aﬃne Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.B.1 Linear Forms and the Riesz Representation Theorem . . . . .
2.B.2 Dual Characterizations of Multiples of Linear Forms . . . . .
2.B.3 Derivatives of Functions on Aﬃne Spaces . . . . . . . . . . . .
2.B.4 Aﬃne Integrability . . . . . . . . . . . . . . . . . . . . . . . . .
2.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
3 4 .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. Deterministic Evolutionary Dynamics 105 Revision Protocols and Evolutionary Dynamics
3.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1 Revision Protocols and Mean Dynamics . . . . . . . . . . . . . . .
3.1.1 Revision Protocols . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Mean Dynamics . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Target Protocols and Target Dynamics . . . . . . . . . . . .
3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Evolutionary Dynamics . . . . . . . . . . . . . . . . . . . . . . . .
3.A Ordinary Diﬀerential Equations . . . . . . . . . . . . . . . . . . . .
3.A.1 Basic Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . .
3.A.2 Existence, Uniqueness, and Continuity of Solutions . . . .
3.A.3 Ordinary Diﬀerential Equations on Compact Convex Sets
3.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Deterministic Dynamics: Families and Properties
4.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1 Principles for Evolutionary Modeling . . . . . . . . . . . . . .
4.2 Desiderata for Revision Protocols and Evolutionary Dynamics
4.2.1 Limited Information . . . . . . . . . . . . . . . . . . . .
4.2.2 Incentives and Aggregate Behavior . . . . . . . . . . .
4.3 Families of Evolutionary Dynamics . . . . . . . . . . . . . . . .
4.4 Imitative Dynamics . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . v 93
93
94
95
95
96
98
100
102 .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. 107
107
108
108
110
112
113
117
119
119
120
123
125 .
.
.
.
.
.
.
.
. 127
127
128
130
130
132
135
140
140
141 4.4.3 Biological Derivations of the Replicator Dynamic . . . . . . .
4.4.4 Extinction and Invariance . . . . . . . . . . . . . . . . . . . . .
4.4.5 Monotone Percentage Growth Rates and Positive Correlation
4.4.6 Rest Points and Restricted Equilibria . . . . . . . . . . . . . . .
4.5 Excess Payoﬀ Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.1 Deﬁnition and Interpretation . . . . . . . . . . . . . . . . . . .
4.5.2 Incentives and Aggregate Behavior . . . . . . . . . . . . . . .
4.6 Pairwise Comparison Dynamics . . . . . . . . . . . . . . . . . . . . .
4.6.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.2 Incentives and Aggregate Behavior . . . . . . . . . . . . . . .
4.6.3 Desiderata Revisited . . . . . . . . . . . . . . . . . . . . . . . .
4.7 Multiple Revision Protocols and Combined Dynamics . . . . . . . . .
4.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 .
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
. Best Response and Projection Dynamics
5.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1 The Best Response Dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 Deﬁnition and Examples . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.2 Construction and Properties of Solution Trajectories . . . . . . . . .
5.1.3 Incentive Properties . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Perturbed Best Response Dynamics . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Revision Protocols and Mean Dynamics . . . . . . . . . . . . . . . .
5.2.2 Perturbed Optimization: A Representation Theorem . . . . . . . .
5.2.3 Logit Choice and the Logit Dynamic . . . . . . . . . . . . . . . . . .
5.2.4 Perturbed Incentive Properties via Virtual Payoﬀs . . . . . . . . . .
5.3 The Projection Dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.2 Solution Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.3 Incentive Properties . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.4 Revision Protocols and Connections with the Replicator Dynamic .
5.A Diﬀerential Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.A.1 Basic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.A.2 Diﬀerential Equations Deﬁned by Projections . . . . . . . . . . . . .
5.B The Legendre Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.B.1 Legendre Transforms of Functions on Open Intervals . . . . . . . .
5.B.2 Legendre Transforms of Functions on Multidimensional Domains .
5.C Perturbed Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi .
.
.
.
.
.
.
.
.
.
.
.
. 145
147
149
151
152
152
154
156
156
157
159
159
161 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 163
163
164
164
166
171
172
173
174
177
182
184
184
185
188
188
191
191
193
194
195
198
199 5.C.1 Proof of the Representation Theorem . . . . . . . . . . . . . . . . . . 199
5.C.2 Additional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 III
6 7 Convergence and Nonconvergence
Global Convergence of Evolutionary Dynamics
6.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1 Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1.1 Potential Functions as Lyapunov Functions . . . . . . . . . . . . .
6.1.2 Gradient Systems for Potential Games . . . . . . . . . . . . . . . .
6.2 Stable Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1 The Projection and Replicator Dynamics in Strictly Stable Games
6.2.2 Integrable Target Dynamics . . . . . . . . . . . . . . . . . . . . . .
6.2.3 Impartial Pairwise Comparison Dynamics . . . . . . . . . . . . .
6.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Supermodular Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 The Best Response Dynamic in TwoPlayer Normal Form Games
6.3.2 Stochastically Perturbed Best Response Dynamics . . . . . . . . .
6.4 Dominance Solvable Games . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.1 Dominated and Iteratively Dominated Strategies . . . . . . . . .
6.4.2 The Best Response Dynamic . . . . . . . . . . . . . . . . . . . . . .
6.4.3 Imitative Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.A Limit and Stability Notions for Deterministic Dynamics . . . . . . . . . .
6.A.1 ωLimits and Notions of Recurrence . . . . . . . . . . . . . . . . .
6.A.2 Stability of Sets of States . . . . . . . . . . . . . . . . . . . . . . . .
6.B Stability Analysis via Lyapunov Functions . . . . . . . . . . . . . . . . .
6.B.1 Lyapunov Stable Sets . . . . . . . . . . . . . . . . . . . . . . . . . .
6.B.2 ωLimits and Attracting Sets . . . . . . . . . . . . . . . . . . . . .
6.B.3 Asymptotically Stable and Globally Asymptotically Stable Sets .
6.C Cooperative Diﬀerential Equations . . . . . . . . . . . . . . . . . . . . . .
6.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 209
209
211
211
216
219
221
224
235
237
239
240
243
247
247
248
249
250
250
251
252
252
253
255
256
258 Local Stability under Evolutionary Dynamics
261
7.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
7.1 NonNash Rest Points of Imitative Dynamics . . . . . . . . . . . . . . . . . . 262 vii 7.2
7.3 7.4 7.5 7.6 7.A 7.B 7.C
7.N
8 Local Stability in Potential Games . . . . . . . . . . . . . . . . . . . . .
Evolutionarily Stable States . . . . . . . . . . . . . . . . . . . . . . . .
7.3.1 Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3.2 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3.3 Regular ESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Local Stability via Lyapunov Functions . . . . . . . . . . . . . . . . .
7.4.1 The Replicator and Projection Dynamics . . . . . . . . . . . .
7.4.2 Target and Pairwise Comparison Dynamics: Interior ESS . . .
7.4.3 Target and Pairwise Comparison Dynamics: Boundary ESS .
Linearization of Imitative Dynamics . . . . . . . . . . . . . . . . . . .
7.5.1 The Replicator Dynamic . . . . . . . . . . . . . . . . . . . . . .
7.5.2 General Imitative Dynamics . . . . . . . . . . . . . . . . . . . .
Linearization of Perturbed Best Response Dynamics . . . . . . . . . .
7.6.1 Deterministically Perturbed Best Response Dynamics . . . . .
7.6.2 The Logit Dynamic . . . . . . . . . . . . . . . . . . . . . . . . .
Matrix Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.A.1 Rank and Invertibility . . . . . . . . . . . . . . . . . . . . . . .
7.A.2 Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . .
7.A.3 Similarity, (Block) Diagonalization, and the Spectral Theorem
7.A.4 Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . .
7.A.5 The Real Jordan Canonical Form . . . . . . . . . . . . . . . . .
7.A.6 The Spectral Norm and Singular Values . . . . . . . . . . . . .
7.A.7 Hines’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . .
Linear Diﬀerential Equations . . . . . . . . . . . . . . . . . . . . . . .
7.B.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.B.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.B.3 Stability and Hyperbolicity . . . . . . . . . . . . . . . . . . . .
Linearization of Nonlinear Diﬀerential Equations . . . . . . . . . . .
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nonconvergence of Evolutionary Dynamics
8.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
8.1 Conservative Properties of Evolutionary Dynamics .
8.1.1 Constants of Motion in Null Stable Games . .
8.1.2 Preservation of Volume . . . . . . . . . . . . .
8.2 Games with Nonconvergent Evolutionary Dynamics
8.2.1 Circulant Games . . . . . . . . . . . . . . . . .
viii .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 263
265
265
266
267
268
268
269
271
275
276
280
282
282
283
285
285
286
287
290
290
292
293
294
295
297
298
299
302 .
.
.
.
.
. 307
307
308
308
313
315
316 8.2.2 Continuation of Attractors for Parameterized Games . .
8.2.3 Mismatching Pennies . . . . . . . . . . . . . . . . . . . .
8.2.4 The Hypnodisk Game . . . . . . . . . . . . . . . . . . . .
8.3 Chaotic Evolutionary Dynamics . . . . . . . . . . . . . . . . . . .
8.4 Survival of Dominated Strategies . . . . . . . . . . . . . . . . . .
8.A Three Classical Theorems on Nonconvergent Dynamics . . . . .
8.A.1 Liouville’s Theorem . . . . . . . . . . . . . . . . . . . . .
8.A.2 The Poincar´ Bendixson and BendixsonDulac Theorems
e
8.B Attractors and Continuation . . . . . . . . . . . . . . . . . . . . .
8.B.1 Attractors and Repellors . . . . . . . . . . . . . . . . . . .
8.B.2 Continuation of Attractors . . . . . . . . . . . . . . . . . .
8.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV
9 .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
. Stochastic Evolutionary Models
Stochastic Evolution and Deterministic Approximation
9.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1 The Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2 Finite Horizon Deterministic Approximation . . . . . . . . . . . . . . . . . .
9.2.1 Kurtz’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2.2 Deterministic Approximation of the Stochastic Evolutionary Process
9.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3.1 Finite Population Eﬀects . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3.2 Discrete Time Models . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.A The Exponential and Poisson Distributions . . . . . . . . . . . . . . . . . . .
9.A.1 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.A.2 The Poisson Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . .
9.B Countable State Markov Processes . . . . . . . . . . . . . . . . . . . . . . . .
9.B.1 Countable Probability Models . . . . . . . . . . . . . . . . . . . . . .
9.B.2 Uncountable Probability Models and Measure Theory . . . . . . . .
9.B.3 Distributional Properties and Sample Path Properties . . . . . . . . .
9.B.4 Countable State Markov Chains . . . . . . . . . . . . . . . . . . . . .
9.B.5 Countable State Markov Processes . . . . . . . . . . . . . . . . . . . .
9.C Kurtz’s Theorem in Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . .
9.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 319
322
326
330
333
343
343
346
347
347
348
349 351
353
353
354
355
356
358
362
362
362
363
363
366
368
368
369
370
372
373
375
376 10 Inﬁnite Horizon Behavior and Equilibrium Selection
379
10.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Bibliography 381 x Frequently Used Deﬁnitions
Classes of games
(2.8) ΦF(x) = f (x) ( ≡ Φ f (x) ) (2.14) ( y − x) (F( y) − F(x)) ≤ 0 (2.22) ˜
˜
Σ y ≥ Σx implies that Σ F( y) ≥ Σ F(x) potential game 58 stable game 64 supermodular game 80 General equations for mean dynamics
(M) p pp ˙
xi = p p x j ρ ji (Fp (x), xp ) − xi ρi j (Fp (x), xp ) (3.3) p mean dynamic 111 target dynamic 112 exact target dynamic j∈Sp 113 j∈Sp
p p p τ j (Fp (x), xp ) ˙
xi = mp τi (Fp (x), xp ) − xi j∈Sp (3.5) ˙
xp = mp σp (Fp (x), xp ) − xp . Properties of evolutionary dynamics
(NS) VF (x) = 0 if and only if x ∈ NE(F) (PC) VF (x) p Nash stationarity
positive correlation p 0 implies that VF (x) Fp (x) > 0 132
132 Six fundamental evolutionary dynamics
(R) p
p ˆp
˙
xi = xi Fi (x) replicator dynamic p
p
ˆp
˙
(BNN) xi = mp [Fi (x)]+ − xi BNN dynamic 154 Smith dynamic 157 best response dynamic 165 logit(η) dynamic 177 projection dynamic p ˆ
[F j (x)]+ 141 184 j∈Sp (S) p p ˙
xi = p p j∈Sp (BR) p p x j [Fi (x) − F j (x)]+− xi j∈Sp ˙
xp ∈ mp Mp (Fp (x)) − xp
p p [F j (x) − Fi (x)]+ p
exp(η−1 Fi (x))
−1 p
j∈Sp exp(η F j (x)) (L) ˙
xi = mp (P) p ˙
x = ΠTX(x) (F(x)) − xi xi xii CHAPTER ZERO
Introduction 1 2 Part I
Population Games 3 CHAPTER ONE
Population Games 1.0 Introduction Population games are used to model strategic interactions with these ﬁve traits:
(i) The number of agents is large.
(ii) Individual agents are small: Any one agent’s behavior has little or no eﬀect on
other agents’ payoﬀs.
(iii) The number of roles is ﬁnite: Each agent is a member of one of a ﬁnite number of
populations. Members of a population choose from the same set of strategies, and
their payoﬀs are identical functions of own behavior and opponents’ behavior.
(iv) Agents interact anonymously: Each agent’s payoﬀs only depend on opponents’
behavior through the distribution of opponents’ choices.
(v) Payoﬀs are continuous: The dependence of each agent’s payoﬀs on the distribution
of opponents’ choices is continuous.
Applications ﬁtting this description can be found in a variety of disciplines, including
economics (externalities, macroeconomic spillovers, centralized markets), biology (animal
conﬂict, genetic natural selection), transportation science (highway network congestion,
mode choice), and computer science (selﬁsh routing of Internet traﬃc). Population games
provide a uniﬁed framework for studying these and other topics, helping us to identify
the forces that drive parallel conclusions in seemingly disparate ﬁelds.
The most convenient way to deﬁne population games is to assume that the set of agents
forms a continuum, as doing so enables us to study these games using tools from analysis.
Of course, real populations are ﬁnite. Still, the continuum assumption is appropriate when
the eﬀects of individuals’ choices on opponents’ payoﬀs are small, or, more generally, when
5 individuals ignore these eﬀects when deciding how to act. In subsequent chapters we will
draw explicit links between the ﬁnite and continuous models. 1.1
1.1.1 Population Games
Populations, Strategies, and States Let P = {1, . . . , p }, be a society consisting of p ≥ 1 populations of agents. Agents in
population p form a continuum of mass mp > 0. (Thus, p is the number of populations,
while p is an arbitrary population.)
The set of strategies available to agents in population p is denoted Sp = {1, . . . , np }, and
has typical elements i, j, and (in the context of normal form games) sp . We let n = p∈P np
equal the total number of pure strategies in all populations.
During game play, each agent in population p selects a (pure) strategy from Sp . The set
p
p
of population states (or strategy distributions) for population p is Xp = {xp ∈ Rn : i∈Sp xi =
+
p
mp }. The scalar xi ∈ R+ represents the mass of players in population p choosing strategy
p
i ∈ Sp . Elements of Xv , the set of vertices of Xp , are called pure population states, since at
these states all agents choose the same strategy.
Elements of X = p∈P Xp = {x = (x1 , . . . , xp ) ∈ Rn : xp ∈ Xp }, the set of social states,
+
p
describe behavior in all p populations at once. The elements of Xv = p∈P Xv are the
vertices of X, and are called the pure social states.
When there is just one population (p = 1), we assume that its mass is 1, and we omit the
superscript p from all of our notation: thus, the strategy set is S = {1, . . . , n}, the state space
is X = {x ∈ Rn : i∈S xi = 1}, the simplex in Rn , and the set of pure states Xv = {ei : i ∈ S} is
+
the set of standard basis vectors in Rn . 1.1.2 Payoﬀs We generally take the sets of populations and strategies as ﬁxed and identify a game
with its payoﬀ function. A payoﬀ function F : X → Rn is a continuous map that assigns
p
each social state a vector of payoﬀs, one for each strategy in each population. Fi : X → R
p
denotes the payoﬀ function for strategy i ∈ Sp , while Fp : X → Rn denotes the payoﬀ
functions for all strategies in Sp .
While our standing assumption is that F is continuous, we often impose the stronger
requirements that F be Lipschitz continuous or continuously diﬀerentiable (C1 ). These
additional assumptions will be made explicit whenever we use them. 6 We deﬁne
Fp (x) = 1
mp pp xi Fi (x)
i∈Sp to be the (weighted) average payoﬀ obtained by members of population p at social state x.
Similarly, we let
pp F(x) = xi Fi (x) =
p∈P i∈Sp mp Fp (x)
p∈P denote the aggregate payoﬀ achieved by the society as a whole. 1.1.3 Best Responses and Nash Equilibria To describe optimal behavior, we deﬁne population p’s pure best response correspondence,
b : X ⇒ Sp , which speciﬁes the strategies in Sp that are optimal at each social state x:
p p bp (x) = argmax Fi (x).
i∈Sp p p p Let ∆p = { yp ∈ Rn : i∈Sp yi = 1} denote the simplex in Rn . The mixed best response
+
correspondence for population p, Bp : X ⇒ ∆p is given by
p Bp (x) = yp ∈ ∆p : yi > 0 ⇒ i ∈ bp (x) .
In words, Bp (x) is the set of probability distributions in ∆p whose supports only contain
pure strategies that are optimal at x. Geometrically, Bp (x) is the convex hull of the vertices
of ∆p corresponding to elements of bp (x).
Social state x ∈ X is a Nash equilibrium of the game F if each agent in every population
chooses a best response to x:
NE (F) = x ∈ X : xp ∈ mp Bp (x) for all p ∈ P .
We will see in Section 1.3.6 that the Nash equilibria of a population game can also be
characterized in a purely geometric way.
Nash equilibria always exist:
Theorem 1.1.1. Every population game admits at least one Nash equilibrium.
Theorem 1.1.1 can be proved by applying Kakutani’s Theorem to the proﬁle of best
response correspondences. But we will see that in each of the three classes of games we
7 focus on in Chapter 2—potential games (Sections 2.1 and 2.2), stable games (Section 2.3),
and supermodular games (Section 2.4)—existence of Nash equilibrium can be established
without recourse to ﬁxed point theorems. 1.1.4 Prelude to Evolutionary Dynamics In traditional gametheoretic analyses, it is usual to assume that players follow some
Nash equilibrium of the game at hand. But because population games involve large
numbers of agents, the equilibrium assumption is quite strong, making it more appealing
to rely on less demanding assumptions. Therefore, rather than assume equilibrium play,
we suppose that individual agents gradually adjust their choices to their current strategic
environment. We then ask whether or not the induced behavior trajectories converge to
Nash equilibrium. When they do, the Nash prediction can be justiﬁed; when they do not,
the Nash prediction may be unwarranted.
The question of convergence to equilibrium is a central issue in this book. We will
see later on that in the three classes of games studied in Chapter 2, convergence results
can be established in some generality—that is, without being overly speciﬁc about the
exact nature of the agents’ revision protocols. But in this chapter and the next, we conﬁne
ourselves to introducing population games and studying their equilibria. 1.2 Examples To ﬁx ideas, we oﬀer four examples of population games. These examples and many
more will be developed and analyzed through the remainder of the book. 1.2.1 Random Matching in Normal Form Games Let us begin with the canonical example of evolutionary game theory.
Example 1.2.1. Random matching in a single population to play a symmetric game. A symmetric
two player normal form game is deﬁned by a strategy set S = {1, . . . , n} and a payoﬀ matrix
A ∈ Rn×n . Ai j is the payoﬀ a player obtains when he chooses strategy i and his opponent
chooses strategy j; this payoﬀ does not depend on whether the player in question is called
player I or player II. Below is the bimatrix corresponding to A when n = 3. 8 1
1 A11 , A11
Player I 2 A21 , A12
3 A31 , A13 Player II
2
3
A12 , A21 A13 , A31
A22 , A22 A23 , A32
A32 , A23 A33 , A33 To obtain a population game from this normal form game, we suppose that agents in a
single (unit mass) population are randomly matched to play A. Assuming that agents
evaluate probability distributions over payoﬀs by taking expectations (i.e., that the entries
of the matrix A are von Neumann–Morgenstern utilities), the payoﬀ to strategy i when the
population state is x is Fi (x) = j∈S Ai j x j . It follows that the population game associated
with A is described by the linear map F(x) = Ax. §
Example 1.2.2. Random matching in two populations. A (possibly asymmetric) two player
game is deﬁned by two strategy sets, S1 = {1, . . . , n1 } and S2 = {1, . . . , n2 }, and two payoﬀ
1
2
1
2
matrices, U1 ∈ Rn ×n and U2 ∈ Rn ×n . The corresponding bimatrix when n1 = 2 and n2 = 3
is as follows.
1
Player I 1
2 1
2
U11 , U11
1
2
U21 , U21 Player II
2
3
1
2
1
2
U12 , U12 U13 , U13
1
1
2
2
U22 , U22 U23 , U23 To deﬁne the corresponding population game, we suppose that there are two unit mass
populations, one corresponding to each player role. One agent from each population
is drawn at random and matched to play the game (U1 , U2 ). The payoﬀ functions for
populations 1 and 2 are given by F1 (x) = U1 x2 and F2 (x) = (U2 ) x1 , so the entire population
game is described by the linear map F1 (x) 0 F(x) = 2 = 2 F (x) (U ) U1 x1 U1 x2 = 2 2 1 . § x (U ) x 0 Example 1.2.3. Random matching in p populations. To generalize the previous example, we
deﬁne a p player normal form game. Let Sp = {1, . . . , np } denote player p’s strategy set and
S = q∈P Sq the set of pure strategy proﬁles; player p’s payoﬀ function Up is a map from S
to R.
In the population game, agents in p unit mass populations are randomly matched
to play the normal form game U = (U1 , . . . , Up ), with one agent from each population p
9 being drawn to serve in player role p. This procedure yields a population game with the
multilinear (i.e., linear in each xp ) payoﬀ function
p Up s1 , . . . , sp Fsp (x) =
s−p ∈S−p xrr , where S−p =
s Sq . §
qp rp We conclude with an observation relating the Nash equilibria of population games
generated by random matching to those of the underlying normal form games.
Observation 1.2.4. (i) In the single population case (Example 1.2.1), the Nash equilibria of
F are the symmetric Nash equilibria of the symmetric normal form game U = (A, A ).
(ii) In the multipopulation cases (Examples 1.2.2 and 1.2.3), the Nash equilibria of F are the
Nash equilibria of the normal form game U = (U1 , . . . , Up ). 1.2.2 Congestion Games Because of the linearity of the expectation operator, random matching in normal form
games generates population games with linear or multilinear payoﬀs. Moreover, when
p ≥ 2, each agent’s payoﬀs are independent of the behavior of other members of his
population. Outside of the random matching context, neither of these properties need
hold. Our next class of example provides a case in point.
Example 1.2.5. Congestion games. Consider the following model of highway congestion. A
collection of towns is connected by a network of links (Figure 1.2.1). For each ordered pair
of towns there is a population of agents, each of whom needs to commute from the ﬁrst
town in the pair (where he lives) to the second (where he works). To accomplish this, the
agent must choose a path connecting the two towns. The payoﬀ the agent obtains is the
negation of the delay on the path he takes. The delay on the path is the sum of the delays
on its constituent links, while the delay on a link is a function of the number of agents
who use that link.
Congestion games are used to study not only highway congestion, but also more
general settings involving “symmetric” externalities. To deﬁne a congestion game, we
begin with a ﬁnite collection of facilities (e.g., links in a highway network), denoted Φ.
p
Every strategy i ∈ Sp requires the use of some collection of facilities Φi ⊆ Φ (e.g., the links
p
in route i). The set ρp (φ) = {i ∈ Sp : φ ∈ Φi } contains those strategies in Sp that require
facility φ.
Each facility φ has a cost function cφ : R+ → R whose argument is the facility’s utilization 10 Figure 1.2.1: A highway network. level uφ , the total mass of agents using the facility:
p uφ (x) = xi .
p∈P i∈ρp (φ) Payoﬀs in the congestion game are obtained by summing the appropriate facility costs
and multiplying by −1.
p Fi (x) = − cφ uφ (x) .
p
φ∈Φi Since driving on a link increases the delays experienced by other drivers on that link,
cost functions in models of highway congestion are increasing; they are typically convex
as well. On the other hand, when congestion games are used to model settings with
positive externalities (e.g., consumer technology choice), cost functions are decreasing.
Evidently, payoﬀs in congestion games depend on ownpopulation behavior, and need
only be linear if the underlying cost functions are linear themselves.
Congestion games are the leading examples of potential games (Sections 2.1 and 2.2;
congestion games with increasing cost functions are also stable games (Section 2.3). § 1.2.3 Two Simple Externality Models We conclude this section with two simpler models of externalities.
Example 1.2.6. Asymmetric negative externalities. Agents from a single population choose
from a set of n activities. There are externalities both within and across activities; the
11 increasing C1 function ci j : [0, 1] → R represents the cost imposed on agents who choose
activity i by agents who choose activity j. Payoﬀs in this game are described by
Fi (x) = − ci j (x j ) .
j∈S If “own activity” externalities are strong, in the sense that the derivatives of the cost
functions satisfy
ci j (x j ) + c ji (xi ) , 2cii (xi ) ≥
ji then F is a stable game (Section 2.3 of Chapter 2). §
Example 1.2.7. Search with positive externalities. Consider this simple model of macroeconomic spillovers. Members of a single population choose levels of search eﬀort from the
set S = {1, . . . , n}. Stronger eﬀorts increase the likelihood of ﬁnding trading partners, so
that payoﬀs are increasing both in own search eﬀort and in aggregate search eﬀort. In
particular, payoﬀs are given by
Fi (x) = m(i) b(a(x)) − c(i),
where a(x) = n=1 kxk represents aggregate search eﬀort, the increasing function b : R+ → R
k
represents the beneﬁts of search as a function of aggregate eﬀort, the increasing function
m : S → R is the beneﬁt multiplier, and the arbitrary function c : S → R captures search
costs. In Section 2.4, we will show that F is a supermodular game. § 1.3 The Geometry of Population Games and Nash Equilibria In lowdimensional cases, we can present the payoﬀ vectors generated by a population
game in pictures. Doing so provides a way of visualizing the strategic forces at work;
moreover, the geometric insights we obtain can be extended to games that we cannot
draw. 1.3.1 Drawing TwoStrategy Games The population games that are easiest to draw are twostrategy games: i.e., games played
by a single population of agents who choose between a pair of strategies. When drawing
a twostrategy game, we represent the simplex as a subset of R2 . We synchronize the
12 x1 x* x2 Figure 1.3.1: Payoﬀs in 12 Coordination. drawing with the layout of the payoﬀ matrix by using the vertical coordinate to represent
the mass on the ﬁrst strategy and the horizontal coordinate to represent the mass on the
second strategy. We then select a group of states spaced evenly through the simplex; from
each state x, we draw an arrow representing the payoﬀ vector F(x) that corresponds to
x. (Actually, we draw scaleddown versions of the payoﬀ vectors in order to make the
diagrams easier to read.)
In Figures 1.3.1 and 1.3.2, we present the payoﬀ vectors generated by the twostrategy
coordination game FC2 and the HawkDove game FHD : 1 0 x1 x1 FC2 (x) = 0 2 x = 2x ; 2
2 −1 2 xH 2xD − xH FHD (x) = 0 1 x = x D
D Let us focus on the coordination game FC2 . At the pure state e1 = (1, 0) at which all
agents play strategy 1, the payoﬀs to the two strategies are FC2 (e1 ) = 1 and FC2 (e1 ) = 0;
2
1
hence, the arrow representing FC2 (e1 ) points directly upward from state e1 . At the interior
Nash equilibrium x∗ = (x∗ , x∗ ) = ( 2 , 1 ), each strategy earns a payoﬀ of 2 , so the arrow
12
33
3
2
representing payoﬀ vector FC2 (x∗ ) = ( 3 , 2 ) is drawn at a right angle to the simplex at x∗ .
3
Similar logic explains how the payoﬀ vectors are drawn at other states, and how the
HawkDove ﬁgure is constructed as well.
The diagrams of FC2 and FHD help us visualize the incentives faced by agents playing
these games. In the coordination game, the payoﬀ vectors “push outward” toward the
13 xH x* xD Figure 1.3.2: Payoﬀs in the HawkDove game. two axes, reﬂecting an incentive structure that drives the population toward the two pure
Nash equilibria. In contrast, payoﬀ vectors in the HawkDove game “push inward”,
away from the axes, reﬂecting forces leading the population toward the interior Nash
equilibrium x∗ = ( 1 , 1 ).
22 1.3.2 Displacement Vectors and Tangent Spaces To draw games with more than two strategies we need to introduce two new objects:
TX, the tangent space of the state space X; and Φ, the orthogonal projection of Rn onto TX.
We summarize the relevant concepts in this subsection and the next; for a fuller treatment,
see the Appendix.
To start, let us focus on a singlepopulation game F. Imagine that the population is
initially at state x, and that a group of agents of mass ε switch from strategy i to strategy j.
These revisions move the state from x to x + ε(e j − ei ): the mass of agents playing strategy
i goes down by ε, while the mass of agents playing strategy j goes up by ε. Vectors like
ε(e j − ei ), which represent the eﬀects of such strategy revisions on the population state, are
called displacement vectors. (Since these vectors are tangent to the state space X, we also
call them tangent vectors—more on this below.)
In Figure 1.3.3, we illustrate displacement vectors for twostrategy games. In this
setting, displacement vectors can only point in two directions: when agents switch from
strategy 1 to strategy 2, the state moves in direction e2 − e1 , represented by an arrow
14 e1 –e
e1
2 –e
e2 x 1 e2 Figure 1.3.3: Displacement vectors for twostrategy games. pointing southeast; when agents switch from strategy 2 to strategy 1, the state moves in
direction e1 − e2 , represented by an arrow pointing northwest. Both of these vectors are
tangent to the state space X.
(Two clariﬁcations are in order here. First, remember that a vector is characterized
by its direction and its length, not where we position its base. When we draw an arrow
representing the vector z, we use the context to determine an appropriate position x for
the arrow’s base; the arrow takes the form of a directed line segment from x to x + z.
Second, since we are mainly interested in displacement vectors’ relative sizes, we rescale
them before drawing them, just as we did with payoﬀ vectors in Figures 1.3.1 and 1.3.2.)
Now consider a threestrategy game: a game with one population and three strategies,
whose state space X is thus the simplex in R3 . A “threedimensional” picture of X is
provided in Figure 1.3.4, where X is situated within the plane in R3 that contains it. This
plane is called the aﬃne hull of X, and is denoted by aﬀ (X) (see Appendix 1.A.2). For
future reference, note that displacement vectors drawn from states in X are situated in the
plane aﬀ (X).
Instead of representing the state space X explicitly in R3 , it is more common to present
it as a twodimensional equilateral triangle (Figure 1.3.5). When we follow this approach,
our sheet of paper itself represents the aﬃne hull aﬀ (X), and so arrows drawn on the paper
represent displacement vectors. Figure 1.3.5 presents arrows describing the 3 × 2 = 6
displacement vectors of the form e j − ei , which correspond to switches between distinct
ordered pairs of strategies. Each of these arrows is parallel to some edge of the simplex. 15 e1 e2 e3 Figure 1.3.4: The simplex in R3 . e1 – e3 e1 –
e2 e1 – e1 e2 –
e1 e3 – e2 e3 e2 – e3 x e3 e2 Figure 1.3.5: Displacement vectors for threestrategy games. 16 For purposes of orientation, note that if we resituate the simplex from Figure 1.3.5 in
threedimensional space (i.e., in Figure 1.3.4), then each of these six arrows is obtained by
subtracting one standard basis vector from another.
Switches between pairs of strategies are not the only ways of generating displacement
vectors—they can also come from switches involving three or more strategies, and, in
multipopulation settings, from switches occurring within more than one population. The
set of all displacement vectors from states in X forms a subspace of Rn ; this subspace is
called the tangent space TX.
To formally deﬁne TX, let us ﬁrst consider population p ∈ P in isolation. The state
p
p
space for population p is Xp = {xp ∈ Rn : i∈Sp xi = mp }. The tangent space of Xp , denoted
+
p
TXp , is the smallest subspace of Rn that contains all vectors describing motions between
points in Xp . In other words, if xp , yp ∈ Xp , then yp − xp ∈ TXp , and TXp is the span of all
p
p
p
vectors of this form. It is not hard to see that TXp = Rn ≡ {zp ∈ Rn : i∈Sp zi = 0}: that is,
0
p
TXp contains exactly those vectors in Rn whose components sum to zero. The restriction
on the sum embodies the fact that changes in the population state leave the population’s
mass constant.
The deﬁnition above is suﬃcient for studying single population games. What if there
are multiple populations? In this case, any change in the social state x ∈ X = p∈P Xp
is a combination of changes occurring within the individual populations. Therefore, the
grand tangent space TX is just the product of the tangent spaces for each set Xp : in other
words , TX = p∈P TXp . 1.3.3 Orthogonal Projections Suppose we would like to draw a diagram representing a threestrategy game F.
One possibility is to draw a “threedimensional” representation of F in the fashion of
Figure 1.3.4. We would place more modest demands on our drafting skills if we instead
represented F in just two dimensions. But this simpliﬁcation comes at a cost: since threedimensional payoﬀ vectors F(x) ∈ R3 will be presented as twodimensional objects, some
of the information contained in these vectors will be lost.
From a geometric point of view, the most natural way to proceed is pictured in Figure
1.3.6: instead of drawing an arrow from state x corresponding to the vector F(x) itself, we
instead draw the arrow closest to F(x) among those that lie in the plane aﬀ (X). This arrow
represents a vector in the tangent space TX: namely, the orthogonal projection of F(x) onto
TX.
Let Z be a linear subspace of Rn . The orthogonal projection of Rn onto Z is a linear map
that sends each π ∈ Rn to the closest point to π in Z. Each orthogonal projection can be
17 e1 F(x)
ΦF(x) x
e2 e3 Figure 1.3.6: Projected payoﬀ vectors for threestrategy games. represented by a matrix PZ ∈ Rn×n via the map π → PZ π, and it is common to identify the
projection with its matrix representation . We treat orthogonal projections in some detail
in Appendix 1.A.3; here we focus only on the orthogonal projections we need.
p
First consider population p ∈ P in isolation. The orthogonal projection of Rn onto the
p
p
1
tangent space TXp , denoted Φ ∈ Rn ×n , is deﬁned by Φ = I − np 11 , where 1 = (1, . . . , 1) is
1
1
the vector of ones; thus np 11 is the matrix whose entries are all np .
p
If πp is a payoﬀ vector in Rn , the projection of πp onto TXp is
Φπp = πp − 1
11
np πp = πp − 1 1
np p k∈Sp πk . The ith component of Φπp is the diﬀerence between the actual payoﬀ to strategy i and the
unweighted average payoﬀ of all strategies in Sp . Thus, Φπp discards information about
average payoﬀs while retaining information about relative payoﬀs of diﬀerent strategies in
Sp . This interpretation is important from a gametheoretic point of view, since incentives,
and hence Nash equilibria, only depend on payoﬀ diﬀerences. Therefore, when incentives
(as opposed to, e.g., eﬃciency) are our main concern, we do not need to know the actual
payoﬀ vectors πp ; looking at the projected payoﬀ vectors Φπp is enough.
In multipopulation settings, the tangent space TX = p∈P TXp has a product structure; hence, the orthogonal projection onto TX, denoted Φ ∈ Rn×n , has a block diagonal
structure: Φ = diag(Φ, . . . , Φ). (Note that the blocks on the diagonal of Φ are generally
p
p
not identical: the pth block is an element of Rn ×n .) If we apply Φ to the society’s payoﬀ
18 x1 x2 Figure 1.3.7: Payoﬀs and projected payoﬀs in 12 Coordination. vector π = (π1 , . . . , πp ), the resulting vector Φπ = (Φπ1 , . . . , Φπp ) lists the relative payoﬀs
in each population. 1.3.4 Drawing ThreeStrategy Games Before using orthogonal projections to draw threestrategy games, let us see how this
1
device aﬀects our pictures of twostrategy games. Applying the projection Φ = I − 2 11 to
the payoﬀ vectors from the coordination game FC2 and the HawkDove game FHD yields 1
1 2 − 1 x1 2 x1 − x2 ΦFC2 (x) = 1 12 = 1 − 2x − x + x and 2
2
2
2
21 1 2 − 1 2xD − xH 1 (xD − xH ) 2
HD =
ΦF (x) = 1 12 1
− x (x − x ) . H
D
D
2
2
2
We draw the projected payoﬀs along with the original payoﬀs in Figures 1.3.7 and 1.3.8.
The projected payoﬀ vectors ΦF(x) lie in the tangent space TX, and so are represented
by arrows running parallel to the simplex X. Projecting away the orthogonal component
of payoﬀs makes the “outward force” in the coordination game and the “inward force” in
the HawkDove game more transparent. Indeed, Figures 1.3.7 and 1.3.8 are suggestive of
evolutionary dynamics for these two games—a topic we take up starting in Chapter 3.
Now, let us consider the threestrategy coordination game FC3 and the RockPaper19 x1 x2 Figure 1.3.8: Payoﬀs and projected payoﬀs in the HawkDove game. Scissors game FRPS . 1 0 0 x1 x1 C3
0 2 0 x = 2x ; 2 2
F (x) = 0 0 3 x 3x 3
3 0 −1 1 xR xS − xP RPS
1 x = x − x . P R
F (x) = 0 −1 S −1 1 x x − x 0
S
P
R These games are pictured in Figures 1.3.9 and 1.3.10. The arrows in Figure 1.3.9 represent
the projected payoﬀ vectors ΦFC3 (x), deﬁned by 1 x1 3 (2x1 − 2x2 − 3x3 ) 1
C3
1
2x = (−x + 4x − 3x ) . 2 3
ΦF (x) = I − 3 11 1
2
3 1
3x (−x − 2x + 6x ) 3
1
2
3
3
But in the RockPaperScissors game, the column sums of the payoﬀ matrix all equal 0,
implying that the maps FRPS and ΦFRPS are identical; by drawing one, we draw the other.
As with that of FC2 , the diagram of the coordination game FC3 shows forces pushing
outward toward the extreme points of the simplex. In contrast, Figure 1.3.10 displays
a property that cannot occur with just two strategies: instead of driving toward some
Nash equilibrium, the arrows in Figure 1.3.10 cycle around the simplex. Thus, the ﬁgure
suggests that in the RockPaperScissors game, evolutionary dynamics need not converge
to Nash equilibrium, but instead may avoid equilibrium in perpetuity. We return to
questions of convergence and nonconvergence of evolutionary dynamics beginning in
20 1 2 3 Figure 1.3.9: Projected payoﬀs in 123 Coordination. R P S Figure 1.3.10: Payoﬀs (= projected payoﬀs) in RockPaperScissors. 21 later chapters. 1.3.5 Tangent Cones and Normal Cones To complete our introduction to the geometric approach to population games, we
explain how one can ﬁnd a game’s Nash equilibria by examining a picture of the game.
To begin this discussion, note that the constraint that deﬁnes vectors z as lying in the
tangent space TX—the constraint that keeps population masses constant—is not always
enough to ensure that motion in direction z is feasible. Motions in every direction in TX
p
are feasible if we begin at a state x in the interior of the state space X. But if xi = 0 for
p
some strategy i ∈ Sp , then motion in any direction z with zi < 0 would cause the mass of
agents playing strategy i to become negative, taking the state out of X.
To describe the feasible displacement directions from an arbitrary state x ∈ X, we
introduce the notion of a tangent cone. To begin, recall that the set K ⊆ Rn is a cone if
whenever it contains the vector z, it also contains the vector αz for every α > 0. Most often
one is interested in convex cones (i.e., cones that are convex sets). The polar of the convex
cone K is a new convex cone
K◦ = y ∈ Rn : y z ≤ 0 for all z ∈ K .
In words, the polar cone of K contains all vectors that form a weakly obtuse angle with
each vector in K (Figure 1.3.11).
Exercise 1.3.1. Let K be a convex cone. Show that
(i) K◦ is a closed convex cone, and K◦ = (cl(K))◦ . (Hence, K◦ contains the origin.)
(ii) K is a subspace of Rn if and only if K is symmetric, in the sense that K = −K.
Moreover, in this case, K◦ = K⊥ .
(iii) (K◦ )◦ = cl(K). (Hint: To show that (K◦ )◦ ⊆ cl(K), use the separating hyperplane
theorem.)
The last result above tells us that (K◦ )◦ = K for any closed convex cone K; thus, polarity
deﬁnes an involution on the set of closed convex cones.
Another fundamental result about closed convex cones and their polar cones, the
Moreau Decomposition Theorem, is not needed until later chapters. But as the preceding
discussion provides the proper context to present this result, we do so in Appendix 1.B.
If C ⊂ Rn is a closed convex set, then the tangent cone of C at state x ∈ C, denoted TC(x),
is the closed convex cone
TC(x) = cl z ∈ Rn : z = α y − x for some y ∈ C and some α ≥ 0 .
22 K K° Figure 1.3.11: A convex cone and its polar cone. If C ⊂ Rn is a polytope (i.e., the convex hull of a ﬁnite number of points), then the closure
operation is redundant. In this case, TC(x) is the set of directions of motion from x that
initially remain in C; more generally, TC(x) also contains the limits of such directions. (To
see the diﬀerence, construct TC(x) for x ∈ bd(C) when C is a square and when C is a circle.)
If x is in the relative interior of C (i.e., the interior of C relative to aﬀ (C)), then TC(x) is just
TC, the tangent space of C; otherwise, TC(x) is a strict subset of TC.
Finally, deﬁne the normal cone of C at x to be the polar of the tangent cone of C at x: that
is, NC(x) = (TC(x))◦ . By deﬁnition, NC(x) is a closed convex cone, and it contains every
vector that forms a weakly obtuse angle with every feasible displacement vector at x.
In Figures 1.3.12 and 1.3.13, we sketch examples of tangent cones and normal cones
when X is the state space for a two strategy game (i.e., the simplex in R2 ) and for a three
strategy game (the simplex in R3 ). Since the latter ﬁgure is twodimensional, with the
sheet of paper representing the aﬃne hull of X, the ﬁgure actually displays the projected
normal cones Φ(NX(x)). 1.3.6 Normal Cones and Nash Equilibria At ﬁrst glance, normal cones might appear to be less relevant to game theory than
tangent cones. Theorem 1.3.2 shows that this impression is false: normal cones and Nash
23 NX(x)
x
TX(x) TX(y) y NX(y) Figure 1.3.12: Tangent cones and normal cones for twostrategy games. 24 Φ(NX(v))
v
TX(v) Φ(NX(y))
TX(y) y TX(x)
x
Φ(NX(x)) = {0} Figure 1.3.13: Tangent cones and normal cones for threestrategy games. 25 equilibria are intimately linked.
Theorem 1.3.2. Let F be a population game. Then x ∈ NE(F) if and only if F(x) ∈ NX(x).
p p p Proof. x ∈ NE(F) ⇔ [xi > 0 ⇒ Fi (x) ≥ F j (x)] for all i, j ∈ Sp , p ∈ P
⇔ (xp ) Fp (x) ≥ ( yp ) Fp (x) for all yp ∈ Xp , p ∈ P
⇔ ( yp − xp ) Fp (x) ≤ 0 for all yp ∈ Xp , p ∈ P
⇔ (zp ) Fp (x) ≤ 0 for all zp ∈ TXp (x), p ∈ P
⇔ Fp (x) ∈ NXp (xp ) for all p ∈ P
⇔ F(x) ∈ NX(x).
Exercise 1.3.3. Justify the last equivalence above.
Theorem 1.3.2 tells us that state x is a Nash equilibrium if and only if the payoﬀ vector
F(x) lies in the normal cone of the state space X at x. This result provides us with a simple,
purely geometric description of Nash equilibria of population games. Its proof is very
simple: some algebra shows that x is a Nash equilibrium if and only if it solves a variational
inequality problem—that is, if it satisﬁes
(1.1) ( y − x) F(x) ≤ 0 for all y ∈ X. Applying the deﬁnitions of tangent and normal cones then yields the result.
In many cases, it is more convenient to speak in terms of projected payoﬀ vectors and
projected normal cones. Corollary 1.3.4 restates Theorem 1.3.2 in these terms.
Corollary 1.3.4. x ∈ NE(F) if and only if ΦF(x) ∈ Φ(NX(x)).
Proof. Clearly, F(x) ∈ NX(x) implies that ΦF(x) ∈ Φ(NX(x)). The reverse implication follows from the facts that NX(x) = Φ(NX(x)) + (TX)⊥ (see Exercise 1.3.5) and that
Φ((TX)⊥ ) = {0} (which is the equation that deﬁnes Φ as the orthogonal projection of Rn
onto TX).
Exercise 1.3.5.
(i) Using the notions of relative and average payoﬀs discussed in Section
1.3.3, explain the intuition behind Corollary 1.3.4 in the single population case.
(ii) Prove that NX(x) = Φ(NX(x)) + (TX)⊥ .
(iii) Only one of the two statements to follow is equivalent to x ∈ NE(F) : F(x) ∈
Φ(NX(x)), or ΦF(x) ∈ NX(x). Which is it? 26 In Figures 1.3.7 through 1.3.10, we mark the Nash equilibria of our four population
games with dots. In the twostrategy games FC2 and FHD , the Nash equilibria are those
states x at which the payoﬀ vector F(x) lies in the normal cone NX(x), as Theorem 1.3.2
requires. In both these games and in the threestrategy games FC3 and FRPS , the Nash
equilibria are those states x at which the projected payoﬀ vector ΦF(x) lies in the projected
normal cone Φ(NX(x)), as Corollary 1.3.4 demands. Even if the dots were not drawn, we
could locate the Nash equilibria of all four games by examining the arrows alone.
Exercise 1.3.6. Compute the Nash equilibria of the four games studied above, and verify
that the equilibria appear in the correct positions in Figures 1.3.7 through 1.3.10.
Exercise 1.3.7. Twopopulation twostrategy games. Let F be a game played by two unit mass
populations (p = 2) with two strategies for each (n1 = n2 = 2).
Describe the state space X, tangent space TX, and orthogonal projection Φ for this
setting.
(ii) Show that the state space X can be represented on a sheet of paper by a unit
square, with the upper left vertex representing the state at which all agents in both
populations play strategy 1, and with the upper right vertex representing the state
at which all agents in population 1 play strategy 1 and all agents in population 2
play strategy 2. Explain how the projected payoﬀ vectors ΦF(x) can be represented
as arrows in this diagram.
(iii) At (a) a point in the interior of the square, (b) a nonvertex boundary point, and (c)
a vertex, draw the tangent cone TX(x) and the projected normal cone Φ(NX(x)),
and give algebraic descriptions of each.
(iv) Suppose we draw projected payoﬀ vectors ΦF(x) in the manner you described in
part (ii) and projected normal cones in the manner you described in part (iii). Verify
that in each of the cases considered in part (iii), the arrow representing ΦF(x) is
contained in the sketch of Φ(NX(x)) if and only if x is a Nash equilibrium of F. (i) Appendix
1.A Aﬃne Spaces, Tangent Spaces, and Orthogonal Projections The simplex in Rn , the state space for single population games, is an n − 1 dimensional
subset of Rn ; state spaces for multipopulation games are Cartesian products of scalar
27 multiples of simplices. For this reason, linear subspaces, aﬃne spaces, and orthogonal
projections all play important roles in the study of population games. 1.A.1 Aﬃne Spaces The set Z ⊆ Rn is a (linear) subspace of Rn if it is closed under linear combination: if
ˆ
ˆ
z, z ∈ Z and a, b ∈ R, then az + bz ∈ Z as well. Suppose that Z is a subspace of Rn of
dimension dim(Z) < n, and that the set A is a translation of Z by some vector v ∈ Rn :
A = Z + {v} = {x ∈ Rn : x = z + v for some z ∈ Z}.
Then we say that A is an aﬃne space of dimension dim(A) = dim(Z).
Observe that any vector representing a direction of motion through A is itself an
element of Z: if x, y ∈ A, then y − x = (z y + v) − (zx + v) = z y − zx for some zx and z y in Z;
since Z is closed under linear combinations, z y − zx ∈ Z. For this reason, the set Z is called
the tangent space of A, and we often write TA in place of Z.
Since the origin is an element of Z, the translation vector v in the deﬁnition A = Z + {v}
can be any element of A. But is there a “natural” choice of v? Recall that the orthogonal
complement of Z, denoted by Z⊥ , contains the vectors in Rn orthogonal to all elements of Z:
that is, Z⊥ = {v ∈ Rn : v z = 0 for all z ∈ Z}. It is easy to show that the set A ∩ Z⊥ contains
a single element, which we denote by z⊥ , and that this orthogonal translation vector is the
A
closest point in Z⊥ to every point in A (in the language of Section 1.A.3 below, PZ⊥ x = z⊥
A
for all x ∈ A). We will see that for many purposes, this translation vector is the most
convenient choice.
Example 1.A.1. Consider the subspace Rn = {z ∈ Rn : 1 z = 0} and the aﬃne space
0
A = Rn + {e1 } = {z ∈ Rn : 1 z = 1}, where 1 = (1, . . . , 1) . Since (Rn )⊥ = span({1}) and
0
0
1
1
A ∩ span({1}) = { n 1}, the vector n 1 is the orthogonal translation vector that generates A.
1
1
In particular, A = Rn + { n 1}, and n 1 is the closest point in span({1}) to every x ∈ A. We
0
illustrate the case in which n = 2 in Figure 1.A.1; note again our convention of using the
vertical axis to represent the ﬁrst component of x = (x1 , x2 ). § 28 ⊥ n
(R 0 ) = span({1}) 1
n 1
e1 A
n R0
Figure 1.A.1: The state space and its aﬃne hull for twostrategy games. 1.A.2 Aﬃne Hulls of Convex Sets Let Y ⊆ Rn . The aﬃne hull of Y, denoted aﬀ (Y), is the smallest aﬃne space that contains
Y. This set can be described as k
k ii
ik
ik
i
(Y) = x ∈ Rn : x =
(1.2) aﬀ
λ y for some { y }i=1 ⊂ Y and {λ }i=1 ⊂ R with
λ = 1 . i =1 i =1 The vector x is called an aﬃne combination of the vectors yi . If we also required the λi to be
nonnegative, x would instead be an convex combination of the yi , and (1.2) would become
conv(Y), the convex hull of Y.
Now suppose that Y is itself convex, let A = aﬀ (Y) be its aﬃne hull, and let Z = TA
be the tangent space of A; then we also call Z = TY the tangent space of Y, as Z contains
directions of motion from points in the (relative) interior of Y that stay in Y. We also call
dim(Y) = dim(Z) the dimension of Y.
In constructing the aﬃne hull of a convex set as in (1.2), it is enough to take aﬃne
combinations of a ﬁxed set of dim(Y) + 1 points in Y. To accomplish this, let d = dim(Y),
ﬁx y0 ∈ Y arbitrarily, and choose y1 , . . . , yd so that { y1 − y0 , . . . , yd − y0 } is a basis for Z. Then
letting λ0 = 1 − d=1 λi , we see that
i
Z + { y0 } = span({ y1 − y0 , . . . , yd − y0 }) + { y0 }
29 = x ∈ Rn : x =
= x ∈ Rn : x = d
i=1
d
i=0 λi ( yi − y0 ) + y0 for some {λi }d=1 ⊂ R .
i
λi yi for some {λi }d=0 ⊂ R with
i d
i=0 λi = 1 = aﬀ (Y).
p Example 1.A.2. Population states. Let Xp = {xp ∈ Rn : 1 xp = mp } be the set of population
+
p
p
states for a population of mass m . This convex set has aﬃne hull aﬀ (Xp ) = {xp ∈ Rn :
p
p
1 xp = mp } and tangent space TXp = {zp ∈ Rn : 1 zp = 0} = Rn (cf Example 1.A.1). §
0
Example 1.A.3. Social states. Let X = p∈P Xp be the set of social states for a collection
of populations P = {1, . . . , p } with masses m1 , . . . mp . This convex set has aﬃne hull
p
aﬀ (X) = p∈P aﬀ (Xp ) and tangent space TX = p∈P Rn . Thus, if z = (z1 , . . . , zp ) ∈ TX, then
0
each zp has components that sum to zero. § 1.A.3 Orthogonal Projections If V and W are subspaces of Rn , their sum is V + W = span (V ∪ W ), the set of linear
combinations of elements of V and W . If V ∩ W = {0}, every x ∈ V + W has a unique
decomposition x = v + w with v ∈ V and w ∈ W . In this case, we write V + W as V ⊕ W ,
and call it the direct sum of V and W . For instance, V ⊕ V ⊥ = Rn for any subspace V ⊆ Rn .
Every matrix A ∈ Rn×n deﬁnes a linear operator from Rn to itself via x → Ax. To
understand the action of this operator, remember that the ith column of A is the image
of the standard basis vector ei , and, more generally, that Ax is a linear combination of the
columns of A.
We call the linear operator P ∈ Rn×n a projection onto the subspace V ⊆ Rn if there is a
second subspace W ⊆ Rn satisfying V ∩ W = {0} and V ⊕ W = Rn such that
(i) Px = x for all x ∈ V , and
(ii) Py = 0 for all y ∈ W .
If W = V ⊥ , we call P the orthogonal projection onto V , and write PV in place of P.
Every projection onto V maps all points in Rn to points in V . While for any given
subspace V there are many projections onto V , the orthogonal projection onto V is unique.
For example, 0 0
0 0 P1 = 1 1 and P2 = 0 1 both deﬁne projections of R2 onto the horizontal axis {x ∈ R2 : x1 = 0}. (Recall again our
convention of representing x1 on the vertical axis.) However, since P2 maps the vertical
30 Figure 1.A.2: A projection. Figure 1.A.3: An orthogonal projection. axis {x ∈ R2 : x2 = 0} to the origin, it is the orthogonal projection. The action of the two
projections is illustrated in Figures 1.A.2 and 1.A.3 below. The latter ﬁgure illustrates a
geometrically obvious property of orthogonal projections: the orthogonal projection of Rn
onto V maps each point y ∈ Rn to the closest point to y in V :
2 PV y = argmin y − v .
v∈V Projections admit simple algebraic characterizations. Recall that the matrix A ∈ Rn×n
is idempotent if A2 = A. It is easy to see that projections are represented by idempotent
matrices: once the ﬁrst application of P projects Rn onto the subspace V , the second
application of P does nothing more. In fact, we have
Theorem 1.A.4. (i) P is a projection if and only if P is idempotent.
31 Figure 1.A.4: The orthogonal projection Φ in R2 . (ii) P is an orthogonal projection if and only if P is symmetric idempotent.
p Example 1.A.5. The orthogonal projection onto Rn . In Example 1.A.2, we saw that the set
0
p
p
of population states Xp = {xp ∈ Rn : 1 xp = mp } has tangent space TXp = Rn = {zp ∈
+
0
p
p
p
p
Rn : 1 zp = 0}. We can decompose the space Rn into the direct sum Rn ⊕ Rn , where
0
1
p
p
p
p
1
Rn = (Rn )⊥ = span({1}). The orthogonal projection of Rn onto Rn is Ξ = np 11 , the matrix
0
1
1
p
1
whose entries all equal np ; to verify this, note that Ξzp = 0 for zp ∈ Rn and Ξ1 = 1. The
0
p
np
np
p
p
orthogonal projection of R onto R0 is Φ = I − Ξ, since Φz = z for zp ∈ Rn and Φ1 = 0
0
(see Figure 1.A.4 for the case of np = 2). Both Ξ and Φ are clearly symmetric, and since
1
1
1
Ξ2 = ( np 11 )( np 11 ) = np 11 = Ξ and Φ2 = (I − Ξ)(I − Ξ) = I − 2Ξ + Ξ2 = I − 2Ξ + Ξ = I − Ξ = Φ,
both are idempotent as well. §
More generally, it is easy to show that if P is the orthogonal projection of Rn onto V , then
I − P is the orthogonal projection of Rn onto V ⊥ . Or, in the notation introduced above,
PV⊥ = I − PV .
Example 1.A.6. The orthogonal projection onto TX. Recall from Example 1.A.3 that the set of
p
social states X = p∈P Xp has tangent space TX = p∈P Rn . We can decompose Rn into the
0
p
p
direct sum p∈P Rn ⊕ p∈P Rn = TX ⊕ p∈P span({1}). The orthogonal projection of Rn
0
1
onto p∈P span({1}) is the block diagonal matrix Ξ = diag(Ξ, . . . , Ξ), while the orthogonal
projection of Rn onto TX is Φ = I − Ξ = diag(Φ, . . . , Φ). Of course, Ξ and Φ are both
symmetric idempotent. § 32 Example 1.A.7. Ordinary least squares. Suppose we have a collection of n > k data points,
{(xi , yi )}n=1 , where each xi ∈ Rk contains k components of “explanatory” data and each
i
yi ∈ R is the corresponding component of “explainable” data. We write these data as y1 (x1 ) .
.
. . ∈ Rn×k and y = . ∈ Rn
X=
. n n
y (x ) and assume that X is of full rank. We seek the best linear predictor: the map x → x β that
minimizes the sum of squared prediction errors n=1 ( yi − (xi ) β)2 =  y − Xβ2 .
i
(The prediction function xi → (xi ) β is a truly linear function of x, in the sense that
the input vector 0 generates a prediction of 0. Typically, one seeks an aﬃne prediction
function—that is, one that allows for a nonzero constant term. To accomplish this, one
sets xi1 = 1 for all i, leaving only k − 1 components of true explanatory data. In this case,
the component β1 serves as a constant term in the aﬃne prediction function (x2 , . . . , xk ) →
β1 + k=2 xi βi .)
i
Let span(X) = {Xb : b ∈ Rk } be the column span of X. That β ∈ Rk minimizes  y − Xβ2
is equivalent to the requirement that Xβ be the closest point to y in the column span of X:
2 Xβ = argmin y − v .
v∈span(X) Both calculus and geometry tell us that for this to be true, the vector of prediction errors
y − Xβ must be orthogonal to span(X), and hence to each column of X.
X ( y − Xβ) = 0.
One can verify that X ∈ Rn×k and X X ∈ Rk×k have the same null space, and hence the same
(full) rank. Therefore, (X X)−1 exists, and we can solve the previous equation for β:
β = (X X)−1 X y.
To this point, we have taken X ∈ Rn×k and y ∈ Rn as given and used them to ﬁnd the
vector β ∈ Rk , which we have viewed as deﬁning a map from vectors of explanatory data
x ∈ Rk to predictions x β ∈ R. Now, let us take X alone as given and consider the map from
vectors of “explainable” data y ∈ Rn to vectors of predictions Xβ = X(X X)−1 X y ∈ Rn . By
construction, this linear map P = X(X X)−1 X ∈ Rn×n is the orthogonal projection of Rn onto
span(X). P is clearly symmetric (since the inverse of a symmetric matrix is symmetric), 33 and since P2 = X(X X)−1 X X(X X)−1 X = X(X X)−1 X = P it is idempotent as well. § 1.B The Moreau Decomposition Theorem A basic fact about projection onto subspaces holds that for any vector v ∈ Rn and any
subspace Z ⊆ Rn , the sum v = PZ (v) + PZ⊥ (v) is the unique decomposition of v into the
sum of elements of Z and Z⊥ . The Moreau Decomposition Theorem is a generalization of this
result that replaces the subspace Z and its orthogonal complement with a closed convex
cone and its polar cone. We use this theorem repeatedly in Chapter 5 in our analysis of
the projection dynamic.
To state this result, we need an appropriate analogue of orthogonal projection for the
context of closed, convex sets. To this end, we deﬁne ΠC : Rn → C, the (closest point)
projection of Rn onto the closed convex set C by
ΠC ( y) = argmin y − x .
x∈C This deﬁnition generalizes that of the projection PZ onto the subspace Z ⊆ Rn to cases in
which the target set is not linear, but merely closed and convex. With this deﬁnition in
hand, we can state our new decomposition theorem; an illustration is provided in Figure
1.B.1.
Theorem 1.B.1 (The Moreau Decomposition Theorem). Let K ⊆ Rn and K◦ ⊆ Rn be a closed
convex cone and its polar cone, and let v ∈ Rn . Then the following are equivalent:
(i) vK = ΠK (v) and vK◦ = ΠK◦ (v).
(ii) vK ∈ K, vK◦ ∈ K◦ , v = vK + vK◦ , and vK vK◦ = 0. 34 K v ΠK°(v) ΠK(v) K° Figure 1.B.1: The Moreau Decomposition Theorem. 1.N Notes Congestion games are introduced in Beckmann et al. (1956); see the notes to Chapter
2 for further references. For the biological motivation for the HawkDove game, see
Maynard Smith (1982, Chapter 2).
Portions of Section 1.3 follow Lahkar and Sandholm (2008). The link between normal
cones and Nash equilibria is known from the literature on variational inequalities; see
Harker and Pang (1990) and Nagurney (1999). For more on aﬃne spaces, tangent cones,
normal cones, the Moreau Decomposition Theorem, and related notions, see HiriartUrruty and Lemar´ chal (2001). The algebra of orthogonal projections is explained, e.g., in
e
Friedberg et al. (1989, Section 6.6). 35 36 CHAPTER TWO
Potential Games, Stable Games, and Supermodular Games 2.0 Introduction In the previous chapter, we oﬀered a general deﬁnition of population games and
characterized their Nash equilibria geometrically. Still, since any continuous map F from
the state space X to Rn deﬁnes a population game, population games with even a moderate
number of strategies can be diﬃcult to analyze. In this chapter, we deﬁne three important
classes of population games: potential games, stable games, and supermodular games. From
an economic point of view, each deﬁnition places constraints on the sorts of externalities
agents impose on one another through their choices in the game. From a mathematical
point of view, each deﬁnition imposes a structure on payoﬀ functions that renders their
analysis relatively simple.
We show through examples that potential games, stable games, and supermodular
games each encompass a variety of interesting applications. We also establish the basic
properties of each class of games. Among other things, we show that for games in each
class, existence of Nash equilibrium can be proved using elementary methods. Beginning
in Chapter 6, we investigate the behavior of evolutionary dynamics in the three classes of
games; there, our assumptions on the structure of externalities will allow us to establish a
range of global convergence results.
The deﬁnitions of our three classes of games only require continuity of the payoﬀ functions. If we instead make the stronger assumption that payoﬀs are smooth (in particular,
continuously diﬀerentiable), we can avail ourselves of the tools of calculus. Doing so not
only simpliﬁes computations, but also allows us to express our deﬁnitions and results
in simple, useful, and intuitively appealing ways. The techniques from calculus that we 37 require are reviewed in the Appendices 2.A and 2.B. 2.1 Full Potential Games In potential games, all information about payoﬀs that is relevant to agents’ incentives
can be captured in a single scalarvalued function. The existence of this function—the
game’s potential function—underlies potential games’ many attractive properties. In this
section, we consider full potential games, which can be analyzed using standard multivariate calculus techniques (Appendix 2.A), but at the expense of requiring an extension
of the payoﬀ functions’ domain. In Section 2.2, we introduce a deﬁnition of potential
games that does not use this device, but that instead requires analyses that rely on aﬃne
calculus (Appendix 2.B). 2.1.1 Full Population Games To understand the issues alluded to above, consider a game F played by a single
population of agents. Since population states for this game are elements of X = {x ∈ Rn :
+
n
k∈S xk = 1}, the simplex in R , the payoﬀ Fi to strategy i is a realvalued function with
domain X.
In looking for useful properties of population games, a seemingly natural characteristic
to consider is the marginal eﬀect of adding new agents playing strategy j on the payoﬀs
∂F
of agents currently choosing strategy i. This eﬀect is captured by the partial derivative ∂x ij .
But herein lies the diﬃculty: if Fi is only deﬁned on the simplex, then even if the function
∂F
F is diﬀerentiable, the partial derivative ∂x ij does not exist.
To ensure that partial derivatives exist, we extend the domain of the game F from
the state space X = {x ∈ Rn : k∈S xk = 1} to the entire positive orthant Rn . In mul+
tipopulation settings, the analogous extension is from the original set of social states
p
X = {x = (x1 , . . . , xp ) ∈ Rn : i∈Sp xi = mp } to Rn . In either setting, we call the game with
+
+
payoﬀs deﬁned on the positive orthant a full population game. In many interesting cases,
one can interpret the extensions of payoﬀs as specifying the values that payoﬀs would
take were the population sizes to change—see Section 2.1.3. 2.1.2 Deﬁnition and Characterization With these preliminaries addressed, we are now prepared to deﬁne full potential
games. 38 Deﬁnition. Let F : Rn → Rn be a full population game. We call F a full potential game if there
+
exists a continuously diﬀerentiable function f : Rn → R satisfying
+
(2.1) f (x) = F(x) for all x ∈ Rn .
+ Property (2.1) can be stated more explicitly as
∂f p (x)
∂xi p = Fi (x) for all p ∈ P , i ∈ Sp and x ∈ Rn .
+ The function f , which is unique up to the addition of a constant, is called the full potential
function for the game F. It represents the game’s payoﬀs in an integrated form.
To explain the potential function’s role, suppose that x ∈ X is a population state at
p
p
which F j (x) > Fi (x), so that an agent choosing strategy i ∈ Sp would be better oﬀ choosing
strategy j ∈ Sp . Now suppose some small group of agents switch from strategy i to
p
p
strategy j. These switches are represented by the displacement vector z = e j − ei , where
p
ei is the (i, p)th standard basis vector in Rn . The marginal impact that these switches have
on the value of potential is therefore
∂f
(x) =
∂z f (x) z = ∂f p (x)
∂x j − ∂f p (x)
∂x i p p = F j (x) − Fi (x) > 0. In other words, proﬁtable strategy revisions increase potential. More generally, we will
see in later chapters that the “uphill” directions of the potential function include all
directions in which reasonable adjustment processes might lead. This fact underlies the
many attractive properties that potential games possess.
If the map F : Rn → Rn is C1 (continuously diﬀerentiable), it is well known that F
+
admits a potential function if and only if its derivative matrices DF(x) are symmetric
(see Appendix 2.A.9). In the current gametheoretic context, we call this condition full
externality symmetry.
Observation 2.1.1. Suppose the population game F is C1 . Then F is a full potential game if and
only if it satisﬁes full externality symmetry:
(2.2) DF(x) is symmetric for all x ∈ Rn
+ 39 More explicitly, F is a potential game if and only if
q p (2.3) ∂Fi q (x) ∂x j = ∂F j p (x) ∂x i for all i ∈ Sp , j ∈ Sq , p, q ∈ P , and x ∈ Rn .
+ Observation 2.1.1 characterizes smooth full potential games in terms of a simple, economically meaningful property: condition (2.2) requires that the eﬀect on the payoﬀs to
strategy i ∈ Sp of introducing new agents choosing strategy j ∈ Sq always equals the eﬀect
on the payoﬀs to strategy j of introducing new agents choosing strategy i. 2.1.3 Examples Our ﬁrst two examples build on ones studied in Chapter 1.
Example 2.1.2. Random matching in normal form games with common interests. Suppose a
single population is randomly matched to play symmetric two player normal form game
A ∈ Rn×n , generating the population game F(x) = Ax. While earlier we used this formula
to deﬁne F on the state space X, here we will use it to deﬁne F on all of Rn . (While this
+
choice works very well in the present example, it is not always innocuous, as will see in
Section 2.2.)
The symmetric normal form game A has common interests if both players always receive
the same payoﬀ. This means that Ai j = A ji for all i and j, or, equivalently, that the matrix
A is symmetric. Since DF(x) = A, this is precisely what we need for F to be a full potential
game. The full potential function for F is
1
f (x) = 2 x Ax, which is onehalf of x Ax = i∈S xi Fi (x) = F(x), the aggregate payoﬀ function for F.
To cover the multipopulation case, call the normal form game U = (U1 , . . . , Up ) a
common interest game if there is a function V : S → R such that Up (s) = V (s) for all s ∈ S
and p ∈ P . As before, this means that under any pure strategy proﬁle, all p players earn
the same payoﬀ. This normal form game generates the full population game
p Fsp (x) = xrr
s V (s)
s−p ∈S−p rp 40 on Rn . Aggregate payoﬀs in F are given by
+
p p xsp Fsp (x) = p F(x) =
p∈P sp ∈Sp ∂f p (x)
∂xsp r∈P s∈S Hence, if we let f (x) = s∈S = V (s) 1
xrr = p F(x), we obtain
s p xrr = Fsp (x).
s V (s)
s−p ∈S−p r∈P xrr .
s V (s) rp So once again, random matching in a common interest game generates a full potential
game in which potential is proportional to aggregate payoﬀs. §
Exercise 2.1.3. In the multipopulation case, check directly that condition (2.2) holds.
Example 2.1.4. Congestion games. For ease of exposition, suppose that the congestion
game F models behavior in a traﬃc network. In this environment, an agent taking path
j ∈ Sq aﬀects the payoﬀs of agents choosing path i ∈ Sp through the marginal increases in
p
q
congestion on the links φ ∈ Φi ∩ Φ j that the two paths have in common. But the marginal
eﬀect of an agent taking path i on the payoﬀs of agents choosing path j is identical:
q p ∂Fi q (x) ∂x j =− cφ (uφ (x)) =
p q φ∈Φi ∩Φ j ∂F j p (x). ∂x i In other words, congestion games satisfy condition (2.2), and so are full potential games.
The full potential function for the congestion game F can be written explicitly as
uφ (x) f (x) = −
φ∈Φ cφ (z) dz. 0 Hence, potential is typically unrelated to aggregate payoﬀs, which are given by
pp F(x) = xi Fi (x) = −
p∈P i∈Sp uφ (x)cφ (uφ (x)).
φ∈Φ In Section 2.1.6, we oﬀer conditions under which potential and aggregate payoﬀs are
directly linked. §
Example 2.1.5. Cournot competition. Consider a unit mass population of ﬁrms who choose
production quantities from the set S = {1, . . . , n}. The ﬁrms’ aggregate production is given
41 by a(x) = i∈S i xi . Let p : R+ → R+ denote inverse demand, a decreasing function of
aggregate production. Let the ﬁrms’ production cost function c : S → R be arbitrary.
Then the payoﬀ to a ﬁrm producing quantity i ∈ S at population state x ∈ X is Fi (x) =
i p(a(x)) − c(i).
It is easy to check that F is a full potential game with full potential function
a(x) f (x) = xi c(i). p(z) dz −
0 i∈S In contrast, aggregate payoﬀs in F are
F(x) = xi Fi (x) = a(x)p(a(x)) −
i∈S xi c(i).
i∈S The diﬀerence between the two is
a(x) f (x) − F(x) = p(z) − p(a(x)) dz,
0 which is simply consumers’ surplus. Thus, the full potential function f = F + ( f − F)
measures the total surplus received by ﬁrms and consumers. (Total surplus diﬀers from
aggregate payoﬀs because the latter ignores consumers, who are not modeled as active
agents.) §
Example 2.1.6. Games generated by variable externality pricing schemes. Population games
can be viewed as models of externalities for environments with many agents. One way to
force agents to internalize the externalities they impose upon others is to introduce pricing
schemes. Given an arbitrary full population game F with aggregate payoﬀ function F,
˜
deﬁne an augmented game F as follows:
q ˜p
Fi (x) = p
Fi (x) q
xj +
q∈P j∈Sq ∂F j p (x). ∂xi The double sum represents the marginal eﬀect that an agent choosing strategy i has on
other agents’ payoﬀs.
Suppose that when the game F is played, a social planner charges each agent choosing
strategy i a tax equal to this double sum, and that each agent’s payoﬀ function is separable
˜
in this tax. The population game generated by this intervention is F. 42 Now observe that
(2.4) ∂F
∂
p (x) =
p
∂xi
∂xi q qq
x j F j (x) = p
Fi (x) q
xj + q∈P j∈Sq q∈P j∈Sq ∂F j p (x)
∂x i ˜p
= Fi (x). ˜
Equation (2.4) tells us that the augmented game F is a full potential game, and that its full
potential function is the aggregate payoﬀ function of the original game F. Hence, changes
in strategy which are proﬁtable in the augmented game increase eﬃciency with respect to
the payoﬀs of the original game. § 2.1.4 Nash Equilibria of Full Potential Games We saw in Section 2.1.2 that in full potential games, proﬁtable strategy revisions increase potential. It is therefore natural to expect that Nash equilibria of full potential
games are related to local maximizers of potential. To investigate this idea, consider the
nonlinear program
max f (x) p xi = mp for all p ∈ P , and subject to
i∈Sp p xi ≥ 0 for all i ∈ Sp and p ∈ P . The Lagrangian for this maximization problem is
L(x, µ, λ) = f (x) +
p∈P µp mp − i∈Sp p xi pp + λi xi ,
p∈P i∈Sp so the KuhnTucker ﬁrst order necessary conditions for maximization are
(2.5)
(2.6)
(2.7) ∂f p (x)
∂xi
pp
λi xi =
p
λi ≥ 0 p = µp − λi for all i ∈ Sp and p ∈ P ,
0, for all i ∈ Sp and p ∈ P , and
for all i ∈ Sp and p ∈ P . Let
KT( f ) = x ∈ X : (x, µ, λ) satisﬁes (2.5)(2.7) for some λ ∈ Rn and µ ∈ Rp .
Theorem 2.1.7 shows that the KuhnTucker ﬁrst order conditions for maximizing f on X 43 characterize the Nash equilibria of F.
Theorem 2.1.7. If F is a full potential game with full potential function f, then NE(F) = KT( f ).
Proof. If x is a Nash equilibrium of F, then since F = f , the KuhnTucker conditions
p
p
p
are satisﬁed by x, µp = max j∈Sp F j (x), and λi = µp − Fi (x). Conversely, if (x, µ, λ) satisﬁes the
p ∂f
p (x)
∂xi
p
p
µ − λj KuhnTucker conditions, then for every p ∈ P , (2.5) and (2.6) imply that Fi (x) =
p = µp for all i in the support of xp . Furthermore, (2.5) and (2.7) imply that F j (x) =
≤ µp
p
for all j ∈ Sp . Hence, the support of xp is a subset of argmax j∈Sp F j (x), and so x is a Nash
equilibrium of F.
Note that the multiplier µp represents the equilibrium payoﬀ in population p, and that the
p
multiplier λi represents the “payoﬀ slack” of strategy i ∈ Sp .
Since the set X satisﬁes constraint qualiﬁcation, satisfaction of the KuhnTucker conditions is necessary for local maximization of the full potential function. Thus, Theorem
2.1.7, along with the fact that a continuous function on a compact set achieves its maximum, gives us a simple proof of existence of Nash equilibrium in full potential games.
On the other hand, the KuhnTucker conditions are not suﬃcient for maximizing
potential. Therefore, while all local maximizers of potential are Nash equilibria, not all
Nash equilibria locally maximize potential.
Example 2.1.8. Consider again the 123 Coordination game introduced in Chapter 1: F1 (x) 1 0 0 x1 x1 F(x) = F2 (x) = 0 2 0 x2 = 2x2 . F (x) 0 0 3 x 3x 3
3
3
3
The full potential function for this game is the convex function f (x) = 1 (x1 )2 + (x2 )2 + 2 (x3 )2 .
2
The three pure states, e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 0, 1), all locally maximize
potential, and so are Nash equilibria. To focus on one instance, note that the KuhnTucker
conditions are satisﬁed at state e1 by the multipliers µ = 1, λ1 = 0, and λ2 = λ3 = 1. The
632
global minimizer of potential, ( 11 , 11 , 11 ), is a state at which payoﬀs to all three strategies
are equal, and is therefore a Nash equilibrium as well; the KuhnTucker conditions are
6
satisﬁed here with multipliers µ = 11 and λ1 = λ2 = λ3 = 0. Finally, at each of the boundary
2
states ( 2 , 1 , 0), ( 3 , 0, 1 ), and (0, 3 , 5 ), the strategies which are played receive equal payoﬀs,
33
4
4
5
which exceed the payoﬀ accruing to the unused strategy; thus, these states are Nash
equilibria as well. These states, coupled with the appropriate multipliers, also satisfy the
KuhnTucker conditions: for example, x = ( 2 , 1 , 0) satisﬁes the conditions with µ = 2 ,
33
3
λ1 = λ2 = 0 and λ3 = 2 . This exhausts the set of Nash equilibria of F.
3 44 Figure 2.1.1: The potential function of 123 Coordination. Figures 2.1.1 and 2.1.2 contain a graph and a contour plot of the full potential function
f , and show the connection between this function and the Nash equilibria of F. §
The previous example demonstrates that in general, potential games can possess Nash
equilibria that do not maximize potential. But if the full potential function f is concave, the
KuhnTucker conditions are not only necessary for maximizing f ; they are also suﬃcient.
This fact gives us the following corollary to Theorem 2.1.7.
Corollary 2.1.9. (i) If f is concave on X, then NE(F) is the convex set of maximizers of f on
X.
(ii) If f is strictly concave on X, then NE(F) is a singleton containing the unique maximizer of
f on X.
Example 2.1.10. A network of highways connects Home and Work. The two towns are
separated by a river. Highways A and D are expressways that go around bends in the
river, and that do not become congested easily: cA (u) = cD (u) = 4 + 20u. Highways B and
C cross the river over two short but easily congested bridges: cB (u) = cC (u) = 2 + 30u2 . In
order to create a direct path between the towns, a city planner considers building a new
expressway E that includes a third bridge over the river. Delays on this new expressway
are described by cE (u) = 1 + 20u. The highway network as a whole is pictured in Figure
2.1.3.
45 1 2 3 Figure 2.1.2: A contour plot of the potential function of 123 Coordination. Before link E is constructed, there are two paths from Home to Work: path 1 traverses
links A and B, while path 2 traverses links C and D. The equilibrium driving pattern
splits the drivers equally over the two paths, yielding an equilibrium driving time (=
–equilibrium payoﬀ) of 23.5 on each.
After link E is constructed, drivers may also take path 3, which uses links C, E, and B.
(We assume that traﬃc on link E only ﬂows to the right.) The resulting population game
has payoﬀ functions F1 (x) −(6 + 20x1 + 30(x1 + x3 )2 ) F (x) = 2 F(x) = 2 −(6 + 20x2 + 30(x2 + x3 ) ) F (x) −(5 + 20x + 30(x + x )2 + 30(x + x )2 ) 3
3
1
3
2
3
and full potential function
f (x) = − 6x1 + 6x2 + 5x3 + 10((x1 )2 + (x2 )2 + (x3 )2 + (x1 + x3 )3 + (x2 + x3 )3 ) .
Figures 2.1.4 and 2.1.5 contain a graph and a contour plot of the full potential function.
Note that the full potential function for the twopath game is the restriction of f to the
states at which x3 = 0.
Evidently, the full potential function f is concave. (This is no coincidence—see Exercise
2.1.11 below.) The unique maximizer of potential on X, the state x ≈ (.4616, .4616, .0768),
is the unique Nash equilibrium of the game. In this equilibrium, the driving time on each
46 A
B Work Home
C E
D Figure 2.1.3: A highway network. Figure 2.1.4: The potential function of a congestion game. 47 1 2 3 Figure 2.1.5: A contour plot of the congestion game’s potential function. path is approximately 23.93, which exceeds the original equilibrium time of 23.5. In other
words, adding an additional link to the network actually increases equilibrium driving
times—a phenomenon known as Braess’ paradox.
The intuition behind this phenomenon is easy to see. By opening up the new link E, we
make it possible for a single driver on path 3 to use both of the easily congested bridges,
B and C. But while using path 3 is bad for the population as a whole, it is appealing to
individual drivers, as drivers do not account for the negative externalities their use of the
bridges imposes on others. §
Exercise 2.1.11. Uniqueness of equilibrium in congestion games.
(i) Let F be a congestion game with cost functions cφ and full potential function f .
Show that if each cφ is increasing, then f is concave, which implies that NE(F) is
the convex set of maximizers of f on X. (Hint: Fix y, z ∈ X, let x(t) = (1 − t) y + tz,
and show that g(t) = f (x(t)) is concave.)
(ii) Construct a congestion game in which each cφ is strictly increasing but in which
NE(F) is not a singleton.
(iii) Show that in case (ii), the equilibrium link utilization levels uφ are unique. (Hint:
Since f (x) only depends on the state x through the utilization levels uφ (x), we
can deﬁne a function g : U → R on U = {vφ }φ∈Φ : vφ = uφ (x) for some x ∈ Rn by
+
g(uφ (x)) = f (x). Show that x maximizes f on X if and only if uφ (x) maximizes g on
U.) 48 Exercise 2.1.12. Example 2.1.6 shows that by adding statedependent congestion charges
to a congestion game, a planner can ensure that drivers use the network eﬃciently, in
the sense of minimizing average travel times. Show that these congestion charges can be
imposed on a linkbylink basis, and that the price on each link need only depend on the
number of drivers on that link.
Exercise 2.1.13. Show that Cournot competition (Example 2.1.5) with a strictly decreasing
inverse demand function generates a potential game with a strictly concave potential
function, and hence admits a unique Nash equilibrium.
Exercise 2.1.14. Entry and exit. When we deﬁne a full population game F : Rn → Rn , we
+
specify the payoﬀs of each of the n strategies for all possible vectors of population masses.
It is only a small additional step to allow agents to enter and leave the game. Fixing a
vector of population masses (m1 , . . . , mp ), we deﬁne a population game with entry and exit by
p
assuming that the set of feasible social states is X = {x = (x1 , . . . , xp ) ∈ Rn : i∈Sp xi ≤ mp },
+
and that an agent who exits the game receives a payoﬀ of 0.
(i) State an appropriate deﬁnition of Nash equilibrium for population games with
entry and exit.
(ii) A population game with entry and exit is a potential game if it satisﬁes full externality
symmetry (2.2). Prove an analogue of Theorem 2.1.7 for such games. 2.1.5 The Geometry of Nash Equilibrium in Full Potential Games Theorem 2.1.7 shows that if F is a potential game with potential function f , then the set
of states satisfying the KuhnTucker ﬁrst order conditions for maximizing f are precisely
the Nash equilibria of F. We now oﬀer a geometric proof of this result, and discuss its
implications.
The nonlinear program from Section 2.1.4 seeks to maximize the function f on the
polytope X. What do the KuhnTucker conditions for this program mean?
The KuhnTucker conditions adapt the classical approach to optimization based on
linearization to settings with both equality and inequality constraints. In the current
context, these conditions embody the following construction: To begin, one linearizes
the objective function f at the state x ∈ X of interest, replacing it with the function
l f ,x ( y) = f (x) + f (x) ( y − x). Then, one determines whether the linearized function reaches
its maximum on X at state x. Of course, this method can accept states that are not
maximizers: for instance, if x is an interior local minimizer of f , then the linearization l f ,x
is a constant function, and so is maximized everywhere in X. But because X is a polytope 49 (in particular, since constraint qualiﬁcation holds), x must maximize l f ,x on X if it is to
maximize f on X.
With this interpretation of the KuhnTucker conditions in hand, we can oﬀer a simple
geometric proof that NE(F) = KT( f ). The analysis employs our normal cone characterization of Nash equilibrium from Chapter 1.
Theorem 2.1.7: If F is a full potential game with full potential function f, then NE(F) = KT( f ).
Second proof. x ∈ KT( f ) ⇔ x maximizes l f ,x on X.
⇔ z ∈ TX(x) ⇒
⇔ f (x) z ≤ 0 f (x) ∈ NX(x) ⇔ F(x) ∈ NX(x)
⇔ x ∈ NE(F).
This proof is easy to explain in words. As we have argued, satisfying the KuhnTucker
conditions for f on X is equivalent to maximizing the linearized version of f on X. This
in turn is equivalent to the requirement that if z is in the tangent cone of X at x—that is,
if z is a feasible displacement direction from x—then z forms a weakly obtuse angle with
the gradient vector f (x), representing the direction in which f increases fastest. But this
is precisely what it means for f (x) to lie in the normal cone NX(x). The deﬁnition of
potential tells us that we can replace f (x) with F(x); and we know from Chapter 1 that
F(x) ∈ NX(x) means that x is a Nash equilibrium of F.
This argument sheds new light on Theorem 2.1.7. The KuhnTucker conditions, which
provide a way of ﬁnding the maximizers of the function f , are stated in terms of the
gradient vectors f (x). At ﬁrst glance, it seems rather odd to replace f (x) with some
nonintegrable map F: after all, what is the point of the KuhnTucker conditions when there
is no function to maximize? But from the geometric point of view, replacing f (x) with F
makes perfect sense. When the KuhnTucker conditions are viewed in geometric terms—
namely, in the form f (x) ∈ NX(x)—they become a restatement of the Nash equilibrium
condition; the fact that f (x) is a gradient vector plays no role. So to summarize, the
Nash equilibrium condition F(x) ∈ NX(x) is identical to the KuhnTucker conditions, but
applies whether or not the map F is integrable.
Exercise 2.1.15. Let F be a full potential game with full potential function f . Let C ⊆ NE(F)
be smoothly connected, in the sense that if x, y ∈ C, then there exists a piecewise C1 path
α : [0, 1] → C with α(0) = x and α(1) = y. Show that f is constant on C. (Hint: Use the
Fundamental Theorem of Calculus and the fact that F(x) ∈ NX(x) for all x ∈ NE(F), along
50 with the fact that when α(t) = x and α is diﬀerentiable at x, both α (t) and −α (t) are in
TX(x).) 2.1.6 Eﬃciency in Homogeneous Full Potential Games We saw in Section 2.1.3 that when agents are matched to play normal form games with
common interests, the full potential function of the resulting population game is proportional to the game’s aggregate payoﬀ function. How far can we push this connection?
Deﬁnition. We call a full potential game F homogeneous of degree k if each of its payoﬀ
p
functions Fi : Rn → R is a homogeneous function of degree k, where k −1.
+
Example 2.1.16. Random matching in normal form games with common interests. In the single
population setting, each payoﬀ function F(x) = Ax is linear, so the full potential game
F is homogeneous of degree 1. With p ≥ 2 populations, the payoﬀs Fp to population
p’s strategies are multilinear in (x1 , . . . , xp−1 , xp+1 , . . . , xp ), so the full potential game F is
homogeneous of degree p − 1. §
Example 2.1.17. Isoelastic congestion games. Let F be a congestion game with cost functions
cφ . For each facility φ ∈ Φ, let
ηφ (u) = ucφ (u)
cφ (u) denote φ’s cost elasticity, which is well deﬁned whenever cφ (u) 0. We call a congestion
game isoelastic with elasticity η ∈ R if ηφ = η for all φ ∈ Φ. Thus, a congestion game is
isoelastic if all facilities in Φ are equally sensitive to congestion at all levels of use.
Isoelasticity implies that all cost functions are of the form cφ (u) = aφ uη , where the aφ are
arbitrary (i.e., positive or negative) scalar constants. (Notice that η cannot be negative, as
this would force facility costs to become inﬁnite at u = 0.) Since each uφ is linear in x, each
p
payoﬀ function Fi is a sum of functions that are homogeneous of degree η in x, and so is
itself homogeneous of degree η. Therefore, any isoelastic congestion game with elasticity
η is a homogeneous potential game of degree η. §
The eﬃciency properties of homogeneous potential games are consequences of the
following theorem.
Theorem 2.1.18. The full potential game F is homogeneous of degree k −1 if and only if the
1
normalized aggregate payoﬀ function k+1 F(x) is a full potential function for F and is homogeneous
of degree k + 1 0.
51 1
Proof. If the potential game F is homogeneous of degree k
−1, then k+1 F(x) =
pp
1
p∈P
j∈Sp x j F j (x) is clearly homogeneous of degree k + 1. Therefore, condition (2.2)
k +1
and Euler’s law imply that ∂
p
∂xi 1
1 F(x) = k+1
k+1
q∈P 1 = k+1 q ∂F j q
p x j p (x) + Fi (x) ∂x i
j∈Sq p ∂Fi q
p x j q (x) + Fi (x) ∂x
q q∈P j∈S j 1
p
p
kFi (x) + Fi (x)
k+1
p
= Fi (x),
= 1
1
so k+1 F is a full potential function for F. On the other hand, if k+1 F is homogeneous of
p
1
degree k +1 0 and is a full potential function for F, then each payoﬀ function Fi = ∂∂p ( k+1 F)
xi
is homogeneous of degree k, so the converse statement follows. To understand the connection between homogeneity and eﬃciency, consider the expression ∂∂p F(x), which represents the impact of an agent who chooses strategy i on
xi
aggregate payoﬀs. Recalling Example 2.1.6, we split this impact into two terms. The
q ﬁrst term, q q ∂F j
j x j ∂xp (x),
i represents the impact of this agent’s behavior on his opponents’
p payoﬀs. The second term, Fi (x), represents the agent’s own payoﬀs. In homogeneous
potential games, these two eﬀects are precisely balanced: the payoﬀ an agent receives
from choosing a strategy is directly proportional to the social impact of his choice. For
this reason, selfinterested behavior leads to desirable social outcomes.
Observe that if a potential game is homogeneous of degree less than −1, its full potential function is negatively proportional to aggregate payoﬀs. In this case, selfinterested
behavior leads to undesirable social outcomes. To remove this case from consideration,
we call a potential game positively homogeneous if its full potential function is homogeneous
of strictly positive degree, so that the game itself is homogeneous of degree k > −1.
With this deﬁnition in hand, we can present a result on the eﬃciency of Nash equilibria.
We call the social state x locally eﬃcient in game F (x ∈ LE(F)) if there exists an ε > 0 such
that F(x) ≥ F( y) for all y ∈ X within ε of x. If this inequality holds for all y ∈ X, we call x
globally eﬃcient (x ∈ GE(F)).
Corollary 2.1.19.
(i) If the full potential game F is positively homogeneous, then LE(F) ⊆ NE(F).
(ii) If in addition its full potential function f is concave, then GE(F) = LE(F) = NE(F).
52 Exercise 2.1.20. Establish these claims.
Exercise 2.1.21. Let F be a congestion game with nondecreasing aﬃne cost functions:
cφ (u) = aφ u + bφ . Suppose that within each population, the ﬁxed cost of each route is
equal:
bφ = bp for all i ∈ Sp and p ∈ P .
p φ∈Φi Show that NE(F) = GE(F). 2.1.7 Ineﬃciency Bounds for Congestion Games The results from the previous section provide stringent conditions under which Nash
equilibria of congestion games are eﬃcient. Since exact eﬃciency rarely obtains, it is
natural to ask just how ineﬃcient equilibrium behavior can be. We address this question
in the context of congestion games with nondecreasing cost functions—in other words,
congestion games in which congestion is a bad.
It will be convenient to use notation tailored to the questions at hand. Given the
facilities φ and the nondecreasing cost functions cφ , we let
p p Ci (x) = −Fi (x) = cφ (uφ (x))
p
φ∈Φi denote the cost of strategy i ∈ Sp at state x and let
p C(x) = −F(x) = p xi Ci (x) =
p∈P i∈Sp uφ (x)cφ (uφ (x))
φ∈Φ denote social cost at state x. We refer to the resulting congestion game either as C or as (C, m)
(to emphasize the population masses m). When we introduce alternative cost functions
γφ , we replace C with Γ in the notation above.
One approach to bounding the ineﬃciency of equilibria is to compare the equilibrium
social cost to the minimal social cost in a game with additional agents.
Proposition 2.1.22. Let C be a congestion game with nondecreasing cost functions. Let x∗ be a
Nash equilibrium of (C, m), and let y be a feasible state in (C, 2m). Then C(x∗ ) ≤ C( y).
Exercise 2.1.23. This exercise outlines a proof of Proposition 2.1.22.
(i) Deﬁne the cost functions γφ by γφ (u) = max{cφ (uφ (x∗ )), cφ (u)}. Show that u(γφ (u) −
cφ (u)) ≤ cφ (uφ (x∗ )) uφ (x∗ ).
53 (ii) Show that Γi ( y) ≥ min j∈Sp C j (x∗ ).
p p (iii) Use parts (i) and (ii) to show that Γ( y) − C( y) ≤ C(x∗ ) and that Γ( y) ≥ 2C(x∗ ), and
conclude that C(x∗ ) ≤ C( y).
Exercise 2.1.24. This exercise applies Proposition 2.1.22 to settings with ﬁxed population
masses but varying cost functions.
˜
(i) Show that the equilibrium social cost under cost functions cφ (u) = 1 cφ ( u ) is bounded
2
2
above by the minimal social cost under cost functions cφ .
(ii) Let C be a congestion game with cost functions cφ (u) = (kφ − u)−1 for some capacities
kφ > 0. (We assume that population masses are small enough that no edge can
reach its capacity.) Using part (i), show that the equilibrium social cost when
capacities are 2k is bounded above by the minimal social cost when capacities are k.
In other words, doubling the capacities of the edges reduces costs at least as much
as enforcing eﬃcient behavior under the original capacities.
A more direct way of understanding ineﬃciency is to bound a game’s ineﬃciency ratio:
the ratio between the game’s equilibrium social cost and its minimal feasible social cost.
Example 2.1.25. A highway network consisting of two parallel links is to be traversed by
a unit mass of drivers. The links’ cost functions are c1 (u) = 1 and c2 (u) = u. In the unique
Nash equilibrium of this game, all drivers travel on route 2, creating a social cost of 1. The
eﬃcient state, which minimizes C(x) = x1 + (x2 )2 , is xmin = ( 1 , 1 ); it generates a social cost
22
3
of C(xmin ) = 4 . Thus, the ineﬃciency ratio in this game is 4 .
3
The next result describes an easily established upper bound on ineﬃciency ratios.
Proposition 2.1.26. Suppose that the cost functions cφ are nondecreasing and satisfy ucφ (u) ≤
u
α 0 cφ (z) dz for all u ≥ 0. If x∗ ∈ NE(C) and x ∈ X, then C(x∗ ) ≤ αC(x).
Exercise 2.1.27.
(i) Prove Proposition 2.1.26. (Hint: Use a potential function argument.)
(ii) Show that if cost functions in C are polynomials of degree at most k with nonnegative coeﬃcients, then the ineﬃciency ratio in C is at most k + 1.
Exercise 2.1.27 tells us that the ineﬃciency ratio of a congestion game with aﬃne cost
functions cannot exceed 2. Is it possible to establish a smaller upper bound? We saw in
4
Example 2.1.25 that ineﬃciency ratios as high as 3 can arise in very simple games with
aﬃne cost functions. Amazingly, 4 is the highest possible ineﬃciency ratio for congestion
3
games with cost functions of this form. 54 Theorem 2.1.28. Let C be a congestion game whose cost functions cφ are nonnegative, nondecreasing, and aﬃne: cφ (u) = aφ u + bφ with aφ , bφ ≥ 0. If x∗ ∈ NE(C) and x ∈ X, then
C(x∗ ) ≤ 4 C(x).
3
To prove Theorem 2.1.28, we introduce the auxiliary cost functions γφ (u) = 2aφ u + bφ .
For intuition, note that when facility φ is used by u agents, the marginal externality
imposed by one of these agents on each other user of the facility is cφ (u) = aφ , so that the
total externalities imposed by this agent on all other users are ucφ (u) = aφ u. Thus, the cost
functions γφ are the ones generated by an externality pricing scheme under which agents
are always made to pay for the externalities they currently impose on others. Since the
aggregate payoﬀ function is concave, it follows that the Nash equilibria of the new game
Γ are the eﬃcient states in the original game C (cf Example 2.1.6).
Lemma 2.1.29. (i) γφ ( u ) = cφ (u) for all u ≥ 0.
2
d
(ii) γφ (u) = du (ucφ (u)) for all u ≥ 0.
(iii) ucφ (u) ≥ vcφ (v) + (u − v)γφ (v) for all u, v ≥ 0.
(iv) NE(Γ, m ) = GE(C, m ).
2
2
∗
(v) x∗ ∈ NE(C, m) ⇔ x2 ∈ GE(C, m ).
2
x
(vi) C( 2 ) ≥ 1 C(x) for all x ∈ Rn .
+
4
Proof. Parts (i) and (ii) are immediate. Part (iii) follows directly from part (ii) and
the convexity of the function u → ucφ (u). Part (ii) implies that the concave function
− φ uφ (x) cφ (uφ (x)) = −C(x) is a potential function for Γ, which with Corollary 2.1.9 yields
part (iv). Parts (v) and (vi) are easily veriﬁed by direct calculation.
Proof of Theorem 2.1.28: Let x∗ ∈ NE(C, m), and let x be an arbitrary feasible state for
∗
∗
∗
(C, m). Parts (iv) and (v) of the lemma imply that x2 ∈ NE(Γ, m ). Hence, −Γ( x2 ) ( y − x2 ) ≤
2
0 for all y feasible in (Γ, m ) by the normal cone characterization of Nash equilibrium.
2
Therefore:
C(x) = cφ (uφ (x)) uφ (x)
φ∈Φ
∗ ∗ ∗ cφ (uφ ( x2 )) uφ ( x2 ) + ≥
φ∈Φ ∗ γφ (uφ ( x2 )) uφ (x) − uφ ( x2 ) by (iii) φ∈Φ = ∗
C( x2 ) = ∗
C( x2 ) p∗
Γi ( x2 ) + xi − 1 (x∗ )i
2
p p p∈P i∈Sp
p + ∗ p p Γi ( x2 ) · 1 xi +
2
p∈P i∈Sp ∗ 1
2 xi − (x∗ )i
p p p∈P i∈Sp
p ≥ C( x2 ) + ∗ Γi ( x2 ) · ∗ p Γi ( x2 ) · 1 (x∗ )i
2
p∈P i∈Sp 55 ∗ since Γ( x2 ) ( x −
2 x∗
)
2 ≥0 ∗ = C( x2 ) + p
p
Ci (x∗ ) · (x∗ )i 1
2 by (i) p∈P i∈Sp = C( x2 ) + 1 C(x∗ )
2
∗)
3
≥ C(x
∗ by (vi). 4 That the highest ineﬃciency ratio for a given class of cost functions can be realized
in a very simple network is true quite generally. Consider a twolink network with link
cost functions c1 (u) = 1 and c2 (u) = uk , where k ≥ 1. With a unit mass population, the
Nash equilibrium for this network is x∗ = (0, 1), and has social cost C(x∗ ) = 1; the eﬃcient
state is xmin = (1 − (k + 1)−1/k , (k + 1)−1/k ), and has social cost C(xmin ) = 1 − k(k + 1)−(k+1)/k .
Remarkably, it is possible to show that the resulting ineﬃciency ratio of (1 − k(k + 1)−(k+1)/k )−1
is the highest possible in any network whose cost functions are polynomials of degree at
most k. See the Notes for further details. 2.2 Potential Games To deﬁne full potential games, we ﬁrst deﬁned full population games by extending the
domain of payoﬀs from the state space X to the positive orthant Rn . While this device for
+
introducing potential functions is simple, it is often artiﬁcal. By using ideas from aﬃne
calculus (Appendix 2.B), we can deﬁne potential functions for population games without
recourse to changes in domain. 2.2.1 Motivating Examples We can motivate the developments to come not only by parsimony, but also by generality, as the following two examples show.
Example 2.2.1. Random matching in symmetric normal form potential games. Recall that the
symmetric normal form game C ∈ Rn×n is a common interest game if C is a symmetric
matrix, so that both players always receive the same payoﬀ. We call the symmetric normal
form game A ∈ Rn×n a potential game if A = C + 1r for some common interest game C and
some arbitrary vector r ∈ Rn . Thus, each player’s payoﬀ is the sum of a common interest
term and a term that only depends on his opponent’s choice of strategy. (For the latter
point, note that Ai j = Ci j + r j .)
Suppose a population of agents is randomly matched to play game A. Since the second
payoﬀ term has no eﬀect on agents’ incentives, it is natural to expect our characterization
of equilibrium from the previous section to carry over to the current setting. But this does
56 not follow directly from our previous deﬁnitions. Suppose we deﬁne the full population
game F : Rn → Rn as in Example 2.1.2: F(x) = Ax. Then the resulting derivative matrix is
+
DF(x) = A = C + 1r , and so
∂F j
∂Fi
(x) = Ci j + r j , but
(x) = C ji + ri .
∂x j
∂xi
Therefore, unless r is a constant vector (in which case A itself is symmetric), the full
population game F deﬁned above is not a full potential game. §
Example 2.2.2. Twostrategy games. Recall that the population game F : X → Rn is a
twostrategy game if p = 1 and n = 2. In this setting, the state space X is the simplex
in R2 , which can be viewed as a relabelling of the unit interval. Because all functions
deﬁned on the unit interval are integrable, it seems natural to expect twostrategy games
to admit potential functions. If we wanted to show that F deﬁnes a full potential game,
we would ﬁrst need to extend its domain to R2 . Once we do this, the domain is no longer
+
onedimensional, so our intuition about the existence of a potential function is lost. § 2.2.2 Deﬁnition, Characterizations, and Examples Example 2.2.2 suggests that the source of our diﬃculties is the extension of payoﬀs
from the original state space X to the fulldimensional set Rn . As the deﬁnition of full
+
potential games relied on this extension, our new notion of potential games will require
some additional ideas. The key concepts are the tangent spaces and orthogonal projections
introduced in Chapter 1, which we brieﬂy review here.
p
p
Recall that the state space for population p is given by Xp = {xp ∈ Rn : i∈Sp xi = mp }.
+
p
The tangent space of Xp , denoted TXp , is the smallest subspace of Rn that contains all
p
p
p
directions of motion through Xp ; it is deﬁned by TXp = Rn ≡ {zp ∈ Rn : i∈Sp zi = 0}.
0
p
p
p
The matrix Φ ∈ Rn ×n , representing the orthogonal projection of Rn onto TXp , is deﬁned
p
1
by Φ = I − np 11 . If πp ∈ Rn is a payoﬀ vector, then the projected payoﬀ vector Φπp
represents relative payoﬀs under πp : it preserves the diﬀerences between components of
πp while normalizing their sum to zero. Changes in the social state x ∈ X = p∈P Xp
p
are represented by elements of TX =
p∈P TX , the tangent space of X. The matrix
Φ ∈ Rn×n , representing the orthogonal projection of Rn onto TX, is the block diagonal
matrix diag(Φ, . . . , Φ). If π = (π1 , . . . , πp ) ∈ Rn is a payoﬀ vector for the society, then
Φπ = (Φπ1 , . . . , Φπp ) normalizes each of the p pieces of the vector π separately.
With these preliminaries in hand, we are ready for our new deﬁnition.
Deﬁnition. Let F : X → Rn be a population game. We call F a potential game if it admits a
57 potential function: a C1 function f : X → R that satisﬁes
(2.8) f (x) = ΦF(x) for all x ∈ X. Since the potential function f has domain X, the gradient vector f (x) is by deﬁnition an
element of the tangent space TX (see Appendix 2.B.3). Our deﬁnition of potential games
requires that this gradient vector always equal ΦF(x), the projection of the payoﬀ vector
F(x) onto the subspace TX.
At the cost of sacriﬁcing parsimony, one can deﬁne potential games without aﬃne
calculus by using a function deﬁned throughout Rn to play the role of the potential
+
function f . To do so, one simply includes the projection Φ on both sides of the analogue
of equation (2.8).
Observation 2.2.3. If F is a potential game with potential function f : X → R, then any C1
extension f˜ : Rn → R of f satisﬁes
+
(2.9) Φ f˜(x) = ΦF(x) for all x ∈ X. Conversely, if the population game F admits a function f˜ satisfying condition (2.9), then F is a
potential game, and the restriction f = f˜ X is a potential function for F.
This observation is immediate from the relevant deﬁnitions. In particular, if f˜ and f
agree on X, then for all x ∈ X the gradient vectors f˜(x) and f (x) deﬁne identical
linear operators on TX, implying that Φ f˜(x) = Φ f (x). But since Φ f (x) = f (x) by
deﬁnition, it follows that Φ f˜(x) = f (x); this equality and deﬁnition (2.8) yield the result.
Like full potential games, potential games can be characterized by a symmetry condition on the payoﬀ derivatives DF(x). Since potential games generalize full potential
games, the new symmetry condition is less restrictive than the old one.
Theorem 2.2.4. Suppose the population game F : X → Rn is C1 . Then F is a potential game if
and only if it satisﬁes externality symmetry:
(2.10) DF(x) is symmetric with respect to TX × TX for all x ∈ X. Proof. Immediate from Theorem 2.B.8 in Appendix 2.B.
Condition (2.10) demands that at each state x ∈ X, the derivative DF(x) deﬁne a
symmetric bilinear form on TX × TX:
ˆ
z DF(x)ˆ = z DF(x)z for all z, z ∈ TX and x ∈ X.
zˆ
58 Observation 2.2.5 oﬀers a version of this condition that does not require aﬃne calculus,
just as Observation 2.2.3 did for deﬁnition (2.8).
˜
Observation 2.2.5. Suppose that the population game F : X → Rn is C1 , and let F : Rn → Rn be
+
1
any C extension of F. Then F satisﬁes externality symmetry (and so is a potential game) if and
only if
˜
ΦDF(x)Φ is symmetric for all x ∈ X.
The next exercise characterizes externality symmetry in a more intuitive way.
Exercise 2.2.6. Show that externality symmetry (2.10) holds if and only if the previous
p
p
q
q
ˆ
equality holds whenever z = e j − ei and z = el − ek . In other words, show that (2.10) is
equivalent to
p (2.11) p q q ∂(F j − Fi )
∂(el − ek ) q (x) = q p p ∂(Fl − Fk )
∂(e j − ei ) (x) for all i, j ∈ Sp , k, l ∈ Sq , p, q ∈ P , and x ∈ X. The left hand side of equation (2.11) captures the change in the payoﬀ to strategy j ∈ Sp
relative to strategy i ∈ Sp as agents switch from strategy k ∈ Sq to strategy l ∈ Sq . This
eﬀect must equal the change in the payoﬀ of l relative to k as agents switch from i to j, as
expressed on the right hand side of (2.11). This description is akin to that of full externality
symmetry (2.2) (see the discussion after equation (2.3)), but it only refers to relative payoﬀs
and to feasible changes in the social state.
Exercise 2.2.7. Let F be a C1 single population game. Show that F is a potential game if
and only if it satisﬁes triangular integrability:
∂F j
∂Fi
∂Fk
(x) +
(x) +
(x) = 0 for all i, j, k ∈ S and x ∈ X.
∂(e j − ek )
∂(ek − ei )
∂(ei − e j )
We now return to the examples that led oﬀ the section.
Example 2.2.8. Twostrategy games revisited. If F : X → R2 is a smooth twostrategy game,
its state space X is the simplex in R2 , whose tangent space TX is spanned by the vector
ˆ
ˆˆ
d = e1 − e2 . If z and z are vectors in TX, then z = kd and z = kd for some real numbers k
ˆ
ˆ
ˆ
and k; thus, however F is deﬁned, we have that z DF(x)ˆ = kkd DF(x)d = z DF(x)z for all
z
x ∈ X. In other words, F is a potential game. Even if F is merely continuous, the function 59 f : X → R deﬁned by
(2.12) x1 f (x1 , 1 − x1 ) = (F1 (t, 1 − t) − F2 (t, 1 − t)) dt 0 is a potential function for F, so F is still a potential game. (If you think that a
on the right hand side of equation (2.12), convince yourself that it is not.) § 1
2 is needed Exercise 2.2.9. Random matching in symmetric normal form potential games. Let A = C + 1r be
a symmetric normal form potential game: C ∈ Rn×n is symmetric, and r ∈ Rn is arbitrary.
Deﬁne the population game F : X → Rn by F(x) = Ax. Use one of the derivative conditions
above to verify that F is a potential game, and ﬁnd a potential function f : X → R for F.
Exercise 2.2.10. Random matching in normal form potential games. The normal form game
U = (U1 , . . . , Up ) is a potential game if there is a potential function V : S → R and auxiliary
functions W p : S−p → R such that
Up (s) = V (s) + W p (s−p ) for all s ∈ S and p ∈ P .
In a normal form potential game, each player’s payoﬀ is the sum of a common payoﬀ
term and a term that only depends on opponents’ behavior. It is easy to show that pure
strategy proﬁle s ∈ S is a Nash equilibrium of U if and only if it is a local maximizer of the
potential function V .
˜
(i) Deﬁne the full population game F : Rn → Rn by
+
˜p
Fsp (x) =
s−p ∈S−p V (s) + W p (s−p ) xrr =
s Up (s)
rp s−p ∈S−p xrr .
s
rp ˜
Show that F is not a full potential game.
(ii) Deﬁne the population game F : X → Rn using the equation from part (i). By
verifying condition (2.10), show that F is a potential game.
(iii) Construct a potential function for F. 2.2.3 Potential Games and Full Potential Games What is the relationship between full potential games and potential games? In the
former case, condition (2.1) requires that payoﬀs be completely determined by the potential
function, which is deﬁned on Rn ; in the latter, condition (2.8) asks only that relative payoﬀs
+
be determined by the potential function, now deﬁned just on X. 60 To understand the relationship between the two deﬁnitions, take a potential game
F : X → Rn with potential function f : X → R as given, and extend f to a full potential
function f˜ : Rn → R. Theorem 2.2.11 shows that the link between the full potential game
+
˜
F ≡ f˜ and the original game F depends on how the extension f˜ is chosen.
Theorem 2.2.11. Let F : X → Rn be a potential game with potential function f : X → R.
˜
Let f˜ : Rn → R be any C1 extension of f , and deﬁne the full potential game F : Rn → Rn by
+
+
˜
F(x) = f˜(x). Then
˜
˜
(i) The population games F and F X have the same relative payoﬀs: ΦF(x) = ΦF(x) for all
x ∈ X.
˜
(ii) One can choose the extension f˜ in such a way that F and F X are identical.
˜
Part (i) of the theorem shows that the full potential game F generated from an arbitrary
extension of the potential function f exhibits the same relative payoﬀs as F on their
˜
common domain X. It follows that F and F have the same best response correspondences
and Nash equilibria, but may exhibit diﬀerent average payoﬀ levels. Part (ii) of the
˜
theorem shows that by choosing the extension f˜ appropriately, we can make F and F
identical on X. To accomplish this, we construct the extension f˜ in such a way (equation
(2.13) below) that its derivatives at states in X evaluated in directions orthogonal to TX
encode information about average payoﬀs from the original game F.
In conclusion, Theorem 2.2.11(ii) demonstrates that if population masses are ﬁxed, so
that the relevant set of social states is X, then deﬁnition (2.1), while more diﬃcult to check,
does not entail a loss of generality relative to deﬁnition (2.8).
˜
Proof of Theorem 2.2.11: Part (i) follows from the fact that ΦF(x) = Φ f˜(x) = f (x) =
ΦF(x) for all x ∈ X; compare the discussion following Observation 2.2.3.
To prove part (ii), we ﬁrst extend f and F from the state space X to its aﬃne hull
ˆ
aﬀ (X). Let fˆ : aﬀ (X) → R be a C1 extension of f : X → R, and let gp : aﬀ (X) → R be
1
a continuous extension of population p’s average payoﬀ function, np 1 Fp : X → R. (The
existence of these extensions follows from the Whitney Extension Theorem.) Then deﬁne
ˆ
ˆ
ˆ
ˆ
G : aﬀ (X) → Rn by Gp (x) = 1 gp (x), so that F(x) = ΦF(x) + (I − Φ)F(x) = fˆ(x) + G(x) for all
ˆ
ˆ
ˆ
ˆ
x ∈ X. If after this we deﬁne F : aﬀ (X) → Rn by F(x) = fˆ(x) + G(x), then F is a continuous
ˆ
extension of F, and fˆ(x) = ΦF(x) for all x ∈ aﬀ (X).
With this groundwork complete, we can extend f to all of Rn via
+
(2.13) f˜( y) = f (ξ( y)) + ( y − ξ( y)) F(ξ( y)), where ξ( y) = Φ y + z is the closest point to y in aﬀ (X). (Here, z⊥ is the orthogonal
TX
TX
p
translation vector that sends TX to aﬀ (X): namely, (z⊥ )p = mp 1.) Theorem 2.B.10 shows
TX
n
61 that ˜
f˜ X = F X is identical to F. Theorem 2.2.11 implies that all of our results from Sections 2.1.4 and 2.1.5 on Nash
equilibria of full potential games apply unchanged to potential games. On the other hand,
the eﬃciency results from Section 2.1.6 do not. In particular, the proof of Theorem 2.1.18
depends on the game F being a full population game, as the application of Euler’s Theorem
makes explicit use of the partial derivatives of F. In fact, to establish that a potential game
F has eﬃciency properties of the sorts described in Section 2.1.6, one must show that F
can be extended to a homogeneous full potential game. This should come as no surprise:
since the potential function f : X → R only captures relative payoﬀs, it cannot be used to
prove eﬃciency results, which depend on both relative and average payoﬀs.
Exercise 2.2.12. Consider population games with entry and exit (Exercise 2.1.14). Which
derivative condition is the right one for deﬁning potential games in this context, (2.2) or
(2.10)? Why?
˜
Exercise 2.2.13. Prove this simple “converse” to Theorem 2.2.11: Suppose F : Rn → Rn is
+
˜ : Rn → R. Let F = F and f = f˜ .
˜
a full potential game with full potential function f
+
X
X
Then F is a potential game with potential function f . 2.2.4 Passive Games and Constant Games We conclude this section by introducing two simple classes of population games.
Deﬁnition. The population game H : X → Rn is a passive game if for each state x ∈ X and each
population p ∈ P , the payoﬀs to all of population p’s strategies are equal:
p p Hi (x) = H j (x) for all i, j ∈ Sp , p ∈ P , and x ∈ X.
Deﬁnition. The population game K : X → Rn is a constant game if all strategies’ payoﬀs are
independent of the state: that is, if K(x) = π for all x ∈ X, or, more explicitly, if
p p Ki (x) = πi for all i ∈ Sp , p ∈ P , and x ∈ X.
In a passive game, an agent’s own behavior has no bearing on his payoﬀs; in a constant
game, each agent’s behavior is the sole determinant of his payoﬀs.
The following two propositions provide some alternate characterizations of these
games.
Proposition 2.2.14. The following statements are equivalent:
62 (i)
(ii)
(iii)
(iv)
(v)
(vi) H is a passive game.
There are functions cp : X → R such that Hp (x) = cp (x)1 for all p ∈ P and x ∈ X.
H(x) ∈ (TX)⊥ for all x ∈ X.
ΦH(x) = 0 for all x ∈ X.
z H(x) = 0 for all z ∈ TX and x ∈ X.
H is a potential game whose potential function is constant. Proposition 2.2.15. The following statements are equivalent:
(i) K is a constant game.
(ii) DK(x) = 0 for all x ∈ X.
(iii) K is a potential game that admits a linear potential function.
In particular, if K(x) = π is a constant game, then k(x) = π x is a potential function for K.
One reason that passive and constant games are interesting is that adding them to a
population game from a certain class (the potential games, the stable games, the supermodular games) results in a new game from the same class. For instance, suppose that F is
a potential game with potential function f , let H be a passive game, and let K be a constant
game with potential function k. Evidently, F + H is also a potential game with potential
function f ; thus, adding H to F leaves the Nash equilibria of F unchanged. F + K is also
a potential game, but its potential function is not f , but f + k; thus, NE(F) and NE(F + K)
generally diﬀer. Similar observations are true for stable games and for supermodular
games: adding a passive game or a constant game to a game from either of these classes
keeps us in the class, but only adding passive games leaves incentives unchanged.
When payoﬀs are smooth, the invariances just described can be represented in terms
of payoﬀ derivatives. As an illustration, recall that the C1 population game F : X → Rn is
a potential game if and only if it satisﬁes externality symmetry:
(2.10) DF(x) is symmetric with respect to TX × TX for all x ∈ X. The ﬁrst TX tells us that condition (2.10) constrains the eﬀects of left multiplication of
DF(x) by elements of TX; this restricts the purview of (2.10) to changes in relative payoﬀs.
The second TX tells us that (2.10) constrains the eﬀects of right multiplication of DF(x) by
elements of TX; this reﬂects that we can only evaluate how payoﬀs change in response
to feasible changes in the state. In summary, the action of the derivative matrices DF(x) on
TX × TX captures changes in relative payoﬀs due to feasible changes in the state. We have
seen that this action is enough to characterize potential games, and we will soon ﬁnd that
it is enough to characterize stable and supermodular games as well. 63 It follows from this discussion that the additions to F that do not aﬀect the action of its
derivative matrices on TX × TX are the ones that do not alter F’s class. These additions
are characterized by the following proposition.
Proposition 2.2.16. Let G be a C1 population game. Then DG(x) is the null bilinear form on
TX × TX for all x ∈ X if and only if G = H + K, where H is a passive game and K is a constant
game.
Exercise 2.2.17. Prove Propositions 2.2.14, 2.2.15, and 2.2.16. (Hints: For Proposition 2.2.15,
prove the equivalence of (i) and (iii) using the Fundamental Theorem of Calculus. For
2.2.16, use the previous propositions, along with the fact that DG(x) is the null bilinear
form on TX × TX if and only if ΦDG(x) = 0.)
Exercise 2.2.18.
(i) Suppose H(x) = Ax is a single population passive game. Describe A.
(ii) Suppose K(x) = Ax is a single population constant game. Describe A. 2.3 Stable Games There are a variety of wellknown classes of games whose Nash equilibria lie in a
single convex component: for instance, two player zerosum games, wars of attrition,
games with an interior ESS or NSS, and potential games with concave potential functions.
This shared property of these seemingly disparate examples springs from a common
source: all of these examples are stable games. 2.3.1 Deﬁnition The common structure in the examples above is captured by the following deﬁnition.
Deﬁnition. The population game F : X → Rn is a stable game if
(2.14) ( y − x) (F( y) − F(x)) ≤ 0 for all x, y ∈ X. If the inequality in condition (2.14) holds strictly whenever x
y, we call F a strictly stable
game, while if this inequality always binds, we call F a null stable game.
For a ﬁrst intuition, imagine for the moment that F ≡ f (x) is also a full potential
game. In this case, condition (2.14) is simply the requirement that the potential function f 64 be concave. Our deﬁnition of stable games thus extends the deﬁning property of concave
potential games to games whose payoﬀs are not integrable.
Stable games whose payoﬀs are diﬀerentiable can be characterized in terms of the
action of their derivative matrices DF(x) on TX × TX.
Theorem 2.3.1. Suppose the population game F is C1 . Then F is a stable game if and only if it
satisﬁes selfdefeating externalities:
(2.15) DF(x) is negative semideﬁnite with respect to TX for all x ∈ X. Before proving Theorem 2.3.1, let us provide some intuition for condition (2.15). This
condition asks that
z DF(x)z ≤ 0 for all z ∈ TX and x ∈ X.
This requirement is in turn equivalent to p∈P p
p ∂Fi
(x)
zi
∂z i∈Sp ≤ 0 for all z ∈ TX and all x ∈ X. To interpret this expression, recall that the displacement vector z ∈ TX describes the
aggregate eﬀect on the population state of strategy revisions by a small group of agents.
p ∂F The derivative ∂zi (x) represents the marginal eﬀect that these revisions have on the payoﬀs
of agents currently choosing strategy i ∈ Sp . Condition (2.15) considers a weighted sum of
these eﬀects, with weights given by the changes in the use of each strategy, and requires
that this weighted sum be negative.
Intuitively, a game exhibits selfdefeating externalities if the improvements in the
payoﬀs of strategies to which revising agents are switching are always exceeded by the
improvements in the payoﬀs of strategies which revising agents are abandoning. For
p
p
example, suppose the tangent vector z takes the form z = e j − ei , representing switches by
some members of population p from strategy i to strategy j. In this case, the requirement in
p ∂F j ∂F p condition (2.15) reduces to ∂z (x) ≤ ∂zi (x): that is, any performance gains that the switches
create for the newly chosen strategy j are dominated by the performance gains created for
the abandoned strategy i.
Exercise 2.3.2.
(i) Characterize the C1 twostrategy stable games using a derivative
condition.
(ii) Recall the HawkDove game introduced in Chapter 1:
65 HD F −1 2 xH 2xD − xH . (x) = 0 1 x = x
D
D Verify that F is a stable game. Also, ﬁll in the numerical details of the argument
from the previous paragraph for this speciﬁc choice of payoﬀ function.
Proof of Theorem 2.3.1: To begin, suppose that F is a stable game. Fix x ∈ X and z ∈ TX;
we want to show that z DF(x)z ≤ 0. Since F is C1 , it is enough to consider x in the interior
of X. In this case, yε = x + εz lies in X whenever ε is suﬃciently small, and so
F( yε ) = F(x) + DF(x)( yε − x) + o( yε − x ).
by the deﬁnition of DF(x). Premultiplying by yε − x and rearranging yields
2 ( yε − x) (F( yε ) − F(x)) = ( yε − x) DF(x)( yε − x) + o( yε − x ).
Since the left hand side is nonpositive and since yε − x = εz, it follows that ε2 z DF(x)z +
o(ε2 ) ≤ 0, and hence that z DF(x)z ≤ 0.
Next, suppose that condition (2.15) holds. Then if we let α(t) = ty + (1 − t)x, the
Fundamental Theorem of Calculus implies that
1 ( y − x) (F( y) − F(x)) = ( y − x) DF(α(t))( y − x) dt
0 1 = ( y − x) DF(α(t))( y − x) dt ≤ 0.
0 Exercise 2.3.3. The derivative condition that characterizes potential games, externality
ˆ
ˆ
symmetry (2.10), requires that z DF(x)ˆ = z DF(x)z. That z and z are chosen separately
z
means that DF(x) is treated as a bilinear form. Exercise 2.2.6 shows that in order to check
ˆ
ˆ
that (2.10) holds for all z and z in TX, it is enough to show that it holds for all z and z in a
p
p
basis for TX—for example, the set of vectors of the form e j − ei .
In contrast, selfdefeating externalities (2.15), which requires that z DF(x)z ≤ 0, places
the same vector z on both sides of DF(x), thus viewing DF(x) as a quadratic form. Explain
why the conclusion of Exercise 2.2.6 does not extend to the present setting. Also, construct
p
p
a 3 × 3 symmetric game A such that z Az ≤ 0 whenever z is of the form e j − ei but such
that F(x) = Ax is not a stable game. 66 2.3.2 Examples Example 2.3.4. Random matching in symmetric normal form games with an interior evolutionarily
or neutrally stable state. Let A be a symmetric normal form game. State x ∈ X is an
evolutionarily stable state (or an evolutionarily stable strategy, or simply an ESS) of A if
(2.16) x Ax ≥ y Ax for all y ∈ X; and (2.17) x Ax = y Ax implies that x A y > y A y. Condition (2.16) says that x is a symmetric Nash equilibrium of A. Condition (2.17) says
that x performs better against any alternative best reply y than y performs against itself.
(Alternatively, (2.16) says that no y ∈ X can strictly invade x, and (2.16) and (2.17) together
say that if y can weakly invade x, then x can strictly invade y—see Section 2.3.3 below.) If
we weaken condition (2.17) to
(2.18) If x Ax = y Ax, then x A y ≥ y A y, then a state satisfying conditions (2.16) and (2.18) is called a neutrally stable state (NSS).
Suppose that the ESS x lies in the interior of X. Then as x is an interior Nash equilibrium,
all pure and mixed strategies are best responses to it: for all y ∈ X, we have that x Ax =
y Ax, or, equivalently, that (x − y) Ax = 0. Next, we can rewrite the inequality in condition
(2.17) as (x − y) A y > 0. Subtracting this last expression from the previous one yields
(x − y) A(x − y) < 0. But since x is in the interior of X, all tangent vectors z ∈ TX are
proportional to x − y for some choice of y ∈ X. Therefore, z DF(x)z = z Az < 0 for all
z ∈ TX, and so F is a strictly stable game. Similar reasoning shows that if F admits an
interior NSS, then F is a stable game. §
Example 2.3.5. Random matching in RockPaperScissors. In RockPaperScissors, Paper covers Rock, Scissors cut Paper, and Rock smashes Scissors. If a win in a match is worth
w > 0, a loss −l < 0, and a draw 0, we obtain the symmetric normal form game 0 −l w A = w 0 −l , where w, l > 0. −l w 0 When w = l, we refer to A as (standard) RPS; when w > l, we refer to A as good RPS, and
when w < l, we refer to A as bad RPS. In all cases, the unique symmetric Nash equilibrium
1
of A is ( 3 , 1 , 1 ).
33
To determine the parameter values for which this game generates a stable population
67 game, deﬁne d = w − l. Since y A y = 1 y (A + A ) y, it is enough to see when the symmetric
2
matrix 0 d d ˆ = A + A = d 0 d A d d 0
ˆ
is negative semideﬁnite with respect to TX. Now A has one eigenvalue of 2d corresponding to the eigenvector 1, and two eigenvalues of −d corresponding to the orthogonal
ˆ
eigenspace TX. Thus, z Az = −dz z for each z ∈ TX. Since z z > 0 whenever z 0, we
conclude that F is stable if and only if d ≥ 0. Thus, good RPS is strictly stable, standard
RPS is stable, and bad RPS is neither. §
Exercise 2.3.6. Random matching in wars of attrition. A war of attrition is a two player
symmetric normal form game. Strategies represent amounts of time committed to waiting
for a scarce resource. If the two players choose times i and j > i, then the j player obtains
the resource, worth v, while both players pay a cost of ci : once the ﬁrst player leaves, the
other seizes the resource immediately. If both players choose time i, the resource is split,
so payoﬀs are v − ci each. Show that for any resource value v ∈ R and any cost vector
2
n
c ∈ R satisfying c1 ≤ c2 ≤ . . . ≤ cn , random matching in a war of attrition generates a
stable game. §
Example 2.3.7. Random matching in symmetric zerosum games. A symmetric two player
normal form game A is symmetric zerosum if A is skewsymmetric: that is, if A ji = −Ai j
for all i, j ∈ S. This condition ensures that under single population random matching, the
total utility generated in any match is zero. Since payoﬀs in the resulting single population
game are F(x) = Ax, we ﬁnd that z DF(x)z = z Az = 0 for all vectors z ∈ Rn , and so F is a
null stable game. §
Example 2.3.8. Random matching in standard zerosum games. A two player normal form
game U = (U1 , U2 ) is zerosum if U2 = −U1 , so that the two players’ payoﬀs always add up
to zero. Random matching of two populations to play U generates the population game 0 F(x1 , x2 ) = 2 (U ) U1 x1 0 = 0 x2 −(U1 ) U1 x1 . 0 x2 If z is a vector in Rn = Rn +n , then
1 z DF(x)z = (z1 ) 2 0 (z2 ) −(U1 ) U1 z1 = (z1 ) U1 z2 − (z2 ) (U1 ) z1 = 0, 0 z2 68 so F is a null stable game. §
Exercise 2.3.9. Random matching in multizerosum games. Let U be a p player normal form
game in which each player p ∈ P chooses a single strategy from Sp to simultaneously
play a distinct zerosum contest with each of his p − 1 opponents. We call such a U a
multizerosum game.
p
q
(i) When p < q, let Zpq ∈ Rn ×n denote player p’s payoﬀ matrix for his zerosum contest
against player q. Deﬁne the normal form game U in terms of the Zpq matrices.
(ii) Let F be the p population game generated by random matching in U. Show that
z DF(x)z = 0 for all x ∈ X and z ∈ Rn , and hence that F is a null stable game.
The previous example and exercise show that random matching across multiple populations can generate a null stable game. Proposition 2.3.10 reveals that null stable games
are the only stable games that can be generated in this way.
Proposition 2.3.10. Suppose F is a C1 stable game without ownpopulation interactions: Fp (x) is
independent of xp for all p ∈ P . Then F is a null stable game.
Proof. By Theorem 2.3.1, F is stable if and only if for all x ∈ X, DF(x) is negative
semideﬁnite with respect to TX. This requirement on DF(x) can be restated as (i) ΦDF(x)Φ
is negative semideﬁnite (with respect to Rn ); or as (ii) Φ(DF(x) + DF(x) )Φ is negative
semideﬁnite, or (since the previous matrix is symmetric) as (iii) Φ(DF(x) + DF(x) )Φ has
all eigenvalues nonpositive. By similar logic, F is null stable if and only if for all x ∈ X,
Φ(DF(x) + DF(x) )Φ has all eigenvalues zero (and so is the null matrix).
Let Dq Fp (x) be the (p, q)th block of the derivative matrix DF(x). Since Fp is independent
of xp , it follows that Dp Fp (x) = 0, and hence that Φ(Dp Fp (x) + Dp Fp (x) )Φ = 0. Since this
product is the (p, p)th block of the symmetric matrix Φ(DF(x) + DF(x) )Φ, the latter has
zero trace, and so its eigenvalues sum to zero. Therefore, the only way Φ(DF(x) + DF(x) )Φ
can be negative semideﬁnite is if all of its eigenvalues are zero. In other words, if F is
stable, it is null stable.
Proposition 2.3.10 tells us that withinpopulation interactions are required to obtain a
strictly stable game. Thus, strictly stable games can arise when there is matching within a
single population to play a symmetric normal form game, but not when there is random
matching in multiple populations to play a standard normal form game.
On the other hand, strictly stable games can arise in multipopulation matching settings that allow matches both across and within populations (see the Notes). Moreover,
in general population games—for instance, in congestion games—withinpopulation interactions are the norm, and strictly stable games are not uncommon. Our remaining
examples illustrate this point.
69 Example 2.3.11. (Perturbed) concave potential games. We call F : X → Rn a concave potential
game if it is a potential game whose potential function f : X → R is concave. Then since
y − x ∈ TX, since the orthogonal projection matrix Φ is symmetric, and since f ≡ ΦF,
we ﬁnd that
( y − x) (F( y) − F(x)) = (Φ( y − x)) (F( y) − F(x))
= ( y − x) (ΦF( y) − ΦF(x))
= ( y − x) ( f ( y) − f (x)) ≤ 0,
so F is a stable game. If the inequalities above are satisﬁed strictly, then they will continue
to be satisﬁed if the payoﬀ functions are slightly perturbed. In other words, perturbations
of strictly concave potential games remain strictly stable games. §
Example 2.3.12. Negative dominant diagonal games. We call the full population game F a
negative dominant diagonal game if it satisﬁes
p p ∂Fi p (x)
∂x i ≤ 0 and ∂Fi p (x)
∂x i ≥ 1
2 ( j,q) q
p ∂F j
∂Fi p (x) + q (x) ∂x
∂x j
i
(i,p) for all i ∈ Sp , p ∈ P , and x ∈ X. The ﬁrst condition says that choosing strategy i ∈ Sp
imposes a negative externality on other users of this strategy. The second condition
requires that this externality exceeds the average of (i) the total externalities that strategy
i imposes on other strategies and (ii) the total externalities that other strategies impose on
strategy i. These conditions are precisely what is required for the matrix DF(x) + DF(x) to
have a negative dominant diagonal. The dominant diagonal condition implies that all of
the eigenvalues of DF(x) + DF(x) are negative; since DF(x) + DF(x) is also symmetric, it
is negative semideﬁnite. Therefore, DF(x) is negative semideﬁnite too, and so F is a stable
game. § 2.3.3 Invasion In Section 2.3.4, we introduce new equilibrium concepts that are of basic importance
for stable games: global neutral stability and global evolutionary stability. These concepts
are best understood in terms of the notion of invasion to be presented now.
Let F : X → Rn be a population game, and let x, y ∈ X be two social states. We say that
y can weakly invade x ( y ∈ IF (x)) if ( y − x) F(x) ≥ 0. Similarly, y can strictly invade x ( y ∈ IF (x))
70 if ( y − x) F(x) > 0.
The intuition behind these deﬁnitions is simple. Consider a single population of agents
who play the game F, and whose initial behavior is described by the state x ∈ X. Now
imagine that a very small group of agents decide to switch strategies. After these agents
select their new strategies, the distribution of choices within their group is described by
some y ∈ X, but since the group is so small the impact of its behavior on the overall
population state is negligible. Thus, the average payoﬀ in the invading group is at
least as high as that in the incumbent population if y F(x) ≥ x F(x), or equivalently, if
y ∈ IF (x). Similarly, the average payoﬀ in the invading group exceeds that in the incumbent
population if y ∈ IF (x).
The interpretation of invasion does not change much when there are multiple populations. If we write ( y − x) F(x) as p ( yp − xp ) Fp (x), we see that if y ∈ IF (x), there must be
some population p for which the small group switching to yp outperforms the incumbent
population playing xp at social state x.
These stories suggest a link with evolutionary dynamics. If y is any state in X, then
the vector y − x is a feasible displacement direction from state x. If in addition y ∈ IF (x),
then the direction y − x is not only feasible, but also respects the incentives provided by
the underlying game.
The invasion conditions also have simple geometric interpretations. That y ∈ IF (x)
means that the angle between the displacement vector y − x and the payoﬀ vector F(x) is
weakly acute; if y ∈ IF (x), this angle is strictly acute. Figure 2.3.1 sketches the set IF (x) at
various states x in a two strategy game. Figure 2.3.2 does the same for a threestrategy
game. To draw the latter case, we need the observation that
y ∈ IF (x) ⇔ ( y − x) F(x) > 0
⇔ (Φ( y − x)) F(x) > 0
⇔ ( y − x) ΦF(x) > 0.
In other words, y ∈ IF (x) if and only if the angle between the displacement vector y − x
and the projected payoﬀ vector ΦF(x) is strictly acute. 2.3.4 Global Neutral Stability and Global Evolutionary Stability Before introducing our new solution concepts, we ﬁrst characterize Nash equilibrium
in terms of invasion: a Nash equilibrium is a state that no other state can strictly invade.
Proposition 2.3.13. x ∈ NE(F) if and only if IF (x) = ∅.
71 e1 I F (x)
x
F(x)
x
� F(x)
� I F (x)
� e2 Figure 2.3.1: Invasion in a two strategy game. e1 ΦF(x)
I F (x) x ΦF(x)
� x
�
�
I F (x)
e2 e3 Figure 2.3.2: Invasion in a three strategy game. 72 Proof. x ∈ NE(F) ⇔ ( y − x) F(x) ≤ 0 for all y ∈ X ⇔ IF (x) = ∅.
With this background at hand, we call x ∈ X a globally neutrally stable state (GNSS) if
( y − x) F( y) ≤ 0 for all y ∈ X.
Similarly, we call x a globally evolutionarily stable state (GESS) if
( y − x) F( y) < 0 for all y ∈ X − {x}.
We let GNSS(F) and GESS(F) denote the sets of globally neutrally stable strategies and
globally evolutionarily stable strategies, respectively.
To see the reason for our nomenclature, note that the inequalities used to deﬁne GNSS
and GESS are the same ones used to deﬁne NSS and ESS in symmetric normal form
games (Example 2.3.4), but that they are now required to hold not just at those states
y that are optimal against x, but at all y ∈ X. NSS and ESS also require a state to be
a Nash equilibrium, but our new solution concepts implicitly require this as well—see
Proposition 2.3.15 below.
It is easy to describe both of these concepts in terms of the notion of invasion.
Observation 2.3.14. (i) GNSS(F) = y∈X IF ( y), and so is convex.
(ii) x ∈ GESS(F) if and only if x ∈ y∈X−{x} IF ( y).
In words: a GNSS is a state that can weakly invade every state (or, equivalently, every
other state), while a GESS is a state that can strictly invade every other state.
Our new solution concepts can also be described in geometric terms. For example, x
is a GESS if a small motion from any state y x in the direction F( y) (or ΦF( y)) moves
the state closer to x (see Figure 2.3.3). If we allow not only these acute motions, but also
orthogonal motions, we obtain the weaker notion of GNSS.
We conclude this section by relating our new solution concepts to Nash equilibrium.
Proposition 2.3.15. (i) If x ∈ GNSS(F), then x ∈ NE(F).
(ii) If x ∈ GESS(F), then NE(F) = {x}. Hence, if a GESS exists, it is unique.
Proof. To prove part (i), let x ∈ GNSS(F) and let y
x. Deﬁne xε = ε y + (1 − ε)x.
Since x is a GNSS, (x − xε ) F(xε ) ≥ 0 for all ε ∈ (0, 1]. Simplifying and dividing by ε yields
(x − y) F(xε ) ≥ 0 for all ε ∈ (0, 1], so taking ε to zero yields ( y − x) F(x) ≤ 0. In other words,
x ∈ NE(F).
73 y
ΦF(y)
x
y
�
ΦF(y)
� ~
ΦF(y) ̃
y Figure 2.3.3: The geometric deﬁnition of GESS. ΦF(xε)
y ΦF(x)
xε x Figure 2.3.4: Why every GNSS is a Nash equilibrium. 74 To prove part (ii), it is enough to show that if x is a GESS, then no y
x ∈ GESS(F), then x ∈ IF ( y); since IF ( y) is nonempty, y NE(F). x is Nash. But if Evidently, this proposition implies that every GNSS is an NSS, and that every GESS is an
ESS.
The proof that every GNSS is Nash is easy to explain in pictures. In Figure 2.3.4, we
draw the GNSS x and an arbitrary state y, and place the state xε on the segment between
y and x. Since x is a GNSS, the angle between F(xε ) and x − xε , and hence between ΦF(xε )
and x − xε , is weakly acute. Taking ε to zero, it is apparent that the angle between ΦF(x)
and y − x, and hence between y − x and ΦF(x), must be weakly obtuse. Since y was
arbitrary, x is a Nash equilibrium. 2.3.5 Nash Equilibrium and Global Neutral Stability in Stable Games Proposition 2.3.15 tells us that every GNSS of an arbitrary game F is a Nash equilibrium.
Theorem 2.3.16 shows that much more can be said if F is stable: in these case, the sets
of globally neutrally stable states and Nash equilibria coincide. Together, this fact and
Observation 2.3.14 imply that the Nash equilibria of any stable game form a convex set.
In fact, if we can replace certain of the weak inequalities that deﬁne stable games with
strict ones, then the Nash equilibrium is actually unique.
Theorem 2.3.16. (i) If F is a stable game, then NE(F) = GNSS(F), and so is convex.
(ii) If in addition F is strictly stable at some x ∈ NE(F) (that is, if ( y − x) (F( y) − F(x)) < 0 for
all y x), then NE(F) = GESS(F) = {x}.
Proof. Suppose that F is stable, and let x ∈ NE(F). To establish part (i), it is enough to
show that x ∈ GNSS(F). So ﬁx an arbitrary y x. Since F is stable,
(2.19) ( y − x) (F( y) − F(x)) ≤ 0. And since x ∈ NE(F), ( y − x) F(x) ≤ 0. Adding these inequalities yields
(2.20) ( y − x) F( y) ≤ 0, As y was arbitrary, x is a GNSS.
Turning to part (ii), suppose that F is strictly stable at x. Then inequality (2.19) holds
strictly, so inequality (2.20) holds strictly as well. This means that x is a GESS of F, and
hence the unique Nash equilibrium of F. 75 R P S Figure 2.3.5: The GESS of good RPS. Example 2.3.17. RockPaperScissors revisited. Recall from Example 2.3.5 that good RPS is
a (strictly) stable game; standard RPS is a zerosum game, and hence a (weakly) stable
1
game. The unique Nash equilibrium of both of games is x∗ = ( 1 , 3 , 1 ). In Figure 2.3.5, for
3
3
a selection of states x, we draw the projected payoﬀ vectors ΦF(x) generated by good RPS
(with w = 3 and l = 1), as well as the vector from x to x∗ . For each x, the angle between this
pair of vectors is acute, reﬂecting the fact that the Nash equilibrium x∗ is a GESS. In Figure
2.3.6, we perform the same exercise for standard RPS. In this case, the vectors ΦF(x) and
x∗ − x always form a right angle, so x∗ is a GNSS but not a GESS. §
Exercise 2.3.18. Let F be a stable game. Show that if x∗ is a Nash equilibrium of F such
that DF(x∗ ) is negative deﬁnite with respect to TX × TX, then x∗ is a GESS, and hence the
unique Nash equilibrium of F.
Exercise 2.3.19. Pseudostable games. We call the population game F pseudostable if for all x,
y ∈ X, ( y − x) F(x) ≤ 0 implies that (x − y) F( y) ≥ 0. In other words, if y cannot strictly
invade x, then x can weakly invade y.
(i) Show that every stable game is pseudostable.
(ii) Show that if F is pseudostable, then NE(F) = GNSS(F), and so is convex.
76 R P S Figure 2.3.6: The GNSS of standard RPS. (A smooth realvalued function f is pseudoconcave if its gradient f is pseudostable.
Given facts (i) and (ii) above and the discussion in Section 2.1.5, it should be no surprise
that many results from concave programming (e.g., the convexity of the set of maximizers)
remain true when the objective function is only pseudoconcave.)
In addition to its role in establishing that the set of Nash equilibria of a stable game
is convex, the concept of global neutral stability enables us to carry out an important
theoretical exercise: that of devising an elementary proof of existence of Nash equilibrium
in stable games—in other words, one that does not rely on an appeal to a ﬁxed point
theorem. The heart of the proof, Proposition 2.3.20, is a ﬁnite analogue of the result we
seek.
Proposition 2.3.20. Let F be a stable game, and let Y be a ﬁnite subset of X. Then there exists a
state x∗ ∈ conv(Y) such that ( y − x∗ ) F( y) ≤ 0 for all y ∈ Y.
In words: if F is a stable game, then given any ﬁnite set of states Y, we can always ﬁnd
a state in the convex hull of Y that can weakly invade every element of Y. The proof of
this result uses the Minmax Theorem.
Proof. Suppose that Y has m elements. Deﬁne a two player zerosum game U =
77 (U1 , U2 ) = (Z, −Z) with n1 = n2 = m as follows:
Zxy = (x − y) F( y).
In this game, player 2 chooses a “status quo” state y ∈ Y, player 1 chooses an “invader”
x ∈ Y, and the payoﬀ Zxy is the invader’s “relative payoﬀ ” in F. Split Z into its symmetric
and skewsymmetric parts:
ZS = 1 (Z + Z ) and ZSS = 1 (Z − Z ).
2
2
Since F is stable, equation (2.19) from the previous proof shows that
ZS =
xy (x − y) F( y) + ( y − x) F(x) = 1 (x − y) (F( y) − F(x)) ≥ 0
2 1
2 for all x, y ∈ Y.
The Minmax Theorem tells us that in any zero sum game, player 1 has a strategy that
guarantees him the value of the game. In the skewsymmetric game USS = (ZSS , −ZSS ) =
(ZSS , (ZSS ) ), the player roles are interchangeable, so the game’s value must be zero. Since
Z = ZSS + ZS and ZS ≥ 0, the value of U = (Z, −Z) must be at least zero. In other words, if
λ ∈ Rm is a maxmin strategy for player 1, then
λx Zxy µ y ≥ 0
x∈Y y∈Y for all mixed strategies µ of player 2. If we let
x∗ = λx x ∈ conv(Y)
x∈Y and ﬁx an arbitrary pure strategy y ∈ Y for player 2, we ﬁnd that
λx (x − y) F( y) = (x∗ − y) F( y). λx Zxy = 0≤
x∈Y x∈Y With this result in hand, existence of Nash equilibrium in stable games follows from a
simple compactness argument. Theorem 2.3.16 and Observation 2.3.14 tell us that
NE(F) = GNSS(F) = {x ∈ X : ( y − x) F( y) ≤ 0}.
y∈X Proposition 2.3.20 shows that if we take the intersection above over an arbitrary ﬁnite set
78 Y ⊂ X instead of over X itself, then the intersection is nonempty. Since X is compact, the
ﬁnite intersection property allows us to conclude that GNSS(F) is nonempty itself.
Exercise 2.3.21. In Exercise 2.1.14, we deﬁned population games with entry and exit. If
F : Rn → R is C1 and deﬁnes such a game, what condition on the derivative matrices
+
DF(x) is the appropriate deﬁnition of stable games for this context? Argue that all of the
results in this section continue to hold when entry and exit are permitted. 2.4 Supermodular Games Of the classes of games we study in this chapter, supermodular games, a class that
includes models of coordination, search, and Bertrand competition, are the most familiar to economists. By deﬁnition, supermodularity requires that higher choices by one’s
opponents make one’s own higher strategies look relatively more desirable. This complementarity condition imposes a monotone structure on the agents’ best response correspondences, which in turn imposes structure on the set of Nash equilibria. 2.4.1 Deﬁnition Each strategy set Sp = {1, . . . , np } is naturally endowed with a linear order. To deﬁne
supermodular games, we introduce a corresponding partial order on the set of population
states Xp (and, implicitly, on the set of mixed strategies for population p). Deﬁne the matrix
p
p
Σ ∈ R(n −1)×n by 0 1 · · · 1 . . . . .. .
.
Σ=
. . .
.
. 0 · · · 0 1 Then
np
p (Σx )i =
p xj
j=i+1 equals the total mass on strategies greater than i at population state xp . If we view xp as
a discrete density function on Sp with total mass mp , then Σxp deﬁnes the corresponding
“decumulative distribution function” for xp . In particular, Σ yp ≥ Σxp if and only if yp
stochastically dominates xp . 79 We extend this partial order to all of X using the matrix Σ ∈ R(n−p )×n , which we deﬁne
as the block diagonal matrix Σ = diag(Σ, . . . , Σ). Note that Σ y ≥ Σx if and only if yp
stochastically dominates xp for all p ∈ P .
With these preliminaries in hand, we are ready to deﬁne our class of games.
Deﬁnition. We call the population game F : X → Rn a supermodular game if it exhibits
strategic complementarities:
(2.21) p p p p If Σ y ≥ Σx, then Fi+1 ( y) − Fi ( y) ≥ Fi+1 (x) − Fi (x) for all i < np , p ∈ P , x ∈ X. In words: if y stochastically dominates x, then for any strategy i < np , the payoﬀ
advantage of i + 1 over i is greater at y than at x.
By introducing a bit more notation, we can express condition (2.21) in a more concise
p
p
˜
˜
way. Deﬁne the matrices Σ ∈ Rn ×(n −1) and Σ ∈ Rn×(n−p ) by −1 0 1 −1 ˜ Σ=0
1 . .
.
.
.. ... 0 ...
˜
˜
˜ 0 and Σ = diag(Σ, . . . , Σ). ... −1 0
1 ···
... 0
.
.
. Observation 2.4.1. F is a supermodular game if and only if the following condition holds:
(2.22) ˜
˜
Σ y ≥ Σx implies that Σ F( y) ≥ Σ F(x). As with potential games and stable games, we can characterize smooth supermodular
games in terms of conditions on the derivatives DF(x).
Theorem 2.4.2. Suppose the population game F is C1 . Then F is supermodular if and only if either
of the following equivalent conditions holds.
p (2.23)
(2.24) p q q ∂(Fi+1 − Fi )
∂(e j+1 − e j ) (x) ≥ 0 for all i < np , j < nq , p, q ∈ P , and x ∈ X. ˜
˜
Σ DF(x)Σ ≥ 0 for all x ∈ X. Condition (2.23) is the most transparent of the four conditions. It requires that if
some players in population q switch from strategy j to strategy j + 1, the performance of
strategy i + 1 ∈ Sp improves relative to that of strategy i. On the other hand, condition
80 (2.24) provides the most concise characterization of supermodular games. Moreover,
˜
˜
since the range of Σ is TX (i.e., since each column of Σ lies in TX), condition (2.24) is a
restriction of the action of DF(x) on TX × TX—just like our earlier conditions (2.10) and
(2.15) characterizing potential games and stable games.
Proof. The equivalence of conditions (2.23) and (2.24) is easily veriﬁed. Given Observation 2.4.1, it is enough to show that (2.21) implies (2.23) and that (2.24) implies (2.22).
So suppose condition (2.21) holds, and ﬁx x ∈ X; since F is C1 it is enough to consider
q
q
x in the interior of X. Let yε = x + ε(e j+1 − e j ), which lies in X whenever ε is suﬃciently
small, and which satisﬁes Σ yε ≥ Σx. By the deﬁnition of DF(x), we have that
p p
Fi+1 ( yε ) − p
Fi ( yε ) = p
Fi+1 (x) − p
Fi (x) +ε p q q ∂(Fi+1 − Fi )
∂(e j+1 − e j ) (x) + o yε − x . Thus, condition (2.21) implies that
p ε p q q ∂(Fi+1 − Fi )
∂(e j+1 − e j ) (x) + o(ε) ≥ 0, which implies (2.23).
We now show that (2.24) implies (2.22). We consider only the single population case,
leaving the general case as an exercise. The idea behind the proof is simple. If state y
stochastically dominates state x, then we can transit from state x to state y by shifting
mass from strategy 1 to strategy 2, from strategy 2 to strategy 3, ... , and ﬁnally from
strategy n − 1 to strategy n. Condition (2.23) ≡ (2.24) says that each such shift improves
the payoﬀ of each strategy k + 1 relative to that of strategy k. Since transiting from x to
y means executing all of the shifts, this transition too must improve the performance of
k + 1 relative to k, which is exactly what condition (2.21) ≡ (2.22) requires.
Our matrix notation makes it possible to formalize this argument in a streamlined way.
˜
Recall the deﬁnitions of Σ ∈ Rn×(n−1) and Σ ∈ R(n−1)×n , and deﬁne Ω ∈ Rn×n as follows: −1 0 1 −1 ˜ =0 Σ
1 . .
.
.
.. .. .
0 0 1 ... , Σ = 0 0 0
.
. . .. . −1 0 ··· 0
1 ···
... 0
.
.
. Then it is easy to verify this next observation.
81 1 · · · · · · 1 0 . . 1
. , and Ω = 0 ... ... . .
.
.
.
. ··· 0 1 0 1 · · · · · · 1 0 · · · · · · 0 .
..
. .
0
. .
.. .
.
. . .
. 0 ··· ··· 0 ˜
Observation 2.4.3. ΣΣ = I − Ω ∈ Rn×n .
In words, Observation 2.4.3 says that the stochastic dominance operator Σ is “inverted” by
˜
the diﬀerence operator Σ, except for a remainder Ω that is a null operator on TX (i.e., that
˜
satisﬁes Ωz = 0 for all z ∈ TX). (For completeness, we also note that ΣΣ = I ∈ R(n−1)×(n−1) .)
Now suppose that Σx ≤ Σ y, and let α(t) = ty + (1 − t)x, so that α(0) = x, α(1) = y, and
α (t) = y − x ∈ TX. Then using the Fundamental Theorem of Calculus, Observation 2.4.3,
condition (2.24), and the fact that Σ( y − x) ≥ 0, we ﬁnd that
1 ˜
˜
Σ (F( y) − F(x)) = Σ DF(α(t)) ( y − x) dt
0
1 = ˜
˜
Σ DF(α(t)) (ΣΣ + Ω) ( y − x) dt
0
1 = ˜
˜
Σ DF(α(t))Σ Σ( y − x) dt
0 ≥ 0. 2.4.2 Examples Exercise 2.4.4. Random matching in supermodular normal form games. The normal form game
U = (U1 , . . . , Up ) is supermodular if the diﬀerence Up (sp + 1, sq , s−{p,q} ) − Up (sp , sq , s−{p,q} ) is
nondecreasing in sq for all sp < np , s−{p,q} ∈ r {p,q} Sr and distinct p, q ∈ P . Show that
random matching of p populations to play U generates a supermodular game.
Exercise 2.4.5. Which symmetric normal form games generate supermodular population
games?
Example 2.4.6. Bertrand oligopoly with diﬀerentiated products. A population of ﬁrms produce
output at zero marginal cost and compete in prices S = {1, . . . , n}. Suppose that the demand
faced by a ﬁrm increases when competitors raise their prices, and that this eﬀect does not
diminish when the ﬁrm itself charges higher prices. More precisely, let qi (x), the demand
faced by a ﬁrm that charges price i when the price distribution is x, satisfy
∂(qk+1 − qk )
∂qi
(x) ≥ 0 and
(x) ≥ 0 for all i ≤ n and all j, k < n.
∂(e j+1 − e j )
∂(e j+1 − e j )
The payoﬀ to a ﬁrm that charges price i is Fi (x) = i qi (x), and so
∂qi+1
∂qi
∂(Fi+1 − Fi )
(x) = (i + 1)
(x) − i
(x)
∂(e j+1 − e j )
∂(e j+1 − e j )
∂(e j+1 − e j )
82 =i ∂(qi+1 − qi )
∂qi+1
(x) +
(x) ≥ 0.
∂(e j+1 − e j )
∂(e j+1 − e j ) Therefore, F is a supermodular game. §
Example 2.4.7. Search with positive externalities. A population of agents choose levels of
search eﬀort in S = {1, . . . , n}. The payoﬀ to choosing eﬀort i is
Fi (x) = m(i) b(a(x)) − c(i),
where a(x) = k≤n kxk is the aggregate search eﬀort, b is some increasing beneﬁt function,
m is an increasing multiplier function, and c is an arbitrary cost function. Notice that
the beneﬁts from searching are increasing in both own search eﬀort and in the aggregate
search eﬀort. Since
∂(Fi+1 − Fi )
(x) = m(i + 1) b (a(x)) ( j + 1) − j − m(i) b (a(x)) ( j + 1) − j
∂(e j+1 − e j )
= (m(i + 1) − m(i)) b (a(x)) ≥ 0,
F is a supermodular game. §
Example 2.4.8. Relative consumption eﬀects/Arms races. Agents from a single population
choose consumption levels (or armament levels) in S = {1, . . . , n}. Payoﬀs take the form
Fi (x) = r(i − a(x)) + u(i) − c(i).
Here, r is a concave function of the diﬀerence between the agent’s consumption level and
the average consumption level in the population, while u and c are arbitrary functions of
the consumption level. (One would typically assume that r is increasing, but this property
is not needed for supermodularity.) Since
∂(Fi+1 − Fi )
(x) = r ((i + 1) − a(x)) −( j + 1) + j − r (i − a(x)) −( j + 1) + j
∂(e j+1 − e j )
= r (i − a(x)) − r ((i + 1) − a(x)) ≥ 0,
F is a supermodular game. §
Exercise 2.4.9. Characterize the C1 twostrategy supermodular games using a derivative
condition. Compare them with the C1 twostrategy stable games (Exercise 2.3.2(i)). Are
all C1 twostrategy games in one class or the other?
83 2.4.3 Best Response Monotonicity in Supermodular Games Recall the deﬁnition of the pure best response correspondence for population p:
p bp (x) = argmax Fi (x).
i∈Sp Theorem 2.4.10 establishes a fundamental property of supermodular games: their pure
best response correspondences are increasing.
Theorem 2.4.10. Let F be a supermodular game with pure best response correspondences bp . If
Σx ≤ Σ y, then min bp (x) ≤ min bp ( y) and max bp (x) ≤ max bp ( y) for all p ∈ P .
This property is intuitively obvious: when opponents choose higher strategies, an
agent’s own higher strategies look relatively better, so his best strategies must be (weakly)
higher as well.
Proof. We consider the case in which p = 1, focusing on the ﬁrst inequality; we leave
the remaining cases as exercises.
Let Σx ≤ Σ y and i < j. Then condition (2.21) implies that
j −1 F j ( y) − Fi ( y) − F j (x) − Fi (x) = Fk+1 ( y) − Fk ( y) − Fk+1 (x) − Fk (x) ≥ 0. k =i Thus, if j = min b(x) > i, then F j ( y) − Fi ( y) ≥ F j (x) − Fi (x) > 0, so i is not a best response to
y. As i < min b(x) was arbitrary, we conclude that min b(x) ≤ min b( y).
To state a version of Theorem 2.4.10 for mixed best responses, we need some additional
p
p
p
notation. Let vi ∈ Rn denote the ith vertex of the simplex ∆p : that is, (vi ) j equals 1 if j = i
p
p
p
p
and equals 0 otherwise. (To summarize our notation to date: xi ∈ R, vi ∈ Rn , and ei ∈ Rn .
p
Of course, the notation vi is unnecessary in the single population case.) We can describe
population p’s mixed best response correspondence in the following equivalent ways:
p Bp (x) = xp ∈ ∆p : xi > 0 ⇒ i ∈ bp (x)
p = conv vi : i ∈ bp (x) ,
We can also deﬁne the minimal and maximal elements of Bp (x) as follows:
p p Bp (x) = vmin bp (x) and Bp (x) = vmax bp (x) . 84 To extend this notation to the multipopulation environment, deﬁne
B(x) = (B1 (x), . . . , Bp (x)) and B(x) = (B1 (x), . . . , Bp (x)).
Then the following corollary follows immediately from Theorem 2.4.10.
Corollary 2.4.11. If F is supermodular and Σx ≤ Σ y, then ΣB(x) ≤ ΣB( y) and ΣB(x) ≤ ΣB( y). 2.4.4 Nash Equilibria of Supermodular Games We now use the monotonicity of the best response correspondence to show that every
supermodular game has a minimal and a maximal Nash equilibrium. The derivation
of this result includes a ﬁnite iterative method for computing the minimal and maximal
equilibria, and so provides a simple proof of the existence of equilibrium. We focus
attention on the case where each population has mass one, so that each set of population
p
states Xp is just the simplex in Rn ; the extension to the general case is a simple but
notationally cumbersome exercise.
p
p
Let x and x be the minimal and maximal states in X : xp = v1 and xp = vnp for all p ∈ P .
Recall that Xv denotes the set of vertices of X, and let n∗ = #Xv = p∈P np . Finally, for
states y, z ∈ X, deﬁne the interval [ y, z] ⊆ X by [ y, z] = {x ∈ X : Σ y ≤ Σx ≤ Σz}.
Theorem 2.4.12. Suppose F is a supermodular game. Then
(i) The sequences {Bk (x)}k≥0 and {Bk (x)}k≥0 are monotone sequences in Xv , and so converge
within n∗ steps to their limits, x∗ and x∗ .
(ii) x∗ = B(x∗ ) and x∗ = B(x∗ ), so x∗ and x∗ are pure Nash equilibria of F.
(iii) NE(F) ⊆ [x∗ , x∗ ]. Thus, if x∗ = x∗ , then this state is the Nash equilibrium of F.
In short, iterating B and B from the minimal and maximal states in X yields Nash equilibria
of F, and all other Nash equilibria of F lie between the two so obtained.
Proof. Part (i) follows immediately from Corollary 2.4.11. To prove part (ii), note that
∗
∗
∗
since x∗ = Bn (x) and Bn +1 (x) = Bn (x) by part (i), it follows that
∗
∗
∗
B(x∗ ) = B(Bn (x)) = Bn +1 (x) = Bn (x) = x∗ . An analogous argument shows that B(x∗ ) = x∗ .
We ﬁnish with the proof of part (iii). If Y ⊆ X and min Y and max Y exist, then the
monotonicity of B implies that B(Y) ⊆ [B(min Y), B(max Y)]. Iteratively applying B to the
∗
∗
∗
set X therefore yields Bn (X) ⊆ [Bn (x), Bn (x)] = [x∗ , x∗ ]. Also, if x ∈ NE(F), then x ∈ B(x),
85 and so Bk−1 (x) ⊆ Bk−1 (B(x)) = Bk (x), implying that x ∈ Bk (x) for all k ≥ 1. We therefore
∗
∗
conclude that x ∈ Bn (x) ⊆ Bn (X) ⊆ [x∗ , x∗ ]. Appendix
2.A Multivariate Calculus 2.A.1 Univariate Calculus Before discussing multivariate calculus we review some ideas from univariate calculus.
A function f from the real line to itself is diﬀerentiable at the point x if
f (x) = lim
y→x f ( y) − f (x)
y−x exists; this limit is called the derivative of f at x. Three useful facts about derivatives are
( f g) (x) = f (x) g (x) + g(x) f (x);
( g ◦ f ) (x) = g ( f (x)) f (x);
y
f ( y) − f (x) = x f (z) dz. The Product Rule:
The Chain Rule:
The Fundamental Theorem of Calculus: The deﬁnition of f (x) above is equivalent to the requirement that
(2.25) f ( y) = f (x) + f (x)( y − x) + o( y − x), where o(z) represents a remainder function r : R → R satisfying
lim
z→0 r(z)
= 0.
z (In words: r(z) approaches zero faster than z approaches zero.) In the approximation
(2.25), f (x) acts as a linear map from R to itself; it sends the displacement of the input,
y − x, to the displacement of the output, f (x)( y − x). 2.A.2 The Derivative as a Linear Map Let L(Rn , Rm ) denote the space of linear maps from Rn to Rm :
ˆ
ˆ
L(Rn , Rm ) = {λ : Rn → Rm  λ(az + bz) = aλ(z) + bλ(ˆ ) for all a, b ∈ R and z, z ∈ Rn }.
z
86 Each matrix A ∈ Rm×n deﬁnes a linear map in L(Rn , Rm ) via λ(z) = Az, and such a matrix
can be found for every map λ in L(Rn , Rm ) (see Appendix 2.B.1). It is common to identify a
linear map with its matrix representation. But it is important to be aware of the distinction
between these two objects: if we replace the domain Rn with a proper subspace of Rn ,
matrix representations of linear maps are no longer unique—see Appendix 2.B.
Let F be a function from Rn to Rm . (Actually, we can replace the domain Rn with any
open set in Rn , or even with a closed set in Rn , as discussed in Appendix 2.A.7.) We say
that F is diﬀerentiable at x if there is a linear map DF(x) ∈ L(Rn , Rm ) satisfying
(2.26) F( y) = F(x) + DF(x)( y − x) + o( y − x) Here, o(z) represents a remainder function r : Rn → Rm that satisﬁes
lim
z→0 r(z)
= 0.
z If the function DF : Rn → L(Rn , Rm ) is continuous, we say that F is continuously diﬀerentiable
or of class C1 .
When we view DF(x) as a matrix in Rm×n , we call it the Jacobian matrix or derivative
matrix of F at x. To express this matrix explicitly, deﬁne the partial derivatives of F at x by
Fi ( y j , x− j ) − Fi (x)
∂Fi
.
(x) = lim
y j →x j
yj − xj
∂x j
Then the derivative matrix DF(x) can be expressed as ∂F1 ∂x1 (x) · · · .
.
.
DF(x) = .
.
. ∂Fm (x) · · ·
∂x1 ∂F1
(x) ∂xn .
. . . ∂Fm
(x)
∂xn If f is a function from Rn to R (i.e., if m = 1), then its derivative at x can be represented
by a vector. We call this vector the gradient of f at x, and deﬁne it by ∂f ∂x1 (x) . . .
f (x) = . ∂f (x)
∂xn
Our notations for derivatives are related by D f (x) = 87 f (x) , where the prime represents transposition, and also by F1 (x) .
. .
DF(x) = . F (x) m
Suppose we are interested in how quickly the value of f changes as we move from the
point x ∈ Rn in the direction z ∈ Rn − {0}. This rate is described by the directional derivative
of f at x in direction z, deﬁned by
(2.27) ∂f
f (x + εz) − f (x)
(x) = lim
.
ε→0
ε
∂z It is easy to verify that
∂f
(x) =
∂z f (x) z. More generally, the rate of change of the vectorvalued function F at x in direction z can
be expressed as DF(x)z.
It is worth noting that a function can admit directional derivatives at x in every direction z 0 without being diﬀerentiable at x (i.e., without satisfying deﬁnition (2.26)).
Amazingly, such a function need not even be continuous at x, as the following example
shows.
Example 2.A.1. Deﬁne the function f : R2 → R by x1 (x2 )2 f (x1 , x2 ) = (x1 )2 + (x2 )4 0 if x1 0, if x1 = 0. Using deﬁnition (2.27), it is easy to verify that the directional derivatives of f at the origin
in every direction z 0 exist: (z2 )2 ∂f (0) = z1 ∂z
0 But while f (0) = 0, f (x) =
at 0. if z1 0, if z1 = 0.
1
2 at all other x that satisfy x1 = (x2 )2 , and so f is discontinuous 88 On the other hand, if all (or even all but one) of the partial derivatives f exist and are
continuous in a neighborhood of x, then f is diﬀerentiable at x. 2.A.3 Diﬀerentiation as a Linear Operation We can view diﬀerentiation as an operation that takes functions as inputs and returns
functions as outputs. From this point of view, diﬀerentiation is a linear operation between
spaces of functions. As an example, suppose that f and g are functions from R to itself,
and that a and b are real numbers. Then the scalar product a f is a function from R to itself,
as is the linear combination a f + bg. (In other words, the set of functions from R to itself is
a vector space.) The fact that diﬀerentiation is linear means that the derivative of the linear
combination, (a f + bg) , is equal to the linear combination of the derivatives, a f + bg .
We can express this idea in a multivariate setting using a simple formula. Suppose
that F : Rn → Rm is a diﬀerentiable function and that A is a matrix in Rl×m . Then AF is
the function from Rn to Rl deﬁned by (AF)k (x) = m 1 Ak j F j (x) for k ∈ {1, . . . , l}. Linearity of
j=
diﬀerentiation says that D(AF) = A(DF), or, more explicitly, that
Linearity of diﬀerentiation: D(AF)(x) = A(DF)(x) for all x ∈ Rn . Put diﬀerently, the diﬀerential operator D and the linear map A commute. 2.A.4 The Product Rule and the Chain Rule Suppose f and g are diﬀerentiable functions from R to itself. Then the product rule
tells us that ( f g) (x) = f (x) g (x) + g(x) f (x). In other words, to ﬁnd the eﬀect of changing x
on the value ( f g)(x) of the product function, ﬁrst ﬁnd the eﬀect of changing x on g(x), and
scale this eﬀect by f (x); then, ﬁnd the eﬀect of changing x on f (x), and scale this eﬀect by
g(x); and ﬁnally, add the two terms.
This same idea can be applied in multidimensional cases as well. Let F : Rn → Rm and
G : Rn → Rm be diﬀerentiable vectorvalued functions. Then F G : Rn → R, deﬁned by
(F G)(x) = F(x) G(x), is a scalarvalued function. The derivative D(F G)(x) ∈ R1×n of our
new function is described by the following product rule:
Product Rule 1: D(F G)(x) = ( (F G)(x)) = F(x) DG(x) + G(x) DF(x). (Notice that in the previous paragraph, a prime ( ) denoted the derivative of a scalarvalued function, while here it denotes matrix transposition. So long as we keep these
scalar and matrix usages separate, no confusion should arise.) 89 If a : Rn → R is a diﬀerentiable scalarvalued function, then aF : Rn → Rm , deﬁned by
(aF)(x) = a(x)F(x), is a vectorvalued function. Its derivative D(aF)(x) ∈ Rm×n is described
our next product rule:
Product Rule 2: D(aF)(x) = a(x)DF(x) + F(x) a(x) = a(x)DF(x) + F(x)Da(x). Finally, we can create a vectorvalued function from F : Rn → Rm and G : Rn → Rm
by introducing the componentwise product F • G : Rn → Rm . This function is deﬁned by
(F•G)i (x) = Fi (x)Gi (x), or, in matrix notation, by (F•G)(x) = diag(F(x))G(x) = diag(G(x))F(x),
where diag(v) denotes the diagonal matrix whose diagonal entries are the components
of the vector v. The derivative of the componentwise product, D(F • G)(x) ∈ Rm×n , is
described by our last product rule:
Product Rule 3: D(F • G)(x) = diag(F(x))DG(x) + diag(G(x))DF(x). One can verify each of the formulas above by expanding them and then applying the
univariate product rule term by term. To remember the product rules, bear in mind that
the end result must be a sum of two terms of the same dimensions, and that each of the
terms must end with a derivative, so as to operate on a displacement vector z ∈ Rn to be
placed on the right hand side.
In the one dimensional setting, the chain rule tells us that ( g ◦ f ) (x) = g ( f (x)) f (x). In
words, the formula says that we can decompose the eﬀect of changing x on ( g ◦ f )(x) into
two pieces: the eﬀect of changing x on the value of f (x), and the eﬀect of this change in
f (x) on the value of g( f (x)).
This same idea carries through to multivariate functions. Let F : Rn → Rm and
G : Rm → Rl be diﬀerentiable, and let G ◦ F : Rn → Rl be their composition. The chain rule
says that the derivative of this composition at x ∈ Rn , D(G ◦ F)(x) ∈ Rl×n , is obtained as the
product of the derivative matrices DG(F(x)) ∈ Rl×m and DF(x) ∈ Rm×n .
The Chain Rule: D(G ◦ F)(x) = DG(F(x)) DF(x). This equation can be stated more explicitly as
∂(G ◦ F)k
(x) =
∂x i m
j =1 ∂F j
∂Gk
(F(x))
(x).
∂y j
∂x i The chain rule can be viewed as a generalization of the earlier formula on linearity of
diﬀerentiation, with the linear map A replaced by the nonlinear function G. 90 2.A.5 Homogeneity and Euler’s Theorem Let f be a diﬀerentiable function from Rn to R. (We can replace the domain Rn with an
open (or even a closed) convex cone: a convex set which, if it contains x ∈ Rn , also contains
tx for all t > 0.) We say that f is homogeneous of degree k if
(2.28) f (tx) = tk f (x) for all x ∈ Rn and t > 0.
By deﬁnition, homogeneous functions are monomials along each ray from the origin.
Indeed, when n = 1 the homogeneous functions are precisely the monomials: if x ∈
R, g(tx) = tk g(x), and g(1) = a, then g(x) = axk . But when n > 1 more complicated
homogeneous functions can be found.
Nevertheless, the basic properties of homogeneous functions are generalizations of
properties of monomials. If we take the derivative of each side of equation (2.28) with
respect to xi , applying the chain rule on the left hand side, we obtain
f (tx) (tei ) = tk ∂f
(x).
∂xi Dividing both sides of this equation by t and simplifying yields
∂f
∂f
(tx) = tk−1
(x).
∂xi
∂xi
In other words, the partial derivatives of a homogeneous function of degree k are themselves homogeneous of degree k − 1.
If we instead take the derivative of each side of (2.28) with respect to t, again using the
chain rule on the left hand side, we obtain 0 if k = 0, f (tx) x = k −1
kt f (x) otherwise. Setting t = 1 yields Euler’s Theorem: if f is homogeneous of degree k, then
f (x) x = k f (x) for all x ∈ Rn .
In fact, the converse of Euler’s Theorem is also true: one can show that if f satisﬁes the
previous identity, it is homogeneous of degree k. 91 2.A.6 Higher Order Derivatives As we have seen, the derivative of a function F : Rn → Rm is a new function
(2.29) DF : Rn → L(Rn , Rm ). For each x ∈ Rn , DF(x) describes how the value of F in Rm changes as we move away from
x in any direction z ∈ Rn . Notice that in expression (2.29), the point x around which we
evaluate the function F inhabits the ﬁrst Rn , while the displacement vector z inhabits the
second Rn .
The second derivative of F at x, D2 F(x) = D(DF(x)), describes how the value of the ﬁrst
ˆ
derivative DF(x) ∈ L(Rn , Rm ) changes as we move away from x in direction z ∈ Rn . Thus,
D2 F(x) is an element of the set of maps L(Rn , L(Rn , Rm )), which we denote by L2 (Rn , Rm ).
Elements of L2 (Rn , Rm ) are called bilinear maps from Rn × Rn to Rm : they take two vectors
in Rn as inputs, are linear in each of these vectors, and return elements of Rm as outputs.
If F is twice continuously diﬀerentiable (i.e., if DF and D2 F are both continuous in x),
ˆ
then it can be shown that D2 F(x) is symmetric, in the sense that D2 F(x)(z, z) = D2 F(x)(ˆ , z) for
z
n
2
2
n
m
ˆ
all z, z ∈ R . We therefore say that D F(x) is an element of Ls (R , R ), the set of symmetric
bilinear maps from Rn × Rn to Rm .
More generally, the kth derivative of F is a map Dk F : Rn → Lk (Rn , Rm ). For each x ∈ Rn ,
s
Dk F(x) is a symmetric multilinear map; it takes k displacement vectors in Rn as inputs, is
linear in each, and returns an output in Rm ; this output does not depend on the order of
the inputs. If F has continuous derivatives of orders zero through K, we say that it is in
class CK .
We can use higher order derivatives to write the Kth order version of Taylor’s Formula,
which provides a polynomial approximation of a CK function F around the point x.
K Taylor’s Formula: F( y) = F(x) +
k =1 K
1k
D F(x)( y − x, . . . , y − x) + o y − x .
k! Here, Dk F(x)( y − x, . . . , y − x) ∈ Rm is the output generated when the multilinear map
Dk F(x) ∈ Lk (Rn , Rm ) acts on k copies of the displacement vector ( y − x) ∈ Rn . (To see where
s
the factorial terms come from, try expressing the coeﬃcients of a Kth order polynomial in
terms of the polynomial’s derivatives.)
The higher order derivative that occurs most frequently in applications is the second
derivative of a scalar valued function f : Rn → R. This second derivative, D2 f , sends
each x ∈ Rn to a symmetric bilinear map D2 f (x) ∈ L2 (Rn , R). We can represent this map
s
using a Hessian matrix 2 f (x) ∈ Rn×n , the elements of which are the second order partial
92 derivatives of f :
2 ∂f (x) · · · (∂x1 )2 .
.
2
.
. f (x) = .
. 2 ∂f (x) · · ·
∂xn ∂x1 ∂2 f (x) ∂x1 ∂xn . . . 2 ∂f (x) (∂xn )2 When f is C2 , the symmetry of the map D2 f (x) is reﬂected in the fact that the Hessian
matrix is symmetric: corresponding pairs of mixed partial derivatives are equal.
ˆ
The value D2 f (x)(z, z) is expressed in terms of the Hessian matrix in this way:
ˆ
D2 f (x)(z, z) = z 2 f (x)ˆ .
z Using the gradient vector and Hessian matrix, we can express the secondorder Taylor
approximation of a C2 scalarvalued function as follows:
f ( y) = f (x) + 2.A.7 f (x) ( y − x) + 1 ( y − x)
2 2 f (x)( y − x) + o y − x 2 . The Whitney Extension Theorem While we have deﬁned our K times continuously diﬀerentiable functions to have
domain Rn , nothing we have discussed so far would change were our functions only
deﬁned on open subsets of Rn . In fact, it is also possible to deﬁne CK functions on closed
sets X ⊂ Rn . To do so, one requires F : X → Rm to be CK in the original sense on int(X),
and to admit “local uniform Taylor expansions” at each x on bd(X). The Whitney Extension
Theorem tells us that such functions F can always be extended to CK functions deﬁned
on all of Rn . In eﬀect, the Whitney Extension Theorem provides a deﬁnition of (K times)
continuously diﬀerentiability for functions deﬁned on closed sets. 2.A.8 Vector Integration and the Fundamental Theorem of Calculus Let α : R → Rn be a vectorvalued function deﬁned on the real line. Integrals of α are
computed componentwise: in other words,
b b α(t) dt = (2.30)
a i αi (t) dt.
a 93 It is easy to verify that integration, like diﬀerentiation, is linear: if A ∈ Rm×n then
b b A α(t) dt = A
a α(t) dt.
a With deﬁnition (2.30) in hand, we can state a multivariate version of the Fundamental
Theorem of Calculus. Suppose that F : Rn → Rm is a C1 function. Let α : [0, 1] → Rn be a
C1 function satisfying α(0) = x and α(1) = y, and call its derivative α : R → Rn . Then we
have
1 The Fundamental Theorem of Calculus: F( y) − F(x) = DF(α(t)) α (t) dt.
0 2.A.9 Potential Functions and Integrability When can a continuous vector ﬁeld F : Rn → Rn be expressed as the gradient of some
scalar valued function f ? In other words, when does F = f for some potential function
f : Rn → R? One can characterize the vector ﬁelds that admit potential functions in terms
of their integrals over closed curves: if F : Rn → Rn is continuous, it admits a potential
function if and only if
1 F(α(t)) (2.31)
0 d
α(t)
dt dt = 0 for every piecewise C1 function α : [0, 1] → Rn with α(0) = α(1). If we use C to denote the
closed curve through Rn traced by α, then (2.31) can be expressed more concisely as
F(x) · dx = 0.
C When F is not only continuous, but also C1 , the question of the integrability of F can be
answered by examining crosspartial derivatives. Note ﬁrst that if F admits a C2 potential
function f , then the symmetry of the Hessian matrices of f implies that
(2.32) ∂F j
∂2 f
∂2 f
∂Fi
(x) =
(x) =
(x) =
(x),
∂x j
∂x i ∂x j
∂x j ∂x i
∂xi and hence that the derivative matrix DF(x) is symmetric for all x ∈ Rn . The converse
statement is also true, and provides the characterization of integrability we seek: if F is
C1 , with DF(x) symmetric for all x ∈ Rn (i.e., whenever the integrability condition (2.32)
holds), there is a function f : Rn → R such that f = F. This suﬃcient condition for
94 integrability remains valid whenever the domain of F is an open (or closed) convex subset
of Rn . However, condition (2.32) does not ensure the existence of a potential function for
vector ﬁelds deﬁned on more general domains. 2.B Aﬃne Calculus The simplex in Rn , which serves as our state space in single population games, is an
n − 1 dimensional set. As a consequence, derivatives of functions deﬁned on the simplex
can not be computed in the manner described in Appendix 2.A, as partial derivatives of
such functions do not exist. To understand diﬀerential calculus in this context, and in the
more general context of multipopulation games, we must develop the tools of calculus
for functions deﬁned on aﬃne spaces. 2.B.1 Linear Forms and the Riesz Representation Theorem Let Z be a subspace of Rn , and let L(Z, R) be the set of linear maps from Z to R. L(Z, R)
is also known as the dual space of Z, and elements of L(Z, R), namely, maps λ : Z → R that
ˆ
satisfy λ(az + bz) = aλ(z) + bλ(ˆ ), are also known as linear forms.
z
Each vector y ∈ Z deﬁnes a linear form λ ∈ L(Z, R) via λ(z) = y z. In fact, the converse
statement is also true: every linear form can be uniquely represented in this way.
Theorem 2.B.1 (The Riesz Representation Theorem). For each linear form λ ∈ L(Z, R), there
is a unique y ∈ Z, the Riesz representation of λ, such that λ(z) = y z for all z ∈ Z.
Another way of describing the Riesz representation theorem is to say that Z and L(Z, R)
are linearly isomorphic: the map from Z to L(Z, R) described above is linear, onetoone,
and onto.
It is crucial to note that when Z is a proper subspace of Rn , the linear form λ can
be represented by many vectors in Rn . What Theorem 2.B.1 tells us is that λ can be
represented by a unique vector in Z itself.
Example 2.B.2. Let Z = R2 = {z ∈ R2 : z1 + z2 = 0}, and deﬁne the linear form λ ∈ L(Z, R) by
0
λ(z) = z1 − z2 . Then not only 1
3 ˆ y = , but also y = −1
1
ˆ
represents λ: if z ∈ Z, then y z = 3z1 + z2 = 3z1 + (−z1 ) = 2z1 = z1 − z2 = y z = λ(z). But
since y is an element of Z, it is the Riesz representation of λ. §
95 ˆ
In this example, the reason that both y and y can represent λ is that their diﬀerence, 2 ˆ
y−y= 2
is orthogonal to Z. This suggests a simple way of recovering the Riesz representation of
a linear form from an arbitrary vector representation: eliminate the portion orthogonal to
Z by applying the orthogonal projection PZ .
ˆ
Theorem 2.B.3. Let λ ∈ L(Z, R) be a linear form. If y ∈ Rn represents λ, in the sense that
ˆ
ˆ
λ(z) = y z for all z ∈ Z, then y = PZ y is the Riesz representation of λ.
1
Example 2.B.4. Recall that the orthogonal projection onto R2 is Φ = I − 2 11 . Thus, in the
0
ˆ
previous example, we can recover y from y in the following way: ˆ
y = Φ y = (I − 2.B.2 1
11
2 3 3 2 1 ) = − = .§ 1 1 2 −1 Dual Characterizations of Multiples of Linear Forms Before turning our attention to calculus, we present some results that characterize
when two linear forms are scalar multiples of one another. We will use these results
when studying imitative dynamics in Chapters 4 and 7; see especially Exercise 4.4.18 and
Theorem 7.5.9.
If the vectors v ∈ Rn and w ∈ Rn are nonzero multiples of one another, then v and w
clearly are orthogonal to the same set of vectors in Rn . Conversely, if {v}⊥ = { y ∈ Rn : v y =
0} equals {w}⊥ , then v and w must be (nonzero) multiples of one another, as they are both
normal vectors of the same hyperplane.
When are v and w positive multiples of one another? This is the case if and only if
the set H (v) = { y ∈ Rn : v y ≥ 0}, the closed halfspace consisting of those vectors with
which v forms an acute or right angle, is equal to the corresponding set H (w). Clearly,
H (v) = H (w) implies that {v}⊥ = {w}⊥ , and so that v = cw; since v ∈ H (v) = H (w), it must
be that c > 0.
In summary, we have
Observation 2.B.5.
(i) {x ∈ Rn : v x = 0} = {x ∈ Rn : w x = 0} if and only if v = cw for some c 0.
(ii) {x ∈ Rn : v x ≥ 0} = {x ∈ Rn : w x ≥ 0} if and only if v = cw for some c > 0. 96 Proposition 2.B.6 provides analogues of the characterizations above for settings in
which one can only compare how v and w act on vectors in some subspace Z ⊆ Rn . Since
these comparisons relate v and w as linear forms on Z, Theorem 2.B.3 suggests that the
characterizations should be expressed in terms of the orthogonal projections of v and w
onto Z.
Proposition 2.B.6.
(i) {z ∈ Z : v z = 0} = {z ∈ Z : w z = 0} if and only if PZ v = c PZ w for some c 0.
(ii) {z ∈ Z : v z ≥ 0} = {z ∈ Z : w z ≥ 0} if and only if PZ v = c PZ w for some c > 0.
Proof. The “if” direction of part (i) is immediate. For the “only if” direction, observe
that v z = 0 for all z ∈ Z if and only if v PZ x = 0 for all x ∈ Rn . Since the matrix PZ is
symmetric, we can rewrite the equality above as (PZ v) x = 0; thus, the conclusion that
PZ v = c PZ w with c 0 follows from Observation 2.B.5(i). The proof of part (ii) follows
similarly from Observation 2.B.5(ii).
To cap this discussion, we note that both parts of Observation 2.B.5 are the simplest
cases of more general duality results that link a linear map A ∈ L(Rm , Rn ) ≡ Rn×m with its
transpose A ∈ L(Rn , Rm ) ≡ Rm×n . Part (i) is essentially the m = 1 case of the Fundamental
Theorem of Linear Algebra:
(2.33) range(A) = (nullspace(A ))⊥ . In equation (2.33), the set range(A) = {w ∈ Rn : w = Ax for some x ∈ Rm } is the span of
the columns of A. The set nullspace(A ) = { y ∈ Rn : A y = 0} consists of the vectors that
A maps to the origin; equivalently, it is the set of vectors that are orthogonal to every
column of A. Viewed in this light, equation (2.33) says that w is a linear combination of the
columns of A if and only if any y that is orthogonal to each column of A is also orthogonal
to w. While (2.33) is of basic importance, it is quite easy to derive after taking orthogonal
complements:
(range(A))⊥ = { y ∈ Rn : y Ax = 0 for all x ∈ Rm } = { y ∈ Rn : y A = 0 } = nullspace(A ).
Part (ii) of Observation 2.B.5 is essentially the m = 1 case of Farkas’s Lemma:
(2.34) [w = Ax for some x ∈ Rm ] if and only if [[A y ≥ 0 ⇒ w y ≥ 0] for all y ∈ Rn ].
+ In words: w is a nonnegative linear combination of the columns of A if and only if any y
that forms a weakly acute angle with each column of A also forms a weakly acute angle with
97 w. Despite their analogous interpretations, statement (2.34) is considerably more diﬃcult
to prove than statement (2.33)—see the Notes. 2.B.3 Derivatives of Functions on Aﬃne Spaces Before considering calculus on aﬃne spaces, let us brieﬂy review diﬀerentiation of
scalarvalued functions on Rn . If f is a C1 function from Rn to R, then its derivative at x,
denoted D f (x), is an element of L(Rn , R), the set of linear maps from Rn to R. For each
x ∈ Rn , the map D f (x) takes vectors z ∈ Rn as inputs and returns scalars D f (x)z ∈ R as
outputs. The latter expression appears in the ﬁrst order Taylor expansion
f (x + z) = f (x) + D f (x) z + o(z) for all z ∈ Rn .
By the Riesz Representation Theorem, there is a unique vector f (x) ∈ Rn satisfying
D f (x) z = f (x) z for all z ∈ Rn . We call f (x) the gradient of f at x. In the present
∂f
fulldimensional case, f (x) is the vector of partial derivatives ∂xi (x) of f at x.
Now, let A ⊆ Rn be an aﬃne space with tangent space TA, and consider a function
f : A → R. (As in Appendix 2.A, the ideas to follow can also be applied to functions
whose domain is a set that is open (or closed) relative to A.) We say that f is diﬀerentiable
at x ∈ A if there is a linear map D f (x) ∈ L(TA, R) satisfying
f (x + z) = f (x) + D f (x) z + o(z) for all z ∈ TA.
The gradient of f at x is the Riesz representation of D f (x). In other words, it is the unique
vector f (x) ∈ TA such that D f (x) z = f (x) z for all z ∈ TA. If the function f : A → TA
is continuous, then f is continuously diﬀerentiable, or of class C1 .
When A = Rn , this deﬁnition of the gradient is simply the one presented earlier, and
f (x) is the only vector in Rn that represents D f (x). But in lower dimensional cases, there
are many vectors in Rn that can represent D f (x). The gradient vector f (x) is the only one
lying in TA; all others are obtained by summing f (x) and an element of (TA)⊥ .
When A = Rn , the gradient of f at x is just the vector of partial derivatives of f at x.
But in other cases, the partial derivatives of f may not even exist. How does one compute
f (x) then? Usually, it is easiest to extend the function f to all of Rn in some smooth way,
and then to compute the gradient by way of this extension. In some cases (e.g., when f is
a polynomial), obtaining the extension is just a matter of declaring that the domain is Rn .
But even in this situation, there is an alternative extension that is often handy.
Proposition 2.B.7. Let f : A → R be a C1 function on the aﬃne set A, and let Z = TA.
98 (i) Let f˜ : Rn → R be any C1 extension of f . Then
(ii) Deﬁne f : Rn → R by f (x) = PZ f˜(x) for all x ∈ A. f ( y) = f (PZ y + z⊥ ),
A
where z⊥ is the unique element of A ∩ Z⊥ . Then
A f (x) = f (x) for all x ∈ A. In words, f assigns the value f (x) to each point in Rn whose orthogonal projection onto
TA = Z is the same as that of x ∈ A; the gradient of f is identical to the gradient of f on
the set A.
Proof. Part (i) follows immediately from the relevant deﬁnitions. To prove part (ii),
suppose that x ∈ A. Then by the chain and product rules,
D f (x) = D( f (PZ x + z⊥ )) = D f (x)PZ .
A
This linear form on Rn is represented by the (column) vector f (x) = ( f (x) PZ ) ∈ Rn . But
since the orthogonal projection matrix PZ is symmetric, and since f (x) ∈ Z, we conclude
that
f (x) = ( f (x) PZ ) = PZ f (x) = PZ f (x) = f (x). The fact that PZ is an orthogonal projection makes this proof simple: since PZ is symmetric,
we are able to transfer its action from the displacement direction z ∈ Z to the vector f (x)
itself.
Similar considerations arise for vectorvalued functions deﬁned on aﬃne spaces, and
also for higher order derivatives. If F : A → Rm is C1 , its derivative at x ∈ A is a linear map
DF(x) ∈ L(Z, Rm ), where we once again write Z for TA. While there are many matrices
in Rm×n that represent this derivative, applying the logic above to each component of F
shows that there is a unique such matrix, called the Jacobian matrix or derivative matrix,
whose rows are elements of Z. As before, we abuse notation by denoting this matrix
DF(x). But unlike before, this abuse can create some confusion: if F is “automatically”
deﬁned on all of Rn , one must be careful to distinguish between the derivative matrix of
F : Rn → Rm at x and the derivative matrix of its restriction FA : A → Rm at x; they are
related by DFA (x) = DF(x)PZ .
If the function f : A → R is C2 , then its second derivative at x ∈ A is a symmetric
bilinear map D2 f (x) ∈ L2 (Z, R). There are many symmetric matrices in Rn×n that represent
s
2
D f (x), but there is a unique such matrix whose rows and columns are in Z. We call this
matrix the Hessian of f at x, and denote it 2 f (x). If f˜ : Rn → R is any C2 extension of f ,
99 then we can compute the Hessian of f as 2 f (x) = PZ 2 f˜(x)PZ ; if f ( y) = f (PZ y + z⊥ ) is the
A
constant orthogonal extension of f to Rn , then 2 f (x) = 2 f (x). 2.B.4 Aﬃne Integrability A necessary and suﬃcient condition for a C1 vector ﬁeld F : Rn → Rn to admit
a potential function—that is, a scalar valued function f satisfying f (x) = F(x) for all
x ∈ Rn —is that its derivative matrix DF(x) be symmetric for all x ∈ Rn . We now state
a deﬁnition of potential functions for cases in which the map F is only deﬁned on an
aﬃne space, and show that an appropriate symmetry condition on DF(x) is necessary
and suﬃcient for a potential function to exist. We also relate these notions to their fulldimensional analogues.
Let A ⊆ Rn be an aﬃne space with tangent space Z = TA, and let z⊥ be the unique
A
element of A ∩ Z⊥ . Suppose that the map F : A → Rn is continuous. We call the function
f : A → R a potential function for F if
(2.35) f (x) = PZ F(x) for all x ∈ A. What does this deﬁnition require? Since f (x) ∈ Z, the action of f (x) on Z⊥ is null (that
is, (z⊥ ) f (x) = 0 whenever z⊥ ∈ Z⊥ ). But since F(x) ∈ Rn , the action of F(x) on Z⊥ is not
restricted in this way. Condition (2.35) requires that F(x) have the same action as f (x) on
Z, but places no restriction on how F(x) acts on the complementary set Z⊥ .
Theorem 2.B.8 characterizes the smooth maps on A that admit potential functions. The
characterization is stated in terms of a symmetry condition on the derivatives DF(x).
Theorem 2.B.8. The C1 map F : A → Rn admits a potential function if and only if DF(x) is
ˆ
symmetric with respect to Z × Z for all x ∈ A (i.e., if and only if z DF(x)ˆ = z DF(x)z for all
z
ˆ
z, z ∈ Z and x ∈ A).
Proof. To prove the “only if” direction, suppose that F admits a potential function f
satisfying condition (2.35). This means that for all x ∈ A, F(x) and f (x) deﬁne identical
linear forms in L(Z, R). By taking the derivative of each side of this identity, we ﬁnd that
DF(x) = 2 f (x) as bilinear forms in L2 (Z, R). But since 2 f (x) is a symmetric bilinear form
on Z × Z (by virtue of being a second derivative), DF(x) is as well.
The “if” direction is a consequence of the following proposition.
Proposition 2.B.9. Deﬁne the map F : Rn → Rn by
F( y) = PZ F(PZ y + z⊥ ).
A
100 Then F admits a potential function f : Rn → R if and only if DF(x) is symmetric with respect to
Z × Z for all x ∈ A. In this case, f = f is a potential function for F.
A Proof. Deﬁne the function ξ : Rn → A by ξ( y) = PZ y + z⊥ . Then
A
DF( y) = D PZ F(ξ( y)) = PZ DF(ξ( y)) PZ . (2.36) Now F admits a potential function if and only if DF( y) is symmetric for all y ∈ Rn . Equation
(2.36) tells us that the latter statement is true if and only if DF(x) is symmetric with respect
to Z × Z for all x ∈ A, proving the ﬁrst statement in the proposition.
To prove the second statement, suppose that f is a potential function for F, and let
f = f . Then since ξ(x) = x for all x ∈ A, we ﬁnd that
A f (x) = PZ f (x) = PZ F(x) = PZ PZ F(ξ(x)) = PZ F(x).
This completes the proof of Theorem 2.B.8.
If the C1 map F : A → Rn is integrable (i.e., if it admits a potential function f : A → R),
can we extend F to all of Rn in such a way that the extension is integrable too? One
natural way to proceed is to extend the potential function f to all of Rn . If one does so
˜
in an arbitrary way, then the projected maps PZ F and PZ F will agree regardless of how
the extended potential function f˜ is chosen (cf Observation 2.2.3 and the subsequent
˜
discussion). But is it always possible to choose f˜ in such a way that that F and F are
˜
identical on A, so that the function F is a genuine extension of the function F? Theorem
2.B.10 shows one way that this can be done.
Theorem 2.B.10. Suppose F : A → Rn is continuous with potential function f : A → R. Deﬁne
f˜ : Rn → R by
f˜( y) = f (ξ( y)) + ( y − ξ( y)) F(ξ( y)), where ξ( y) = PZ y + z⊥ ,
A
˜
˜
˜
and deﬁne F : Rn → Rn by F( y) = f˜( y). Then F A = F. Thus, any integrable map/potential
function pair deﬁned on A can be extended to a vector ﬁeld/potential function pair deﬁned on all of
Rn .
˜
Proof. We can compute F from f˜ using the chain and product rules:
˜
F( y) =
= f˜( y)
f (ξ( y)) PZ + ( y − ξ( y)) DF(ξ( y))PZ + F(ξ( y)) (I − PZ )
101 = PZ F(ξ( y)) PZ + ( y − ξ( y)) DF(ξ( y))PZ + F(ξ( y)) − F(ξ( y)) PZ
= F(ξ( y)) PZ PZ + ( y − ξ( y)) DF(ξ( y))PZ + F(ξ( y)) − F(ξ( y)) PZ
= F(ξ( y)) + ( y − ξ( y)) DF(ξ( y))PZ
˜
If x ∈ A, then ξ(x) = x, allowing us to conclude that F(x) = F(x).
If F takes values in Z, so that F(x) = PZ F(x) for all x ∈ A, then f˜( y) is simply f (ξ( y)),
˜
and so F( y) = PZ f (ξ( y)) = PZ F(ξ( y)); in this case, the construction in Theorem 2.B.10 is
identical to the one introduced in Proposition 2.B.9. The novelty in Theorem 2.B.10 is that
it lets us extend the domain of F to all of Rn in an integrable fashion even when F takes
values throughout Rn . 2.N Notes Section 2.1. Sections 2.1.1 through 2.1.6 follow Sandholm (2001), while Section 2.1.7
follows Roughgarden and Tardos (2002, 2004).
Random matching in two player games with common interests deﬁnes a fundamental
model from population genetics; the common interest assumption reﬂects the shared fate
of two genes that inhabit the same organism. See Hofbauer and Sigmund (1988, 1998)
for further discussion. Congestion games ﬁrst appear in the seminal book of Beckmann
et al. (1956), who deﬁne a general model of traﬃc ﬂow with inelastic demand, and
use a potential function argument to establish the existence and uniqueness of Nash
equilibrium. The textbook of Sheﬃ (1985) treats congestion games from a transportation
science perspective at an undergraduate level; the more recent monograph of Patriksson
(1994) provides a comprehensive treatment of the topic from this point of view. Important
examples of ﬁnite player potential games are introduced by Rosenthal (1973) and Slade
(1994), and characterizations of this class of normal form games are provided by Monderer
and Shapley (1996), Ui (2000), and Sandholm (2008b). Example 2.1.6 and Exercise 2.1.12
are due to Sandholm (2005b). Braess’s paradox (Example 2.1.10) was ﬁrst reported in
Braess (1968). Exercise 2.1.11 is well known in the transportation science literature; it also
corrects a mistake (!) in Corollary 5.6 of Sandholm (2001). Versions of the eﬃciency results
in Section 2.1.6 are established by Dafermos and Sparrow (1969) for a model of traﬃc
congestion model and by Hofbauer and Sigmund (1988) for single population games. For
further discussion of constraint qualiﬁcation and of the interpretation of the KuhnTucker
ﬁrst order conditions, see Avriel (1976, Section 3.1) and Harker and Pang (1990). For a
complete treatment of eﬃciency bounds for congestion games, including more general
102 results than those described here, see Roughgarden (2005).
Section 2.2. This section follows Sandholm (2008b).
The general deﬁnition and basic properties of normal form potential games are established by Monderer and Shapley (1996). The triangular integrability condition from
Exercise 2.2.7 is due to Hofbauer (1985). The fact that constant games are potential games
in which potential equals aggregate payoﬀs is important in models of evolutionary implementation; see Sandholm (2002, 2005b, 2007b).
Section 2.3. This section follows Hofbauer and Sandholm (2008).
Evolutionarily stable strategies and neutrally stable strategies are introduced in the
single population random matching context by Maynard Smith and Price (1973) and
Maynard Smith (1982), respectively. The connection between interior ESS and negative
deﬁniteness of the payoﬀ matrix was ﬁrst noted by Haigh (1975). See Hines (1987) for
a survey of early work on these and related concepts. A version of the GESS concept is
used by Hamilton (1967) in his pioneering analysis of sexratio selection under the name
“unbeatable strategy”; see Hamilton (1996, p. 373–374) for an intriguing discussion of the
links between the notions of unbeatable strategy and ESS. Further discussion of ESS can
be found in the Notes to Chapter 7.
For more on RockPaperScissors, see Gaunersdorfer and Hofbauer (1995). The War
of Attrition is introduced in Bishop and Cannings (1978); for economic applications, see
Bulow and Klemperer (1999) and the references therein. Imhof (2005) derives a closedform expression for the Nash equilibrium of the war of attrition in terms of Chebyshev
polynomials of the second kind. The dominant diagonal condition used in Example 2.3.12
is a consequence of the Gerˇ gorin Disk Theorem; see Horn and Johnson (1985). This
s
reference also presents the trace condition used in proving Proposition 2.3.10.
In the convex analysis literature, functions that satisfy our deﬁnition of stability
(though typically with the inequality reversed) are called “monotone”—see Rockafellar (1970) or HiriartUrruty and Lemar´ chal (2001). For more on pseudomonotonicity and
e
pseudoconvexity, see Avriel (1976, Chapter 6) and Crouzeix (1998). The elementary proof
of existence of Nash equilibrium in stable games presented in Section 2.3.5 is a translation
to the present context of work on monotone operators on vector spaces due to Minty
(1967). Good references on the Minmax Theorem and its connection with the Separating
Hyperplane Theorem are Kuhn (2003) and Luce and Raiﬀa (1957).
Section 2.4. The deﬁnition of supermodular population games here comes from Hofbauer and Sandholm (2007). Finite player analogues of the results presented here are
established by Topkis (1979), Vives (1990), and Milgrom and Roberts (1990). Accounts
of these results can be found in Fudenberg and Tirole (1991, Sec. 12.3) and Vives (2005); 103 Topkis (1998) and Vives (2000) are booklength studies. For macroeconomic applications,
see Cooper (1999).
Appendix 2.A. For a textbook treatment of multivariate calculus that emphasizes the
notion of the derivative as a linear map, see Lang (1997, Chapter 17). For the Whitney
Extension Theorem, see Abraham and Robbin (1967) or Krantz and Parks (1999).
Appendix 2.B. The version of the Riesz Representation Theorem presented here, along
with further discussion of calculus on aﬃne spaces, can be found in Akin (1990). For
further discussion of the dual characterizations described at the end of Section 2.B.2, see
Lax (2007, Chapter 13) or HiriartUrruty and Lemar´ chal (2001, Section A.4.3).
e 104 Part II
Deterministic Evolutionary Dynamics 105 CHAPTER THREE
Revision Protocols and Evolutionary Dynamics 3.0 Introduction The theory of population games developed in the previous chapters provides a simple
framework for describing strategic interactions among large numbers of agents. Having
explored these games’ basic properties, we now turn to modeling the behavior of the
agents who play them.
Traditionally, predictions of behavior in games are based on some notion of equilibrium, typically Nash equilibrium or some reﬁnement thereof. These notions are founded
on the assumption of equilibrium knowledge, which posits that each player correctly anticipates how his opponents will act. The equilibrium knowledge assumption is diﬃcult
to justify, and in contexts with large numbers of agents it is particularly strong.
As an alternative to the equilibrium approach, we introduce an explicitly dynamic
model of choice, a model in which agents myopically alter their behavior in response to
their current strategic environment. This dynamic model does not assume the automatic
coordination of agents’ beliefs, and it can accommodate many speciﬁcations of agents’
choice procedures.
These procedures are speciﬁed formally by deﬁning a revision protocol ρ. A revision
protocol takes current payoﬀs and aggregate behavior as inputs; its outputs are conditional
p
switch rates ρi j (πp , xp ), which describe how frequently agents playing strategy i ∈ Sp who
are considering switching strategies switch to strategy j ∈ Sp , given that the current payoﬀ
vector and population state are πp and xp . Revision protocols are ﬂexible enough to
accommodate a wide variety of choice paradigms, including ones based on imitation,
optimization, and other approaches. 107 A population game F describes a strategic environment; a revision protocol ρ describes
the procedures agents follow in adapting their behavior to that environment. Together F
and ρ deﬁne a stochastic evolutionary process in which all random elements are idiosyncratic across agents. Since the number of agents are large, intuition from the law of large
numbers suggests that the idiosyncratic noise will average out, so that aggregate behavior
evolves according to an essentially deterministic process.
After formally deﬁning revision protocols, we spend Section 3.1 deriving the diﬀerential equation that describes this deterministic process. As the diﬀerential equation captures
expected motion under the original stochastic process, we call it the mean dynamic generated by F and ρ. The examples we present in Section 3.2 show how common dynamics
from the evolutionary literature can be derived through this approach.
In the story above, we began with a game and a revision protocol and derived a
diﬀerential equation on the state space X. But if our goal is to investigate the consequences
of a particular choice procedure, it is preferable to ﬁx this revision protocol and let the
game F vary. By doing so, we generate a map from population games to diﬀerential
equations that we call an evolutionary dynamic. This notion of an evolutionary dynamic is
developed in detail in Section 3.3.
Our derivation of deterministic evolutionary dynamics in this chapter is informal,
based solely on an appeal to the idea that idiosyncratic noise should be averaged away
when populations are large. We will formalize this logic in Chapter 9. There we specify
a Markov process to describe stochastic evolution in a large but ﬁnite population. We
then prove that over ﬁnite time spans, this Markov process converges to a deterministic
limit—namely, a solution trajectory of the mean dynamic—as the population size becomes
arbitrarily large.
Until then, we spend Chapters 3 through 8 working directly with the deterministic
limit. To prepare for this, we introduce the rudiments of the theory of ordinary diﬀerential
equations in Appendix 3.A and pursue this topic further in the appendices of the chapters
to come. 3.1
3.1.1 Revision Protocols and Mean Dynamics
Revision Protocols We now introduce a simple, general model of myopic individual choice in population
games.
Let F : X → Rn be a population game with pure strategy sets (S1 , . . ., Sp ) and integer 108 valued population masses (m1 , . . ., mp ). We suppose for now that each population is large
but ﬁnite: population p ∈ P has Nmp members, where N is a positive integer. The set
1
of feasible social states is therefore X N = X ∩ N Zn = {x ∈ X : Nx ∈ Zn }, a discrete grid
embedded in the original state space X. We refer to the parameter N somewhat loosely as
the population size.
The procedures agents follow in deciding when to switch strategies and which strategies to switch to are called revision protocols.
p p p p Deﬁnition. A revision protocol ρp is a map ρp : Rn × Xp → Rn ×n . The scalar ρi j (πp , xp ) is
+
p
called the conditional switch rate from strategy i ∈ S to strategy j ∈ Sp given payoﬀ vector πp
and population state xp .
We will also refer to the collection ρ = (ρ1 , . . . , ρp ) as a revision protocol when no confusion
will arise.
A population game F, a population size N, and a revision protocol ρ deﬁne a continuous
time evolutionary process on X N . A onesizeﬁtsall interpretation of this process is as
follows. Each agent in the society is equipped with a “stochastic alarm clock”. The
times between rings of an agent’s clock are independent, each with a rate R exponential
distribution. (This modeling device is often called a “Poisson alarm clock” for reasons to
be made clear below.) We assume that the rate R satisﬁes
p ρi j (πp , xp ), R ≥ max
pp x ,π ,i, p j∈Sp and that the ring times of diﬀerent agents’ clocks are independent of one another.
The ringing of a clock signals the arrival of a revision opportunity for the clock’s
owner. If an agent playing strategy i ∈ Sp receives a revision opportunity, he switches to
p
strategy j i with probability ρi j /R, and he continues to play strategy i with probability
p
1 − j i ρi j /R; this decision is made independently of the timing of the clocks’ rings. If a
switch occurs, the population state changes accordingly, from the old state x to a new state
y that accounts for the agent’s choice. As the evolutionary process proceeds, the alarm
clocks and the revising agents are only inﬂuenced by the prior history of the process by
way of the current values of payoﬀs and the social state.
This interpretation of the evolutionary process can be applied to any revision protocol.
Still, simpler interpretations are often available for protocols with additional structure.
To motivate one oftsatisﬁed structural condition, observe that in the interpretation
p
provided above, the diagonal components ρii of the revision protocol play no role what 109 soever. But if the protocol is exact—that is, if there is a constant R > 0 such that
p p ρi j (πp , xp ) = R for all πp ∈ Rn , xp ∈ Xp , i ∈ Sp , and p ∈ P , (3.1)
j∈Sp p then the values of these diagonal components become meaningful: in this case, ρii /R =
p
1 − j i ρi j /R is the probability that a strategy i player who receives a revision opportunity
does not switch strategies.
Exact protocols are particularly easy to interpret when R = 1: in this case, agents’
p
clocks ring at rate 1, and for every strategy j ∈ Sp , ρi j itself represents the probability that
an i player whose clock rings proceeds by playing strategy j. We will henceforth assume
that protocols described as exact have clock rate R = 1 unless a diﬀerent clock rate is
speciﬁed explicitly. This focus on unit clock rates is not very restrictive: the only eﬀect
1
of replacing a protocol ρ with its scalar multiple R ρ is to change the speed at which the
evolutionary process runs by a constant factor.
Other examples of protocols that allow alternative interpretations of the evolutionary
process can be found in Section 3.2. 3.1.2 Mean Dynamics N
The model above deﬁnes a stochastic process {Xt } on the state space X N . We now
N
derive a deterministic process that describes the expected motion of {Xt }. In Chapter 9,
we will prove that this deterministic process provides a very good approximation of the
N
behavior of the stochastic process {Xt } so long as the time horizon of interest is ﬁnite and
the population size is suﬃciently large. But having noted this result, we will focus in the
intervening chapters on the deterministic process itself.
The times between rings of each agent’s stochastic alarm clock are independent and
follow a rate R exponential distribution. How many times will this agent’s clock ring
during the next t time units? A basic result from probability theory shows that the
number of rings during time interval [0, t] follows a Poisson distribution with mean Rt.
This fact is all we need to perform the analysis below; a detailed account of the exponential
and Poisson distributions can be found in Appendix 9.A.
N
Let us now compute the expected motion of the stochastic process {Xt } over the next
dt time units, where dt is small. To rein in our notation we focus on the single population
case.
Each agent in the population receives revision opportunities according to an exponential distribution with rate R, and so each expects to receive R dt opportunities during 110 the next dt time units. Thus, if the current state is x, the expected number of revision
opportunities received by agents currently playing strategy i is approximately
Nxi R dt.
We say “approximately” because the value of xi may change during time interval [0, dt],
but this change is very likely to be small if dt is small.
Since an i player who receives a revision opportunity switches to strategy j with
probability ρi j /R, the expected number of such switches during the next dt time units is
approximately
Nxi ρi j dt.
It follows that the expected change in the use in strategy i during the next dt time units is
approximately (3.2) N x j ρ ji − xi
j∈S j∈S ρi j dt. The ﬁrst term in expression (3.2) captures switches to strategy i from other strategies, while
the second captures switches to other strategies from strategy i. Dividing expression (3.2)
by N yields the expected change in the proportion of agents choosing strategy i: that is, in
component xi of the social state. We obtain a diﬀerential equation for the social state by
eliminating the time diﬀerential dt :
˙
xi = x j ρ ji − xi ρi j .
j∈S j∈S This ordinary diﬀerential equation is the mean dynamic corresponding to revision protocol
ρ.
We now describe the mean dynamic for the general multipopulation case.
Deﬁnition. Let F be a population game, and let ρ be a revision protocol. The mean dynamic
corresponding to F and ρ is
(M) p pp ˙
xi = p p x j ρ ji (Fp (x), xp ) − xi
j∈Sp ρi j (Fp (x), xp ).
j∈Sp 111 3.1.3 Target Protocols and Target Dynamics To conclude this section, we introduce a condition on revision protocols that is satisﬁed
in many interesting examples, and that generates mean dynamics that are easy to describe
in geometric terms.
We say that ρ is a target protocol if conditional switch rates under ρ do not depend on
p
agents’ current strategies: in other words, ρi j may depend on the candidate strategy j, but
not on the incumbent strategy i. We can represent target protocols using maps of the form
p
p
p
p
τp : Rn × Xp → Rn , where ρi j ≡ τ j for all i ∈ Sp . This restriction yields mean dynamics of
+
the form
(3.3) p p p p ˙
xi = mp τi (Fp (x), xp ) − xi τ j (Fp (x), xp ),
j∈Sp which we call target dynamics.
p
What is the geometric interpretation of these dynamics? If τp (πp , xp ) ∈ Rn is not the
+
zero vector, we can deﬁne
p p
τi (πp , xp ) λ (π , x ) =
p p p and p
σi (πp , xp ) i∈S = τi (πp , xp )
λp (πp , xp ) . Then σp (πp , xp ) ∈ ∆p is a probability vector, and we can rewrite equation (3.3) as (3.4) p p
λ (F (x), xp ) mp σp (Fp (x), xp ) − xp p
˙
x = 0 if τp (Fp (x), xp ) 0, otherwise. The ﬁrst case of equation (3.4) tells us that the population state xp ∈ Xp moves in the
direction of the target state mp σp ∈ Xp , the representative of the probability vector σp ∈ ∆p
in the state space Xp = mp ∆p ; moreover, motion toward the target state proceeds at rate λp .
Figure 3.1.1(i) illustrates this idea in the single population case; since here the population’s
mass is 1, the target state is just the probability vector σp ∈ Xp = ∆p .
Now suppose that protocol τ is an exact target protocol: a target protocol that is exact
with clock rate R = 1 (see equation (3.1) and the subsequent discussion). In this case,
we call the resulting mean dynamic an exact target dynamic. Since exactness implies that
λp ≡ 1, we often denote exact target protocols by σ rather than τ, emphasizing that the
p
p
values of σp : Rn × Xp → ∆n are probability vectors. Exact target dynamics take the 112 σ = σ (F(x),x) σ = σ (F(x),x)
.
x =σ– x .
x = λ (σ – x)
x x
(ii) an exact target dynamic (λ ≡ 1) (i) a target dynamic Figure 3.1.1: Target dynamics in a single population. especially simple form
(3.5) ˙
xp = mp σp (Fp (x), xp ) − xp . The vector of motion in (3.5) can be drawn with its tail at the current state xp and its head
at the target state mp σp , as illustrated in Figure 3.1.1(ii) in the single population case. 3.2 Examples We now oﬀer a number of examples of revision protocols and their mean dynamics
that we will revisit throughout the remainder of the book. Recall that
Fp (x) = pp 1
mp xi Fi (x)
i∈Sp represents the average payoﬀ obtained by members of population p. It is useful to deﬁne
the excess payoﬀ to strategy i ∈ Sp ,
p
ˆp
Fi (x) = Fi (x) − Fp (x), 113 as the diﬀerence between strategy i’s payoﬀ and the average payoﬀ in population p. The
excess payoﬀ vector for population p is written as
ˆ
Fp (x) = Fp (x) − 1Fp (x).
To conserve on notation, the examples to follow are stated for the single population
setting. When introducing revision protocols, we let π ∈ Rn denote an arbitrary payoﬀ
ˆ
vector; when the population state x ∈ X is also given, we let π = π − 1x π denote the
resulting excess payoﬀ vector.
Example 3.2.1. Pairwise proportional imitation. Revision protocols of the form
(3.6) ρi j (π, x) = x j ri j (π) are called imitative protocols. The natural interpretation of these protocols diﬀers somewhat
from the one presented in Section 3.1.1. Here, an agent who receives a revision opportunity
chooses an opponent at random and observes her strategy. If our agent is playing strategy
i and the opponent strategy j, the agent switches from i to j with probability proportional
to ri j . Note that the value of x j need not be observed; instead, this term in equation (3.6)
reﬂects the agent’s observation of a randomly chosen opponent.
Suppose that after selecting an opponent, the agent imitates the opponent only if the
opponent’s payoﬀ is higher than his own, doing so in with probability proportional to the
payoﬀ diﬀerence:
ρi j (π, x) = x j [π j − πi ]+ .
The mean dynamic generated by this revision protocol is
˙
xi = x j ρ ji (F(x), x) − xi
j∈S = ρi j (F(x), x)
j∈S x j xi [Fi (x) − F j (x)]+ − xi
j∈S = xi x j [F j (x) − Fi (x)]+
j∈S x j (Fi (x) − F j (x))
j∈S = xi Fi (x) − F(x) .
This equation, which we can rewrite as
(R) ˆ
˙
xi = xi Fi (x).
114 deﬁnes the replicator dynamic, the best known dynamic in evolutionary game theory. Under
˙
this dynamic, the percentage growth rate xi /xi of each strategy currently in use is equal to
that strategy’s current excess payoﬀ; unused strategies always remain so. §
Example 3.2.2. Pure imitation driven by dissatisfaction. Suppose that when a strategy i
player receives a revision opportunity, he opts to switch strategies with a probability that
is linearly decreasing in his current payoﬀ. (For example, agents might revise when their
payoﬀs do not meet a uniformly distributed random aspiration level.) In the event that
the agent decides to switch, he imitates a randomly selected opponent. This leads to the
revision protocol
ρi j (π, x) = (K − πi )x j ,
where the constant K is suﬃciently large that conditional switch rates are always positive.
The mean dynamic generated by this revision protocol is
˙
xi = x j ρ ji (F(x), x) − xi
j∈S = ρi j (F(x), x)
j∈S x j (K − F j (x))xi − xi (K − Fi (x))
j∈S = xi K − j∈S x j F j (x) − K + Fi (x) ˆ
= xi Fi (x).
Thus, this protocol’s mean dynamic is the replicator dynamic as well. §
Exercise 3.2.3. Imitation of success. Consider the revision protocol
ρi j (π, x) = τ j (π, x) = x j (π j − K),
where the constant K is smaller than any feasible payoﬀ.
(i) Oﬀer an interpretation of this protocol.
(ii) Show that this protocol generates the replicator dynamic as its mean dynamic.
(iii) Part (ii) implies that the replicator dynamic is a target dynamic. Compute the rate
λ(F(x), x) and target state σ(F(x), x) corresponding to population state x. Describe
how these vary as one changes the value of K.
Exercise 3.2.4. In the single population setting, we call a mean dynamic an antitarget
115 dynamic if it can be expressed as
˙˜
˜
x = λ(F(x), x) x − σ(F(x), x) ,
˜
˜
where λ(π, x) ∈ R+ and σ(π, x) ∈ ∆.
(i) Give a geometric interpretation of antitarget dynamics.
(ii) Show that the replicator dynamic is an antitarget dynamic.
Unlike the imitative protocols introduced above, the protocols to follow have agents
directly evaluate the payoﬀs of candidate strategies.
Example 3.2.5. Logit choice. Suppose that choices are made according to the logit choice
protocol, the exact target protocol deﬁned by
exp(η−1 π j ) ρi j (π, x) = σ j (π, x) = k∈S exp(η−1 πk ) . The parameter η > 0 is called the noise level. If η is large, choice probabilities under the
logit rule are nearly uniform. But if η is near zero, choices are optimal with probability
close to one, at least when the diﬀerence between the best and second best payoﬀ is not
too small. By equation (3.5), the exact target dynamic generated by protocol σ is
(L) ˙
xi = σi (F(x), x) − xi
= exp(η−1 Fi (x))
− xi .
−1
k∈S exp(η Fk (x)) This is the logit dynamic with noise level η. §
Example 3.2.6. Comparison to the average payoﬀ. Consider the target protocol
ˆ
ρi j (π, x) = τ j (π, x) = [π j ]+ .
When an agent’s clock rings, he chooses a strategy at random. If that strategy’s payoﬀ is
above average, the agent switches to it with probability proportional to its excess payoﬀ.
By equation (3.3), the induced target dynamic is
(BNN) ˙
xi = τi (F(x), x) − xi τ j (F(x), x)
j∈S ˆ
= [Fi (x)]+ − xi ˆ
[Fk (x)]+ .
k ∈S 116 This is the Brownvon NeumannNash (BNN) dynamic. §
Example 3.2.7. Pairwise comparisons. Suppose that
ρi j (π, x) = [π j − πi ]+ .
When an agent’s clock rings, he selects a strategy at random. If the new strategy’s
payoﬀ is higher than his current strategy’s payoﬀ, he switches strategies with probability
proportional to the diﬀerence between the two payoﬀs. The resulting mean dynamic,
˙
xi = (S) x j ρ ji (F(x), x) − xi
j∈S = ρi j (F(x), x)
j∈S [F j (x) − Fi (x)]+ , x j [Fi (x) − F j (x)]+ − xi
j∈S j∈S is called the Smith dynamic. § 3.3 Evolutionary Dynamics With this background established, we now provide a formal deﬁnition of evolutionary
dynamics. Let P = {1, . . . , p } be a set of populations with masses mp and strategy sets Sp .
Let X be the corresponding set of social states:
X = {x ∈ Rn : x = (x1 , . . ., xp ), where
+ p i∈Sp xi = mp }. Deﬁne the sets F and T as follows: F = {F : X → Rn : F is Lipschitz continuous};
T = {x : [0, ∞) → X : x is continuous}
F is the set of population games with Lipschitz continuous payoﬀs; T is the set of
continuous forwardtime trajectories through the state space X.
Deﬁnition. An evolutionary dynamic is a setvalued map D : F ⇒ T . It assigns each
population game F ∈ F a set of trajectories D(F) ⊂ T satisfying
Existence and forward invariance: For each ξ ∈ X, there is a {xt }t≥0 ∈ D(F) with x0 = ξ.
Thus, for each game F and each initial condition ξ ∈ X, an evolutionary dynamic must
specify at least one solution trajectory that begins at ξ and then remains in X at all positive
times.
117 This deﬁnition of an evolutionary dynamic is rather general, in that it does not impose
a uniqueness requirement (i.e., since it allows multiple trajectories in D(F) to emanate
from a single initial condition). This generality allows us to handle dynamics deﬁned by
discontinuous diﬀerential equations and by diﬀerential inclusions—see Chapter 5. But for
dynamics deﬁned by Lipschitz continuous diﬀerential equations, this level of generality
is unnecessary: in this case, standard results allow us to ensure not only existence of
solutions, but also:
Uniqueness: For each ξ ∈ X, there is exactly one {xt }t≥0 ∈ D(F) with x0 = ξ.
Lipschitz continuity: For each t, xt = xt (ξ) is a Lipschitz continuous function of ξ.
The basic results on existence and uniqueness of solutions to ordinary diﬀerential
equations concern equations deﬁned on open sets. To contend with the fact that our
mean dynamics are deﬁned on the compact, convex set X, we need conditions ensuring
that solution trajectories do not leave this set. The required conditions are provided by
Theorem 3.3.1: if the vector ﬁeld VF : X → Rn is Lipschitz continuous, and if at each
state x ∈ X, the growth rate vector VF (x) is contained in the tangent cone TX(x), the set
of directions of motion from x that do not point out of X, then all of our desiderata for
solution trajectories are satisﬁed.
Theorem 3.3.1. Suppose VF : X → Rn is Lipschitz continuous, and let S(VF ) ⊂ T be the set
˙
of solutions {xt }t≥0 to x = VF (x). If VF (x) ∈ TX(x) for all x ∈ X, then S(VF ) satisﬁes existence,
forward invariance, uniqueness, and Lipschitz continuity.
Theorem 3.3.1 follows directly from Theorems 3.A.2, 3.A.6, and 3.A.8 in Appendix 3.A.
Its implications for evolutionary dynamics are as follows.
Corollary 3.3.2. Let the map F → VF assign each population game F ∈ F a Lipschitz continuous
vector ﬁeld VF : X → Rn that satisﬁes VF (x) ∈ TX(x) for all x ∈ X. Deﬁne D : F ⇒ T by
˙
D(F) = {{xt } ∈ T : {xt } solves x = VF (x)}.
Then D is an evolutionary dynamic. Indeed, for each F ∈ F , the set D(F) ⊂ T satisﬁes not only
existence and forward invariance, but also uniqueness and Lipschitz continuity.
In light of Corollary 3.3.2, we can identify an evolutionary dynamic D with a map F → VF
that assigns population games to vector ﬁelds on X. We sometimes use the notation V(·)
to refer to an evolutionary dynamic as a map in this sense. 118 To link these results with revision protocols and mean dynamics, we characterize the
tangent cone requirement explicitly: for VF (x) to lie in TX(x), the growth rates of each
population’s strategies must sum to zero, so that population masses stay constant over
time, and the growth rates of unused strategies must be nonnegative.
Proposition 3.3.3. V (x) ∈ TX(x) if and only if these two conditions hold:
p
(i)
i∈Sp Vi (x) = 0 for all p ∈ P .
p
p
(ii) For all i ∈ Sp and p ∈ P , xi = 0 implies that Vi (x) ≥ 0.
Thus, if V : X → Rn is the mean dynamic generated by a game F and a revision protocol ρ, then
V (x) ∈ TX(x) for all x ∈ X.
Exercise 3.3.4. Verify these claims. Appendix
3.A Ordinary Diﬀerential Equations 3.A.1 Basic Deﬁnitions Every continuous vector ﬁeld V : Rn → Rn deﬁnes an ordinary diﬀerential equation
(ODE) on Rn , namely
d
x
dt t = V (xt ). ˙
Often we write xt for
(D) d
x;
dt t we also express the previous equation as ˙
x = V (x). Equation (D) describes the evolution of a state variable xt over time. When the current
state is xt , the current velocity of state—in other words, the speed and direction of the
˙
change in the state—is V (xt ). The trajectory {xt }t∈I is a solution to (D) if xt = V (xt ) at all
times t in the interval I, so that at each moment, the time derivative of the trajectory is
described by the vector ﬁeld V (see Figure 3.A.1).
In many applications, one is interested in solving an initial value problem: that is, in
characterizing the behavior of solution(s) to (D) that start at a given initial condition
ξ ∈ Rn .
Example 3.A.1. Exponential growth and decay. The simplest diﬀerential equation is the linear
˙
equation x = ax on the real line. What are the solutions to this equation starting from
119 V(x 1 )
V(x0 ) x1
x2 V(x 2 ) x0 Figure 3.A.1: A solution of an ordinary diﬀerential equation. initial condition ξ ∈ R? It is easy to verify that xt = ξ exp(at) is a solution to this equation
on the full time interval (−∞, ∞), since
d
x
dt t = d
(ξ exp(at))
dt = a(ξ exp(at)) = axt , as required. This solution describes a process of exponential growth or decay according
to whether a is positive or negative.
˙
In fact, xt = ξ exp(at) is the only solution to x = ax from initial condition ξ. If { yt } is a
˙
solution to x = ax from any initial condition, then
d
dt ˙
yt exp(−at) = yt exp(−at) − ayt exp(−at) = 0. Hence, yt exp(−at) is constant, and so yt = ψ exp(at) for some ψ ∈ R. Since y0 = ξ, it must
be that ψ = ξ. § 3.A.2 Existence, Uniqueness, and Continuity of Solutions Except in cases where the state variable x is one dimensional or the vector ﬁeld V is
linear, explicit solutions to ODEs are usually impossible to obtain. To investigate dynamics
for which explicit solutions are unavailable, one begins by verifying that a solution exists
and is unique, and then uses various indirect methods to determine its properties.
The main tool for ensuring existence and uniqueness of solutions to ODEs is the
120 PicardLindel¨ f Theorem. To state this result, ﬁx an open set O ⊆ Rn . We call the function
o
f : O → Rm Lipschitz continuous if there exists a scalar K such that
f (x) − f ( y) ≤ K x − y for all x, y ∈ O.
More generally, we say that f is locally Lipschitz continuous if for all x ∈ O, there exists an
open neighborhood Ox ⊆ O containing x such that the restriction of f to Ox is Lipschitz
continuous. It is easy to verify that every C1 function is locally Lipschitz.
Theorem 3.A.2 (The PicardLindelof Theorem). Let V : O → Rn be locally Lipschitz contin¨
uous. Then for each ξ ∈ O, there exists a scalar T > 0 and a unique trajectory x : (−T, T) → O
such that {xt } is a solution to (D) with x0 = ξ.
The PicardLindelof Theorem is proved using the method of successive approxima¨
tions. Given an approximate solution xk : (−T, T) → O with xk = ξ, one constructs a new
0
k +1
trajectory x : (−T, T) → O using the map C that is deﬁned as follows:
t xk+1 = C (xk )t ≡ ξ +
t V (xk ) ds.
s
0 It is easy to see the trajectories {xt } that are ﬁxed points of C are the solutions to (D) with
x0 = ξ. Thus, if C has a unique ﬁxed point, the theorem is proved. But it is possible
to show that if T is suﬃciently small, then C is a contraction in the supremum norm;
therefore, the desired conclusion follows from the Banach (or Contraction Mapping) Fixed
Point Theorem.
If V is continuous but not Lipschitz, Peano’s Theorem tells us that solutions to (D) exist,
but in this case solutions need not be unique. The following example shows that when
V does not satisfy a Lipschitz condition, so that small changes in x can lead to arbitrarily
large changes in V (x), it is possible for solution trajectories to escape from states at which
the velocity under V is zero.
˙2
Example 3.A.3. Consider the ODE x = 3 x1/3 on R. The right hand side of this equation is
continuous, but it fails to be Lipschitz at x = 0. One solution to this equation from initial
condition ξ = 0 is the stationary solution xt ≡ 0. Another solution is given by xt = t3/2 . In
fact, for each t0 ∈ [0, ∞), the trajectory that equals 0 until time t0 and satisﬁes xt = (t − t0 )3/2
thereafter is also a solution; so is the trajectory that satisﬁes xt = −(t − t0 )3/2 after time t0 . §
The PicardLindelof Theorem guarantees the existence of a solution to (D) over some
¨
open interval of times. This open interval need not be the full time interval (−∞, ∞), as
the following example illustrates.
121 ˙
Example 3.A.4. Consider the C1 ODE x = x2 on R. The unique solution with initial
1
condition ξ = 1 is xt = 1−t . This solution exists for all negative times, but it “explodes” in
forward time at t = 1. §
When V is locally Lipschitz, one can always ﬁnd a maximal open time interval over
which the solution to (D) from initial condition ξ exists in the domain O. If V is deﬁned
throughout Rn and is bounded, then the speed of all solution trajectories is bounded as
well, which implies that solutions exist for all time.
Theorem 3.A.5. If V : Rn → Rn is locally Lipschitz continuous and bounded, then for each
ξ ∈ Rn , then {xt }, the unique solution to (D) with x0 = ξ, exists for all t ∈ (−∞, ∞).
We will often ﬁnd it convenient to discuss solutions to (D) from more than one initial
condition at the same time. To accomplish this most easily, we introduce the ﬂow of
diﬀerential equation (D).
Suppose that V : Rn → Rn is Lipschitz continuous, and let A ⊂ Rn be an invariant set
under (D): that is, solutions to (D) with initial conditions in A exist and remain in A at
all times t ∈ (−∞, ∞). Then the ﬂow φ : (−∞, ∞) × A → A generated by (D) is deﬁned by
φt (ξ) = xt , where {xt }t∈(−∞,∞) is the solution to (D) with initial condition x0 = ξ. If we ﬁx
ξ ∈ A and vary t, then {φt (ξ)}t∈(−∞,∞) is the solution orbit of (D) through initial condition ξ;
note also that φ satisﬁes the group property φt (φs (ξ)) = φs+t (ξ). If we instead ﬁx t and vary
ξ, then {φt (ξ)}ξ∈A describes the positions at time t of solutions to (D) with initial conditions
in A ⊆ A.
Using this last notational device, we can describe the continuous variation of solutions
to (D) in their initial conditions.
Theorem 3.A.6. Suppose that V : Rn → Rn is Lipschitz continuous with Lipschitz constant
K, and that A ⊂ Rn is invariant under (D). Let φ be the ﬂow of (D), and ﬁx t ∈ (−∞, ∞).
Then φt (·) is Lipschitz continuous with Lipschitz constant eKt : for all ξ, χ ∈ A, we have that
φt (ξ) − φt (χ) ≤ ξ − χ eKt .
The assumption that A is invariant is only made for notational convenience; the theorem
is valid as long as solutions to (D) from ξ and χ exist throughout the time interval from 0
to t.
The proof of Theorem 3.A.6 is a direct consequence of the following inequality, which
is of importance in its own right.
Lemma 3.A.7 (Gronwall’s Inequality). Let zt : [0, T] → R+ be continuous. Suppose C ≥ 0 and
¨
t
K ≥ 0 are such that zt ≤ C + 0 Kzs ds for all t ∈ [0, T]. Then zt ≤ CeKt for all t ∈ [0, T].
122 If we set zt = φt (ξ) − φt (χ) , then the antecedent inequality in the lemma is satisﬁed when
C = ξ − χ and K is the Lipschitz constant for V , so Theorem 3.A.6 immediately follows.
Also note that setting ξ = χ establishes the uniqueness of solutions to (D) from each initial
condition. 3.A.3 Ordinary Diﬀerential Equations on Compact Convex Sets The PicardLindelof Theorem concerns ODEs deﬁned on open subsets of Rn . In con¨
trast, evolutionary dynamics for population games are deﬁned on the set of population
states X, which is compact and convex. Fortunately, existence and uniqueness of forward
time solutions can still be established in this setting.
To begin, we introduce the notion of forward invariance. The set C ⊆ Rn is forward
invariant under the Lipschitz ODE (D) if every solution to (D) that starts in C at time 0
remains in C at all positive times: if {xt } is the solution to (D) from ξ ∈ C, then xt exists and
lies in C at all t ∈ [0, ∞).
When C is forward invariant but not necessarily invariant under (D), we can speak
of the semiﬂow φ : [0, ∞) × A → A generated by (D). While semiﬂows are not deﬁned
for negative times, they resemble ﬂows in many other respects: by deﬁnition, φt (ξ) = xt ,
where {xt }t≥0 is the solution to (D) with initial condition x0 = ξ; also, φ is continuous in t
and ξ, and φ satisﬁes the group property φt (φs (ξ)) = φs+t (ξ).
Now suppose that the domain of the vector ﬁeld V is a compact, convex set C. Intuition
suggests that as long as V never points outward from C, solutions to (D) should be well
deﬁned and remain in C for all positive times.
Theorem 3.A.8 tells us that if we are given a Lipschitz continuous vector ﬁeld V that
is deﬁned on a compact convex set C and that never points outward from the boundary
˙
of C, then the ODE x = V (x) leaves C forward invariant. If in addition the negation of
V never points outward from the boundary of C, then C is both forward and backward
invariant under the ODE.
Theorem 3.A.8. Let C ⊂ Rn be a compact convex set, and let V : C → Rn be Lipschitz continuous.
ˆ
ˆ
ˆ
(i) Suppose that V (x) ∈ TC(x) for all x ∈ C. Then for each ξ ∈ C, there exists a unique
x : [0, ∞) → C with x0 = ξ that solves (D).
ˆ
ˆ
ˆ
ˆ
(ii) Suppose that V (x) ∈ TC(x) ∩ (−TC(x)) for all x ∈ C. Then for each ξ ∈ C, there exists a
unique x : (−∞, ∞) → C with x0 = ξ that solves (D).
Proof. (i) Let V : Rn → Rn be the extension of V : C → Rn deﬁned by V ( y) = V (ΠC ( y)),
where ΠC : Rn → C is the closest point projection onto C (see Section 1.B). Then V is 123 Lipschitz continuous and bounded. Thus, Theorem 3.A.5 tells us that the ODE
(3.7) ˙
y = V ( y) admits unique solutions from all initial conditions in Rn , and that these solutions exist for
all (forward and backward) time. Now let ξ ∈ C, let {xt }t∈(−∞,∞) be the unique solution to
(3.7) with x0 ∈ C, and suppose that xt ∈ C for all positive t; then since V and V agree on C,
{xt }t≥0 must be the unique forward solution to (D) with x0 = ξ. Thus, to prove our result,
it is enough to show that the set C is forward invariant under the dynamic (3.7).
Deﬁne the squared distance function δC : Rn → R by
δC ( y) = min  y − x2 .
x∈C One can verify that δC is diﬀerentiable with gradient
δC ( y) = 2( y − ΠC ( y)).
Hence, if { yt } is a solution to (3.7), then the chain rule tells us that
(3.8) d
δ (y )
dt C t ˙
= δC ( yt ) yt = 2( yt − ΠC ( yt )) V ( yt ) = 2( yt − ΠC ( yt )) V (ΠC ( yt )). Suppose we could show that this quantity is bounded above by zero (i.e., that when
yt − ΠC ( yt ) and V (ΠC ( yt )) are nonzero, the angle between them is weakly obtuse.) This
would imply that the distance between yt and C is nonincreasing over time—in other
words, that δC is a Lyapunov function for the set C under the dynamic (3.7)—which would
in turn imply that C is forward invariant under (3.7).
We divide the analysis into two cases. If yt ∈ C, then yt = ΠC ( yt ), so expression (3.8)
evaluates to zero. On the other hand, if yt C, then the diﬀerence yt − ΠC ( yt ) is in the
normal cone NC(ΠC ( yt )) (see Figure 3.A.2). Since V (ΠC ( yt )) ∈ TC(ΠC ( yt )), it follows that
( yt − ΠC ( yt )) V (ΠC ( yt )) ≤ 0, so the proof is complete.
ˆ
ˆ
ˆ
(ii) If V (x) ∈ TC(x) ∩ (−TC(x)), then a slight modiﬁcation of the argument above shows
d
that dt δC ( yt ) = 2( yt − ΠC ( yt )) V (ΠC ( yt )) = 0, and so that the distance between yt and C is
constant over time under the dynamic (3.7). Therefore, C is both forward and backward
invariant under (3.7), and hence under (D) as well. 124 yt
Π C( yt )
V(Π C( yt )) Figure 3.A.2: The proof of Theorem 3.A.8. 3.N Notes Section 3.1: Bjornerstedt and Weibull (1996) introduce a version of the revision protocol
¨
model and derive the mean dynamics associated with certain imitative decision rules; see
Weibull (1995, Sections 4.4 and 5.3) for a summary. The model studied here builds on
Bena¨m and Weibull (2003) and Sandholm (2003, 2006a). Versions of target dynamics are
ı
considered in Sandholm (2005a) and Hofbauer and Sandholm (2008).
Section 3.2: The replicator dynamic was introduced by Taylor and Jonker (1978), but is
closely related to a number of older models from mathematical biology—see Schuster and
Sigmund (1983). The latter authors coined the term “replicator dynamic”, borrowing the
term “replicator” from Dawkins (1976, 1982). Example 3.2.1, Example 3.2.2, and Exercise
3.2.3 are due to Schlag (1998), Bjornerstedt and Weibull (1996), and Hofbauer (1995a),
¨
respectively.
The logit dynamic is studied by Fudenberg and Levine (1998) and Hofbauer and
Sandholm (2002, 2007). The BNN dynamic was introduced in the context of symmetric
zero sum games by Brown and von Neumann (1950). Nash (1951) uses a discrete version
of this dynamic to devise a proof of existence of Nash equilibrium in normal form games
based on Brouwer’s ﬁxed point theorem. The Smith dynamic, also known as the pairwise
diﬀerence dynamic, was introduced by Smith (1984) to study the dynamics of traﬃc
ﬂow. Generalizations of all of the dynamics from this section are studied in the next two
chapters, where additional references can be found.
125 Appendix 3.A: Hirsch and Smale (1974) and Robinson (1995) are ﬁne introductions to
ordinary diﬀerential equations at the undergraduate and graduate levels, respectively.
Theorem 3.A.8 is adapted from Smirnov (2002, Theorem 5.7). 126 CHAPTER FOUR
Deterministic Dynamics: Families and Properties 4.0 Introduction In the model of evolution introduced in Chapter 3, a large society of agents recurrently
play a population game F by applying a revision protocol ρ. Through an informal appeal
to the law of large numbers, we argued that aggregate behavior in the society can be
described by a diﬀerential equation
(M) ˙
x = VF (x) on the state space X. Alternatively, by ﬁxing the revision protocol ρ, we can deﬁne a map
from population games F to diﬀerential equations (M), a map that we call an evolutionary
dynamic.
In this chapter and the next, we introduce families of evolutionary dynamics, where
the dynamics within each family are deﬁned by qualitatively similar revision protocols.
We investigate the properties of the dynamics in each family. One of our goals in doing
so is to provide an evolutionary justiﬁcation of the prediction of Nash equilibrium play
(see Section 3.0).
The ﬁrst part of this chapter sets the stage for this analysis. We begin in Section 4.1
by stating general principles for evolutionary modeling in game theory. While some of
these principles are implicit in our formulation of evolutionary dynamics, others must be
imposed directly on our revision protocols.
We do so by introducing two desiderata for revision protocols in Section 4.2. Continuity
(C) asks that revision protocols depend continuously on their inputs. Scarcity of data (SD)
demands that the conditional switch rate from strategy i to strategy j only depend on the
127 payoﬀs of these two strategies. Protocols that respect these two properties do not make
unrealistic demands on the amount of information that agents in an evolutionary model
must possess.
Section 4.2 also oﬀers two conditions that relate aggregate behavior under evolutionary
dynamics to incentives in the underlying games. Nash stationarity (NS) asks that the
rest points of the mean dynamic be precisely the Nash equilibria of the game being
played. Positive correlation (PC) requires that out of equilibrium, strategies’ growth rates
be positively correlated with their payoﬀs. Evolutionary dynamics satisfying these two
properties respect payoﬀs in the underlying strategic interaction, and so agree with the
traditional, rationalistic approach to game theory in some primitive way. Section 4.3
previews the performance of each of our families of dynamics under the four desiderata,
and uses examples to highlight the properties of each.
Our study of the families themselves begins in Section 4.4, which introduces imitative
dynamics. These dynamics, whose prototype is the replicator dynamic, are the most
thoroughly studied in the evolutionary literature. While imitative dynamics have many
appealing properties, they admit rest points that are not Nash equilibria; thus, they fail
Nash stationarity (NS), and so fail to provide a full justiﬁcation of the Nash prediction.
We continue to work toward this justiﬁcation in Section 4.5, where we introduce excess
payoﬀ dynamics. These dynamics satisfy Nash stationarity, but as they require agents to
know the average payoﬀs obtained by members of their population, they fail scarcity of
data (SD), and so do not provide the justiﬁcation we seek.
We come to this justiﬁcation at last in Section 4.6, where we deﬁne pairwise comparison
dynamics. These dynamics, whose revision protocols only require agents to compare the
payoﬀs of the pair of strategies at issue, satisfy all four of our desiderata, and so provide
our justiﬁcation of the Nash prediction. This justiﬁcation is developed further in Section
4.7, which shows that any dynamic that combines imitation and pairwise comparison
satisﬁes all of our desiderata as well.
Of course, a more compelling justiﬁcation of the Nash prediction would not only link
Nash equilibrium with stationary states of an evolutionary dynamic, but would also show
that evolution leads to Nash equilibrium from disequilibrium states. This key issue will
be the focus of Part III of the book. 4.1 Principles for Evolutionary Modeling We begin our discussion by proposing ﬁve principles for evolutionary modeling:
(i) Large populations
128 (ii)
(iii)
(iv)
(v) Inertia
Myopia
Limited information
Insensitivity to modeling details The ﬁrst principle, that populations are large, is not only part of the deﬁnition of a
population games; it is also a key component of the deterministic approximation theorem.
This principle buttresses the next two: inertia, that players only occasionally consider
switching strategies; and myopia, that agents condition choices on current behavior and
payoﬀs, and do not attempt to incorporate beliefs about the future course of play into their
decisions. Both of these principles are built into the deﬁnition of a revision protocol: agents
wait a random amount of time before revising, using procedures that only condition on
current payoﬀs and the current social state. All of the ﬁrst three principles are mutually
reinforcing: myopic behavior is most sensible when opponents’ behavior adjusts slowly,
and when populations are large enough that individual members become anonymous,
inhibiting repeated game eﬀects.
The fourth principle holds that agents possess limited information about opponents’
behavior. This principle ﬁts in easily with the previous three. When the number of agents
in an interaction is large, exact information about their aggregate behavior typically is
diﬃcult to obtain. If agents make costly eﬀorts to gather such information, it would be
incongruous to then assume that they act upon it in a shortsighted fashion. The principle
of limited information is expressed in our model through restrictions on allowable revision
protocols, as we discuss below.
The ﬁfth principle for evolutionary modeling, insensitivity to modeling details, is of a
diﬀerent nature than the others. According to this principle, one should be most satisﬁed
with properties of evolutionary dynamics that are not sensitive to the exact speciﬁcation of
the revision protocol. If a property holds for all revision protocols with a certain “family
resemblance”, then one can argue that the property is not a consequence of particular
choices of functional forms, but of more fundamental assumptions about how individuals
make decisions. It is because of this principle that our analyses to come focus on families
of evolutionary dynamics, and on establishing properties of dynamics that hold for all
family members.
The principle of insensitivity to modeling details provides a defense against a well
known critique of evolutionary analysis of games: that it is inherently arbitrary. According
to this critique, modelers who depart from the assumption of perfect rationality are
left with an overwhelming array of alternative assumptions; since the choice among
these assumptions is ultimately made in an ad hoc fashion, the predictions of boundedly
129 rational models must be viewed with suspicion. Heeding the ﬁfth principle enables us to
dispel this critique: if all qualitatively similar models generate the same predictions, then
arbitrariness is no longer an issue. 4.2 Desiderata for Revision Protocols and Evolutionary Dynamics We now turn from general principles for evolutionary modeling to speciﬁc desirable
properties for revision protocols and their mean dynamics. 4.2.1 Limited Information Since revision protocols can be essentially arbitrary functions of the payoﬀ vector Fp (x)
and the population state xp , they allow substantial freedom to specify how agents respond
to current strategic conditions. But as we argued in the introduction, it is most in keeping
with the evolutionary paradigm to specify models of choice in which agents only possess
limited information about their strategic environment. Our ﬁrst two desiderata capture
this idea.
(C)
(SD) Continuity:
Scarcity of data: ρ is Lipschitz continuous.
p
p
p
p
For all i, j ∈ Sp and p ∈ P , ρi j only depends on πi , π j , and x j . It is contrary to the evolutionary paradigm to posit revision protocols that are extremely
sensitive to the exact values of payoﬀs or of the population state. When the population
size is large, exact information about these quantities can be diﬃcult to obtain; myopic
agents are unlikely to make the necessary eﬀorts. These concerns are reﬂected in condition
(C), which requires that agents’ revision protocols be Lipschitz continuous functions of
payoﬀs and the state. Put diﬀerently, condition (C) asks that small changes in aggregate
behavior not lead to large changes in players’ responses.
The information that agents in an evolutionary model possess depends on the application at hand: in some settings—for instance, if information is provided by a central
planner—agents might have every bit of information that could conceivably be of use.
But in others agents might know very little about their strategic environment.
Condition (SD), scarcity of data, imposes a stringent restriction on the nature of agents’
information, allowing agents to know only those facts that are most germane to the
decision at hand. Under this condition, an agent who receives a revision opportunity
chooses a candidate strategy j either by observing the strategy of a randomly chosen
opponent or by selecting a strategy at random from the set Sp . Then, the agent’s decision
130 about whether to switch is based only on the payoﬀs of the current strategy i and the
candidate strategy j.
Some of the protocols we consider require agents to know the payoﬀs of all strategies
in Sp . While such protocols fail condition (SD), one can imagine environments where
this payoﬀ information might be within the agents’ grasp. We therefore also propose this
weaker scarcity of data condition:
(SD∗ ) p p p p p For all i, j ∈ Sp and p ∈ P , ρi j only depends on π1 , π2 , . . ., πnp , and x j . To illustrate the use of these conditions, we recall some examples of revision protocols
from Chapter 3. As all of these examples satisfy continuity (C), we focus our attention on
scarcity of data.
Example 4.2.1. The following three revision protocols generate the replicator dynamic as
their mean dynamics:
p xj p p (4.1) p
ρi j (πp , xp ) (4.2) ρi j (πp , xp ) = (Kp − πi ) mjp , (4.3) ρi j (πp , xp ) = = mp [π j − πi ]+ ,
p px p p p xj mp p (π j − Kp ).
p (In equations (4.2) and (4.3), we assume the constant Kp is chosen so that ρi j (πp , xp ) ≥ 0.)
Protocol (4.1) is pairwise proportional imitation, protocol (4.2) is pure imitation driven by
dissatisfaction, and protocol (4.3) is imitation of success. All three of these protocols satisfy
condition (SD). §
Example 4.2.2. The logit dynamic is derived from the exact target protocol
p p
ρi j (πp , xp ) = p
σ j (πp , xp ) exp(η−1 π j ) = p k∈Sp exp(η−1 πk ) . This protocol fails condition (SD), but it satisﬁes condition (SD∗ ). §
Example 4.2.3. The target protocol
p p p ˆ
ρi j (πp , xp ) = τ j (πp , xp ) = [π j ]+
induces the BNN dynamic as its mean dynamic. This protocol conditions on strategy j’s
p
p
1
1
ˆ
excess payoﬀ π j = π j − mp (xp ) πp , and hence on the population average payoﬀ mp (xp ) πp .
131 Since computing this average payoﬀ requires knowledge of the payoﬀs and utilization
levels of all strategies, this protocol fails both condition (SD) and condition (SD∗ ). § 4.2.2 Incentives and Aggregate Behavior Our two remaining desiderata impose restrictions on mean dynamics, linking the
evolution of aggregate behavior to the incentives in the underlying game. The ﬁrst of the
two constrains equilibrium behavior, the second disequilibrium dynamics.
(NS)
(PC) Nash stationarity:
VF (x) = 0 if and only if x ∈ NE(F).
p
p
Positive correlation: VF (x) 0 implies that VF (x) Fp (x) > 0. Nash stationarity (NS) requires that the Nash equilibria of the game F and the rest points
of the dynamic VF coincide. It can be split into two distinct restrictions. First, (NS) asks
that every Nash equilibrium of F be a rest point of VF . If state x is a Nash equilibrium,
then no agent beneﬁts from switching strategies; (NS) demands that in this situation, the
state be at rest under VF . This does not mean that the agents never switch strategies at
this state; instead, it requires that the expected aggregate impact of switches is nil.
Second, condition (NS) asks that every rest point of VF be a Nash equilibrium of F. If
the current population state is not a Nash equilibrium, then by deﬁnition there are agents
who would beneﬁt from switching strategies. Condition (NS) requires that some of these
agents eventually avail themselves of this opportunity.
Positive correlation (PC) is a mild payoﬀ monotonicity condition that has force whenever
a population is not at rest. To understand its name, view the strategy set Sp = {1, . . ., np }
as a probability space endowed with the uniform probability measure. Then the vectors
p
p
p
VF (x) ∈ Rn and Fp (x) ∈ Rn can be interpreted as random variables on Sp , making it
meaningful to ask about their covariance.
To evaluate this quantity, we make a simple observation: if Y and Z are random
variables and the expectation of Y is zero, then the covariance of Y and Z is just the
expectation of their product:
Cov(Y, Z) = E(YZ) − E(Y)E(Z) = E(YZ).
p Since the dynamic VF keeps population masses constant (in other words, since VF (x) ∈
p
TXp ), we know that the components of VF (x) sum to zero. Thus
p
1
V (x)
np k E(V p (x)) = = 0, and so k∈Sp 132 p
p
1
V (x) Fk (x)
np k Cov(V p (x), Fp (x)) = E(V p (x) Fp (x)) = = 1
V p (x)
np Fp (x). k∈Sp
p p We can therefore restate condition (PC) as follows: if VF (x) 0, then Cov(VF (x), Fp (x)) > 0.
One can visualize condition (PC) through its geometric interpretation: whenever the
p
growth rate vector VF (x) is nonzero, it forms a strictly acute angle with the vector of
payoﬀs Fp (x) (see Examples 4.2.5 and 4.2.7 below). In rough terms, this means that the
direction of motion does not overly distort the direction of the payoﬀ vector.
In this connection, it is worth emphasizing that while the payoﬀ vector Fp (x) can be any
p
p
vector in Rn , forward invariance requires the growth rate vector VF (x) to be an element
of the tangent cone TXp (xp ): its components must sum to zero, and it must not assign
negative growth rates to unused strategies (Proposition 3.3.3). This means that in most
games, evolutionary dynamics must distort payoﬀ vectors in order to remain feasible.
The dynamic that minimizes this distortion, the projection dynamic, is studied in Chapter
5.
There is an important link between our two conditions: the outofequilibrium condition (PC) implies half of the equilibrium condition (NS). In particular, if positive correlation
holds, then every Nash equilibrium of F is a rest point under VF .
This is easiest to see in the single population setting. If x is a Nash equilibrium of F,
then F(x) is in the normal cone of X at x. Since VF (x) is a feasible direction of motion from
x, it is in the tangent cone of X at x; thus, the angle between F(x) and VF (x) cannot be acute.
Positive correlation therefore implies that x is a rest point of VF .
More generally, we have the following result.
Proposition 4.2.4. If VF satisﬁes (PC), then x ∈ NE(F) implies that VF (x) = 0.
Proof. Suppose that VF satisﬁes (PC) and that x ∈ NE(F). Recall that
x ∈ NE(F) ⇔ F(x) ∈ NX(x) ⇔ [v F(x) ≤ 0 for all v ∈ TX(x)] .
p Now ﬁx p ∈ P , and deﬁne the vector v ∈ Rn by vp = VF (x) and vq = 0 for q p. Then
p
v ∈ TX(x) by construction, and so VF (x) Fp (x) = v F(x) ≤ 0. Condition (PC) then implies
p
that VF (x) = 0. Since p was arbitrary, we conclude that VF (x) = 0.
Example 4.2.5. Consider the twostrategy coordination game F1 (x) 1 0 x1 x1 F(x) = F (x) = 0 2 x = 2x , 2
2
2 133 x1 x2
Figure 4.2.1: Condition (PC) in 12 Coordination. and the replicator dynamic for this game, ˆ
V1 (x) x1 F1 (x) x1 x1 − (x1 )2 + 2(x2 )2 V (x) = ˆ
V (x) = x F (x) = x 2x − (x )2 + 2(x )2
22
2
2
1
2
2 , both of which are graphed in Figure 4.2.1. At each state that is not a rest point, the angle
between F(x) and V (x) is acute. At each Nash equilibrium, no vector that forms an acute
angle with the payoﬀ vector is a feasible direction of motion; thus, all Nash equilibria
must be rest points under V . §
ˆ
Exercise 4.2.6. Suppose that F is a twostrategy game, and let VF and VF be Lipschitz
continuous dynamics that satisfy condition (PC). Show that if neither dynamic is at rest
ˆ
ˆ
at state x ∈ X, then VF (x) is a positive multiple of VF (x). Conclude that if VF and VF also
ˆ
satisfy condition (NS), then VF (x) = k(x)VF (x) for some positive function k : X → (0, ∞). In
ˆ
ˆ
this case, the phase diagrams of VF and VF are identical, and solutions to VF and VF diﬀer
only by a change in speed (cf Exercise 4.4.10 below). §
Example 4.2.7. Consider the threestrategy coordination game F1 (x) 1 0 0 x1 x1 F (x) = 0 2 0 x = 2x .
2 2 2
F(x) = F (x) 0 0 3 x 3x 2
2
2
134 Since payoﬀs are now vectors in R3 , they can no longer be drawn in a twodimensional
picture, so we draw the projected payoﬀ vectors x1 − 1 (x1 + 2x2 + 3x3 ) 3 1
2x − 1 (x + 2x + 3x ) 2 31
ΦF(x) = I − 3 11 F(x) = 2
3 3x − 1 (x + 2x + 3x ) 3
2
3
31
instead. Since dynamic VF also takes values in TX, drawing the growth rate vectors VF (x)
and the projected payoﬀ vectors ΦF(x) is enough to evaluate property (PC) (cf Exercise
4.2.8). In Figure 4.2.2(i), we plot the projected payoﬀs ΦF and the replicator dynamic; in
Figure 4.2.2(ii) we plot the projected payoﬀs ΦF and the BNN dynamic. In both cases,
except when VF (x) = 0, the angles between VF (x) and ΦF(x) are always acute. At each
Nash equilibrium x, all directions of motion from x that form an acute angle with ΦF(x)
are infeasible, and so both dynamics are at rest. §
Exercise 4.2.8. Let VF be an evolutionary dynamic for the single population game F. Show
that sgn(VF (x) F(x)) = sgn(VF (x) ΦF(x)). Thus, to check that (PC) holds, it is enough to
verify that it holds with respect to projected payoﬀs. 4.3 Families of Evolutionary Dynamics In the remainder of this chapter and in Chapter 5, we introduce various families and
examples of evolutionary dynamics, and we evaluate them in terms of our four desiderata:
continuity (C), scarcity of data (SD), Nash stationarity (NS), and positive correlation (PC).
Table 4.1 summarizes the results. Let us brieﬂy mention a few of the main ideas from the
analyses to come.
• Imitative dynamics, including the replicator dynamic, satisfy all of the desiderata
except for Nash stationarity (NS): these dynamics admit rest points that are not Nash
equilibria.
• Excess payoﬀ dynamics, including the BNN dynamic, satisfy all of our desiderata
except scarcity of data (SD): the revision protocols that generate these dynamics
involve comparisons between the individual strategies’ payoﬀs and the population’s
average payoﬀ.
• By introducing revision protocols that only require pairwise comparisons of payoﬀs,
we obtain a family of evolutionary dynamics that satisfy all four desiderata.
135 1 2 3 (i) The replicator dynamic
1 2 3 (ii) The BNN dynamic
Figure 4.2.2: Condition (PC) in 123 Coordination. 136 Family Leading example (C) (SD) (NS) (PC) imitation
excess payoﬀ
pairwise comparison replicator
BNN
Smith
best response
logit
projection yes
yes
yes
no
yes
no yes
no
yes
yesa
yesa
no perturbed best response a
b no
yes
yes
yesb
no
yes yes
yes
yes
yesb
no
yes These dynamics fail condition (SD), but satisfy the weaker requirement (SD*).
The best response dynamics satisfy versions of conditions (NS) and (PC) deﬁned for diﬀerential inclusions. Table 4.1: Families of evolutionary dynamics and their properties. • The best response dynamic satisﬁes versions of all of the desiderata except continuity: its revision protocol depends discontinuously on payoﬀs.
• We can eliminate the discontinuity of the best response dynamic by introducing
perturbations, but at the cost of violating the incentive conditions. In fact, choosing
the level of perturbations involves a tradeoﬀ between condition (C) and conditions
(NS) and (PC): smaller perturbations reduce the degree of smoothing, while larger
perturbations make the failures of the incentive conditions more severe.
• The projection dynamic minimizes the discrepancy at each state between the vector
of payoﬀs and the vector representing the directions of motion. It satisﬁes both of
incentive conditions, but neither of the limited information conditions. There are
a variety of close connections between the projection dynamic and the replicator
dynamic.
Figure 4.3.1 presents phase diagrams for the six basic dynamics in the standard RockPaperScissors game FR (x) 0 −1 1 xR xS − xP x = x − x . P R
F(x) = FP (x) = 1
0 −1 S F (x) −1 1 x x − x 0
S
P
R
S
1
The unique Nash equilibrium of RPS places equal mass on each strategy: x∗ = ( 1 , 1 , 3 ).
33
In the phase diagrams, colors represent speed of motion: within each diagram, motion
is fastest in the red regions and slowest in the blue ones. In this example, the maximum 137 (i) replicator (ii) projection (iii) Brownvon NeumannNash (iv) Smith (v) best response (vi) logit(.08) Figure 4.3.1: Six basic dynamics in the RockPaperScissors game. 138 √ speed under the replicator dynamic is 42 ≈ .3536, while the maximum speed under the
√
other ﬁve dynamics is 2 ≈ 1.4142. Some remarks on the phase diagrams:
• The replicator and projection dynamics exhibit closed orbits around the Nash equilibrium. Under the other four dynamics, the Nash equilibrium is globally asymptotically stable.
• The replicator dynamic has rest points at the Nash equilibrium and at each of the
pure states. Under the other dynamics, the only rest point is the Nash equilibrium.
• The phase diagram for the BNN dynamic can be divided into six regions. In the
“odd” regions, exactly one strategy has above average payoﬀs, so the dynamic
moves directly toward a pure state, just as under the best response dynamic. In
the “even” regions, two strategies have above average payoﬀs; as these regions are
traversed, the “target point” of the dynamic passes from one pure state to the next.
• Compared to those of the BNN dynamic, solutions of the Smith dynamic approach
the Nash equilibrium at closer angles and at higher speeds.
• Under the best response dynamic, solution trajectories always aim directly toward
the state representing the current best response. The trajectories are kinked whenever
best responses change.
• Unlike those of the best response dynamic, solutions trajectories of the logit dynamic
are smooth. The directions of motion under the two dynamics are similar, except at
states near the boundaries of the best response regions.
• Under the replicator dynamic, the boundary consists of three rest points and three
heteroclinic orbits that connect distinct rest points. All told, the boundary forms what
is known as a heteroclinic cycle.
• Under the projection dynamic, there is a unique forward solution from each initial
condition, but backward solutions are not unique. For example, the outermost
closed orbit (the inscribed circle) is reached in ﬁnite time by every solution trajectory
that starts outside of it. In addition, there are solution trajectories that start in the
interior of the state space and reach the boundary in ﬁnite time—an impossibility
under any of the other dynamics
We develop these and many other observations in the sections to come. 139 4.4
4.4.1 Imitative Dynamics
Deﬁnition Imitative dynamics are based on revision protocols of the form
p pp ˆ
ρi j (πp , xp ) = x j ri j (πp , xp ), (4.4)
p p ˆ
where x j = x j /mp is the proportion of population p members playing strategy j ∈ Sp . We
can interpret these protocols as follows: When an agent’s clock rings, he randomly chooses
an opponent from his population. If the agent is playing strategy i ∈ Sp and the opponent
strategy j ∈ Sp , then the agent imitates the opponent with probability proportional to the
p
conditional imitation rate ri j .
The revision protocol (4.4) generates a mean dynamic of the form
(4.5) pp p p xk ρki (Fp (x), xp ) − xi ˙
xi = p ρik (Fp (x), xp ) k∈Sp
ppp
p
pp
ˆ
ˆ
xk xi rki (Fp (x), xp ) − xi
xk rik (Fp (x), xp )
k∈Sp
k∈Sp
p
pp
p
ˆ
xi
xk rki (Fp (x), xp ) − rik (Fp (x), xp ) .
k∈Sp
k∈Sp =
= If the revision protocol satisﬁes the requirements below, the diﬀerential equation above
deﬁnes an imitative dynamic.
p Deﬁnition. Suppose that the conditional imitation rates ri j are Lipschitz continuous, and that
net conditional imitation rates are monotone:
(4.6) p p p p p p π j ≥ πi ⇔ rk j (πp , xp ) − r jk (πp , xp ) ≥ rki (πp , xp ) − rik (πp , xp ) for all i, j, k ∈ Sp and p ∈ P . Then the map from population games F ∈ F to diﬀerential equations (4.5) is called an imitative
dynamic.
Condition (4.6) says that whenever strategy j ∈ Sp has a higher payoﬀ than strategy
i ∈ Sp , then the net rate of imitation from any strategy k ∈ Sp to j exceeds the net rate of
imitation from k to i. We illustrate this condition in the next subsection using a variety of
examples; the condition’s implications for aggregate behavior are developed thereafter.
Example 4.4.1. The replicator dynamic. The fundamental example of an imitative dynamic 140 is the replicator dynamic, deﬁned by
(R) p
p ˆp
˙
xi = xi Fi (x). Under the replicator dynamic, the percentage growth rate of each strategy i ∈ Sp currently
p
ˆp
in use equals its excess payoﬀ Fi (x) = Fi (x) − Fp (x); unused strategies remain so. We
provide a variety of derivations of the replicator dynamic below. § 4.4.2 Examples The examples to follow are expressed in the setting of a single, unit mass population,
ˆ
so that xi = xi . They are easily recast for multipopulation settings.
Example 4.4.2. Imitation via pairwise comparisons. Suppose that ρi j (π, x) = x j φ(π j − πi ),
where φ : R → R+ equals 0 on (−∞, 0] and is strictly increasing on [0, ∞). In this case,
an agent only imitates his randomly chosen opponent when the opponent’s payoﬀ is
higher than the agent’s own. Protocols of this form satisfy condition (4.6). If we write
ψ(d) = φ(d) − φ(−d), then we can express the corresponding mean dynamic as
˙
xi = xi xk φ(Fi (x) − Fk (x)) − φ(Fk (x) − Fi (x))
k∈S = xi xk ψ(Fi (x) − Fk (x)).
k∈S Setting φ(d) = [d]+ gives us the pairwise proportional imitation protocol from Example 3.2.1.
In this case ψ(d) = d, and the mean dynamic is the replicator dynamic (R). §
Exercise 4.4.3. Suppose we generalize the Example 4.4.2 by letting ρi j (π, x) = x j φi j (π j − πi ),
where each function φi j equals 0 on (−∞, 0] and is strictly increasing on [0, ∞). Explain
why the resulting mean dynamic need not satisfy condition (4.6), and so need not be an
imitative dynamic. (For an interesting contrast, see Section 4.6.)
Example 4.4.4. Pure imitation driven by dissatisfaction. Suppose that ρi j (π, x) = a(πi ) x j . Then
when the clock of an i player rings, he abandons his current strategy with probability
proportional to the abandonment rate a(πi ); in such instances, he imitates a randomly chosen
opponent. In this case, condition (4.6) requires that a : R → R+ be strictly decreasing, and
the mean dynamic becomes
(4.7) ˙
xi = xi
k∈S xk a(Fk (x)) − a(Fi (x)) = xi 141 k ∈S xk a(Fk (x)) − a(Fi (x)) . If abandonment rates take the linear form a(πi ) = K − πi (where K is large enough), then
(4.7) is again the replicator dynamic (R). §
Example 4.4.5. Imitation of success. Suppose ρi j (π, x) = x j c(π j ). Then when an agent’s clock
rings, he picks an opponent at random; if the opponent is playing strategy j, the player
imitates him with probability proportional to the copying rate c(π j ). In this case, condition
(4.6) requires that c : R → R+ be strictly increasing, and the mean dynamic becomes
(4.8) ˙
xi = xi
k∈S xk c(Fi (x)) − c(Fk (x)) = xi c(Fi (x)) − k∈S xk c(Fk (x)) . Since ρ is a target protocol (i.e., since ρi j ≡ τ j ), the mean dynamic (4.8) is actually a target
dynamic: xk c(Fk (x)) ˙
x i = k ∈ S 0 xi c(Fi (x))
− xi
k∈S xk c(Fk (x)) if x j c(F j (x)) 0 for some j ∈ S, otherwise. If copying rates are of the linear form c(π j ) = π j + K (for K large enough), then (4.8) is
once again the replicator dynamic (R). If in addition payoﬀs are nonnegative and average
payoﬀs are positive, we can choose c(π j ) = π j , so that (4.8) becomes
(4.9) ˙
xi = F(x) xi Fi (x)
F(x) − xi . Here, the target state is proportional to the vector of popularityweighted payoﬀs xi Fi (x),
with the rate of motion toward this state governed by average payoﬀs F(x). §
Exercise 4.4.6. Why is the restriction on payoﬀs needed to obtain equation (4.9)?
Example 4.4.7. Imitation of success with repeated sampling. Suppose that
ρi j (π, x) = x j w(π j )
,
k∈S xk w(πk ) where k∈S xk w(πk ) > 0. Here, when an agent’s clock rings he chooses an opponent at
random. If the opponent is playing strategy j, the agent imitates him with probability
proportional to the copying weight w(π j ). If the agent does not imitate this opponent, he
draws a new opponent at random and repeats the procedure. In this case, condition (4.6)
requires that w : R → R+ be strictly increasing. Since ρ is an exact target protocol (i.e.,
142 (i) The replicator dynamic (ii) The Maynard Smith replicator dynamic Figure 4.4.1: Two imitative dynamics in 123 Coordination. since ρi j ≡ σ j and
(4.10) ˙
xi = j∈S σ j ≡ 1), it induces the exact target dynamic xi w(Fi (x))
− xi . §
k∈S xk w(Fk (x)) We conclude with two important instances of repeated sampling.
Example 4.4.8. The Maynard Smith replicator dynamic. If payoﬀs are nonnegative and average payoﬀs are positive, we can let copying weights equal payoﬀs: w(π j ) = π j . The
resulting exact target dynamic,
(4.11) ˙
xi = xi Fi (x)
F(x) − xi , is known as the Maynard Smith replicator dynamic.
Example 4.4.5 showed that under the same assumptions on payoﬀs, the replicator
dynamic takes the form (4.9). The Maynard Smith replicator dynamic (4.11) diﬀers from
(4.9) only in that the target state is approached at a unit rate rather than at a rate determined
by average payoﬀs; thus, motion under (4.9) is relatively fast when average payoﬀs are
relatively high. Comparing the protocol here to the one from Example 4.4.5 reveals the
source of the diﬀerence in speeds: under repeated sampling, the overall payoﬀ level has
little inﬂuence on the probability that a revising agent winds up switching strategies.
In the single population setting, the phase diagrams of (4.9) and (4.11) are identical, and
the dynamics only diﬀer in terms of the speed at which solution trajectories are traversed
143 H T H T H
H T T (i) The replicator dynamic (ii) The Maynard Smith replicator dynamic Figure 4.4.2: Two imitative dynamics in Matching Pennies. (cf Exercise 4.4.10). We illustrate this in Figure 4.4.1, which presents phase diagrams for
the two dynamics in 123 Coordination.
When there are multiple populations, the fact that average payoﬀs diﬀer across populations implies that the phase diagrams of (4.9) and (4.11) no longer coincide. This is
illustrated in Figure 4.4.2, which presents phase diagrams for this Matching Pennies game: H
T h
2, 1
1, 2 t
1, 2
2, 1 While interior solutions of (4.9) form closed orbits around the unique Nash equilibrium
1
x∗ = (( 2 , 1 ), ( 1 , 1 )), interior solutions of (4.11) converge to x∗ . §
2
22
Example 4.4.9. The ilogit dynamic. If the copying weights w(π j ) = exp(η−1 π j ) are exponential functions of payoﬀs, the exact target dynamic (4.10) becomes the ilogit dynamic with
noise level η > 0.
˙
xi = xi exp(η−1 Fi (x))
− xi .
−1
k∈S xk exp(η Fk (x)) Here, the ith component of the target state is proportional both to the mass of agents
playing strategy i and to an exponential function of strategy i’s payoﬀ. If η is small, and x
is not too close to the boundary of X or of any best response region, then the target state
144 is close to eb(x) , the vertex of X corresponding to the current best response. Therefore, in
most games, the ilogit dynamic with small η approximates the best response dynamic
˙
x ∈ B(x) − x on much of int(X). We illustrate this in Figure 4.4.3, which presents four
ilogit dynamics (with η = .5, .1, .05, and .01) and the best response dynamic for the
anticoordination game −1 0
0 x1 −x1 0 −1 0 x = −x . § 2 2 F(x) = Ax = x −x 0
0 −1
3
3
Exercise 4.4.10. Changes of speed and reparameterizations of time. Let V : Rn → Rn be a
Lipschitz continuous vector ﬁeld and let k : Rn → (0, ∞) be a positive Lipschitz continuous
˙
function. Let {xt } be a solution to x = V (x) with initial condition ξ, and let { yt } be a solution
t
˙
to x = k(x)V (x), also with initial condition ξ. Show that yt = xI(t) , where I(t) = 0 k( ys ) ds. 4.4.3 Biological Derivations of the Replicator Dynamic While we have derived the replicator dynamic from models of imitation, its origins lie
in mathematical biology, where it arises from models of intra and interspecies competition. The next two exercises, which are set in a single population, consider the replicator
dynamic from this point of view.
Exercise 4.4.11. In the basic game theoretic model of natural selection within a single animal
species, each strategy i ∈ S represents a behavioral type. The value of Fi (x) represents
the (reproductive) ﬁtness of type i when the current proportions of types are described by
x ∈ int(X). In particular, if we let yi ∈ (0, ∞) represent the (absolute) number animals of
type i in the population, then the evolution of the population is described by
(4.12) ˙
yi = yi Fi (x), where xi = yi
j∈S yj . Show that under equation (4.12), the vector x describing the proportions of animals of
each of each type evolves according to the replicator equation (R).
Exercise 4.4.12. The LotkaVolterra equation. The LotkaVolterra equation is a fundamental
model of biological competition among members of multiple species. When there are
n − 1 species, the equation takes the form
(4.13) ˙
yk = yk bk + (My)k , k ∈ {1, . . . , n − 1},
145 (i) The ilogit(.5) dynamic (ii) The ilogit(.1) dynamic (iii) The ilogit(.05) dynamic (iv) The ilogit(.01) dynamic (v) The best response dynamic
Figure 4.4.3: ilogit and best response dynamics in Anticoordination. 146 where bk is the baseline growth rate for species k, and the interaction matrix M ∈ R(n−1)×(n−1)
governs crossspecies eﬀects. Show that after the change of variable
xi = yi
1+ n −1
l =1 yl and xn = 1
1+ n−1
l =1 yl , the n − 1 dimensional LotkaVolterra equation (4.13) is equivalent up to a change of speed
(cf Exercise 4.4.10) to the n strategy replicator dynamic
˙
xi = xi ((Ax)i − x Ax), i ∈ {1, . . . , n},
where the payoﬀ matrix A ∈ Rn×n is related to M ∈ R(n−1)×(n−1) and b ∈ Rn−1 by the R(n−1)×n
matrix equation
M b = I (−1) A.
If M and b are given, this equation determines A up to an additive constant in each column.
Thus, A can always be chosen so that either the elements of its last row or the elements of
its diagonal are all 0. 4.4.4 Extinction and Invariance We now derive properties shared by all imitative dynamics. First of all, it follows
immediately from equation (4.5) that all imitative dynamics satisfy extinction: if a strategy
is unused, its growth rate is zero.
(4.14) p p If xi = 0, then Vi (x) = 0. Extinction implies that the growth rate vectors V (x) are always tangent to the boundaries of
X: formally, V (x) is not only in TX(x), but also in −TX(x) (cf Proposition 3.3.3). Thus, since
imitative dynamics are Lipschitz continuous, it follows from Theorem 3.A.8 in Chapter 3
that solutions to imitative dynamics exist for all positive and negative times.
˙
Proposition 4.4.13 (Forward and backward invariance). Let x = VF (x) be an imitative dynamic. Then for each initial condition ξ ∈ X, this dynamic admits a unique solution trajectory in
T (−∞,∞) = {x : (−∞, ∞) → X : x is continuous}.
Extinction also implies a second invariance property: if {xt } is a solution trajectory of
an imitative dynamic, then the support of xt is independent of t. Uniqueness of solution 147 trajectories, which is implied by the Lipschitz continuity of the dynamic, is an essential
ingredient of the proof of this result.
Theorem 4.4.14 (Support invariance). If {xt } is a solution trajectory of an imitative dynamic,
p
then the sign of component (xt )i is independent of t ∈ (−∞, ∞).
˙
Proof. Let {xt } be a solution to the imitative dynamic x = V (x), and suppose that x0 = ξ.
p
p
Suppose that ξi = 0; we want to show that (xt )i = 0 for all t ∈ (−∞, ∞). To accomplish
ˆ
this, we deﬁne a new vector ﬁeld V : X → Rn as follows: 0 if j = i and q = p,
ˆ q (x) = Vj
q V (x) otherwise. j p ˆ
ˆ
˙
ˆ
ˆ
If {xt } ⊂ X is the unique solution to x = V (x) with x0 = ξ, then (xt )i = 0 for all t. But V
p
ˆ
ˆ
and V are identical whenever xi = 0 by extinction (4.14); therefore, {xt } is also a solution
˙
˙
ˆ
to x = V (x). Since solutions to x = V (x) are unique, it must be that {xt } = {xt }, and hence
p
that (xt )i = 0 for all t.
p
p
Now suppose that ξi > 0. If xt = χ satisﬁed χi = 0, then the preceding analysis would
˙
imply that there are two distinct solutions to x = V (x) with xt = χ, one that is contained in
the boundary of X and one that is not. As this would contradict uniqueness of solutions,
p
we conclude (xt )i > 0 at all times t.
All of the phase diagrams presented in this section illustrate the face invariance property. The next example points out one of its more subtle consequences.
Example 4.4.15. Figure 4.4.4 presents the phase diagram of the replicator dynamic for a
game with a strictly dominant strategy: for all x ∈ X, F1 (x) = 1 and F2 (x) = F3 (x) = 0.
There are two connected components of rest points: one consisting solely of the unique
Nash equilibrium e1 , and the other containing those states at which strategy 1 is unused.
Clearly, the latter component is unstable, as all nearby solution trajectories lead away from
it and toward the Nash equilibrium. But as the coloring of the ﬁgure indicates, the speed
of motion away from the unstable component is very slow: if a small behavior disturbance
pushes the state oﬀ of the component, it may take a long time before the stable equilibrium
is reached. § 148 Figure 4.4.4: The replicator dynamic in a game with a strictly dominant strategy. 4.4.5 Monotone Percentage Growth Rates and Positive Correlation We now turn to monotonicity properties of imitative dynamics. All dynamics of form
(4.5) can be expressed as
(4.15) p p p p p p p p ˆ
xk rki (Fp (x), xp ) − rik (Fp (x), xp ) . ˙
xi = Vi (x) = xi Gi (x), where Gi (x) =
k∈Sp
p p p If strategy i ∈ Sp is in use, then Gi (x) = Vi (x)/xi represents the percentage growth rate of the
number of agents using this strategy.
Observation 4.4.16 notes that under every imitative dynamic (as deﬁned in Section
4.4.1), strategies’ percentage growth rates are ordered by their payoﬀs.
Observation 4.4.16. All imitative dynamics exhibit monotone percentage growth rates:
(4.16) p p p p Gi (x) ≥ G j (x) if and only if Fi (x) ≥ F j (x). This observation is immediate from condition (4.6), which deﬁnes imitative dynamics.
Condition (4.16) is a strong restriction on strategies’ percentage growth rates. We now
show that it implies our basic payoﬀ monotonicity condition, which imposes a weak
restriction on strategies’ absolute growth rates.
149 Theorem 4.4.17. All imitative dynamics satisfy positive correlation (PC).
Proof. Let x be a social state at which V p (x)
To do so, we deﬁne
p p 0; we need to show that V p (x) Fp (x) > 0. p p S+ (x) = {i ∈ Sp : Vi (x) > 0} and S− (x) = { j ∈ Sp : V j (x) < 0}
to be the sets of population p strategies with positive and negative absolute growth rates,
respectively. By extinction (4.14), these sets are contained in the support of xp . It follows
that
p p p
S+ (x) = {i ∈ S :
p p
xi > 0 and Vi (x)
p xi > 0} and p
S− (x) = {j ∈ S :
p p
xj > 0 and V j (x)
p xj < 0}. Since V (x) ∈ TX, we know from Proposition 3.3.3 that
p p Vk (x) = − Vk (x),
p p k∈S+ (x) k∈S− (x) and since V p (x) 0, these expressions are positive. Therefore, condition (4.16) enables us
to conclude that
p p p Vk (x) Fk (x) + V p (x) Fp (x) = p Vk (x) Fk (x)
p p k∈S+ (x) k∈S− (x)
p p i∈S+ (x) p j∈S− (x) p k∈S+ (x)
p p j∈S− (x) Vk (x)
p k∈S− (x)
p = mpin Fi (x) − max F j (x)
p
i∈S+ (x) p Vk (x) + max F j (x)
p ≥ min Fi (x)
p Vk (x) > 0.
p k∈S+ (x) We conclude this section by considering two other monotonicity conditions that appear
in the literature.
Exercise 4.4.18. In the single population setting, an imitative dynamic (4.15) has aggregate
monotone percentage growth rates if
(4.17) ˆ
ˆ
y G(x) ≥ y G(x) if and only if y F(x) ≥ y F(x) ˆ
for all population states x ∈ X and mixed strategies y, y ∈ ∆.
(i) Show that any imitative dynamic satisfying condition (4.17) is equivalent to the
replicator dynamic up to a reparameterization of time (see Exercise 4.4.10). (Hint:
150 Use Proposition 2.B.6 to show that condition (4.17) implies that ΦG(x) = c(x) ΦF(x)
for some c(x) > 0. Then use the fact that G(x) x = 0 (why?) to conclude that
ˆ
˙
xi = k(x) xi Fi (x).)
(ii) If a multipopulation imitative dynamic satisﬁes the natural analogue of condition
(4.17), what can we conclude about the dynamic?
Exercise 4.4.19. An dynamic of form (4.15) has signpreserving percentage growth rates if
(4.18) p
ˆp
sgn(Gi (x)) = sgn(Fi (x)). Show that any such dynamic satisﬁes positive correlation (PC). (Note that dynamics
satisfying condition (4.18) need not satisfy condition (4.6), and so need not be imitative
dynamics as we have deﬁned them here. In fact, there does not appear to be an intuitive
restriction on revision protocols that leads to condition (4.18).) 4.4.6 Rest Points and Restricted Equilibria Since all imitative dynamics satisfy positive correlation (PC), Proposition 4.2.4 tells us
that their rest points include all Nash equilibria of the underlying game F. On the other
hand, face invariance tells us that nonNash rest points can exist—for instance, while pure
states in X are not always Nash equilibria of F, they are necessarily rest points of VF .
To characterize the set of rest points, we ﬁrst recall the deﬁnition of Nash equilibrium:
p p p NE(F) = {x ∈ X : xi > 0 ⇒ Fi (x) = max F j (x)}.
p
j∈S Bearing this deﬁnition in mind, we deﬁne the set of restricted equilibria of F by
p p p RE(F) = {x ∈ X : xi > 0 ⇒ Fi (x) = max F j (x)}.
p
j∈S :x j >0 In words, x is a restricted equilibrium of F if it is a Nash equilibrium of a restricted version
of F in which only strategies in the support of x can be played.
Exercise 4.4.20. Alternate deﬁnitions of restricted equilibrium.
(i) Show that x ∈ RE(F) if
and only if within each population p, all strategies in the support of xp achieve the
p
p
same payoﬀ: RE(F) = {x ∈ X : xi > 0 ⇒ Fi (x) = πp }.
(ii) We can also oﬀer a geometric deﬁnition of restricted equilibrium. Let X[x] be the
ˆ
ˆ
set of social states whose supports are contained in the support of x : X[x] = {x ∈
ˆ
p
p
ˆ
X : xi = 0 ⇒ xi = 0}. Show that x ∈ RE(F) if and only if the payoﬀ vector F(x) is
contained in the normal cone of X[x] at x : RE(F) = {x ∈ X : F(x) ∈ NX[x] (x)}.
151 Because imitative dynamics exhibit face invariance, strategies that are initially unused
are never subsequently chosen. This suggests a link between rest points of imitative
dynamics and the restricted equilibria of the underlying game that is established in the
following theorem.
˙
Theorem 4.4.21. If x = VF (x) is an imitative dynamic, then RP(VF ) = RE(F).
p Proof. x ∈ RP(V ) ⇔ Vi (x) = 0 for all i ∈ Sp , p ∈ P
p ⇔
⇔ Vi (x)
p
xi
p
Fi (x) p (by (4.14)) p (by (4.16)) = 0 whenever xi > 0, p ∈ P = πp whenever xi > 0, p ∈ P ⇔ x ∈ RE(F).
While there are rest points of imitative dynamics that are not Nash equilibria, we will
see that nonNash rest points are locally unstable—see Chapter 7. On the other hand, as
Example 4.4.15 illustrates, the speed of motion away from these unstable rest points is
initially rather slow.
Exercise 4.4.22.
(i) Suppose that the payoﬀs of one population game are the negation
of the payoﬀs of another. What is the relationship between the replicator dynamics
of the two games?
(ii) Give an example of a threestrategy game whose Nash equilibrium is unique and
whose replicator dynamic admits seven rest points. 4.5 Excess Payoﬀ Dynamics In the next two subsections we consider revision protocols that are not based on imitation of successful opponents, but rather on the direct evaluation of alternative strategies.
Under such protocols, good unused strategies will be discovered and chosen, raising the
possibility that the dynamics satisfy Nash stationarity (NS). 4.5.1 Deﬁnition and Interpretation In some settings, particularly those in which information about population aggregates
is provided by a central planner, agents may know their population’s current average
payoﬀ. (Of course, this violates of scarcity of data (SD).) Suppose that each agent’s
choices are based on comparisons between the various strategies’ current payoﬀs and the
152 population’s average payoﬀ, and that these choices do not condition on the agent’s current
strategy. Then the agents’ choice procedure can be described using a target protocol of
the form
p p ˆ
ρi j (πp , xp ) = τ j (πp ),
p p 1
ˆ
where πi = πi (x) − mp (xp ) πp represents the excess payoﬀ to strategy i ∈ Sp . Such a protocol
generates the target dynamic (4.19) p p p p ˆ
˙
xi = mp τi (Fp (x)) − xi ˆ
τ j (Fp (x))
j∈Sp = p ˆp p ˆp
mp τi (F (x)) − xp if τp (Fp (x)) ˆ j∈S τ j (F (x)) p ˆp τ j (F (x)) 0, j∈S otherwise. 0 To obtain our new class of dynamics, we introduce a monotonicity condition for the
ˆ
protocol τ. To do so, let us ﬁrst observe that the excess payoﬀ vector Fp (x) cannot lie in the
p
interior of the negative orthant Rn : for this to happen, every strategy would have to earn
−
a below average payoﬀ. Bearing this in mind, we can let the domain of the function τp be
p
p
p
p
p
p
the set Rn = Rn − int(Rn ). Note that int(Rn ) = Rn − Rn is the set of excess payoﬀ vectors
−
−
∗
∗
p
p
under which at least one strategy earns an above average payoﬀ, while bd(Rn ) = bd(Rn )
−
∗
is the set of excess payoﬀ vectors under which no strategy earns an above average payoﬀ.
With this notation in hand, we can deﬁne our family of dynamics.
p p Deﬁnition. Suppose the protocols τp : Rn → Rn are Lipschitz continuous and satisfy acuteness:
+
∗
(4.20) p ˆ
ˆˆ
If πp ∈ int(Rn ), then τp (πp ) πp > 0.
∗ Then the map from population games F ∈ F to diﬀerential equations (4.19) is called an excess
payoﬀ dynamic.
ˆ
How should one interpret condition (4.20)? If the excess payoﬀ vector πp has a positive
component, this condition implies that
ˆ
σp (πp ) =
i∈S 1
ˆ
τp (πp ) ∈ ∆p ,
p
ˆ
τi (πp ) the probability vector that deﬁnes the target state, is well deﬁned. Acuteness requires
153 ˆ
that if we pick a component of the excess payoﬀ vector πp at random according to this
probability vector, then the expected value of this randomly chosen component is strictly
positive. Put diﬀerently, acuteness asks that on average, revising agents switch to strategies
with above average payoﬀs.
Example 4.5.1. The BNN dynamic. Suppose that conditional switch rate to strategy i ∈ Sp
p
p
ˆ
ˆ
is given by the positive part of strategy i’s excess payoﬀs: τi (πp ) = [πi ]+ . The resulting
mean dynamic,
(BNN) p
p
ˆp
˙
xi = mp [Fi (x)]+ − xi p ˆ
[F j (x)]+ ,
j∈Sp is called the Brownvon NeumannNash (BNN) dynamic. §
Exercise 4.5.2. kBNN dynamics. The kBNN dynamic is generated by the revision protocol
p
p
ˆ
ˆ+
τi (πp ) = [πi ]k , where k ≥ 1. Argue informally that if k is large, then at “typical” states,
the direction of motion under the kBNN dynamic is close to that under the best response
˙
dynamic, xp ∈ mp Bp (x) − xp (see Chapter 5), but that the speed of motion is not. 4.5.2 Incentives and Aggregate Behavior Our goal in this section is to show that every excess payoﬀ dynamic satisﬁes our two
incentive properties.
˙
Theorem 4.5.3. Every excess payoﬀ dynamic x = VF (x) satisﬁes Nash stationarity (NS) and
positive correlation (PC).
We prove this result under the assumption that τp satisﬁes sign preservation:
(4.21) p p ˆ
ˆ
sgn(τi (πp )) = sgn([πi ]+ ). A proof using only acuteness is outlined in Exercise 4.5.8 below. We also focus on the
single population case; the proof of the multipopulation case is a simple extension of the
argument below.
The proof follows immediately from the following three lemmas.
ˆ
Lemma 4.5.4. F(x) ∈ bd(Rn ) if and only if x ∈ NE(F).
∗
ˆ
Proof. F(x) ∈ bd(Rn ) ⇔ Fi (x) ≤
∗ k∈S xk Fk (x) for all i ∈ S ⇔ there exists a c ∈ R such that Fi (x) ≤ c for all i ∈ S,
154 with F j (x) = c whenever x j > 0
⇔ F j (x) = max Fk (x) whenever x j > 0
k∈S ⇔ x ∈ NE(F).
ˆ
Lemma 4.5.5. If F(x) ∈ bd(Rn ), then VF (x) = 0.
∗
Proof. Immediate from sign preservation (4.21).
ˆ
Lemma 4.5.6. If F(x) ∈ int(Rn ), then VF (x) F(x) > 0.
∗
ˆ
ˆ
ˆ
Proof. Recall that F(x) = F(x) − 1F(x) and that VF (x) = τ(F(x)) − 1 τ(F(x)) x. The ﬁrst
ˆ
deﬁnition implies that x and F(x) are always orthogonal:
ˆ
x F(x) = x F(x) − 1F(x) = x F(x) − F(x) = 0.
ˆ
Combining this with the second deﬁnition, we see that if F(x) ∈ int(Rn ), then
∗
ˆ
VF (x) F(x) = VF (x) (F(x) + 1F(x))
ˆ
= VF (x) F(x) since VF (x) ∈ TX ˆ
ˆ
ˆ
= (τ(F(x)) − 1 τ(F(x))x) F(x)
ˆ
ˆ
= τ(F(x)) F(x) ˆ
since x F(x) = 0 >0 by acuteness (4.20). Exercise 4.5.7. Suppose that revision protocol τp is Lipschitz continuous, acute, and separable:
p p p τi (πp ) ≡ τi (πi ).
Show that τp also satisﬁes sign preservation (4.21).
Exercise 4.5.8. This exercise shows how to establish properties (NS) and (PC) using only
continuity and acuteness (4.20)—that is, without requiring sign preservation (4.21). The
proofs of Lemmas 4.5.4 and 4.5.6 go through unchanged, but Lemma 4.5.5 requires additional work. Using acuteness and continuity, show that
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
(i) If π ∈ bd(Rn ) and πi < 0, then τi (π) = 0. (Hint: Consider πε = π + εe j , where π j = 0.)
∗
n
ˆ
ˆ
ˆ
ˆ
ˆ
(ii) If π ∈ bd(R∗ ) and πi = π j = 0, then τ(π) = 0. (Hint: To show that τi (π) = 0, consider
ε
2
ˆ
ˆ
π = π − εei + ε e j .)
Then use these two facts to prove Lemma 4.5.5.
155 Exercise 4.5.9. This exercise demonstrates that in general, one cannot “normalize” a target
dynamic in order to create an exact target dynamic. This highlights a nontrivial sense in
which the former class of dynamics is more general than the latter.
Recall that in the single population setting, the BNN dynamic is deﬁned by the target
ˆ
ˆ
protocol τi (π) = [πi ]+ .
(i) It is tempting to try to deﬁne an exact target protocol by normalizing τ in an
appropriate way. Explain why such a protocol would not be welldeﬁned.
(ii) To attempt to circumvent this problem, one can construct a dynamic that is derived
from the normalized protocol whenever the latter is welldeﬁned. Show that such
a dynamic must be discontinuous in some games. (Hint: It is enough to consider
twostrategy games.) 4.6 Pairwise Comparison Dynamics Excess payoﬀ dynamics satisfy Nash stationarity (NS), positive correlation (PC), and
continuity (C), but they fail scarcity of data (SD). The revision protocols that underlie these
dynamics require agents to compare their current payoﬀ with the average payoﬀ obtained
in their population. Without the assistance of a central planner, the latter information is
unlikely to be known to the agents.
A natural way to reduce these informational demands is to replace the population’s
average payoﬀ with another reference payoﬀ, one whose value agents can directly access.
We accomplish this by considering revision protocols based on pairwise payoﬀ comparisons, which satisfy scarcity of data (SD). In the remainder of this section, we show that
the resulting evolutionary dynamics can be made to satisfy our other desiderata as well. 4.6.1 Deﬁnition Suppose that the revision protocol ρp only directly conditions on payoﬀs, not the
population state. The induced mean dynamic is then of the form
(4.22) p pp ˙
xi = p p x j ρ ji (Fp (x)) − xi
j∈Sp ρi j (Fp (x)),
j∈Sp This equation and a mild monotonicity condition on ρ deﬁnes our next class of dynamics.
Deﬁnition. Suppose that the revision protocol ρ is Lipschitz continuous and sign preserving:
(4.23) p p p sgn(ρi j (πp )) = sgn([π j − πi ]+ ) for all i, j ∈ Sp and p ∈ P .
156 Then the map from population games F ∈ F to diﬀerential equations (4.22) is called a pairwise
comparison dynamic.
Sign preservation (4.23) is a particularly natural property: it says that the conditional
switch rate from i ∈ Sp to j ∈ Sp is positive if and only if the payoﬀ to j exceeds the payoﬀ
to i.
Example 4.6.1. The Smith dynamic. The simplest sign preserving revision protocol,
p p p ρi j (πp ) = [π j − πi ]+ .
generates the Smith dynamic:
p p ˙
xi = (S) p p p p j∈Sp p [F j (x) − Fi (x)]+ . § x j [Fi (x) − F j (x)]+ − xi j∈Sp
p p p Exercise 4.6.2. The kSmith dynamic. Consider instead the protocol ρi j (πp ) = [π j − πi ]k ,
+
where k ≥ 1. Argue informally that in the single population case, when k is large, the
direction of motion from most states x is approximately parallel to an edge of the simplex.
How is this edge determined from the payoﬀ vector F(x)? 4.6.2 Incentives and Aggregate Behavior Our main result in this section is
Theorem 4.6.3. Every pairwise comparison dynamic satisﬁes Nash stationarity (NS) and positive
correlation (PC).
The proof of this theorem relies on three equivalences between properties of Nash
equilibria and evolutionary dynamics on the one hand, and requirements that sums of
p
p
p
p
p
p
terms of the form ρi j , [F j − Fi ]+ , or ρi j [F j − Fi ]+ equal zero on the other. Sign preservation
ensures that sums of the three types are identical, allowing us to establish the result.
˙
In what follows, x = V (x) is the pairwise comparison dynamic generated by the
population game F and revision protocol ρ.
p Lemma 4.6.4. x ∈ NE(F) ⇔ For all i ∈ Sp and p ∈ P , xi = 0 or p j∈Sp Proof. Both statements say that each strategy in use at x is optimal.
p Lemma 4.6.5. V p (x) = 0 ⇔ For all i ∈ Sp , xi = 0 or
157 p j∈Sp p [F j (x) − Fi (x)]+ = 0. ρi j (Fp (x)) = 0. Proof. (⇐) Immediate.
(⇒) Fix a population p ∈ P , and suppose that V p (x) = 0. If j is an optimal strategy for
p
population p at x, then sign preservation implies that ρ jk (Fp (x)) = 0 for all k ∈ Sp , and so
that there is no “outﬂow” from strategy j:
p p ρ ji (Fp (x)) = 0. xj i∈Sp
p Since V j (x) = 0, there can be no “inﬂow” into strategy j either:
pp xi ρi j (Fp (x)) = 0.
i∈Sp We can express this condition equivalently as
p p For all i ∈ Sp , either xi = 0 or ρi j (Fp (x)) = 0.
If all strategies in Sp earn the same payoﬀ at state x, the proof is complete. Otherwise,
p
let i be a “second best” strategy—that is, a strategy whose payoﬀ Fi (x) is second highest
among the payoﬀs available from strategies in Sp at x. The last observation in the previous
p
paragraph and sign preservation tell us that there is no outﬂow from i. But since Vi (x) =
0, there is also no inﬂow into i:
p p For all k ∈ Sp , either xk = 0 or ρki (Fp (x)) = 0.
Iterating this argument for strategies with lower payoﬀs establishes the result.
Lemma 4.6.6. Fix a population p ∈ P . Then
(i) V p (x) Fp (x) ≥ 0.
p
(ii) V p (x) Fp (x) = 0 ⇔ For all i ∈ Sp , xi = 0 or p j∈Sp p p ρi j (Fp (x))[F j (x) − Fi (x)]+ = 0. Proof. We compute the inner product as follows:
V p (x) Fp (x) =
j∈Sp pp
xi ρi j (Fp (x)) − i∈Sp i∈Sp
pp = p
xj p p
p ρ ji (Fp (x)) F j (x) pp p xi ρi j (Fp (x))F j (x) − x j ρ ji (Fp (x))F j (x)
j∈Sp i∈Sp
pp = p p xi ρi j (Fp (x)) F j (x) − Fi (x)
j∈Sp i∈Sp 158 = p
xi p i∈S p
p
p ρi j (Fp (x))[F j (x) − Fi (x)]+ , p j∈S where the last equality follows from signpreservation. Both claims directly follow.
Theorem 4.6.3 follows immediately from these three lemmas and sign preservation
(4.23). 4.6.3 Desiderata Revisited Pairwise comparison dynamics satisfy all four of the desiderata proposed at the beginning of the chapter: continuity (C), scarcity of data (SD), Nash stationarity (NS), and
positive correlation (PC). To provide some insight into this result, we compare revision
protocols that generate the three key dynamics from this chapter:
p p p p replicator: ˆ
ρi j (πp , xp ) = x j [π j − πi ]+ ; BNN: ρi j (πp , xp ) = [π j − πp ]+ ; Smith: ρi j (πp , xp ) = [π j − πi ]+ . p p p p p From the point of view of our desiderata, the protocol that generates the Smith dynamic
combines the best features of the other two. Like the protocol for the BNN dynamic, the
Smith protocol is based on direct evaluations of payoﬀs rather than imitation, allowing it
to satisfy Nash stationarity (NS). Like the protocol for the replicator dynamic, the Smith
protocol is based on comparisons of individual strategies’ payoﬀs rather than comparisons
involving aggregate statistics, and so satisﬁes scarcity of data (SD). Thus, while the BNN
and replicator dynamics each satisfy three of our desiderata, the Smith dynamic satisﬁes
all four. 4.7 Multiple Revision Protocols and Combined Dynamics The results above might seem to suggest that dynamics satisfying all four desiderata
are rather special, in that they must be derived from a very speciﬁc sort of revision protocol.
We now argue to the contrary that these desiderata are satisﬁed rather broadly.
To make this point, let us consider what happens if an agent uses multiple revision
protocols at possibly diﬀerent intensities. If an agent uses the revision protocol ρV at
intensity a and the revision protocol ρW at intensity b, then his behavior is described by
the new revision protocol ρC = aρV + bρW . Moreover, since mean dynamics are linear
159 in conditional switch rates, the mean dynamic for the combined protocol is a linear
combination of the two original mean dynamics: CF = aVF + bWF .
Theorem 4.7.1 links the properties of the original and combined dynamics.
Theorem 4.7.1. Suppose that the dynamic VF satisﬁes (PC), that the dynamic WF satisﬁes (NS)
and (PC), and that a, b > 0. Then the combined dynamic CF = aVF + bWF also satisﬁes (NS) and
(PC).
p p p Proof. To show that CF satisﬁes (PC), suppose that CF (x) 0. Then either VF (x), WF (x),
p
or both are not 0. Since VF and WF satisfy (PC), it follows that VF (x) Fp (x) ≥ 0, that
p
WF (x) Fp (x) ≥ 0, and that at least one of these inequalities is strict. Consequently,
p
CF (x) Fp (x) > 0, and so CF satisﬁes (PC).
Our proof that CF satisﬁes (NS) is divided into three cases. First, if x is a Nash
equilibrium of F, then it is a rest point of both VF and WF , and hence a rest point of CF as
well. Second, if x is a nonNash rest point of VF , then it is not a rest point of WF . Since
VF (x) = 0 and WF (x) 0, it follows that CF (x) = bWF (x) 0, so x is not a rest point of
CF . Finally, suppose that x is not a rest point of VF . Then by Proposition 4.2.4, x is not
a Nash equilibrium, and so x is not a rest point of WF either. Since VF and WF satisfy
p
condition (PC), we know that VF (x) F(x) = p∈P VF (x) Fp (x) > 0 and that WF (x) F(x) > 0.
Consequently, CF (x) F(x) > 0, implying that x is not a rest point of CF . Thus, CF satisﬁes
(NS).
A key implication of Theorem 4.7.1 is that imitation and Nash stationarity are not
incompatible. If agents usually rely on imitative protocols but occasionally follow protocols that directly evaluate strategies’ payoﬀs, then the rest points of the resulting mean
dynamics are precisely the Nash equilibria of the underlying game. Indeed, if we combine
an imitative dynamic VF with any small amount of a pairwise comparison dynamic WF ,
we obtain a combined dynamic CF that satisﬁes all four of our desiderata.
1
9
Example 4.7.2. Figure 4.7.1 presents a phase diagram for the 10 replicator + 10 Smith
dynamic in standard RockPaperScissors. Comparing this diagram to those for the replicator and Smith dynamics alone (Figure 4.3.1), we see that the diagram for the combined
dynamic more closely resembles the Smith phase diagram than the replicator phase diagram, and in more than one respect: the combined dynamic has exactly one rest point,
1
the unique Nash equilibrium x∗ = ( 3 , 1 , 1 ), and all solutions to the combined dynamic
33
converge to this state. We will revisit this fragility of imitative dynamics in Chapter 8,
where it will appear in a much starker form. § 160 Figure 4.7.1: The 4.N 9
10 1
replicator + 10 Smith dynamic in RPS. Notes Section 4.2: This section follows Sandholm (2006a).
A wide variety of payoﬀ monotonicity conditions have been considered in the literature; for examples, see Nachbar (1990), Friedman (1991), Samuelson and Zhang (1992),
Swinkels (1993), Ritzberger and Weibull (1995), Hofbauer and Weibull (1996), and Sandholm (2001). Positive correlation is essentially the weakest condition that has been proposed. Most existing conditions are strictly stronger (see the notes to Section 4.4 below).
Friedman’s (1991) weak compatibility is positive correlation plus the additional restriction
that unused strategies are never subsequently chosen. Swinkels (1993) calls a dynamic a
myopic adjustment dynamic if it satisﬁes positive correlation, but he allows Fp (x) V p (x) = 0
even when V p (x) 0.
Section 4.4: The approach to imitative revision protocols and dynamics in this section
builds on the work of Bjornerstedt and Weibull (1996), Weibull (1995), and Hofbauer
¨
(1995a).
Taylor and Jonker (1978) introduce the replicator dynamic to provide a dynamic analogue of Maynard Smith and Price’s (1973) equilibrium (ESS) model of animal conﬂict.
Exercise 4.4.12, which shows that the replicator dynamic is equivalent after a nonlinear
(barycentric) change of variable to the LotkaVolterra equation (Lotka (1920), Volterra 161 (1931)), is due to Hofbauer (1981). Schuster and Sigmund (1983) further observe that
fundamental models of population genetics (e.g., Crow and Kimura (1970)) and of biochemical evolution (e.g., Eigen and Schuster (1979)) can be viewed as special cases of the
replicator dynamic; they are also the ﬁrst to refer to the dynamic by this name. For more
on these biological models, see Hofbauer and Sigmund (2003). For a detailed analysis of
the replicator dynamic from an economic point of view, see Weibull (1995, Chapter 3). The
derivations of the replicator dynamic in Examples 4.4.2, 4.4.4, and 4.4.5 are due to Schlag
(1998), Bjornerstedt and Weibull (1996), and Hofbauer (1995a), respectively.
¨
The Maynard Smith replicator dynamic can be found in Maynard Smith (1982, Appendices D and J). For a contrast between the standard and Maynard Smith replicator
dynamics from a biological point of view, see Hofbauer and Sigmund (1988, Section 27.1).
The ilogit dynamic is due to Bjornerstedt and Weibull (1996) and Weibull (1995).
¨
Most early work by economists on deterministic evolutionary dynamics focuses on
generalizations of the replicator dynamic expressed in terms of percentage growth rates,
as in equation (4.15). The condition we call monotone percentage growth rates (4.16)
has appeared in many places under a variety of names: relative monotonicity (Nachbar
(1990)), order compatibility of predynamics (Friedman (1991)), monotonicity (Samuelson and
Zhang (1992), and payoﬀ monotonicity (Weibull (1995)). Aggregate monotone percentage
growth rates (4.17) and Exercise 4.4.18 are introduced by Samuelson and Zhang (1992).
Signpreserving percentage growth rates (4.18) is a condition due to Nachbar (1990); see
also Ritzberger and Weibull (1995), who call this condition payoﬀ positivity. For surveys
of the literature referenced here, see Weibull (1995, Chapters 4 and 5) and Fudenberg and
Levine (1998, Chapter 3).
Sections 4.5, 4.6, and 4.7: These sections follow Sandholm (2005a, 2006a).
The Brownvon NeumannNash dynamic was introduced in the context of symmetric
zerosum games by Brown and von Neumann (1950). Nash (1951) uses a discrete time
analogue of this dynamic as the basis for his simple proof of existence of equilibrium
based on Brouwer’s Theorem. More recently, the BNN dynamic was reintroduced by
Skyrms (1990), Swinkels (1993), and Weibull (1996), and by Hofbauer (2000), who gave
the dynamic its name. The Smith dynamic was introduced in the transportation science
literature by Smith (1984). 162 CHAPTER FIVE
Best Response and Projection Dynamics 5.0 Introduction This chapter continues the parade of evolutionary dynamics commenced in Chapter
4. In the ﬁrst two sections, the step from payoﬀ vector ﬁelds to evolutionary dynamics
is traversed through a traditional gametheoretic approach, by employing best response
correspondences and perturbed versions thereof. The third section follows a geometric
approach, deﬁning an evolutionary dynamic via closest point projections of payoﬀ vectors.
The best response dynamic embodies the assumption that revising agents always switch
to their current best response. Because the best response correspondence is discontinuous and multivalued, the basic properties of solution trajectories under the best response
dynamic are quite diﬀerent from those of our earlier dynamics: multiple solution trajectories can sprout from a single initial condition, and solution trajectories can cycle in and
out of Nash equilibria. Despite these diﬃculties, we will see that analogues of incentive
properties (NS) and (PC) still hold true.
While the discontinuity of the best response protocol stands in violation of a basic
desideratum from Chapter 4, one can obtain a continuous protocol by working with perturbed payoﬀs. The resulting perturbed best response dynamics are continuous (and even
diﬀerentiable), and so have wellbehaved solution trajectories. While the payoﬀ perturbations prevent our incentive conditions from holding exactly, we show that appropriately
perturbed versions of these conditions, deﬁned in terms of socalled “virtual payoﬀs”, can
be proved.
Our ﬁnal evolutionary dynamic, the projection dynamic, is motivated by geometric
considerations: we deﬁne the growth rate vector under the projection dynamic to be 163 the closest approximation of the payoﬀ vector by a feasible vector of motion. While the
resulting dynamic is discontinuous, its solutions still exist, are unique, and are continuous
in their initial conditions; moreover, both of our incentive conditions are easily veriﬁed.
We show that the projection dynamic can be derived from protocols that reﬂect “revision
driven by insecurity”. These protocols also reveal surprising connections between the
projection dynamic and the replicator dynamic, connections that we develop further
when studying the global behavior of evolutionary dynamics in Chapter 6.
The dynamics studied in this chapter require us to introduce new mathematical techniques. Determining the basic properties of the best response dynamic and the projection
dynamic requires ideas from the theory of diﬀerential inclusions (i.e., of set valued diﬀerential equations), which we develop in Appendix 5.A. A key tool for analyzing perturbed
best response dynamics is the Legendre transform, whose basic properties are explained
in Appendix 5.B. These properties are central to our analysis of perturbed maximization,
which is oﬀered in Appendix 5.C. 5.1
5.1.1 The Best Response Dynamic
Deﬁnition and Examples Traditional game theoretic analysis is based on the assumption of equilibrium play.
This assumption can be split into two distinct parts: that agents have correct beliefs about
their opponents’ behavior, and that they choose their strategies optimally given those
beliefs. When all agents simultaneously have correct beliefs and play optimal responses,
their joint behavior constitutes a Nash equilibrium.
It is natural to introduce an evolutionary dynamic based on similar principles. To
accomplish this, we suppose that each agent’s revision opportunities arrive at a ﬁxed rate,
and that when an agent receives such an opportunity, he chooses a best response to the
current population state. Thus, we assume that each agent responds optimally to correct
beliefs whenever he is revising, but not necessarily at other points in time.
Before introducing the best response dynamics, let us review the notions of exact target
protocols and dynamics introduced in Section 3.1.3. Under an exact target protocol, conp
p
ditional switch rates ρi j (πp , xp ) ≡ σ j (πp , xp ) are independent of an agent’s current strategy.
p
These rates also satisfy j∈S σ j (πp , xp ) ≡ 1, so that σp (πp , xp ) ∈ ∆p is a mixed strategy. Such
a protocol induces the exact target dynamic
(5.1) ˙
xp = mp σp (Fp (x)) − xp . 164 ˙
Under (5.1), the vector of motion xp for population p has its tail at the current state xp
and its head at mp σp , the representative of the mixed strategy σp ∈ ∆p in the state space
Xp = mp ∆p .
The best response protocol is given by the multivalued map
σp (πp , xp ) = Mp (πp ) ≡ argmax ( yp ) πp . (5.2) yp ∈∆p p Mp : Rn ⇒ ∆p is the maximizer correspondence for population p: the set Mp (πp ) consists
of those mixed strategies that only place mass on pure strategies optimal under payoﬀ
vector πp . Inserting this protocol into equation (5.1) yields the best response dynamic:
(BR) ˙
xp ∈ mp Mp (Fp (x)) − xp . We can also write (BR) as
˙
xp ∈ mp Bp (x) − xp .
where Bp = Mp ◦ Fp is the best response correspondence for population p.
Deﬁnition. The best response dynamic assigns each population game F ∈ F the set of solutions
to the diﬀerential inclusion (BR).
All of our dynamics from Chapter 4 are Lipschitz continuous, so the existence and
uniqueness of their solutions is ensured by the PicardLindelof Theorem. Since the best
¨
response dynamic (BR) is a discontinuous diﬀerential inclusion, that theorem does not
apply here. But while the map Mp is not a Lipschitz continuous function, it does exhibit
other regularity properties: in particular, it is a convexvalued, upper hemicontinuous correspondence. These properties impose enough structure on the dynamic (BR) to establish
an existence result.
To state this result, we say that the Lipschitz continuous trajectory {xt }t≥0 is a Carath´odory
e
˙
˙
solution to the diﬀerential inclusion x ∈ V (x) if it satisﬁes xt ∈ V (xt ) at all but a measure
zero set of times in [0, ∞).
Theorem 5.1.1. Fix a continuous population game F. Then for each ξ ∈ X, there exists a trajectory
{xt }t≥0 with x0 = ξ that is a Carath´odory solution to the diﬀerential inclusion (BR).
e
It is important to note that while solutions to the best response dynamic exist, they need
not be unique: as the examples to follow will illustrate, multiple solution trajectories can
emanate from a single initial condition. For a brief introduction to the theory of diﬀerential
inclusions, see Appendix 5.A.1.
165 In Chapter 3, we justiﬁed our focus on the deterministic dynamic generated by a
revision protocol through an appeal to a ﬁnite horizon approximation theorem. This
result, which we present in Chapter 9, tells us that under certain regularity conditions,
N
the stochastic evolutionary process {Xt } generated by a game F and revision protocol
ρ is well approximated by a solution to the mean dynamic (M) over any ﬁnite time
horizon, so long as the population size is large enough. But because the revision protocol
that generates the best response dynamic is discontinuous and multivalued, the ﬁnite
horizon approximation theorem from Chapter 9 does not apply here: indeed, since σ is
N
multivalued, the Markov process {Xt } is not even uniquely deﬁned! Nevertheless, we
conjecture that it is possible to prove a version of the ﬁnite horizon approximation theorem
that applies in the present setting (see the Notes). 5.1.2 Construction and Properties of Solution Trajectories Because solutions to the best response dynamic need not be unique, they can be distinctly more complicated than solutions to Lipschitz continuous dynamics, as we demonstrate in a series of examples below. But before doing this, we show another sense in
which solutions to the best response dynamic are rather simple.
Let {xt } be a solution to (BR), and suppose that at all times t ∈ [0, T], population p’s
unique best response to state xt is the pure strategy i ∈ Sp . Then during this time interval,
evolution in population p is described by the aﬃne diﬀerential equation
p ˙
xp = mp ei − xp .
p p In other words, the population state xp moves directly towards vertex vi = mp ei of the set
Xp , proceeding more slowly as time passes. It follows that throughout the interval [0, T],
p
the state (xt )p lies on the line segment connecting (x0 )p and vi ; indeed, we can solve the
previous equation to obtain an explicit formula for (xt )p :
p (xt )p = (1 − e−t ) vi + e−t (x0 )p for all t ∈ [0, T].
Matters are more complicated at states that admit multiple best responses, since at such
states more than one future course of evolution is possible. Still, not every element of
Bp (x) need deﬁne a feasible direction of motion for population p: if {(xt )p } is to head toward
ˆ
ˆ
state xp during a time interval of positive length, all pure strategies in the support of xp
must remain optimal throughout the interval.
Example 5.1.2. Standard RockPaperScissors. Suppose a population of agents is randomly
166 Figure 5.1.1: The best response dynamic in RPS. matched to play standard RockPaperScissors: 0 −l w w 0 −l A= −l w 0 with w = l. The phase diagram for the best response dynamic in F(x) = Ax is presented in
Figure 5.1.1. The upper, lower left, and lower right regions of the ﬁgure contain the states
at which Paper, Scissors, and Rock are the unique best responses; in each of these regions,
all solution trajectories head directly toward the appropriate vertex. When the boundary
of a best response region is reached, multiple directions of motion are possible, at least
in principle. But at all states other than the unique Nash equilibrium x∗ = ( 1 , 1 , 1 ), the
333
only direction of motion that can persist for a positive amount of time is the one heading
toward the new best response, and starting from x∗ , the only feasible solution trajectory
is the stationary one. Putting this all together, we conclude that in standard RPS, the
solution to the best response dynamic from each initial condition is unique.
Figure 5.1.1 appears to show that every solution trajectory converges to the unique
Nash equilibrium x∗ . To verify this, we prove that along every solution trajectory {xt }, 167 whenever the best response to xt is unique, we have that
(5.3) d
max Fk (xt ) = − max Fk (xt ).
k ∈S
dt k∈S Since the best response is unique at almost all times t, integrating equation (5.3) shows
that
(5.4) max Fk (xt ) = e−t max Fk (x0 ).
k∈S k∈S Now in standard RPS, the maximum payoﬀ function maxk∈S Fk is nonnegative, equalling
zero only at the Nash equilibrium x∗ . This fact and equation (5.4) imply that the maximal
payoﬀ falls over time, converging as t approaches inﬁnity to its minimum value of 0; over
this same time horizon, the state xt converges to the Nash equilibrium x∗ .
To prove equality (5.3), ﬁx a state xt at which there is a unique optimal strategy—say,
˙
Paper. At this state, xt = eP − xt . Since FP (x) = w(xR − xS ), we can compute that
d
F (x )
dt P t ˙
= FP (xt ) xt
= w(eR − eS ) (eP − xt )
= −w(eR − eS ) xt
= −FP (xt ). § Example 5.1.3. Twostrategy coordination. Suppose that agents are randomly matched to
play the two strategy game with strategy set S = {U, D} and payoﬀ matrix 1 0 A= 0 2 . The resulting random matching game F(x) = Ax has three Nash equilibria, the two pure
equilibria eU and eD , and the mixed equilibrium (x∗ , x∗ ) = ( 2 , 1 ).
UD
33
To reduce the amount of notation, we let d = xD represent the proportion of players
choosing strategy D, so that the mixed Nash equilibrium becomes d∗ = 1 . The best
3
response dynamic for this game is described in terms of the state d as follows: {−d} if d < d∗ , d˙ = [− 1 , 2 ] if d = d∗ , 33 {1 − d} if d > d∗ . 168 From every initial condition other than d∗ , the dynamic admits a unique solution trajectory
that converges to a pure equilibrium:
(5.5)
(5.6) d0 < d∗ ⇒ dt = e−t d0 ,
d0 > d∗ ⇒ dt = e−t d0 + (1 − e−t ) = 1 − e−t (1 − d0 ). But there are many solution trajectories starting from d∗ : one solution is stationary;
another proceeds to d = 0 according to equation (5.5), a third proceeds to d = 1 according
to equation (5.6), and yet others follows the trajectories in (5.5) and (5.6) after some initial
delay.
Notice that solutions (5.5) and (5.6) quickly leave the vicinity of d∗ . This is unlike
the behavior of Lipschitz continuous dynamics, under which solutions from all initial
conditions are unique, and solutions that start near a stationary point move very slowly.
§
Exercise 5.1.4. Twostrategy anticoordination. Suppose players are randomly matched to
play the anticoordination game −1 0 A= 0 −1 .
Show that there is a unique solution to this dynamic from each initial condition d0 . Also,
1
show that each solution reaches the unique Nash equilibrium d∗ = 2 in ﬁnite time, and
express this time as a function of the initial condition d0 . This is unlike the behavior of
Lipschitz continuous dynamics, under which solutions can only reach rest points in the
limit as the time t approaches inﬁnity.
Example 5.1.5. Threestrategy coordination. Figure 5.1.2 presents the phase diagram for the
best response dynamic generated by random matching in the pure coordination game 1 0 0 0 1 0 . A= 0 0 1 1
The speed of motion is fastest near the mixed Nash equilibrium x∗ = ( 3 , 1 , 1 ). As in
33
Example 5.1.3, solution trajectories are not unique: this time, whenever the state is on the
Yshaped set of boundaries between best response regions, it can leave this set and head
into any adjoining basin of attraction. § 169 Figure 5.1.2: The best response dynamic in Pure Coordination. Exercise 5.1.6. Good and bad RPS.
(i) Using a similar argument to that provided in Example 5.1.2, show that in any good RPS game, the unique Nash equilibrium
x∗ = ( 1 , 1 , 1 ) is globally stable, and that it is reached in ﬁnite time from every
333
initial condition.
(ii) Show that in any bad RPS game, solutions starting from almost all initial conditions
converge to a limit cycle in the interior of the state space. In addition, argue
that there are multiple solutions starting from the Nash equilibrium x∗ : one is
stationary, while others spiral outward toward the limit cycle. The latter solutions
are not diﬀerentiable at t = 0. It is therefore possible for a solution to escape a Nash
equilibrium without the solution beginning its motion in a welldeﬁned direction.
(Hint: Consider backward solution trajectories from initial conditions in the region
bounded by the cycle.)
Example 5.1.7. Zeeman’s game. Consider the population game F(x) = Ax generated by
random matching in the symmetric normal form game 0 6 −4 A = −3 0 5 −1 3 0 170 Figure 5.1.3: The best response dynamic in Zeeman’s game.
1
with strategy set S = {U, M, D}. The Nash equilibria of F are eU , x∗ = ( 3 , 1 , 1 ), and
33
y∗ = ( 4 , 0, 1 ). The best response dynamic for F is presented in Figure 5.1.3. Solution
5
5
trajectories from a majority of initial conditions are unique and converge to the pure
equilibrium eU . However, some initial conditions generate multiple solutions. Consider,
for example, solutions starting at the interior Nash equilibrium x∗ . There is a stationary
solution at x∗ , as well as solutions that head toward the vertex eU , possibly after some
delay. Other solutions head toward the Nash equilibrium y∗ . Some of these converge to
y∗ ; others leave segment x∗ y∗ before reaching y∗ . Of those that leave, some head to eU ,
while others head toward eD and then return to x∗ . If x∗ is revisited, any of the behaviors
just described can occur again. Therefore, there are solutions to (BR) that arrive at and
depart x∗ in perpetuity. § 5.1.3 Incentive Properties In the previous chapter, we introduced two properties, Nash stationarity (NS) and
positive correlation (PC), that link growth rates under evolutionary dynamics with payoﬀs
in the underlying games.
(NS) VF (x) = 0 if and only if x ∈ NE(F). 171 (PC) p VF (x) p 0 implies that VF (x) Fp (x) > 0 for all p ∈ P . Both of these properties are designed for singlevalued diﬀerential equations. We now
establish that analogues of these two properties are satisﬁed by the diﬀerential inclusion
(BR).
Theorem 5.1.8. The best response dynamic satisﬁes
(5.7)
(5.8) 0 ∈ VF (x) if and only if x ∈ NE(F).
p
ˆp
(zp ) Fp (x) = mp max F j (x) for all zp ∈ VF (x).
p
j∈S ˙
Condition (5.7) requires that the diﬀerential inclusion x ∈ VF (x) have a stationary
solution at every Nash equilibrium, but at no other states. As we have seen, this condition
does not rule out the existence of additional solution trajectories that leave Nash equilibria.
p
Condition (5.8) asks that the correspondence x → VF (x) Fp (x) be single valued, always
equaling the product of population p’s mass and its maximal excess payoﬀ. It follows that
this map is Lipschitz continuous and nonnegative, equaling zero if and only if all agents in
population p are playing a best response (see Lemma 4.5.4). Summing over populations,
we see that VF (x) F(x) = {0} if and only if x is a Nash equilibrium of F.
p Proof. Property (5.7) is immediate. To prove property (5.8), ﬁx x ∈ X, and let zp ∈ VF (x).
Then zp = mp yp − xp for some yp ∈ Mp (Fp (x)). Therefore,
p
ˆp
(zp ) Fp (x) = (mp yp − xp ) Fp (x) = mp max F j (x) − mp Fp (x) = mp max F j (x).
p
p
j∈S 5.2 j∈S Perturbed Best Response Dynamics The best response dynamic is a fundamental model of evolution in games, as it provides
an idealized description of the behavior of agents whose decisions condition on exact
information about the current strategic environment. Of course, the ﬂip side of exact
information is discontinuity, a violation of our desideratum (C) for revision protocols (see
Section 4.2.1).
We now introduce revision protocols under which agents choose best responses to
payoﬀs that have been subjected to perturbations. While the perturbations can represent
actual payoﬀ noise, they can also represent errors in agents’ perceptions of payoﬀs, or in
the agents’ implementations of the best response rule. Regardless of their interpretation,
the perturbations lead to revision protocols that are smooth functions of payoﬀs, and so
to dynamics that can be analyzed using standard techniques.
172 The use of perturbed best response functions is not unique to evolutionary game theory. To mention one prominent example, researchers in experimental economics employ
perturbed best response functions when attempting to rationalize experimental data. Consequently, the ideas we develop in this section provide dynamic foundations for solution
concepts in common use in experimental research (see the Notes). 5.2.1 Revision Protocols and Mean Dynamics Perturbed best response protocols are exact target protocols deﬁned in terms of perp
˜
turbed maximizer functions Mp : Rn → int(∆p ):
(5.9) ˜
σp (πp , xp ) = Mp (πp ). ˜
Unlike the maximizer correspondence Mp , the function Mp is singlevalued, continuous,
˜
and even diﬀerentiable. The mixed strategy Mp (πp ) ∈ int(∆p ) places most of its mass on the
optimal pure strategies, but places positive mass on all pure strategies. Precise deﬁnitions
˜
of Mp will be stated below.
Example 5.2.1. Logit choice. When p = 1, the logit choice function with noise level η > 0 is
written as
˜
Mi (π) = exp(η−1 πi )
.
−1
j∈S exp(η π j ) ˜
For any value of η > 0, each strategy receives positive probability under M regardless of
the payoﬀ vector π. But if πi > π j for all j i, the probability with which strategy i is
chosen approaches one as η approaches zero. Notice too that adding a constant vector to
the payoﬀ vector π has no eﬀect on choice probabilities.
When there are just two strategies, the logit choice function reduces to
˜
M1 (π) = exp(η−1 (π1 − π2 ))
˜
˜
and M1 (π) + M2 (π) = 1.
exp(η−1 (π1 − π2 )) + 1 In Figure 5.2.1, we ﬁx π2 at 0, and graph as a function of π1 the logit(η) choice probabilities
˜
M1 (π) for η = .25, .1, and .02, as well as the optimal choice probabilities M1 (π). Evidently,
˜
M1 provides a smooth approximation of the discontinuous map M1 . While the function
˜
M1 cannot converge uniformly to the correspondence M1 as the noise level η goes to zero,
˜
one can show that the graph of M1 converges uniformly (in the Hausdorﬀ metric—see the
Notes) to the graph of M1 as η approaches zero. §
173 –1 0 1 π1 ˜
Figure 5.2.1: Logit choice probabilities M1 (π1 , 0) for noise levels η = .25 (red), η = .1 (green), and η = .02
(blue), along with optimal choice probabilities M1 (π1 , 0) (black). The protocol (5.9) induces the perturbed best response dynamic
(5.10) ˜
˙
xp = mp Mp (Fp (x)) − xp as its mean dynamic. We can also write (5.10) as
˜
˙
xp = mp Bp (x) − xp ,
˜
˜
where the function Bp = Mp ◦ Fp , which maps social states to mixed strategies, is the
perturbed best response function for population p; it is a perturbed version of the best
response correspondence Bp = Mp ◦ Fp . 5.2.2 Perturbed Optimization: A Representation Theorem We now consider two methods of deﬁning perturbed maximizer functions. To avoid
superscripts, we focus here on the single population case.
˜
The traditional method of deﬁning M, a method with a long history in the theory of
discrete choice, is based on stochastic perturbations of the payoﬀs to each pure strategy. In
this construction, an agent chooses the best response to the vector of payoﬀs π ∈ Rn , but
only after the payoﬀs to his alternatives have been perturbed by some random vector ε.
(5.11) ˜
Mi (π) = P i = argmax π j + ε j . j∈S We require the random vector ε to be an admissible stochastic perturbation: it must admit
174 ˜
a positive density on Rn , and this density must be smooth enough that the function M is
continuously diﬀerentiable. For example, if the components εi are independent, standard
˜
results on convolutions imply that M is C1 whenever the densities of the components εi
˜
are bounded. In the discrete choice literature, the deﬁnition of M via equation (5.11) is
known as the additive random utility model (ARUM).
˜
We can also deﬁne M by introducing a deterministic perturbation of the payoﬀs to each
mixed strategy. Call the function v : int(∆) → R an admissible deterministic perturbation if
it is diﬀerentiably strictly convex and steep near bd(∆). That is, v is admissible if the second
derivative at y, D2 v( y) ∈ L2 (Rn , R), is positive deﬁnite for all y ∈ int(∆), and if  v( y)
s
0
approaches inﬁnity whenever y approaches bd(∆). (Recall that Rn is an alternate notation
0
for T∆, the tangent space of the simplex.) With an admissible v in hand, we deﬁne the
˜
function M by
(5.12) ˜
M(π) = argmax y π − v( y) .
y∈int(∆) One interpretation of the function v is that it represents a “control cost” that becomes
large whenever an agent puts too little probability on any particular pure strategy. Because
the base payoﬀs to each strategy are bounded, the steepness of v near bd(∆) implies that
it is never optimal for an agent to choose probabilities too close to zero.
˜
Note that under either deﬁnition, choice probabilities under M are unaﬀected by
1
constant shifts in the payoﬀ vector π. The projection of Rn onto Rn , Φ = I − n 11 , employs
0
˜
just such a shift, so we can express this property of M as follows:
˜
˜
M(π) = M(Φπ) for all π ∈ Rn .
˜
With this motivation, we deﬁne M : Rn → int(∆) to be the restriction of M to the subspace
0
Rn .
0
As we noted above, the stochastic construction (5.11) is the traditional way of deﬁning
perturbed maximizer functions, and this construction is more intuitively appealing than
the deterministic construction (5.12). But the latter construction is clearly more convenient
for analysis: while under (5.11) choice probabilities must expressed as cumbersome multiple integrals, under (5.12) they are obtained as interior maximizers of a strictly concave
function.
˜
Happily, we need not trade oﬀ intuitive appeal for convenience: every M deﬁned via
equation (5.11) can be represented in form (5.12).
˜
Theorem 5.2.2. Let M be a perturbed maximizer function deﬁned in terms of an admissible
175 ˜
stochastic perturbation ε via equation (5.11). Then M satisﬁes equation (5.12) for some admissible
˜
deterministic perturbation v. In fact, M = MRn and v are invertible, and M = ( v)−1 .
0
Taking as given the initial statements in the theorem, it is easy to verify the last one.
˜
Indeed, suppose that M (and hence M) can be derived from the admissible deterministic
perturbation v, that the gradient v : int(∆) → Rn is invertible, and that the payoﬀ vector
0
∗ ≡ M(π) satisﬁes
n
π is in R0 . Then y
y∗ = argmax y π − v( y) .
y∈int(∆) This is a strictly concave maximization problem with an interior solution. Taking the ﬁrst
order condition with respect to directions in Rn yields
0
Φ(π − v( y∗ )) = 0.
Since π and v( y∗ ) are already in Rn , the projection Φ does nothing, so rearranging allows
0
us to conclude that
M(π) = y∗ = ( v)−1 (π).
In light of this argument, the main task in proving Theorem 5.2.2 is to show that a
function v with the desired properties exists. Accomplishing this requires the use of the
Legendre transform, a classical tool from convex analysis. We explain the basic properties
of the Legendre transform in Appendix 5.B. This device is used to prove the representation
theorem in Appendix 5.C , where some auxiliary results can also be found.
˜
One such result is worth mentioning now. Theorem 5.2.2 tells us that every M deﬁned in terms of stochastic perturbations can be represented in terms of deterministic
perturbations. Exercise 5.2.3 shows that the converse statement is false, and thus that the
˜
deterministic deﬁnition of M is strictly more general than the stochastic one.
Exercise 5.2.3. Show that when n ≥ 4, there is no stochastic perturbation of payoﬀs which
yields the same choice probabilities as the admissible deterministic perturbation v( y) =
− j∈S log y j . (Hint: Use Theorem 5.C.6 in the Appendix.) 176 5.2.3 Logit Choice and the Logit Dynamic In Example 5.2.1, we introduced the best known example of a perturbed maximizer
function: the logit choice function with noise level η > 0.
(5.13) ˜
Mi (π) = exp(η−1 πi )
.
−1
j∈S exp(η π j ) This function generates as its mean dynamic the logit dynamic with noise level η:
p (L) p
˙
xi =m p exp(η−1 Fi (x))
−1 p
j∈Sp exp(η F j (x)) p − xi . Rest points of logit dynamics are called logit equilibria.
Example 5.2.4. In Figure 5.2.2, we present phase diagrams for the 123 Coordination game 1 0 0 x1 x1 F(x) = Ax = 0 2 0 x2 = 2x2 0 0 3 x 3x 3
3
under logit dynamics with a range of noise levels. As η passes from .01 to 1, the dynamics
pass through four distinct regimes. At the lowest noise levels, the dynamics admit seven
rest points, three stable and four unstable, corresponding to the seven Nash equilibria of F.
When η reaches ≈ .22, two of the unstable rest points annihilate one another, leaving ﬁve
rest points in total. At η ≈ .28, the stable rest point corresponding to Nash equilibrium e1
and an unstable rest point eliminate one another, so that three rest points remain. Finally,
when η ≈ .68, the stable rest point corresponding to Nash equilibrium e2 and an unstable
rest point annihilate each other, leaving just a single, stable rest point. If we continue to
1
increase η, the last rest point ultimately converges to the central state ( 1 , 3 , 1 ).
3
3
This example provides an illustration of a deep topological result called the Poincar´e
Hopf Theorem. In the present twodimensional context, this theorem ensures that generically, the number of sinks plus the number of sources equals the number of saddles plus
one. §
Example 5.2.5. Stochastic derivation of logit choice. We can derive the logit choice function
from stochastic perturbations that are i.i.d. with the double exponential distribution: P(εi ≤
c) = exp(− exp(−η−1 c − γ)), where γ = limn→∞ ( n=1 1 − log n) ≈ 0.5772 is Euler’s constant.
k
k
For intuition, we mention without proof that Eεi = 0 and Var(εi ) =
1.2826η.
177 η2 π2
,
6 so that SD(εi ) ≈ (i) η = .001 (ii) η = .1 (iii) η = .2 (iv) η = .22 (v) η = .27 (vi) η = .28 Figure 5.2.2: Logit dynamics in 123 Coordination. 178 (vii) η = .4 (viii) η = .6 (ix) η = .68 (x) η = .85 (xi) η = 1.2 (xii) η = 3 Figure 5.2.2: Logit dynamics in 123 Coordination. 179 To see that these perturbations generate logit choice, note that the density of εi is
f (x) = η−1 exp(−η−1 x − γ) exp(− exp(−η−1 x − γ)). Using the substitutions y = exp(−η−1 x − γ)
and m j = exp(η−1 π j ), we compute as follows:
∞ P i = argmax j∈S π j + ε j = f (x)
0 ji F(πi + x − π j ) dx ∞ =− η−1 y exp(− y)
0
∞ =− exp − y
0 = mj η
dy
mi y mj
dy
mi mi
j∈S = j∈S ji exp − y mj exp(η−1 πi )
.§
−1
j∈S exp(η π j ) Exercise 5.2.6. Deterministic derivation of logit choice. According to the representation theorem, it must also be possible to derive the logit choice function from an admissible
deterministic perturbation. Show that this is accomplished using the (negated) entropy
function v( y) = η j∈S y j log y j .
The next exercise gives explicit formulas for various functions from the proof of the
˜
representation theorem in the case of logit choice. Included is the derivative matrix DM(π),
a useful item in analyses of local stability (see Chapter 7.) The exercise also shows how
˜
the entropy function v can be derived from the function M.
Exercise 5.2.7. Additional results on logit choice.
˜
˜
(i) Show that µ(π) = η log( j∈S exp(η−1 π j )) is a potential function for M. (For the
interpretation of this function, see Observation 5.C.3 and Theorem 5.C.4 in the
Appendix.)
1
1
˜
˜
¯
˜
¯
(ii) Let µ be the restriction of µ to Rn , so that µ(π) = ΦM(π) = M(π) − n 1 = M(π) − n 1.
0
1
ˆ
For y ∈ int(∆), let y ≡ y − n 1. Show that 1
log y1 − n .
.
ˆ
¯
( µ)−1 ( y) = M−1 ( y) = η . log y − 1
n n j∈S j∈S log y j . log y j ¯
¯
(iii) Let (C∗ , µ∗ ) be the Legendre transform of (Rn , µ), and deﬁne v : int(∆) → R by
0
∗ ( y). Show by direct computation that v( y) = η
¯
v( y) = µ ˆ
j∈S y j log y j .
−1
˜
(iv) Show that v( y) = M ( y). (Hint: Let v be the natural extension of v to Rn , and use
+
180 ˜
the fact v( y) = Φ v( y).)
(v) Show that 2 v( y) = η Φ diag([ y−1 ]) Φ, where [ y−1 ] j = y−1 for all j ∈ S.
j
(vi) Show that if π ∈ Rn , then
0
˜
DM(π) = 2 ˜
µ(π) = η−1 diag(M(π)) − M(π)M(π) = 2 ¯
µ(π) = DM(π). ¯
(vii) Show that 2 v(M(π)) = ( 2 µ(π))−1 when these matrices are viewed as linear maps
from Rn to Rn . (Hint: Since both of these maps are of full rank on Rn , it is enough
0
0
0
¯
to show that 2 µ(π) 2 v(M(π)) = Φ, the orthogonal projection onto Rn .)
0
˜
Exercise 5.2.8. Suppose that M is a perturbed maximizer function derived from an admissible deterministic perturbation as in equation (5.12) (or from an admissible stochastic
˜
perturbation as in equation (5.11)). Show that if M can be expressed as
(5.14) ˜
Mi (π) = α(πi )
j∈S α(π j ) ˜
for some increasing diﬀerentiable function α : R → (0, ∞), then M is a logit choice
function with some noise level η > 0. (Hint: Combine equation (5.14) with the fact that
˜
the derivative matrix DM(π) must be symmetric (see Corollary 5.C.5 and Theorem 5.C.6
in the Appendix).)
Exercise 5.2.9. The variablerate logit dynamic. The variablerate logit dynamic with noise level
η > 0 is deﬁned by
(5.15) p p p p exp(η−1 F j (x)). ˙
xi = mp exp(η−1 Fi (x)) − xi j∈Sp The previous exercise shows that the logit dynamic is the only perturbed best response
dynamic that admits a modiﬁcation of this sort.
(i) Describe a simple revision protocol that generates this dynamic, and provide an
interpretation.
(ii) Show that if p = 1, then (5.15) is equivalent to the logit dynamic (L) up to a change
in the speed at which solution trajectories are traversed. Explain why this is not
the case when p ≥ 2.
(iii) Compare this dynamic with the excess payoﬀ dynamics from Chapter 4. Explain
why those dynamics cannot be modiﬁed so as to resemble the logit dynamic (L). 181 5.2.4 Perturbed Incentive Properties via Virtual Payoﬀs Because they incorporate payoﬀ disturbances, perturbed best response dynamics cannot satisfy positive correlation (PC) or Nash stationarity (NS). We now show that these
dynamics do satisfy suitably perturbed versions of the two incentive properties. In light
of the representation theorem, there is no loss of generality in focusing on dynamics
generated by admissible deterministic perturbations v = (v1 , . . . , vp ).
We can describe the set of Nash equilibria of F in terms of the best response correspondences Bp :
NE(F) = {x ∈ X : xp ∈ mp Bp (x) for all p ∈ P }.
In similar fashion, we deﬁne the set of perturbed equilibria of the pair (F, v) in terms of the
˜
perturbed best response functions Bp :
˜
PE(F, v) = {x ∈ X : xp = mp Bp (x) for all p ∈ P }.
By deﬁnition, the rest points of the perturbed best response dynamic (5.10) are the perturbed equilibria of (F, v).
Observation 5.2.10. All perturbed best response dynamics satisfy perturbed stationarity:
(5.16) V (x) = 0 if and only if x ∈ PE(F, v). We can derive an alternate characterization of perturbed equilibrium using the notion
˜
of virtual payoﬀs. Deﬁne the virtual payoﬀs F : int(X) → Rn for the pair (F, v) by
1
˜
Fp (x) = Fp (x) − vp ( mp xp ). Thus, the virtual payoﬀ function for population p is the diﬀerence between the population’s
true payoﬀ function and gradient of its deterministic perturbation.
For intuition, let us consider the single population case. When x is far from the
˜
boundary of the simplex X, the perturbation v is relatively ﬂat, so the virtual payoﬀs F(x)
are close to the true payoﬀs F(x). But near the boundary of X, true and virtual payoﬀs are
quite diﬀerent. For example, when xi is the only component of x that is close to zero, then
for each alternate strategy j i, moving “inward” in direction ei − e j sharply decreases the
v
value of v; thus, the directional derivative ∂(e∂−e j ) (x) is large in absolute value and negative.
i
˜
˜
It follows that the diﬀerence Fi (x) − F j (x) between these strategies’ virtual payoﬀs is large
˜
and positive. In other words, rare strategies are quite desirable in the “virtual game” F.
182 Individual agents do not use virtual payoﬀs to decide how to act: to obtain the
maximized function in deﬁnition (5.12) from the virtual payoﬀ function, we must replace
1
the normalized population state mp xp with the vector of choice probabilities yp . But at
1
perturbed equilibria, mp xp and yp agree. Therefore, perturbed equilibria of (F, v) correspond
˜
to “Nash equilibria” of the “virtual game” F.
˜
Theorem 5.2.11. Let x ∈ X be a social state. Then x ∈ PE(F, v) if and only if ΦFp (x) = 0 for all
p ∈ P.
˜
˜
The equality ΦFp (x) = 0 means that Fp (x) is a constant vector. Since uncommon strategies
˜
are quite desirable in the “virtual game” F, no state that includes an unused strategy can
˜
be a “Nash equilibrium” of F; thus, equality of all virtual payoﬀs in each population is the
˜
right deﬁnition of “Nash equilibrium” in F.
Theorem 5.2.11 follows immediately from perturbed stationarity (5.16) and Lemma
5.2.12 below.
˜
Lemma 5.2.12. Let x ∈ X be a social state. Then V p (x) = 0 if and only if ΦFp (x) = 0.
˜
Proof. Using the facts that Mp (πp ) = Mp (Φπp ), that Mp = ( vp )−1 , and that the range of
p
vp is Rn (so that vp = Φ ◦ vp ), we argue as follows:
0
˜
V p (x) = 0 ⇔ mp Mp (Fp (x)) = xp
⇔ Mp (ΦFp (x)) = 1p
x
mp
1
vp ( mp xp ) ⇔ ΦFp (x) =
˜
⇔ ΦFp (x) = 0. Turning now to disequilibrium behavior, recall that positive correlation is deﬁned in
terms of inner products of growth rate vectors and payoﬀ vectors:
(PC) p VF (x) p 0 implies that VF (x) Fp (x) > 0 for all p ∈ P . In light of the discussion above, the natural analogue of property (PC) for perturbed best
˜
response dynamics replaces the true payoﬀs Fp (x) with virtual payoﬀs Fp (x). Doing so
yields virtual positive correlation:
(5.17) V p (x) ˜
0 implies that V p (x) Fp (x) > 0 for all p ∈ P . To conclude this section, we verify that all perturbed best response dynamics heed this
property.
183 Theorem 5.2.13. All perturbed best response dynamics satisfy virtual positive correlation (5.17).
Proof. Let x ∈ X be a social state at which V p (x)
(5.18) ˜
yp ≡ Mp (Fp (x)) = Mp (ΦFp (x)) 0. Then by deﬁnition, 1p
x.
mp Since vp = (Mp )−1 , we can rewrite the equality in expression (5.18) as vp ( yp ) = ΦFp (x).
Therefore, since V p (x) ∈ TXp , we ﬁnd that
˜
˜
˜
V p (x) Fp (x) = mp Mp (Fp (x)) − xp ΦFp (x)
= mp Mp (ΦFp (x)) − xp
= mp yp − 1p
x
mp 1
ΦFp (x) − vp ( mp xp ) 1
vp ( yp ) − vp ( mp xp ) > 0, where the ﬁnal inequality follows from the fact that yp
of vp . 5.3
5.3.1 1p
x
mp and from the strict convexity The Projection Dynamic
Deﬁnition Our main payoﬀ monotonicity condition for evolutionary dynamics is positive correlation (PC). In geometric terms, (PC) requires that at each state where population p is not
at rest, the growth rate vector V p (x) must form an acute angle with the payoﬀ vector Fp (x).
Put diﬀerently, (PC) demands that growth rate vectors not distort payoﬀ vectors to too
great a degree. Is there an evolutionary dynamic that minimizes this distortion?
If the vector ﬁeld V is to deﬁne an evolutionary dynamic, each growth rate vector V (x)
must represent a feasible direction of motion, in the sense of lying in the tangent cone
TX(x). Thus, the most direct approach to our question is to always take V (x) to be the
closest point in TX(x) to the payoﬀ vector F(x).
Deﬁnition. The projection dynamic associates each population game F ∈ F with a diﬀerential
equation
(P) ˙
x = ΠTX(x) (F(x)), where ΠTX(x) is the closest point projection of Rn onto the tangent cone TX(x). 184 It is easy to provide an explicit formula for (P) at social states in the interior of X. Since
at such states TX(x) = TX, the closest point projection ΠTX(x) is simply Φ, the orthogonal
projection onto the subspace TX. In fact, whenever xp ∈ int(Xp ), we have that
p p ˙
xi = (ΦFp (x))i = Fi − p Fk (x). 1
n
k ∈S Thus, when xp is an interior population state, the growth rate of strategy i ∈ Sp is the
diﬀerence between its payoﬀ and the unweighted average of the payoﬀs to population p’s
strategies.
When x is a boundary state, then the projection ΠTX(x) does not reduce to an orthogonal
projection, so providing an explicit formula for (P) becomes more complicated. Exercise
5.3.1 describes the possibilities in a threestrategy game, while Exercise 5.3.2 provides an
explicit formula for the general case.
Exercise 5.3.1. Let F be a threestrategy game. Give an explicit formula for V (x) = ΠTX(x) F(x)
when
(i) x ∈ int(X);
(ii) x1 = 0 but x2 , x3 > 0;
(iii) x1 = 1.
Exercise 5.3.2. Let F be an arbitrary single population game. Show that the projection
ΠTX(x) (v) can be expressed as follows: v i − (ΠTX(x) (v))i = 0 1
#S (v,x) j∈S (v,x) vj if i ∈ S (v, x).
otherwise; Here, the set S (v, x) ⊆ S contains all strategies in support(x), along with any subset of
S − support(x) that maximizes the average #S (1v,x) j∈S (v,x) v j . 5.3.2 Solution Trajectories The dynamic (P) is clearly discontinuous at the boundary of X, so the existence and
uniqueness results for Lipschitz continuous diﬀerential equations do not apply. We nevertheless have the following result, which is an immediate consequence of Theorem 5.A.4
in the Appendix.
Theorem 5.3.3. Fix a Lipschitz continuous population game F. Then for each ξ ∈ X, there exists
a unique Carath´odory solution {xt }t≥0 to the projection dynamic (P) with x0 = ξ. Moreover,
e
185 solutions to (P) are Lipschitz continuous in their initial conditions: if {xt }t≥0 and { yt }t≥0 are
solutions to (P), then  yt − xt  ≤  y0 − x0  eKt for all t ≥ 0, where K is the Lipschitz coeﬃcient for F.
Theorem 5.3.3 shows that the discontinuous diﬀerential equation (P) enjoys many of
the properties of Lipschitz continuous diﬀerential equations. But there are important
diﬀerences between the two types of dynamics. One diﬀerence is easy to spot: solutions
to (P) are solutions in the Carath´ odory sense, and so can have kinks at a measure zero set
e
of times. Other diﬀerences are more subtle. For instance, while the theorem ensures the
uniqueness of the forward solution trajectory from each state ξ ∈ X, backward solutions
need not be unique. It is therefore possible for distinct solution trajectories of the projection
dynamic to merge with one another.
Example 5.3.4. Figure 5.3.1 presents phase diagrams for the projection dynamic in good
RPS (w = 2, l = 1), standard RPS (w = l = 1), and bad RPS (w = 1, l = 2). In all three games,
1
most solutions spiral around the Nash equilibrium x∗ = ( 1 , 3 , 1 ) in a counterclockwise
3
3
direction.
In good RPS (Figure 5.3.1(i)), all solutions converge to the Nash equilibrium. Solutions
that begin close to a vertex hit and then travel along an edge of the simplex before heading
into the interior of the simplex forever. Thus, there is a portion of each edge that is
traversed by solutions starting from a positive measure set of initial conditions.
In standard RPS (Figure 5.3.1(ii)), all solutions enter closed orbits at a ﬁxed distance
1
from x∗ . Solutions starting at distance √6 or greater from x∗ (i.e., all solutions at least as
11
1
far from x∗ as the state (0, 2 , 2 )) quickly enter the closed orbit at distance √6 from x∗ ; other
solutions maintain their initial distance from x∗ forever.
In bad RPS (Figure 5.3.1(iii)), all solutions other than the one starting at x∗ enter the
same closed orbit. This orbit alternates between segments through the interior of X and
segments along the boundaries.
Notice that in all three cases, solution trajectories starting in the interior of the state
space can reach the boundary in ﬁnite time. This is impossible under any of our previous
dynamics, including the best response dynamic. §
˙
Exercise 5.3.5.
(i) Under what conditions is the dynamic (P) described by x = ΦF(x) at
all states x ∈ X (i.e., not just at interior states)?
(ii) Suppose that F(x) = Ax is generated by random matching in the symmetric normal
form game A. What do the conditions from part (i) reduce to in this case? (Note
˙
that under these conditions, x = ΦAx is a linear diﬀerential equation; it is therefore
possible to write down explicit formulas for the solution trajectories (see Chapter
7).)
186 (i) good RPS (ii) standard RPS (iii) bad RPS
Figure 5.3.1: The projection dynamic in three RockPaperScissors games. 187 5.3.3 Incentive Properties That solutions to the projection dynamic exist, are unique, and are continuous in their
initial conditions is not obvious. But given this fact and the manner in which the dynamic is
deﬁned, it is not surprising that the dynamic satisﬁes both of our incentive properties. The
proofs of these properties are simple applications of the Moreau Decomposition Theorem:
given any closed convex cone K ⊆ Rn and any vector π ∈ Rn , the projections ΠK (π) and
ΠK◦ (π) are the unique vectors satisfying ΠK (π) ∈ K, ΠK◦ (π) ∈ K◦ , and ΠK (π) + ΠK◦ (π) = π
(see Appendix 1.B).
Theorem 5.3.6. The projection dynamic satisﬁes Nash stationarity (NS) and positive correlation
(PC).
Proof. Using the Moreau Decomposition Theorem and the normal cone characterization
of Nash equilibrium (see Theorem 1.3.2), we ﬁnd that
ΠTX(x) (F(x)) = 0 ⇔ F(x) ∈ NX(x) ⇔ x ∈ NE(F),
establishing (NS). To prove (PC), we again use the Moreau Decomposition Theorem:
V p (x) Fp (x) = ΠTXp (xp ) (Fp (x)) ΠTXp (xp ) (Fp (x)) + ΠNXp (xp ) (Fp (x))
= ΠTXp (xp ) (F(xp ))2
≥ 0.
The inequality binds if and only if ΠTXp (xp ) (Fp (x)) = V p (x) = 0. 5.3.4 Revision Protocols and Connections with the Replicator Dynamic To this point, we have motivated the projection dynamic entirely through geometric
considerations. Can this dynamic be derived from a model of individual choice? In this
section, we describe revision protocols that generate the projection dynamic as their mean
dynamics, and use these protocols to argue that the projection dynamic models “revision
driven by insecurity”. Our analysis reveals close connections between the projection
dynamic and the replicator dynamic, connections that we will develop further in the next
chapter.
In the remainder of this section, we focus on the single population setting; the extension
to multipopulation settings is straightforward. 188 If we focus exclusively on interior states, the connections between the replicator and
projection dynamics are especially strong. In Chapter 3, we introduced three revision
protocols that generate the replicator dynamic as their mean dynamics:
(5.19) ρi j (π, x) = x j [π j − πi ]+ ; (5.20) ρi j (π, x) = x j (K − πi ); (5.21) ρi j (π, x) = x j (π j + K). The x j term in each formula reﬂects the fact that these protocols are driven by imitation. For
instance, to implement the ﬁrst protocol, an agent whose clock rings picks an opponent
from his population at random; he then imitates this opponent only if the opponents’
payoﬀ is higher, doing so with probability proportional to the payoﬀ diﬀerence. The
x j term in these protocols endows their mean dynamic with a special functional form:
the growth rate of each strategy is proportional to its prevalence in the population. For
protocol (5.19), the derivation of the mean dynamic proceeds as follows:
˙
xi = x j ρ ji (F(x), x) − xi
j∈S = ρi j (F(x), x)
j∈S x j xi [Fi (x) − F j (x)]+ − xi
j∈S = xi x j [F j (x) − Fi (x)]+
j∈S x j (Fi (x) − F j (x))
j∈S = xi Fi (x) − j x j F j (x) . To derive the projection dynamic on int(X), we use analogues of the revision protocols
1
above, replacing x j with nxi :
(5.22)
(5.23)
(5.24) [π j − πi ]+
;
nxi
K − πi
ρi j (π, x) =
;
nxi
πj + K
ρi j (π, x) =
.
nxi
ρi j (π, x) = Thus, while in each of the imitative protocols, ρi j is proportional to the mass of agents
playing the candidate strategy j, in the protocols just above, ρi j is inversely proportional to the
mass of agents playing the current strategy i. One can therefore designate the projection
189 dynamic as capturing “revision driven by insecurity”, as it describes the behavior of
agents who are especially uncomfortable choosing strategies not used by many others.
It is easy to verify that protocols (5.22), (5.23), and (5.24) all induce the projection
dynamic on the interior of the state space. In the case of protocol (5.22), the calculation
proceeds as follows:
˙
xi = x j ρ ji (F(x), x) − xi
j∈S = xj [Fi (x) − F j (x)]+
nx j j∈S = ρi j (F(x), x)
j∈S 1
n [F j (x) − Fi (x)]+ − xi nxi j∈S (Fi (x) − F j (x))
j∈S = Fi (x) − F j (x). 1
n
j∈S 1
Because of the xi term in the revision protocol, the mean dynamic above does not depend
directly on the value of xi , allowing the disappearance rates of rare strategies to stay
bounded away from zero. In other words, it is because unpopular strategies can be
abandoned quite rapidly that solutions to the projection dynamic can travel from the
interior to the boundary of the state space in a ﬁnite amount of time.
˙
Except in cases where the projection dynamic is deﬁned by x = ΦF(x) at all states (cf
Exercise 5.3.5), the revision protocols above do not generate the projection dynamic on
the boundary of X. Exercise 5.3.7 presents a revision protocol that achieves this goal, even
while maintaining connections with the replicator dynamic. Exercise 5.3.7. Consider the following two revision protocols (5.25) (5.26) ˆ
x j [π j ]+ [πi ]− · ˆ
if
xk [πk ]+ > 0,
ˆ
ˆ
ρi j (π, x) = k∈S
k∈S xk [πk ]+ 0 otherwise. [πS ]
˜j
[πS ] ˜i − + ˜k
·
if
xi [πS ]+ > 0, S
ρi j (π, x) = xi
˜
k∈S (π,x)
k∈S (π,x) [πk ]+ 0
otherwise. ˜i
The set S (π, x) in equation (5.26) is deﬁned in Exercise 5.3.2, and πS = πi − #S (1 x) k∈S (π,x) πk .
π,
(i) Provide an interpretation of protocol (5.25), and show that it generates the replicator
dynamic as its mean dynamic.
(ii) Provide an interpretation of protocol (5.26), and show that it generates the projec190 tion dynamic as its mean dynamic. Appendix
5.A Diﬀerential Inclusions 5.A.1 Basic Theory A correspondence (i.e., a set valued map) V : Rn ⇒ Rn deﬁnes a diﬀerential inclusion
via
(DI) ˙
x ∈ V (x). We call (DI) a good upper hemicontinuous (or good UHC) diﬀerential inclusion if V is:
(i)
(ii)
(iii)
(iv) Nonempty: V (x) ∅ for all x ∈ Rn ;
Convex valued: V (x) is convex for all x ∈ X;
Bounded: There exists a K ∈ R such that sup{ y : y ∈ V (x)} ≤ K for all x ∈ Rn ;
Upper hemicontinuous: The graph of V , gr(V ) = {(x, y) : y ∈ V (x)}, is closed. While solutions to good UHC diﬀerential inclusions are neither as easily deﬁned nor as
well behaved as those of Lipschitz continuous diﬀerential equations, we will see that
analogues of all the main properties of solutions to the latter can be established in the
present setting.
The set of feasible directions of motion under (DI) changes abruptly at discontinuities
of the correspondence V . Our solution notion for (DI) must therefore admit trajectories
with kinks: rather than requiring the relation (DI) to hold at every instant in time, it asks
only that (DI) hold at almost all times. To formalize this notion, recall that the set Z ⊆ R
has measure zero if for every ε > 0, there is a countable collection of open intervals of total
length less than ε that covers Z. A property is said to hold for almost all t ∈ [0, T] if it holds
on subset of [0, T] whose complement has measure zero. Finally, we say that a trajectory
˙
{xt }t∈[0,T] is a (Carath´odory) solution to (DI) if it is Lipschitz continuous and if xt ∈ V (xt ) at
e
˙
almost all times t ∈ [0, T]. Since {xt } is Lipschitz continuous, its derivative xt exists for
t
˙
almost all t ∈ [0, T], and the Fundamental Theorem of Calculus holds: xt − xs = s xu du.
˙
Observe that if {xt } is a Carath´ odory solution to a continuous ODE x = V (x), it is also
e
˙
a solution to the ODE in the usual sense: xt = V (xt ) at all times t ∈ [0, T]. While our new
concept does not introduce new solutions to standard diﬀerential equations, it enables us 191 to ﬁnd solutions in settings where solutions of the old sort do not exist. In particular, we
have the following existence result.
Theorem 5.A.1. Let (DI) be a good UHC diﬀerential inclusion. Then for each ξ ∈ Rn there exists
a (Carath´odory) solution {xt }t∈[0,T] to (DI) with x0 = ξ.
e
Our forward invariance result for ODEs extends to the current setting as follows:
Theorem 5.A.2. Let C ⊆ Rn be a closed convex set, and let V : C ⇒ Rn satisfy conditions (i)(iv)
above. Suppose that V (x) ⊆ TC(x) for all x ∈ C. Extend the domain of V to all of Rn by letting
V ( y) = V (ΠC ( y)) for all y ∈ Rn − C, and let this extension deﬁne the diﬀerential inclusion (DI)
on Rn . Then
(i) (DI) is a good UHC diﬀerential inclusion.
(ii) (DI) admits a forward solution {xt }t∈[0,T] from each x0 ∈ Rn .
(iii) C is forward invariant under (DI).
Our examples of best response dynamics in Section 5.1 show that diﬀerential inclusions
can admit multiple solution trajectories from a single initial condition, and hence that
solutions need not be continuous in their initial conditions. However, the set of solutions
to a diﬀerential inclusion still possesses considerable structure. To formalize this claim, let
C[0,T] denote the space of continuous trajectories through Rn over the time interval [0, T],
equipped with the maximum norm:
C[0,T] = {x : [0, T] → Rn : x is continuous}, and
x = max xt  for x ∈ C[0,T] .
t∈[0,T] Now recall two deﬁnitions from metric space topology. A set A ⊆ C[0,T] is connected if it
cannot be partitioned into two nonempty sets, each of which is disjoint from the closure
of the other. The set A is compact if every sequence of elements of A admits a subsequence
that converges to an element of A .
Now let S[0,T] (V, ξ) be the set of solutions to (DI) with initial condition ξ:
S[0,T] (V, ξ) = {x ∈ C[0,T] : x is a solution to (DI) with x0 = ξ}.
Theorem 5.A.3. Let (DI) be a good UHC diﬀerential inclusion. Then
(i) For each ξ ∈ Rn , S[0,T] (V, ξ) is connected and compact.
(ii) The correspondence S[0,T] (V, ·) : Rn → C[0,T] is upper hemicontinuous. 192 Although an initial condition ξ may be the source of many solution trajectories of
(DI), part (i) of the theorem shows that the set S[0,T] (V, ξ) of such trajectories has a simple
structure: it is connected and compact. Given any continuous criterion f : C[0,T] → R
(where continuity is deﬁned with respect to the maximum norm on C[0,T] ) and any initial
condition ξ, connectedness implies that the set of values f (S[0,T] (V, ξ)) is an interval, while
compactness implies that this set of values is compact; thus, there is a solution which
is optimal according to criterion f among those that start at ξ. Part (ii) of the theorem
provides an analogue of continuity in initial conditions. It tells us that if a sequence of
solution trajectories {xk }∞ 1 to (DI) (with possibly diﬀering initial conditions) converges to
k=
some trajectory x ∈ C[0,T] , then x is also a solution to (DI). 5.A.2 Diﬀerential Equations Deﬁned by Projections Let X ⊆ Rn be a compact convex set, and let F : X → Rn be Lipschitz continuous. We
consider the diﬀerential equation
(P) ˙
x = ΠTX(x) (F(x)), where ΠTX(x) is the closest point projection onto the tangent cone TX(x). This equation
˙
provides the closest approximation to the equation x = F(x) that is consistent with the
forward invariance of X.
Since the right hand side of (P) changes discontinuously at the boundary of X, the
PicardLindelof Theorem does not apply here. Indeed, solutions to (P) have diﬀerent
¨
properties than solutions of standard ODEs: for instance, solution trajectories from different initial conditions can merge after a ﬁnite amount of time has passed. But like
solutions to standard ODEs, forward solutions to the dynamic (P) exist, are unique, and
are Lipschitz continuous in their initial conditions.
Theorem 5.A.4. Let F be Lipschitz continuous. Then for each ξ ∈ X, there exists a unique
(Carath´odory) solution {xt }t≥0 to (P) with x0 = ξ. Moreover, solutions are Lipschitz continuous
e
in their initial conditions:  yt − xt  ≤  y0 − x0  eKt for all t ≥ 0, where K is the Lipschitz coeﬃcient
for F.
We now sketch a proof of this result. Deﬁne the multivalued map V : X ⇒ Rn by
V (x) =
ε>0 cl conv y∈X: y−x≤ε ΠTX( y) (F( y)) . 193 In words, V (x) is the closed convex hull of all values of ΠTX( y) (F( y)) that obtain at points y
arbitrarily close to x. It is easy to check that V is upper hemicontinuous with closed convex
values. Moreover, V (x) ∩ TX(x), the set of feasible directions of motion from x contained
in V (x), is always equal to {ΠTX(x) (F(x))}, and so in particular is nonempty. Because
V (x) ∩ TX(x) ∅, an extension of Theorem 5.A.2 called the Viability Theorem implies that
˙
for each ξ ∈ X, a solution {xt }t≥0 to x ∈ V (x) exists. But since V (x) ∩ TX(x) = {ΠTX(x) (F(x))},
this solution must also solve the original equation (P). This establishes the existence of
solutions to (P).
To prove uniqueness and continuity, let {xt } and { yt } be solutions to (P). Using the chain
rule, the Moreau Decomposition Theorem, and the Lipschitz continuity of F, we see that
d
dt 2 yt − xt = 2( yt − xt ) (ΠTX( yt ) (F( yt )) − ΠTX(xt ) (F(xt )))
= 2( yt − xt ) (F( yt ) − F(xt )) − 2( yt − xt ) (ΠNX( yt ) (F( yt )) − ΠNX(xt ) (F(xt )))
= 2( yt − xt ) (F( yt ) − F(xt )) + 2(xt − yt ) ΠNX( yt ) (F( yt ))
+ 2( yt − xt ) ΠNX(xt ) (F(xt ))
≤ 2( yt − xt ) (F( yt ) − F(xt ))
2 ≤ 2K yt − xt ,
and hence that
2 t 2 yt − xt ≤ y0 − x0 + 2K ys − xs ds.
0 Gronwall’s inequality then implies that
2 2 yt − xt ≤ y0 − x0 e2Kt .
Taking square roots yields the inequality stated in the theorem. 5.B The Legendre Transform The classical Legendre transform is the key tool for proving Theorem 5.2.2, the representation theorem for the additive random utility model. A generalization of this tool,
the socalled LegendreFenchel transform, underlies the large deviations techniques we
will introduce in Chapter 10. In this section, we introduce Legendre transforms of convex
functions deﬁned on open intervals and, more generally, on multidimensional convex
domains.
194 5.B.1 Legendre Transforms of Functions on Open Intervals Let C = (a, b) ⊆ R be an open interval, and let f : C → R be a strictly convex,
continuously diﬀerentiable function that becomes steep at the boundaries of C:
lim f (x) = −∞ if a > −∞ , and lim f (x) = ∞ if b < ∞.
x↓a x↑b The Legendre transform associates with the strictly convex function f a new strictly convex
function f ∗ . Because f : C → R is strictly convex, its derivative f : C → R is strictly
increasing, and thus invertible. We denote its inverse by ( f )−1 : C∗ → R, where the open
interval C∗ is the range of f . Since ( f )−1 is itself strictly increasing, its integral, which we
denote f ∗ : C∗ → R, is strictly convex. With the right choice of the constant of integration
K, the pair (C∗ , f ∗ ) is the Legendre transform of the pair (C, f ). In summary:
f ∗ ≡ ( f )−1 + K is strictly convex f : C → C∗ is strictly increasing − − → ( f )−1 : C∗ → C is strictly increasing
−−
f : C → R is strictly convex The cornerstone of the construction above is this observation: the derivative of f ∗ is
the inverse of the derivative of f . That is,
(5.27) ( f ∗ ) = ( f )−1 . Or, in other words,
(5.28) f ∗ has slope x at y ⇔ f has slope y at x. Surprisingly enough, we can specify the function f ∗ described above in a simple, direct
way. We deﬁne the Legendre transform (C∗ , f ∗ ) of the pair (C, f ) by
C∗ = range( f ) and f ∗ ( y) = max xy − f (x).
x∈C The ﬁrst order condition of the program at right is y = f (x∗ ( y)), or, equivalently, ( f )−1 ( y) =
x∗ ( y). On the other hand, if we diﬀerentiate f ∗ with respect to y, the envelope theorem
yields ( f ∗ ) ( y) = x∗ ( y). Putting these equations together, we see that ( f ∗ ) ( y) = ( f )−1 ( y),
which is property (5.27).
Suppose that f exists and is positive. Then by diﬀerentiating both sides of the identity
( f ∗ ) ( y) = ( f )−1 ( y), we ﬁnd this simple relationship between the second derivatives of f
195 and f ∗ :
( f ∗ ) ( y) = (( f )−1 ) ( y) = 1
, where x = ( f )−1 ( y) = x∗ ( y).
f (x) In words: to ﬁnd ( f ∗ ) ( y), evaluate f at the point x ∈ C corresponding to y ∈ C∗ , and
then take the reciprocal.
Our initial discussion of the Legendre transform suggests that it is a duality relation: in
other words, that one can generate (C, f ) from (C∗ , f ∗ ) using the same procedure through
which (C∗ , f ∗ ) is generated from (C, f ). To prove this, we begin with the simple observations that C∗ is itself an open interval, and that f ∗ is itself strictly convex and continuously
diﬀerentiable. It is also easy to check that ( f ∗ ) ( y) diverges whenever y approaches
bd(C∗ ); in fact, this is just the contrapositive of the corresponding statement about f .
It is easy to verify that (C∗ )∗ = C:
(C∗ )∗ = range(( f ∗ ) ) = range(( f )−1 ) = domain( f ) = C.
To show that ( f ∗ )∗ = f , we begin with the deﬁnition of ( f ∗ )∗ :
( f ∗ )∗ (x) = max xy − f ∗ ( y)
y∈C∗
Taking the ﬁrst order condition yields x = ( f ∗ ) ( y∗ (x)), and hence y∗ (x) = (( f ∗ ) )−1 (x) =
f (x). Since ( f )−1 ( y) = x∗ ( y), y∗ and x∗ are inverse functions. We therefore conclude that
( f ∗ )∗ (x) = xy∗ (x) − f ∗ ( y∗ (x)) = xy∗ (x) − x∗ ( y∗ (x)) y∗ (x) − f (x∗ ( y∗ (x))) = f (x).
Putting this all together, we obtain our third characterization of the Legendre transform
and of the implied bijection between C and C∗ :
(5.29) x maximizes xy − f (x) ⇔ y maximizes xy − f ∗ ( y). Example 5.B.1. If C = R and f (x) = ex , then the Legendre transform of (C, f ) is (C∗ , f ∗ ),
where C∗ = (0, ∞) and f ∗ ( y) = y log y − y. §
Example 5.B.2. Suppose that c : R → R is a strictly convex cost function. (For convenience,
we allow negative levels of output; the next example shows that this is without loss of
generality if c (0) = 0.) If output can be sold at price p ∈ C∗ = range(c ), then maximized 196 C* g–1= ( f *)’
g=f’ y
f *(y) f(x) x C Figure 5.B.1: A Legendre transform. proﬁt equals
π(p) = max xp − c(x).
x∈R Thus, by deﬁnition, (C∗ , π) is the Legendre transform of (R, c). The duality relation tells
us that if we started instead with the maximized proﬁt function π : C∗ → R, we could
recover the cost function c via the dual program
c(x) = max xp − π(p). §
p∈C∗
Example 5.B.3. To obtain the class of examples that are easiest to visualize, let the function
g : R → R be continuous, strictly increasing, and satisfy
lim g(x) = −∞, lim g(x) = ∞, and g(0) = 0. x↓−∞ x↑∞ x
If we deﬁne f (x) = 0 g(s) ds on domain R, then the Legendre transform of (R, f ) is (R, f ∗ ),
y
where f ∗ ( y) = 0 g−1 (t) dt. Evidently, ( f ∗ ) = g−1 = ( f )−1 . Indeed, Figure 5.B.1 illustrates
that x maximizes xy − f (x) if and only if y maximizes xy − f ∗ ( y), and that f ∗ has slope x
at y if and only if f has slope y at x. § 197 5.B.2 Legendre Transforms of Functions on Multidimensional Domains Analogues of all of the previous results can be established in settings with multidimensional domains. Let Z be a linear subspace of Rn . We call (C, f ) a Legendre pair if C ⊆ Z
is (relatively) open and convex, and f is C1 , strictly convex, and steep near bd(C), where
f is steep near bd(C) if  f (x) → ∞ whenever x → bd(C).
Our goal is to deﬁne a pair (C∗ , f ∗ ) that satisﬁes properties (5.30), (5.31), and (5.32): (5.31) f ∗ = ( f )−1 .
f ∗ has slope x at y ⇔ f has slope y at x. (5.32) x maximizes x y − f (x) ⇔ y maximizes x y − f ∗ ( y). (5.30) As before, we can imagine obtaining f ∗ from f by diﬀerentiating, inverting, and then
integrating, as we illustrate in the diagram below:
f : C → R is strictly convex f : C → C∗ is invertible f ∗ ≡ ( f )−1 + K is strictly convex −−→
−−
( f )−1 : C∗ → C is invertible Since the domain of f is C ⊆ Z, the derivative of f , D f , is a map from C into L(Z, R),
the set of linear forms on Z. The gradient of f at x is the unique vector f (x) ∈ Z that
represents D f (x); thus, f is a map from C into Z.
We deﬁne the Legendre transform (C∗ , f ∗ ) of the pair (C, f ) by
C∗ = range( f ) and f ∗ ( y) = max x y − f (x).
x∈C Theorem 5.B.4 summarizes the Legendre transform’s basic properties.
Theorem 5.B.4. Suppose that (C, f ) is a Legendre pair. Then:
(i) (C∗ , f ∗ ) is a Legendre pair.
(ii)
f : C → C∗ is bijective, and ( f )−1 = f ∗ .
(iii) f (x) = max y∈C∗ x y − f ∗ ( y).
(iv) The maximizers x∗ and y∗ satisfy x∗ ( y) = f ∗ ( y) = ( f )−1 ( y) and y∗ (x) =
( f ∗ )−1 (x). f (x) = As in the one dimensional case, we can relate the second derivatives of f ∗ to the
second derivatives of f . The second derivative D2 f is a map from C to L2 (Z, R), the set
s
2
n ×n
of symmetric bilinear forms on Z × Z. The Hessian of f at x, f (x) ∈ R , is the unique
198 representation of D2 f (x) by a symmetric matrix whose rows and columns are in Z. In fact,
since the map z → 2 f (x) z has range Z, we can view the matrix 2 f (x) as a linear map
from Z to Z. We rely on this observation in the following result.
Corollary 5.B.5. If D2 f (x) exists and is positive deﬁnite for all x ∈ C, then D2 f ∗ ( y) exists and is
positive deﬁnite for all y ∈ C∗ . In fact, 2 f ∗ ( y) = ( 2 f (x))−1 as linear maps from Z to Z, where
x = ( f )−1 ( y).
In the one dimensional setting, the derivative f is invertible because it is strictly
increasing. Both of these properties also follow from the stronger assumption that f (x) >
0 for all x ∈ C. In the multidimensional setting, it makes no sense to ask whether f is
strictly increasing. But there is an analogue of the second derivative condition: namely,
that the Hessian 2 f (x) is positive deﬁnite on Z × Z for all x ∈ C. According to the Global
Inverse Function Theorem, any function on a convex domain that is proper (i.e., preimages of
compact sets are compact) and whose Jacobian determinant is everywhere nonvanishing
is invertible; thus, the fact that 2 f (x) is always positive deﬁnite implies that ( f )−1 exists.
However, this deep result is not needed to prove Theorem 5.B.4 or Corollary 5.B.5. 5.C Perturbed Optimization 5.C.1 Proof of the Representation Theorem We now use the results on Legendre transforms from Appendix 5.B to prove Theorem
˜
5.2.2. We deﬁned the perturbed maximizer function M using stochastic perturbations via
(5.11) ˜
Mi (π) = P i = argmax π j + ε j . j∈S Here, the random vector ε is an admissible stochastic perturbation if it has a positive density
˜
˜
on Rn , and if this density is suﬃciently smooth that M is C1 . We deﬁned M using
deterministic perturbations via
(5.12) ˜
M(π) = argmax ( y π − v( y)).
y∈int(∆) Here, the function v : int(∆) → R is an admissible deterministic perturbation if the Hessian
matrix 2 v( y) is positive deﬁnite on Rn × Rn for all y ∈ int(∆), and if  v( y) approaches
0
0
inﬁnity whenever y approaches bd(∆). 199 ˜
Theorem 5.2.2. Let M be a perturbed maximizer function deﬁned in terms of an admissible
˜
stochastic perturbation ε via equation (5.11). Then M satisﬁes equation (5.12) for some admissible
˜
deterministic perturbation v. In fact, M = MRn and v are invertible, and M = ( v)−1 .
0
Proof. The probability that alternative i is chosen when the payoﬀ vector is π is
˜
Mi (π) = P(πi + εi ≥ π j + ε j for all j ∈ S)
= P(ε j ≤ πi + εi − π j for all j)
πi +xi −π1 ∞ = πi +xi −πi−1 πi +xi −πi+1 −∞ f (x) dxn . . . dxi+1 dxi−1 . . . dx1 dxi , ···
···
−∞ πi +xi −πn −∞ −∞ −∞ where f is the joint density function of the random perturbations ε. The following lemma
˜
lists some properties of the derivative of M.
Lemma 5.C.1. For all π ∈ Rn we have
˜
(i) DM(π) 1 = 0.
˜
(ii) DM(π) is symmetric.
˜
(iii) DM(π) has strictly negative oﬀdiagonal elements.
˜
(iv) DM(π) is positive deﬁnite with respect to Rn × Rn .
0 0 ˜
˜
Proof. Part (i) follows from diﬀerentiating the identity M(π) = M(Φπ). To establish
parts (ii) and (iii), let i and j > i be two distinct strategies. Then using the change of
ˆ
variable x j = πi + xi − π j , we ﬁnd that
˜
∂Mi
(π) = −
∂π j πi +xi −π1 ∞ πi +xi −πi−1 πi +xi −πi+1 ···
−∞ −∞ πi +xi −π j−1 πi +xi −π j+1 ···
−∞ −∞ πi +xi −πn f (x1 , . . . , x j−1 , ···
−∞ −∞ −∞ πi + xi − π j , x j+1 , . . . , xn ) dxn . . . dx j+1 dx j−1 . . . dxi+1 dxi−1 . . . dx1 dxi
ˆ
π j +x j −π 1 ∞ =− ˆ
π j +x j −πi−1 ˆ
π j +x j −πi+1 ···
−∞ −∞ ˆ
π j +x j −π j−1 ˆ
π j +x j −π j+1 ···
−∞ −∞ ˆ
π j +x j −πn f (x1 , . . . , xi−1 , ···
−∞ −∞ −∞ ˆ
ˆ
π j + x j − πi , xi+1 , . . . , xn ) dxn . . . dx j+1 dx j−1 . . . dxi+1 dxi−1 . . . dx1 dx j
= ˜
∂M j
(π),
∂πi which implies claims (ii) and (iii). To establish claim (iv), let z ∈ Rn . Then using claims (i),
0
(ii), and (iii) in succession yields
˜
z DM(π)z =
i∈S j∈S ˜
∂Mi
(π) zi z j =
∂π j i∈S 200 ji ˜
∂Mi
(π) zi z j +
∂π j i∈S − ji ˜i 2
∂M (π) zi ∂π j =
i ji ˜
∂Mi
(π) zi z j − z2 =
i
∂π j =−
i j <i ˜
∂M i
(π) zi − z j
∂π j 2 i j<i ˜
∂Mi
(π) 2zi z j − z2 − z2
i
j
∂π j > 0. ˜
˜
Since the derivative matrix DM(π) is symmetric, the vector ﬁeld M admits a potential
˜
˜
˜
function µ : Rn → R (that is, a function satisfying µ(π) = M(π) for all π ∈ Rn ). Let
¯
¯
˜0
˜
µ = µRn be the restriction of µ to Rn . Then for all π ∈ Rn , µ(π) ∈ Rn is given by
0
0
0
1
1
˜
˜
¯
˜
µ(π) = Φ µ(π) = ΦM(π) = M(π) − n 1 = M(π) − n 1 , ˜
where the third equality uses the fact that M(π) ∈ ∆.
¯
¯
Since 2 µ(π) = DM(π) is positive deﬁnite with respect to Rn × Rn , µ is strictly convex;
0
0
n
n
∗ , µ∗ ) be the Legendre
¯
thus, since bd(R0 ) is empty, (R0 , µ) is a Legendre pair. Let the pair (C ¯
1
1
n
¯
¯
transform of (R0 , µ), and deﬁne the function v : (C∗ + n 1) → R by v( y) = µ∗ ( y − n 1).
Theorem 5.2.2 then follows immediately from Lemma 5.C.2.
1
Lemma 5.C.2. (i) C∗ + n 1 = int(∆).
(ii) v : int(∆) → Rn is the inverse of M : Rn → int(∆).
0
0
(iii) v is an admissible deterministic perturbation.
˜
(iv) M(π) = argmax y∈int(∆) y π − v( y) for all π ∈ Rn .
1
¯
Proof. (i) The set C∗ = range( µ) = range(M) − n 1 is convex by Theorem 5.B.4(i).
Moreover, if the components π j , j ∈ J ⊂ S stay bounded while the remaining components
˜
˜
approach inﬁnity, then M j (π) → 0 for all j ∈ J: that is, M(π) converges to a subface of the
˜
simplex ∆. Thus, range(M) = range(M) ⊆ int(∆) contains points arbitrarily close to each
corner of the simplex. Since range(M) is convex, it must equal int(∆).
(ii) Let y ∈ int(∆). Using Theorem 5.B.4(ii), we ﬁnd that
1
1
¯
¯
v( y) = µ∗ ( y − n 1) = ( µ)−1 ( y − n 1) = M−1 ( y).
1
¯
(iii) (C∗ , µ∗ ) is a Legendre pair by Theorem 5.B.4(i); thus, if y → bd(∆) = bd(C∗ ) + n 1,
1
¯
¯
then  v( y) = µ∗ ( y − n 1) diverges. In addition, since 2 µ(π) = DM(π) is positive
deﬁnite with respect to Rn × Rn for all π ∈ Rn , Corollary 5.B.5 implies that 2 v( y) =
0
0
0
1
2∗
¯
µ ( y − n 1) is positive deﬁnite with respect to Rn × Rn for all y ∈ int(∆).
0
0 201 (iv) ˜
˜
Since M(·) = M(Φ(·)), it is enough to consider π ∈ Rn . For such π,
0 ˆ
¯ ˆ 1
argmax y π − v( y) = argmax y π − µ∗ ( y) + n 1 1
y∈int(∆)
ˆ
y∈int(∆)− n 1 1
¯
= µ(π) + n 1
˜
= M(π), where the second equality follows from Theorem 5.B.4(iv).
This completes the proof of Theorem 5.2.2. 5.C.2 Additional Results We conclude this section by stating without proof a few additional results on perturbed
˜
optimization. The ﬁrst two of these concern the construction of the potential function µ
˜
of the perturbed maximizer function M. In fact, two constructions are available, one for
each sort of perturbation.
˜
If we deﬁne M in terms of an admissible deterministic perturbation v, then one can
verify (using the Envelope Theorem or a direct calculation) that the perturbed maximum
˜
function associated with v is a potential function for M.
˜
Observation 5.C.3. The function µ : Rn → R deﬁned by
˜
µ(π) = max y π − v( y)
y∈int(∆) ˜
is a potential function for M as deﬁned in (5.12).
˜
Alternatively, suppose we deﬁne M in terms of an admissible stochastic perturbation
ε. In this case, the expectation of the maximal perturbed payoﬀ is a potential function for
˜
M.
˜
Theorem 5.C.4. The function µ : Rn → R deﬁned by
˜
µ(π) = E max(π j + ε j )
j∈S ˜
is a potential function for M as deﬁned in (5.11).
The intuition behind this result is simple. If we marginally increase the value of πi , the
value of the maximum function max j π j + ε j goes up at a unit rate at those values of ε
202 where strategy i is optimal. The set of ε at which strategy i is optimal also changes, but the
contribution of these points to the value of the maximum function is negligible. Building
on these observations, one can show that
˜
∂µ
˜
(π) = E1{i=argmax j π j +ε j } = P i = argmax j π j + ε j = Mi (π).
∂πi
Which functions are perturbed maximizer functions? The following characterization
of the perturbed maximizer functions that can be derived from admissible deterministic
perturbations follows easily from the proof of Theorem 5.2.2.
˜
Corollary 5.C.5. A bijective function M : Rn → int(∆) can be derived from an admissible
˜
deterministic perturbation if and only if DM(π) is symmetric, positive deﬁnite on Rn , and satisﬁes
0
˜ (π)1 = 0.
DM
The counterpart of this result for stochastic perturbations is known as the WilliamsDalyZachary Theorem.
˜
Theorem 5.C.6. A bijective function M : Rn → int(∆) can be derived from an admissible
˜
stochastic perturbation if and only if DM(π) is symmetric, positive deﬁnite on Rn , and satisﬁes
0
˜
˜
DM(π)1 = 0, as well as the additional requirement that the partial derivatives of M satisfy
˜
∂k M i 0
>0
(−1)
∂πi1 · · · ∂πik
k for each k = 1, . . . , n − 1 and each set of k + 1 distinct indices {i0 , i1 , . . . , ik } ⊆ S.
To establish the necessity of the kth order derivative conditions, one repeatedly diﬀerenti˜
ates the deﬁnition of M. The ﬁrst order derivative condition is derived in this way in the
proof of Theorem 5.2.2. These two results show that deterministic perturbations generate
a strictly larger class of perturbed maximizer functions than stochastic perturbations; see
Exercise 5.2.3 for an explicit example. 5.N Notes Section 5.1: The best response dynamic was introduced by Gilboa and Matsui (1991)
and further studied by Matsui (1992), Hofbauer (1995b), and Gaunersdorfer and Hofbauer
(1995). Hofbauer (1995b) introduced the interpretation of the best response dynamic as a
diﬀerential inclusion. 203 Example 5.1.7 is introduced by Zeeman (1980), who shows that the interior Nash
equilibrium of this game is not an ESS but is nevertheless asymptotically stable under the
replicator dynamic. The properties of the best response dynamic described in the example
are pointed out by Hofbauer (1995b). A complete analysis of best response dynamics in
RockPaperScissors games can be found in Gaunersdorfer and Hofbauer (1995).
An approximation theorem for collections of Markov processes whose mean dynamics
are diﬀerential inclusions is proved by Bena¨m et al. (2005). They work in a setting in which
ı
the step size of the increments of the Markov processes shrinks over time; we conjecture
that their result is also true in the present constant step size setting. Such a result would
provide a foundation not just for the best response dynamic, but for the projection dynamic
as well.
Section 5.2: This section is based on Hofbauer and Sandholm (2002, 2007).
The perturbed best response dynamic ﬁrst appears in the work of Fudenberg and Kreps
(1993) on stochastic ﬁctitious play, while the logit dynamic ﬁrst appears in Fudenberg and
Levine (1998). For the Hausdorﬀ metric mentioned in Example 5.2.1, see Ok (2007). For
further references on logit models in game theory, see the Notes to Chapter 10.
In the experimental economics literature, perturbed equilibrium goes by the name of
quantal response equilibrium, a term introduced by McKelvey and Palfrey (1995). Some
authors use this term more narrowly to refer to logit equilibrium. For more on the use
of these concepts in the experimental literature, see Camerer (2003) and the references
therein.
˜
The properties of the derivative matrix DM(π) have long been known in the discrete
choice literature—see McFadden (1981) or Anderson et al. (1992). The control cost interpretation of deterministic perturbations is suggested by van Damme (1991, Chapter
˜
4). That independent εi with bounded densities generate a continuously diﬀerentiable M
follows from standard results on convolutions; see Hewitt and Stromberg (1965, Theorem
21.33).
An intuitive discussion of the Poincar´ Hopf Theorem can be found in Hofbauer and
e
Sigmund (1988, Section 19); see Milnor (1965) for a formal treatment. See Ritzberger
(1994), Demichelis and Germano (2000, 2002), and Demichelis and Ritzberger (2003) for
intriguing uses of topological ideas to study the global properties of evolutionary game
dynamics.
Section 5.3: Nagurney and Zhang (1996, 1997), building on work of Dupuis and Nagurney (1993), introduce the projection dynamic in the context of congestion games. Earlier,
Friedman (1991) introduced an evolutionary dynamic that is equivalent to the projection
dynamic on int(X), but that is diﬀerent at states in bd(X). The presentation in this section 204 follows Lahkar and Sandholm (2008) and Sandholm et al. (2008).
Appendix 5.A: Smirnov (2002) provides a readable introduction to the theory of diﬀerential inclusions. A more comprehensive but less readable reference is Aubin and Cellina
(1984).
The existence of solutions to diﬀerential inclusions deﬁned by projections of multivalued maps was proved by Henry (1973); the approach described here follows Aubin and
Cellina (1984, Section 5.6). Restricting attention to diﬀerential equations deﬁned by projections of Lipschitz continuous functions allows one to establish uniqueness and continuity
results, a point noted, e.g., by Dupuis and Nagurney (1993).
Appendix 5.B: Formal treatments of the Legendre transform can be found in Rockafellar
(1970) and HiriartUrruty and Lemar´ chal (2001). Example 5.B.3 is borrowed from Roberts
e
and Varberg (1973, Section 15). For the Global Inverse Function Theorem, see Gordon
(1972).
Appendix 5.C : Theorem 5.2.2 is due to Hofbauer and Sandholm (2002). For proofs of
Theorem 5.C.4 and 5.C.6, see McFadden (1981) or Anderson et al. (1992). The latter source
is a good general reference on discrete choice theory. 205 206 Part III
Convergence and Nonconvergence 207 CHAPTER SIX
Global Convergence of Evolutionary Dynamics 6.0 Introduction In the preceding chapters, we introduced a variety of classes of evolutionary dynamics
and exhibited their basic properties. Most conspicuously, we established links between
the rest points of each dynamic and the Nash equilibria of the underlying game, links that
are valid regardless of the nature of the game at hand. This connection is expressed in
its strongest form by dynamics satisfying Nash stationarity (NS), under which rest points
and Nash equilibria coincide.
Still, once one speciﬁes an explicitly dynamic model of behavior, the most natural
approach to prediction is not to focus immediately on equilibrium points, but to determine
where the dynamic leads when set in motion from various initial conditions. If equilibrium
occurs as the limiting state of this adjustment process, we can feel some conﬁdence in
predicting equilibrium play. If instead our dynamics lead to limit cycles or other more
complicated limit sets, then these sets rather than the unstable rest points provide superior
predictions of behavior.
In this chapter, we seek conditions on games and dynamics under which behavior
converges to equilibrium from all or nearly all initial population states. We therefore reconsider the three classes of population games introduced in Chapter 2—potential games,
stable games, and supermodular games—and derive conditions on evolutionary dynamics that ensure convergence in each class of games. We also establish convergence results
for dominance solvable games, but we shall see in Chapter 8 that these results are not
robust to small changes in the dynamics for which they hold.
The most common method for proving global convergence in a dynamical system is by 209 constructing a strict Lyapunov function: a scalarvalued function that the dynamic ascends
whenever it is not at rest. When the underlying game is a potential game, the game’s
potential function provides a natural candidate Lyapunov function for evolutionary dynamics. We verify in Section 6.1 that a potential functions serve as Lyapunov functions
under any evolutionary dynamic that satisﬁes our basic monotonicity condition, positive
correlation (PC). We then use this fact to prove global convergence in potential games
under all of the evolutionary dynamics studied in Chapters 4 and 5.
Unlike potential games, stable games do not come equipped with a scalarvalued
function that is an obvious candidate Lyapunov function for evolutionary dynamics. But
the structure of payoﬀs in these games—already reﬂected in the fact that their sets of Nash
equilibria are convex—makes it natural to expect convergence results to hold.
We develop this intuition in Section 6.2, where we develop approaches to constructing Lyapunov functions for stable games. We ﬁnd that distancelike functions serve as
Lyapunov functions for the replicator and projection dynamics, allowing us to establish
global convergence results for these dynamics in strictly stable games. For target dynamics, including excess payoﬀ, best response, and perturbed best response dynamics,
we ﬁnd that integrability of the revision protocol is the key to establishing convergence
results. We argue in Section 6.2.2 that in the presence of payoﬀ monotonicity, integrability
of the protocol ensures that on average, the vector of motion deviates from the vector of
payoﬀs in the direction of the equilibrium; given the geometry of equilibrium in stable
games, this is enough to ensure convergence to equilibrium. All told, we prove global
convergence results for all six of our fundamental dynamics.
In Section 6.3, we turn our attention to supermodular games. As these game’s essential
property is the monotonicity of their best response correspondences, it is not surprising
that our convergence results address dynamics that respect this monotone structure. We
begin by considering the best response dynamic, using elementary methods to prove a
convergence result for supermodular games generated by twoplayer normal form games
that satisfy a “diminishing returns” condition. To obtain convergence results that demand
less structure of the game, we appeal to methods from the theory of cooperative diﬀerential
equations: these are smooth diﬀerential equations under which increasing the value of one
component of the state variable increases the growth rates of all other components. The
smoothness requirement precludes applying these methods to the best response dynamic,
but we are able to use them to study perturbed best response dynamics. We prove that
after a natural change of coordinates, perturbed best response functions generated by
stochastic perturbations of payoﬀs are monotone. Ultimately, this allows us to show that
the corresponding perturbed best response dynamics converge to perturbed equilibrium 210 from almost all initial conditions.
In Section 6.4, we study evolution in games with strictly dominated strategies. We
ﬁnd that under the best response dynamic and under imitative dynamics, strictly dominated strategies are eliminated; so are strategies ruled out by iterative removal of strictly
dominated strategies. It follows that in games that are dominance solvable—that is, in
games where this iterative procedure leaves only one strategy for each population—the
best response dynamic and all imitative dynamics converge to the dominance solution.
We should emphasize, however, that these elimination results are not robust: we will
see in Chapter 8 that under many small modiﬁcations of the dynamics covered by our
elimination results, strictly dominated strategies can survive.
The deﬁnitions and tools from dynamical systems theory needed for our analyses are
treated in the Appendix. Appendix 6.A introduces notions of stability, limit behavior, and
recurrence for deterministic dynamics. Appendix 6.B presents stability and convergence
results for dynamics that admit Lyapunov functions. Finally, Appendix 6.C introduces
the theory of cooperative diﬀerential equations and monotone dynamical systems. 6.1
6.1.1 Potential Games
Potential Functions as Lyapunov Functions In a potential game F : X → Rn , all information about incentives is captured by the
potential function f : X → R, in that
(6.1) f (x) = ΦF(x) for all x ∈ X. In Chapter 2, we characterized Nash equilibria of F as those states that satisfy the KuhnTucker ﬁrst order conditions for maximizing f on X. We now take a further step, using
the potential function to describe disequilibrium adjustment. In Lemma 6.1.1, we show
that any evolutionary dynamic that satisﬁes positive correlation,
(PC) p VF (x) p 0 implies that VF (x) Fp (x) > 0, must ascend the potential function f .
To state this result, we introduce the notion of a Lyapunov function. The C1 function
˙
L : X → R is an (increasing) strict Lyapunov function for the diﬀerential equation x = VF (x)
˙
if L(x) ≡ L(x) VF (x) ≥ 0 for all x ∈ X, with equality only at rest points of VF . 211 Lemma 6.1.1. Let F be a potential game with potential function f . Suppose the evolutionary
˙
dynamic x = VF (x) satisﬁes positive correlation (PC). Then f is a strict Lyapunov function for VF .
Proof. Follows immediately from condition (PC) and the fact that
f˙(x) = p ˙
f (x) x = (ΦF(x)) VF (x) = Fp (x) VF (x).
p∈P The initial equality in the expression above follows from an application of the chain rule
(Section 2.A.4) to the composition t → xt → f (xt ). Versions of this argument will be used
often in the proofs of the results to come.
If a dynamic admits a strict Lyapunov function, all solution trajectories of the dynamic
converge to equilibrium. Combining this fact with Lemma 6.1.1 allows us to prove a
global convergence result for potential games. To state this result, we brieﬂy present
some deﬁnitions concerning limit behavior of deterministic trajectories; for more on these
notions and on Lyapunov functions, see Appendices 6.A and 6.B.
The ωlimit of trajectory {xt }t≥0 is the set of all points that the trajectory approaches
arbitrarily closely inﬁnitely often:
ω({xt }) = y ∈ X : there exists {tk }∞ 1 with lim tk = ∞ such that lim xtk = y .
k=
k→∞ k→∞ ˙
For dynamics x = VF (x) that admit a unique solution trajectory from each initial condition,
we write ω(ξ) for the ωlimit set of the trajectory starting from state ξ, and we let
Ω(VF ) = ω(ξ)
ξ∈X denote the set of all ωlimit points of all solution trajectories. The set Ω(VF ) (or its closure,
when Ω(VF ) is not closed) provides a basic notion of recurrence for deterministic dynamics.
˙
Theorem 6.1.2. Let F be a potential game, and let x = VF (x) be an evolutionary dynamic for
F that admits a unique solution from each initial condition and that satisﬁes positive correlation
(PC). Then Ω(VF ) = RP(VF ). In particular,
(i) If VF is an imitative dynamic, then Ω(VF ) = RE(F), the set of restricted equilibria of F.
(ii) If VF is an excess payoﬀ dynamic, a pairwise comparison dynamic, or the projection dynamic,
then Ω(VF ) = NE(F).
Proof. Immediate from Lemma 6.1.1, Theorem 6.B.4, and the characterizations of rest
points from Chapters 4 and 5. 212 Example 6.1.3. 123 Coordination. Figure 6.1.1 presents phase diagrams for the six fundamental dynamics in 123 Coordination: (6.2) 1 0 0 x1 x1 0 2 0 x = 2x . 2 2 F(x) = Ax = 0 0 3 x 3x 3
3 In the ﬁrst ﬁve cases, the phase diagram is plotted atop the potential function
(6.3) f (x) = 1
2 (x1 )2 + 2(x2 )2 + 3(x3 )2 . Of these, the ﬁrst four cases (replicator, projection, BNN, Smith) are covered by Theorem 6.1.2; evidently, every solution trajectory in diagrams (i)–(iv) ascends the potential
function, ultimately converging to one of the seven Nash equilibria of F.
It is worth noting that these equilibria are not all locally stable. The interior equilibrium
is a source, with all nearby solution trajectories moving away from the equilibrium. The
three equilibria with twostrategy supports are saddles: for each of these, there is one
solution trajectory that converges to the equilibrium, while all other nearby trajectories
eventually move away from the equilibrium. Only the three remaining equilibria—the
three pure equilibria—are locally stable. We defer further discussion of local stability to
Chapter 7, which is devoted to this topic. §
Our convergence results for best response and perturbed best response dynamics
require additional work. In the case of the best response dynamic
(BR) p ˙
xp ∈ mp Mp (Fp (x)) − xp , where Mp (πp ) = argmax πi ,
i∈Sp we must account for the fact that the dynamic is multivalued.
˙
Theorem 6.1.4. Let F be a potential game with potential function f , and let x ∈ VF (x) be the best
response dynamic for F. Then
∂f
(x) =
∂z ˆp
mp max F j (x) for all z ∈ VF (x) and x ∈ X.
p
p∈P j∈S Therefore, every solution trajectory {xt } of VF satisﬁes ω({xt }) ⊆ NE(F).
Proof. Recall from Theorem 5.1.8 that the best response dynamic satisﬁes the following 213 1 2 1 32 3 (i) replicator (ii) projection 1 1 2 32 3 (iii) BNN (iv) Smith 1 1 2 32 (v) best response 3 (vi) logit(.5) Figure 6.1.1: Six basic dynamics in 123 Coordination. The contour plots are the potential function in (i)(v),
and the logit potential function in (vi). 214 reﬁnement of condition (PC):
p p ˆ
(zp ) Fp (x) = mp max F j (x) for all zp ∈ VF (x).
p
j∈S This condition immediately implies that
∂f
(x) ≡
∂z f (x) z = (ΦF(x)) z = p Fp (x) zp =
p∈P ˆ
mp max F j (x).
p
p∈P j∈S ∂f Thus, ∂z (x) ≥ 0 for all x ∈ X and z ∈ VF (x), and Lemma 4.5.4 implies that equality holds if
and only if x ∈ NE(F). The convergence result now follows from Theorem 6.B.5.
Example 6.1.5. 123 Coordination revisited. Figure 6.1.1(v) presents the phase diagram of the
best response dynamic in 123 Coordination (6.2), again atop the potential function (6.3).
As in Example 5.1.5, there are multiple solutions starting from each initial condition on
the Yshaped set of boundaries between the best response regions. It is not hard to verify
that each of these solutions converges to a Nash equilibrium. §
Finally, we turn to perturbed best response dynamics, considering the (more general)
deﬁnition of these dynamics via admissible deterministic perturbations vp : int(∆p ) → R.
˜
˜
˙
xp = mp Mp (F(x)) − xp , where Mp (πp ) = argmax ( yp ) πp − vp ( yp ),
yp ∈int(∆p ) While these dynamics do not satisfy positive correlation (PC), Theorem 5.2.13 showed
that these dynamics do satisfy a perturbed analogue called virtual positive correlation:
V p (x) ˜
0 implies that V p (x) Fp (x) > 0 for all p ∈ P , ˜
where the virtual payoﬀs F : int(X) → Rn for the pair (F, v) are deﬁned by
1
˜
Fp (x) = Fp (x) − vp ( mp xp ). Accordingly, the Lyapunov function for a perturbed best response dynamic is not the
potential function f , but a perturbed version thereof.
˙
Theorem 6.1.6. Let F be a potential game with potential function f , and let x = VF,v (x) be the
perturbed best response dynamic for F generated by the admissible deterministic perturbations 215 v = (v1 , . . . , vp ). Deﬁne the perturbed potential function f˜ : int(X) → R by
(6.4) f˜(x) = f (x) − 1
mp vp ( mp xp ).
p∈P Then f˜ is a strict Lyapunov function for VF,v , and so Ω(VF,v ) = PE(F, v).
Proof. That f˜ is a strict Lyapunov function for VF,v follows immediately from virtual
positive correlation and the fact that
˙
f˜(x) ≡ p ˙
f˜(x) x = p
˜
Fp (x) VF,v (x). 1
Fp (x) − vp ( mp xp ) VF,v (x) =
p∈P p∈P Since PE(F, v) ≡ RP(VF,v ), that Ω(VF,v ) = PE(F, v) follows from Theorem 6.B.4.
Example 6.1.7. 123 Coordination rerevisited. Figure 6.1.1(vi) presents the phase diagram for
the logit(.5) dynamic in 123 Coordination (6.2). Here the contour plot is the logit potential
function
3 f˜(x) = 1
2 (x1 ) + 2(x2 ) + (3(x3 ) − .5
2 2 xi log xi . 2 i =1 Because the noise level is rather high, this phase diagram looks very diﬀerent than the
others—in particular, it includes only three rest points (two stable and one unstable)
rather than seven. Nevertheless, every solution trajectory ascends the relevant Lyapunov
function f˜, ultimately converging to a perturbed equilibrium. § 6.1.2 Gradient Systems for Potential Games Lemma 6.1.1 tells us that in potential games, any dynamic that satisﬁes condition (PC)
must ascend the potential function f . We now turn to a more reﬁned question: is there an
evolutionary dynamic that ascends f in the fastest possible way?
A ﬁrst answer to this question is suggested by Figure 6.1.1(ii): in 123 Coordination,
solution trajectories of the projection dynamic,
(P) ˙
x = ΠTX(x) (F(x)), cross the level sets of the potential function orthogonally. In fact, we have 216 Observation 6.1.8. Let F : X → Rn be a potential game with potential function f : X → R. On
int(X), the projection dynamic (P) is the gradient system for f :
(6.5) ˙
x= f (x) on int(X). Surprisingly, there is an alternative answer to our question: it turns out that the
replicator dynamic,
(R) p
p ˆp
˙
xi = xi Fi (x), also deﬁnes a gradient system for the potential function f ; however, this is only true
after we apply a clever change of variable. In addition to its inherent interest, this fact
demonstrates a close connection between the replicator and projection dynamics; another
such connection will be made in Section 6.2.1 below.
We restrict our analysis to the single population case. Deﬁne the set X = {x ∈ Rn :
+
2
i∈S x i = 4} to be the portion of the raidus 2 sphere lying in the positive orthant. Our
change of variable is given by the Akin transformation H : int(Rn ) → int(Rn ), where
+
+
√
Hi (x) = 2 xi . Evidently, H is a diﬀeomorphism that maps the simplex X onto the set X .
The transformation makes changes in component xi look large when xi itself is small.
Theorem 6.1.9 tells us that the replicator dynamic is a gradient dynamic on int(X)
after a change of variable that makes changes in the use of rare strategies look important
relative to changes in the use of common ones. Intuitively, this reweighting accounts for
the fact that under imitative dynamics, changes in the use of rare strategies are necessarily
slow.
Theorem 6.1.9. Let F : X → Rn be a potential game with potential function f : X → R. Suppose
we transport the replicator dynamic for F from int(X) to int(X ) using the Akin transformation
H. Then the resulting dynamic is the gradient dynamic for the transported potential function
φ = f ◦ H −1 .
Proof. We prove Theorem 6.1.9 in two steps: ﬁrst, we derive the transported version of
the replicator dynamic; then we derive the gradient system for the transported version of
the potential function, and show that it is the same dynamic on X . The following notation
will simplify our calculations: when y ∈ Rn and a ∈ R, we let [ ya ] ∈ Rn be the vector whose
+
ith component is ( yi )a .
We can express the replicator dynamic on X as
˙
x = R(x) = diag(x) (F(x) − 1x F(x)) = diag (x) − xx F(x).
217 The transported version of this dynamic can be computed as
˙
x = R (x ) = DH(H−1 (x ))R(H−1 (x )).
In words: given a state x ∈ X , we ﬁrst ﬁnd the corresponding state x = H−1 (x ) ∈ X and
direction of motion R(x). Since R(x) represents a displacement from state x, we transport
it to X by premultiplying it by DH(x), the derivative of H evaluated at x.
Since x = H(x) = 2 [x1/2 ], the derivative of H at x is given by DH(x) = diag([x−1/2 ])
Using this fact, we derive a primitive expression for R (x ) in terms of x = H−1 (x ) = 1 [x 2 ]:
4
(6.6) ˙
x = R (x )
= DH(x)R(x)
= diag([x−1/2 ])(diag(x) − xx )F(x)
= diag([x1/2 ]) − [x1/2 ]x F(x). Now, we derive the gradient system on X generated by φ = f ◦ H−1 . To compute φ(x ),
we need to deﬁne an extension of φ to all of Rn , compute its gradient, and then project the
+
result onto the tangent space of X at x . The easiest way to proceed is to let f˜ : int(Rn ) → R
+
1
˜ : int(Rn ) → R by φ = f˜◦ H−1 .
˜
be an arbitrary C extension of f , and to deﬁne the extension φ
+
Since X is a portion of a sphere centered at the origin, the tangent space of X at x
is the subspace TX (x ) = {z ∈ Rn : x z = 0}. The orthogonal projection onto this set is
represented by the n × n matrix
PTX (x ) = I − 1 xx 1
4 xx = I − xx = I − [x1/2 ][x1/2 ] . Also, since Φ f˜(x) = f (x) = ΦF(x) by construction, it follows that
some scalarvalued function c : X → R.
Using the chain rule (Section 2.A.4), we compute that f˜(x) = F(x) + c(x)1 for ˜
φ(x ) = D( f˜ ◦ H−1 )(x ) = (D f (H−1 (x )) DH−1 (x )) = DH−1 (x ) f˜(x), while applying the chain rule to the identity H−1 (H(x)) ≡ x and then rearranging yields
DH−1 (x ) = DH(x)−1 . 218 Marshaling these observations, we ﬁnd that the gradient system on X generated by φ is
˙
x = φ(x )
˜
= PTX (x ) φ(x )
= PTX (x ) DH−1 (x ) f˜(x) = PTX (x ) (DH(x)−1 ) (F(x) + c(x)1)
= I − [x1/2 ][x1/2 ] diag([x1/2 ]) (F(x) + c(x)1)
= diag([x1/2 ]) − [x1/2 ]x (F(x) + c(x)1)
= diag([x1/2 ]) − [x1/2 ]x F(x).
This agrees with equation (6.6), completing the proof of the theorem.
Example 6.1.10. 123 Coordination one last time. Figure 6.1.2 illustrates Theorem 6.1.9 by
˙
presenting phase diagrams of the transported replicator dynamic x = R (x ) for 123 Coordination (cf Example 6.1.3). These phase diagrams on X are drawn atop contour plots
1
of the transported potential function φ(x ) = ( f ◦ H−1 )(x ) = 32 ((x1 )4 + 2(x2 )4 + 3(x3 )4 ). According to Theorem 6.1.9, the solution trajectories of R should cross the level sets of φ
orthogonally.
Looking at Figure 6.1.2, we ﬁnd that the crossings look orthogonal at the center of the
ﬁgure, but not by the boundaries. This is an artifact of our drawing a portion of the sphere
in R3 by projecting it orthogonally onto a sheet of paper. (For exactly the same reason,
latitude and longitude lines in an orthographic projection of the Earth only appear to cross
at right angles in the center of the projection, not on the left and right sides.) To check
whether the crossings near a given state x ∈ X are truly orthogonal, we must minimize
the distortion of angles near x by making x the origin of the projection—that is, the point
where the sphere touches the sheet of paper. In the phase diagrams in Figure 6.1.2, we
mark the projection origins with pink dots; evidently, the crossings are orthogonal near
these points. § 6.2 Stable Games Recall that the population game F is stable if it satisﬁes
(6.7) ( y − x) (F( y) − F(x)) ≤ 0 for all x, y ∈ X. 219 1 2 3 (i) origin = H( 1 , 1 , 1 )
333
1 2 3 1
(ii) origin = H( 7 , 1 , 5 )
77 ˙
Figure 6.1.2: The phase diagram of the transported replicator dynamic x = R (x ) for a coordination game.
The pink dots represent the positions of the projection origins. 220 When F is C1 , this condition is equivalent to selfdefeating externalities:
(6.8) z DF(x) z ≤ 0 for all z ∈ TX and x ∈ X. The set of Nash equilibria of a stable game is convex, and most often a singleton.
In general, uniqueness of equilibrium is not enough to ensure convergence of evolutionary dynamics. As we shall see in Chapter 8, there are many simple examples of
games with a unique Nash equilibrium in which dynamics fail to converge. Nevertheless,
we show in this section that under many evolutionary dynamics, the structure provided
by selfdefeating externalities is enough to ensure convergence. While fewer dynamics
converge here than in potential games, convergence does obtain under all six fundamental
dynamics.
Our convergence proofs for stable games again rely on the construction of Lyapunov
functions, but here we will need to construct a distinct Lyapunov function for each dynamic we consider. It will be natural to write these Lyapunov functions so that their
values fall over time: thus, we say that a C1 function L is a (decreasing) strict Lyapunov
˙
˙
function for the dynamic x = VF (x) if L(x) ≤ 0 for all x ∈ X, with equality only at rest points
of VF . Apart from those for perturbed best response dynamics, the Lyapunov functions
introduced below are also gap functions: they are continuous and nonnegative, with zeros
precisely at the Nash equilibria of the underlying game F. 6.2.1 The Projection and Replicator Dynamics in Strictly Stable Games To obtain convergence results for the projection and replicator dynamics, we must
restrict attention to strictly stable games: that is, games in which condition (6.7) holds
strictly for all x, y ∈ X. The Lyapunov functions for these dynamics are based on explicit
notions of “distance” from the the game’s unique Nash equilibrium x∗ .
Theorem 6.2.1 shows that under the projection dynamic, x∗ is globally asymptotically
stable: all solution trajectories converge to x∗ , and solutions that start near x∗ never move
too far away from x∗ (see Appendix 6.A.2).
˙
Theorem 6.2.1. Let F be a strictly stable game with unique Nash equilibrium x∗ , and let x = VF (x)
be the projection dynamic for F. Let the function Ex∗ : X → R+ , deﬁned by
2
Ex∗ (x) = x − x∗ , represent squared Euclidean distance from x∗ . Then Ex∗ is a strict Lyapunov function for VF , and
so x∗ is globally asymptotically stable under VF .
221 Proof. Since F is a strictly stable game, its unique Nash equilibrium x∗ is also its unique
GESS:
(x − x∗ ) F(x) < 0 for all x ∈ X − {x∗ }.
This fact and the Moreau Decomposition Theorem imply that
˙
˙
Ex∗ (x) = Ex∗ (x) x
∗) Π
= 2(x − x TX(x) (F(x)) = 2(x − x∗ ) F(x) + 2(x∗ − x) ΠNX(x) (F(x))
≤ 2(x∗ − x) Π
(F(x))
NX(x) ≤ 0,
where the penultimate inequality is strict whenever x
NE(F) then follows from Corollary 6.B.7. x∗ . Global asymptotic stability of Exercise 6.2.2. Let F be a stable game, and let x∗ be a Nash equilibrium of F.
(i) Show that x∗ is Lyapunov stable under (P).
(ii) Suppose that F is a null stable game (i.e., that ( y − x) (F( y) − F(x)) = 0 for all x, y ∈ X).
Show that if x∗ ∈ int(X), then Ex∗ deﬁnes a constant of motion for (P) on int(X): the
value of Ex∗ is constant along interior portions of solution trajectories of (P).
Exercise 6.2.3. Show that if F is a C1 stable game, then the squared speed of motion
L(x) = ΦF(x)2 is a Lyapunov function for (P) on int(X). Show that if F is null stable,
then L deﬁnes a constant of motion for (P) on int(X). (Notice that unlike that of Ex∗ , the
deﬁnition of L does not directly incorporate the Nash equilibrium x∗ .)
Under the replicator dynamic (R), as under any imitative dynamic, strategies that are
initially unused remain unused for all time. Therefore, if state x places no mass on a
strategy in the support of the Nash equilibrium x∗ , the solution to (R) starting from x
cannot converge to x∗ . Thus, in stating our convergence result for the replicator dynamic,
we need to be careful to specify the set of states from which convergence to equilibrium
occurs.
p
With this motivation, let Sp (xp ) = {i ∈ Sp : xi > 0} denote the support of xp . Then
p
X yp = xp ∈ Xp : Sp ( yp ) ⊆ Sp (xp ) is the set of states in Xp whose supports contain the support
p
of yp , and X y = p∈P X yp is the set of states in X whose supports contain the support of y. 222 p p To construct our Lyapunov function, we introduce the function h yp : X yp → R, deﬁned by
p p
h yp (xp ) p
yi = log i∈Sp ( yp ) yi p. xi p If population p is of unit mass, so that yp and xp are probability distributions, h yp (xp ) is
known as the relative entropy of yp given xp .
˙
Theorem 6.2.4. Let F be a strictly stable game with unique Nash equilibrium x∗ , and let x = VF (x)
be the replicator dynamic for F. Deﬁne the function Hx∗ : Xx∗ → R+ by
p Hx∗ (x) = h(x∗ )p (xp ).
p∈P −
Then Hx∗1 (0) = {x∗ }, and Hx∗ (x) approaches inﬁnity whenever x approaches X − Xx∗ Moreover,
˙
Hx∗ (x) ≤ 0, with equality only when x = x∗ . Therefore, x∗ is globally asymptotically stable with
respect to Xx∗ . Proof. (p = 1) To see that Hx∗ is a gap function, observe that by Jensen’s inequality, xi xi x∗ · ∗ = log xi ≤ log 1 = 0,
−Hx∗ (x) =
x∗ log ∗ ≤ log i
i ∗
∗ xi
xi ∗)
i∈S(x )
i∈S(x )
i∈S(x
with equality if and only if x = x∗ . The second claim is immediate. For the third claim,
note that since F is strictly stable, x∗ is a GESS, so for all x ∈ Xx∗ we have that
˙
˙
Hx∗ (x) = Hx∗ (x) x
x∗
i
ˆ
=−
· xi Fi (x)
xi
∗
i∈S(x ) ˆ
x∗ Fi (x)
i =−
i∈S = −(x∗ ) (F(x) − 1 x F(x))
= −(x∗ − x) F(x)
≤ 0,
where the inequality binds precisely when x = x∗ . The conclusions about stability then
follow from Theorems 6.B.2 and 6.B.4. 223 Exercise 6.2.5. Let F be a stable game, and let x∗ be a Nash equilibrium of F.
(i) Show that x∗ is Lyapunov stable under (R).
(ii) Show that if F is a null stable game and x∗ ∈ int(X), then Hx∗ deﬁnes a constant of
motion for (R) on int(X). 6.2.2 Integrable Target Dynamics Of our six fundamental dynamics, three of them—the BNN, best response, and logit
dynamics, can be expressed as target dynamics of the form
ˆ
τp (πp , xp ) = τp (πp ),
under which conditional switch rates only depend on on the vector of excess payoﬀs
1
ˆ
πp = πp − mp 1(xp ) πp . This is obviously true of the BNN dynamic. For the other two cases,
note that shifting all components of the payoﬀ vector by the same constant has no eﬀect
on either exact or perturbed best responses: in particular, the deﬁnitions (5.2) and (5.12)
p
of the maximizer correspondence Mp : Rn ⇒ ∆p and the perturbed maximizer function
p
˜
˜ˆ
˜
ˆ
Mp : Rn → ∆p satisfy Mp (πp ) = Mp (πp ) and Mp (πp ) = Mp (πp ).
In this section, we show that these three dynamics converge to equilibrium in all
stable games, as do all close enough relatives of these dynamics. Unlike in the context of
potential games, monotonicity properties alone are not enough to ensure that a dynamic
converges: in addition, integrability of the revision protocol plays a key role in establishing
convergence results.
To begin, we provide an example to illustrate that monotonicity properties alone do
not ensure convergence of target dynamics in stable games.
Example 6.2.6. Cycling in good RPS. Fix ε > 0, and let gε : R → R be a continuous decreasing
function that equals 1 on (–∞, 0], equals ε2 on [ε, ∞), and is linear on [0, ε]. Then deﬁne
the revision protocol τ for RockPaperScissors games by (6.9) τR (π) [πR ]+ gε (πS )
ˆ ˆ ˆ τ (π) = [π ] gε (π ) . P ˆ ˆP + ˆR τ (π) [π ] gε (π ) ˆS + ˆP Sˆ Under this protocol, the weight placed on a strategy is proportional to positive part of the
strategy’s excess payoﬀ, as in the protocol for the BNN dynamic; however, this weight is
only of order ε2 if the strategy it beats in RPS has an excess payoﬀ greater than ε. 224 R P S Figure 6.2.1: An excess payoﬀ dynamic in good RPS (w = 3, l = 2). It is easy to verify that protocol (6.9) satisﬁes acuteness (4.20):
ˆˆ
ˆ+
ˆ
ˆ+
ˆ
ˆ+
ˆ
τ(π) π = [πR ]2 gε (πS ) + [πP ]2 gε (πR ) + [πS ]2 gε (πP ),
ˆ
which is positive when π ∈ int(Rn ). Thus, the target dynamic induced by τ is an excess
∗
payoﬀ dynamic. In Figure 6.2.1 we presents a phase diagram for this dynamic in the good
RPS game 0 −2 3 xR 3 F(x) = Ax = 0 −2 xP . −2 3
0 xS Evidently, solutions from many initial conditions lead to a limit cycle. §
To explain why cycling occurs in the example above, we review some ideas about the
geometry of stable games and target dynamics. By Theorem 2.3.16, every Nash equilibrium x∗ of a stable game is a GNSS. Geometrically, this means that at every nonequilibrium
state x, the projected payoﬀ vector ΦF(x) forms an acute or right angle with the line segment from x back to x∗ (Figures 2.3.3, 2.3.5, and 2.3.6). Meanwhile, our monotonicity
condition for dynamics, positive correlation (PC), requires that away from equilibrium,
each vector of motion VF (x) forms an acute angle with the projected payoﬀ vector ΦF(x)
(Figures 4.2.1 and 4.2.2). Combining these observations, we conclude that if the law of
225 ˙
motion x = VF (x) tends to deviate from the projected payoﬀs ΦF in “outward” directions—
that is, in directions heading away from equilibrium—then cycling will occur (compare
Figure 2.3.6 with Figure 6.2.1). On the other hand, if the deviations of VF from ΦF tend to
be “inward”, then solutions should converge to equilibrium.
By this logic, we should be able to guarantee convergence of target dynamics in stable
games by ensuring that the deviations of VF from ΦF are toward the equilibrium, at least in
some average sense. To accomplish this, we introduce an additional condition for revision
protocols: integrability.
(6.10) p There exists a C1 function γp : Rn → R such that τp ≡ γp . We call the functions γp introduced in this condition revision potentials.
To give this condition a behavioral interpretation, it is useful to compare it to separability:
(6.11) p p ˆ
ˆ
τi (πp ) is independent of π−i . The latter condition is stronger than the former: if τp satisﬁes (6.11), then it satisﬁes (6.10)
with
p (6.12) ˆ
πi ˆ
γ (π ) =
p p i∈Sp 0 p τi (s) ds. In Example 6.2.6, the protocol (6.9) that generated cycling has the following noteworthy
feature: the weights agents place on each strategy depend systematically on the payoﬀs
of the next strategy in the best response cycle. Building on this motivation, one can obtain
a gametheoretic interpretation of integrability. Roughly speaking, integrability (6.10) is
equivalent to a requirement that in expectation, learning the weight placed on strategy
j does not convey information about other strategies’ excess payoﬀs. It thus generalizes
separability (6.11), which requires that learning the weight placed on strategy j conveys
no information at all about other strategies’ excess payoﬀs (see the Notes).
Before turning to our convergence theorems, we address a missing step in the motivating argument above: how does integrability ensure that the law of motion VF tends to
deviate from the projected payoﬀs ΦF in the direction of equilibrium? To make this link,
let us recall a characterization of integrablility from Section 2.A.9: the map τ : Rn → Rn is
integrable if and only if its line integral over any piecewise smooth closed curve C ⊂ Rn 226 evaluates to zero:
ˆ
ˆ
τ(π) · dπ = 0. (6.13)
C Example 6.2.7. Let the population game F be generated by random matching in standard
RPS: 0 −1 1 xR 1 F(x) = Ax = 0 −1 xP . −1 1
0 xS 1
The unique Nash equilibrium of F is the GNSS x∗ = ( 1 , 3 , 1 ). Game F has the convenient
3
3
property that at each state x ∈ X, the payoﬀ vector F(x), the projected payoﬀ vector ΦF(x),
ˆ
and the excess payoﬀ vector F(x) are all the same, a fact that will simplify the notation in
the argument to follow.
Since F is null stable, we know that at each state x
x∗ , the payoﬀ vector F(x) is
orthogonal to the vector x∗ − x. In Figure 2.3.6, these payoﬀ vectors point counterclockwise
relative to x∗ . Since positive correlation (PC) requires that the direction of motion VF (x)
form an acute angle with F(x), dynamics satisfying (PC) also travel counterclockwise
around the equilibrium.
To address whether the deviations of VF from F tend to be inward or outward, let C ⊂ X
1
be a circle of radius c ∈ (0, √6 ] centered at the equilibrium x∗ . This circle is parameterized
by the function ξ : [0, 2π] → X, where (6.14) √ −2 sin α c 3 cos α + sin α + x∗ . ξα = √ √ 6
− 3 cos α + sin α Here α is the counterclockwise angle between the vector ξα − x∗ and a rightward horizontal
vector (see Figure 6.2.2).
Since state ξα lies on the circle C, the vector x∗ − ξα can be drawn as a radius of C; thus,
the payoﬀ vector πα ≡ F(ξα ), which is orthogonal to x∗ − ξα , must be tangent to C at ξα , as
shown in Figure 6.2.2. This observation is easy to verify analytically: (6.15) √ −2 3 cos α √ c −3 sin α + √3 cos α = 3 d ξ . πα = F(ξα ) = √ dα α √ 6
3 sin α + 3 cos α 227 R L⊥ (ξα ) C +α F(ξα ) = √3 dd ξα
α σ (F(ξα ))
σ – x*
x* V (ξα )
F ξα– x* L(ξα ) ξα P S Figure 6.2.2: Integrability and inward motion of target dynamics in standard RPS. If we diﬀerentiate both sides of identity (6.15) with respect to the angle α, and note that
d2
ξ = −(ξα − x∗ ), we can link the rate of change of the payoﬀ vector πα = F(ξα ) to the
(dα)2 α
displacement of state ξα from x∗ :
(6.16) d
π
dα α = √ √
d2
3 (dα)2 ξα = − 3(ξα − x∗ ). Now introduce an acute, integrable revision protocol τ. By combining integrability
condition (6.13) with equation (6.16), we obtain
2π (6.17) 0= τ(π) · dπ ≡
C τ(πα )
0 d
π
dα α 2π √
dα = − 3 τ(πα ) ξα − x∗ dα. 0 τ (π) Let us write λ(π) = i∈S τi (π) and σi (π) = λi(π) to express the dynamic in target form. Then
because ξα − x∗ ∈ TX is orthogonal to x∗ = 1 1, we can conclude from equation (6.17) that
3
2π (6.18) λ(F(ξα )) σ(F(ξα )) − x∗ ξα − x∗ dα = 0. 0 Equation (6.18) is a form of the requirement described at the start of this section: it 228 asks that at states on the circle C, the vector of motion under the target dynamic
(6.19) ˙
x = VF (x) = λ(F(x)) σ(F(x)) − x . typically deviates from the payoﬀ vector F(x) in an inward direction—that is, in the
direction of the equilibrium x∗ .
To reach this interpretation of equation (6.18), note ﬁrst that if the target state σ(F(ξα ))
lies on or even near line L⊥ (ξα ), then motion from ξα toward σ(F(ξα )) is initially inward,
as shown in Figure 6.2.2. (Of course, target state σ(F(ξα )) lies above L(ξα ) by virtue of
positive correlation (PC).) Now, the integrand in (6.18) contains the inner product of the
vectors σ(F(ξα )) − x∗ and ξα − x∗ . This inner product is zero precisely when then the two
vectors are orthogonal, or, equivalently, when target state σ(F(ξα )) lies on L⊥ (ξα ). While
equation (6.18) does not require the two vectors to be orthogonal, it asks that this be true
on average, where the average is taken over states ξα ∈ C, and weighted by the rates
λ(F(ξα )) at which ξα approaches σ(F(ξα )). Thus, in the presence of acuteness, integrability
implies that on average, the dynamic (6.19) tends to point inward, toward the equilibrium
x∗ . §
The foregoing arguments suggest that together, monotonicity and integrability are
enough to ensure global convergence of target dynamics in stable games. We now develop
this intuition into formal results by constructing suitable Lyapunov functions.
As a point of comparison, recall from Section 6.1.1 that in the case of dynamics for
potential games, monotonicity conditions alone are suﬃcient to prove global convergence
results: as the game’s potential function serves as a Lyapunov function for any dynamic
satisfying positive correlation (PC). Unlike potential games, stable games do not come
equipped with candidate Lyapunov functions. But if the revision protocol agents follow
is integrable, then the revision potential of this protocol provides a building block for
constructing a suitable Lyapunov function. Evidently, this Lyapunov function will vary
with the dynamic under study, even when the game under consideration is ﬁxed.
Our ﬁrst result concerns integrable excess payoﬀ dynamics: that is, target dynamics whose
protocols τp are Lipschitz continuous, acute (4.20), and integrable (6.10). The prototype
p
p
ˆ
ˆ
for this class is the BNN dynamic: its protocol τi (πp ) = [πi ]+ is not only acute and
p
ˆ+
ˆ
integrable, but also separable (6.11), and so admits potential function γp (πp ) = 1 i∈Sp [πi ]2
2
(cf equation (6.12)).
˙
Theorem 6.2.8. Let F be a C1 stable game, and let x = VF (x) be the integrable excess payoﬀ
p
dynamic for F based on revision protocols τ with revision potentials γp . Deﬁne the C1 function 229 Γ : X → R by
Γ(x) = ˆ
mp γp (Fp (x)).
p∈P Then Γ is a strict Lyapunov function for VF , and NE(F) is globally attracting. In addition, if F
admits a unique Nash equilibrium, or if the protocols τp also satisfy separability (6.11), then we can
choose Γ to be nonnegative with Γ−1 (0) = NE(F), and so NE(F) is globally asymptotically stable.
For future reference, observe that the value of the Lyapunov function Γ at state x is the
(mp weighted) sum of the values of the revision potentials γp evaluated at the excess payoﬀ
ˆ
vectors Fp (x).
The conditions introduced in the last sentence of the theorem are needed to ensure that
the Lyapunov function Γ is constant on the set NE(F) of Nash equilibria. Were this not
the case, the set NE(F) could be globally attracting without being Lyapunov stable—see
Example 6.B.3.
The proof of Theorem 6.2.8 and those to come make heavy use of multivariate product
and chain rules, which we review in Section 2.A.4.
ˆ
Proof of Theorem 6.2.8. (p = 1) Recall that the excess payoﬀ vector F(x) is equal to
¯
¯
F(x) − 1F(x), where F(x) = x F(x) is the population’s average payoﬀ. By the product rule,
¯
the derivative of F is
¯
DF(x) = x DF(x) + F(x) .
ˆ
¯
Therefore, the derivative matrix for the excess payoﬀ function F(x) = F(x) − 1F(x) is
ˆ
¯
DF(x) = D(F(x) − 1 F(x))
¯
= DF(x) − 1 DF(x)
(6.20) = DF(x) − 1(x DF(x) + F(x) ). Using (6.20) and the chain rule, we can compute the time derivative of Γ:
˙
˙
Γ(x) = Γ(x) x
ˆ
ˆ˙
= γ(F(x)) DF(x)x
ˆ
˙
= τ(F(x)) DF(x) − 1 (x DF(x) + F(x) ) x
ˆ
ˆ
ˆ
˙
˙
= τ(F(x)) − τ(F(x)) 1x DF(x)x − τ(F(x)) 1 F(x) x
ˆ
˙
˙
˙
= x DF(x)x − (τ(F(x)) 1)(F(x) x)
230 ≤ 0,
where the inequality follows from the facts that F is stable and VF satisﬁes positive correlation (PC).
We now show that this inequality binds precisely on the set NE(F). To begin, note that if
ˆ
˙
˙
x ∈ RP(VF ) (i.e., if x = 0), then Γ(x) = 0. On the other hand, if x RP(VF ), then τ(F(x)) 1 > 0
˙
˙
and F(x) x > 0 (by condition (PC)), implying that Γ(x) < 0. Since NE(F) = RP(VF ), the
claim is proved. That NE(F) is globally attracting then follows from Theorem 6.B.4.
If F admits a unique Nash equilibrium x∗ , then the foregoing argument implies that x∗
is the unique minimizer of Γ: since the value of Γ is nonincreasing over time, a solution
starting from a state x with Γ(x) < Γ(x∗ ) could not converge to x∗ , contradicting that x∗
is globally attracting. Thus, after normalizing by an additive constant, we ﬁnd that Γ
is nonnegative with Γ−1 (0) = {x∗ }, so the global asymptotic stability of x∗ follows from
Corollary 6.B.7.
If instead τ satisﬁes separability (6.11), we can deﬁne the revision potential γ as in
equation (6.12). It then follows from Exercise 4.5.7 that Γ is nonnegative, with Γ(x) = 0 if
ˆ
and only if F(x) ∈ bd(Rn ). Thus, Lemma 4.5.4 implies that Γ(x) = 0 if and only if x ∈ NE(F),
∗
and so the global asymptotic stability of NE(F) again follows from Corollary 6.B.7.
Next we consider the best response dynamic, which we here express by applying the
maximizer correspondence
ˆ
ˆ
Mp (πp ) = argmax ( yp ) πp
yp ∈∆p to the vector of excess payoﬀs, yielding the exact target dynamic
(BR) ˆ
˙
xp ∈ mp Mp (Fp (x)) − xp . Following the previous logic, we can assess the possibilities for convergence in stable
games by checking monotonicity and integrability. Monotonicity was established in
Theorem 5.1.8, which showed that (BR) satisﬁes an analogue of positive correlation (PC)
appropriate for diﬀerential inclusions. For integrability, one can argue that the protocol
Mp , despite being multivalued, is integrable in a suitably deﬁned sense, with its “potential
function” being given by the maximum function
p µp (πp ) = max( yp ) πp = max πi .
p
p
p
y ∈∆ i∈S ˆ
Note that if the payoﬀ vector πp , and hence the excess payoﬀ vector πp , have a unique
231 p ˆ
maximizing component i ∈ Sp , then the gradient of µp at πp is the standard basis vector ei .
ˆ
But this vector corresponds to the unique mixed best response to πp , and so
p ˆ
ˆ
µp (πp ) = ei = Mp (πp ).
One can account for multiple optimal components using a broader notion of diﬀerentiaˆ
ˆ
ˆ
tion: for all πp ∈ Rn , Mp (πp ) is the subdiﬀerential of the convex function µp at πp (see the
Notes).
Having veriﬁed monotonicity and integrability, we again construct our candidate
Lyapunov function by plugging the excess payoﬀ vectors into the revision potentials µp .
The resulting function G is very simple: it measures the diﬀerence between the payoﬀs
agents could obtain by choosing optimal strategies and their actual aggregate payoﬀs.
˙
Theorem 6.2.9. Let F be a C1 stable game, and let x ∈ VF (x) be the best response dynamic for F.
Deﬁne the Lipschitz continuous function G : X → R+ by
ˆ
G(x) = max ( y − x) F(x) = max Fi (x).
y∈X i∈S Then G−1 (0) = NE(F). Moreover, if {xt }t≥0 is a solution to VF , then for almost all t ≥ 0 we have
˙
that G(xt ) ≤ −G(xt ), and so NE(F) is globally asymptotically stable under VF .
Proof. (p = 1) That G−1 (0) = NE(F) follows from Lemma 4.5.4. To prove the second
claim, let {xt }t≥0 be a solution to VF , and let S∗ (t) ⊆ S be the set of pure best responses to state
ˆ
xt . Since {xt }t≥0 is Lipschitz continuous, the map t → Fi (xt ) is also Lipschitz continuous
ˆ
for each strategy i ∈ S. Thus, since G(x) = max y∈X ( y − x) F(x) = maxi∈S Fi (x), it follows
from Danskin’s Envelope Theorem (see the Notes) that the map t → G(xt ) is Lipschitz
continuous, and that at almost all t ∈ [0, ∞),
(6.21) ˙
G(xt ) ≡ d
dt ˆ
max Fi (xt ) =
i∈S for all i∗ ∈ S∗ (t). dˆ
F ∗ (xt )
dt i ˙
Applying equation (6.20), we ﬁnd that for t satisfying equation (6.21) and at which xt
exists, we have that
˙
G(xt ) = dˆ
F ∗ (xt )
dt i ˙
= ei∗ DF(xt ) − xt DF(xt ) − F(xt ) xt
˙
˙
= ( y∗ − x ) DF(x )x − F(x ) x
t t t t t ˙
˙
˙
= xt DF(xt )xt − F(xt ) xt
˙
≤ −F(xt ) xt
232 for all i∗ ∈ S∗ (t)
for all i∗ ∈ S∗ (t)
for all y∗ ∈ argmax y∈∆ ˆ
y F(xt ) = − max F(xt ) ( y − xt )
y∈X = −G(xt ),
where the inequality follows from the fact that F is a stable game. (Note that the the
equality of the third to last and last expressions is also implied by Theorem 5.1.8.) The
global asymptotic stability of NE(F) then follows from Theorems 6.B.2 and 6.B.6.
Finally, we consider convergence under perturbed best response dynamics. These are
exact target dynamics of the form
˜ˆ
˙
xp = mp Mp (Fp (x)) − xp ;
here, the target protocol is the perturbed maximizer function
˜ˆ
ˆ
Mp (πp ) = argmax ( yp ) πp − vp ( yp ),
yp ∈int(∆p ) where vp : int(∆p ) → R is an admissible deterministic perturbation (see Section 5.2.2).
Once again, we verify the two conditions that underlie convergence. Theorem 5.2.13
showed that all perturbed best response dynamics satisfy virtual positive correlation (5.17),
establishing the required monotonicity. As for integrability, Observation 5.C.3 showed
˜
that the protocol Mp is integrable; its revision potential,
(6.22) ˜
µp (πp ) = pmax p ( yp ) πp − vp ( yp ),
y ∈int(∆ ) is the perturbed maximum function induced by vp . Now, mimicking Theorem 6.2.8, we
˜
construct our Lyapunov function by composing the revision potentials µp with the excess
p
ˆ
payoﬀ functions F .
˙
Theorem 6.2.10. Let F be a C1 stable game, and let x = VF,v (x) be the perturbed best response
dynamic for F generated by the admissible deterministic perturbations v. Deﬁne the function
˜
G : int(X) → R+ by
˜
G(x) = 1
˜ˆ
mp µp (Fp (x)) + vp ( mp xp ) ,
p∈P ˜
Then G−1 (0) = PE(F, v), and this set is a singleton. Moreover, G is a strict Lyapunov function for
VF,v , and so PE(F, v) is globally asymptotically stable under VF,v .
˜
Proof. (p = 1) As in Section 5.2, let F(x) = F(x) − v(x) be the virtual payoﬀ function
233 generated by (F, v). Then
˜
˜
x ∈ PE(F, v) ⇔ ΦF(x) = 0 ⇔ x = argmax y F(x) − v( y) ⇔ G(x) = 0.
y∈int(∆) ˜
To prove that G is a strict Lyapunov function, recall from Observation 5.C.3 that the
˜
perturbed maximum function µ deﬁned in equation (6.22) is a potential function for the
˜
˜
˜
perturbed maximizer function M: that is, µ ≡ M. Therefore, since F is stable, virtual
positive correlation (5.17) implies that
˙
˜
G(x) = d
dt ˜ˆ
µ(F(x)) + v(x) = d
dt ˜
µ(F(x)) − (x F(x) − v(x)) ˜
˙
˙˙
˙
= M(F(x)) DF(x) x − (x DF(x) x + x F(x) − x v(x))
˜
˙˙
= (M(F(x)) − x) DF(x) x − x (F(x) − v(x))
˙
˙ ˙˜
= x DF(x) x − x F(x)
≤ 0,
with equality if and only if x is a rest point. But RP(VF,v ) = PE(F, v) by deﬁnition, so
Corollary 6.B.7 implies that PE(F, v) is globally asymptotically stable.
Finally, we prove that PE(F, v) is a singleton. Let
˜
φx,h (t) = h F(x + t h)
for all x ∈ X, h ∈ TX − {0}, and t ∈ R such that x + th ∈ int(X). Since F is stable and
D2 v(x + th) is positive deﬁnite with respect to TX × TX, we have that
(6.23) ˜
φx,h (t) = h DF(x + t h) h = h DF(x + t h) h − h D2 v((x + t h)) h < 0, and so φx,h (t) is decreasing in t. Moreover,
(6.24) ˜
x ∈ PE(F, v) ⇔ F(x) is a constant vector ⇔ φx,h (0) = 0 for all h ∈ TX − {0}. Now let x ∈ PE(F, v) and y ∈ X − {x}. Then y = x + t y h y for some t y > 0 and h y ∈ TX − {0}.
Statements (6.23) and (6.24) imply that
˜
˜
φ y,hy (0) = h y F( y) = h y F(x + t y h y ) = φx,hy (t y ) < φx,hy (0) = 0.
Therefore, statement (6.24) implies that y PE(F, v), and hence that PE(F, v) = {x}.
234 6.2.3 Impartial Pairwise Comparison Dynamics In Section 4.6, we deﬁned pairwise comparison dynamics by considering Lipschitz
continuous revision protocols ρp that only condition on payoﬀs and that are sign preserving:
p p
p sgn(ρi j (πp )) = sgn([π j − πi ]+ ) for all i, j ∈ Sp and p ∈ P .
To obtain a general convergence result for stable games, we require an additional condition
called impartiality:
(6.25) p p p p p ρi j (πp ) = φ j (π j − πi ) for some functions φ j : R → R+ . Combining this restriction with mean dynamic equation (M), we see that impartial pairwise comparison dynamics take the form
p p ˙
xi = p p p p p x j φi (Fi (x) − F j (x)) − xi
j∈Sp p p φ j (F j (x) − Fi (x)).
j∈Sp
p p Under impartiality (6.25), the function of the payoﬀ diﬀerence π j − πi that describes the
conditional switch rate from i to j does not depend on an agent’s current strategy i. This
restriction introduces at least a superﬁcial connection with the target dynamics studied in
Section 6.2.2, as both restrict the dependence of agents’ decisions on their current choices
of strategy.
Theorem 6.2.11 shows that together, sign preservation and impartiality ensure global
convergence to Nash equilibrium in stable games.
˙
Theorem 6.2.11. Let F be a C1 stable game, and let x = VF (x) be an impartial pairwise comparison
dynamic for F. Deﬁne the Lipschitz continuous function Ψ : X → R+ by
p Ψ(x) = p p p p∈P i∈Sp j∈Sp d p xi ψ j (F j (x) − Fi (x)), where ψk (d) = 0 p φk (s) ds p
˙
is the deﬁnite integral of φk . Then Ψ−1 (0) = NE(F). Moreover, Ψ(x) ≤ 0 for all x ∈ X, with
equality if and only if x ∈ NE(F), and so NE(F) is globally asymptotically stable. To understand the role played by impartiality (6.25), recall the general formula for the
mean dynamic:
(M) pp p ˙
xi = p p ρi j (Fp (x), xp ). x j ρ ji (Fp (x), xp ) − xi
j∈Sp j∈Sp 235 According to the second term of this expression, the rate of outﬂow from strategy i is
p
p
p
xi k∈Sp ρik ; thus, the percentage rate of outﬂow from i, k∈Sp ρik , varies with i. It follows that
strategies with high payoﬀs can nevertheless have high percentage outﬂow rates: even if
p
p
p
p
πi > π j , one can still have ρik > ρ jk for k i, j. Having good strategies lose players more
quickly than bad strategies is an obvious impediment to convergence to Nash equilibrium.
Impartiality (6.25) places controls on these percentage outﬂow rates. If the conditional
p
switch rates φ j are monotone in payoﬀs, then condition (6.25) ensures that better strategies
have lower percentage outﬂow rates. If the conditional switch rates are not monotone, but
merely signpreserving, condition (6.25) still implies that the integrated conditional switch
p
rates ψk are ordered by payoﬀs. According to the analysis below, this control is enough
to ensure convergence of pairwise comparison dynamics to Nash equilibrium in stable
games.
Proof. (p = 1) The ﬁrst claim is proved as follows:
Ψ(x) = 0 ⇔ [xi = 0 or ψ j (F j (x) − Fi (x)) = 0] for all i, j ∈ S
⇔ [xi = 0 or Fi (x) ≥ F j (x)] for all i, j ∈ S
⇔ [xi = 0 or Fi (x) ≥ max j∈S F j (x)] for all i, j ∈ S
⇔ x ∈ NE(F).
To begin the proof of the second claim, we compute the partial derivatives of Ψ:
∂Ψ
(x) =
∂x l xi ρi j
i∈S j∈S ∂F j
∂Fi
(x) −
(x) +
∂xl
∂xl xi ρi j − x j ρ ji =
i∈S j∈S = ˙
xj
j∈S ∂F j
(x) +
∂x l ∂F j
(x) +
∂xl ψk (Fk (x) − Fl (x))
k∈S ψk (Fk (x) − Fl (x))
k ∈S ψk (Fk (x) − Fl (x)).
k∈S Using this expression, we ﬁnd the rate of change of Ψ over time along solutions to (M):
˙
˙
Ψ(x) = Ψ(x) x
˙
˙
= x DF(x)x + ψk (Fk − Fi ) ˙
xi
i∈S k∈S ˙
˙
= x DF(x)x + x j ρ ji − xi ρi j
i∈S j∈S ψk (Fk − Fi )
k∈S 236 ˙
˙
= x DF(x)x +
i∈S j∈S x ρ j ji ψk (Fk − Fi ) − ψk Fk − F j
k∈S . To evaluate the summation, ﬁrst observe that if Fi (x) > F j (x), then ρ ji (F(x)) ≡ φi (Fi (x) −
F j (x)) > 0 and Fk (x) − Fi (x) < Fk (x) − F j (x); since each ψk is nondecreasing, it follows
that ψk (Fk − Fi ) − ψk (Fk − F j ) ≤ 0. In fact, when k = i, the comparison between payoﬀ
diﬀerences becomes 0 < Fi (x) − F j (x); since each ψi is increasing on [0, ∞), it follows that
ψi (0) − ψi (Fi − F j ) < 0. We therefore conclude that if Fi (x) > F j (x), then ρ ji (F(x)) > 0
and k∈S ψk (Fk − Fi ) − ψk Fk − F j < 0. On the other hand, if F j (x) ≥ Fi (x), we have
˙
˙
immediately that ρ ji (F(x)) = 0. And of course, x DF(x)x ≤ 0 since F is stable.
˙
Marshaling these facts, we ﬁnd that Ψ(x) ≤ 0, and that
(6.26) ˙
Ψ(x) = 0 if and only if x j ρ ji (F(x)) = 0 for all i, j ∈ S. Lemma 4.6.5 shows that the second condition in (6.26) deﬁnes the set RP(VF ), which
is equal to NE(F) by Theorem 4.6.3; this proves the second claim. Finally, the global
asymptotic stability of NE(F) follows from Corollary 6.B.7.
Exercise 6.2.12. Construct a pairwise comparison dynamic that generates cycling in the
good RPS game from Example 6.2.6. 6.2.4 Summary In Table 6.1, we summarize the results in this section by presenting the Lyapunov functions for singlepopulation stable games for the six fundamental evolutionary dynamics.
The Lyapunov functions divide into three classes: those based on an explicit notion of
“distance” from equilibrium, those based on revision potentials for target protocols, and
the Lyapunov function for the Smith dynamic, which stands alone.
Example 6.2.13. Matching Pennies. In Figure 6.2.3, we present phase diagrams of the six
fundamental dynamics in twopopulation Matching Pennies: 1 FH (x) 0
0
1 −1 x1 x2 − x2 H h
t 1 F (x) 0 T T t
0 −1 1 x1 x2 − x2 h = = . 2 2 1 F (x) −1 1 0
0 xh xT − x1 h H 2 2 1 Ft (x)
1 −1 0
0 xt
xH − x1
T
Each phase diagram is drawn atop a contour plot of the relevant Lyapunov function.
Since Matching Pennies is a zerosum game, F is null stable; thus, the Lyapunov functions
237 H h t H h T T (i) replicator
H h (ii) projection
t H T h t T (iii) Brownvon NeumannNash
H t h (iv) Smith
t H T h t T (v) best response (vi) logit(.2) Figure 6.2.3: Six basic dynamics in Matching Pennies. The contour plots are the corresponding Lyapunov
functions. 238 Dynamic Formula Lyapunov function projection ˙
x = ΠTX(x) (F(x)) Ex∗ (x) = x − x∗ 2 replicator ˆ
˙
xi = xi Fi (x) best response ˆ
˙
x ∈ M(F(x)) − x ˆ
G(x) = µ(F(x)) logit ˜ˆ
˙
x = M(F(x)) − x ˜
˜ˆ
G(x) = µ(F(x)) + v(x) ˆ
˙
xi = [Fi (x)]+ − xi BNN
Smith ˙
xi =
j∈S Hx∗ (x) = ˆ j∈S [F j (x)]+ x j [Fi (x) − F j (x)]+ − xi [F j (x) − Fi (x)]+ j∈S Γ(x) =
Ψ(x) = 1
2 i∈S(x∗ ) 1
2 i∈S j∈S x∗ log
i x∗
i
xi 2
ˆ
i∈S [Fi (x)]+ xi [F j (x) − Fi (x)]2
+ Table 6.1: Lyapunov functions for the six fundamental dynamics in stable games. for the replicator and projection dynamics deﬁne constants of motion for these dynamics,
with solution trajectories cycling along level curves. In the remaining cases, all solutions
11
1
converge to the unique Nash equilibrium, x∗ = (( 2 , 2 ), ( 1 , 2 )). §
2 6.3 Supermodular Games In a supermodular game, higher choices by one’s opponents make one’s own higher
strategies look relatively more desirable. In Section 2.4, we used this property to show that
the best response correspondences of supermodular games are monotone in the stochastic
dominance order; this implies in turn that these games admit minimal and maximal Nash
equilibria.
Given this monotone structure on best response correspondences, it is natural to look
for convergence results for supermodular games under the best response dynamic (BR).
In Section 6.3.1, we use elementary methods to establish a global convergence result for
(BR) under some strong additional assumptions on the underlying game: in particular, it
must be derived from a twoplayer normal form game that satisﬁes both supermodularity
and “diminishing returns” conditions.
To prove more general convergence results, we appeal to the theory of cooperative
diﬀerential equations. These are smooth diﬀerential equations under which increasing
the value any component of the state variable increases the growth rates of all other components. Under some mild regularity conditions, almost all solutions of these equations
converge to rest points. 239 Because of the smoothness requirement, these techniques cannot be applied to the best
response dynamic itself. Happily, the needed monotonicity carries over from exact best
responses to perturbed best responses, although only those that can be generated from
stochastic perturbations of payoﬀs. In Section 6.3.2, we use this idea to prove almost global
convergence of stochastically perturbed best response dynamics in supermodular games. 6.3.1 The Best Response Dynamic in TwoPlayer Normal Form Games Let U = (U1 , U2 ) be a twoplayer normal form game, and let F be the population game
obtained when members of two populations are randomly matched to play U (cf Example
1.2.2). Then the best response dynamic (BR) for F takes the form
(BR) ˙
x1 ∈ B1 (x) − x1 = M1 (F1 (x)) − x1 = M1 (U1 x2 ) − x1 ,
˙
x2 ∈ B2 (x) − x2 = M2 (F2 (x)) − x2 = M2 ((U2 ) x1 ) − x2 . Our convergence result for supermodular games concerns simple solutions of this
dynamic. A solution {xt }t≥0 of (BR) is simple if the set of times at which it is not diﬀerentiable
has no accumulation point, and if at other times, the target states Bp (xt ) are pure (i.e.,
vertices of Xp ).
Exercise 6.3.1.
(i) Given an example of a Nash equilibrium x∗ of a 2 × 2 game such that
no solution to (BR) starting from x∗ is simple.
(ii) Show that there exists a simple solution to (BR) from every initial condition in game
ˆ
ˆ
U = (U1 , U2 ) if for all nonempty sets S1 ⊆ S1 and S2 ⊆ S2 , the game in which players
ˆ
ˆ
are restricted to strategies in S1 and S2 admits a pure Nash equilibrium. (Theorem
2.4.12 implies that U has this property if it is supermodular.)
If {xt }t≥0 is a simple solution trajectory of (BR), we can list the sequence of times {tk }
at which the solution is not diﬀerentiable (i.e., at which the target state for at least one
population changes). During each open interval of times Ik = (tk−1 , tk ), the pure strategies
ik ∈ S1 and jk ∈ S2 selected by revising agents are ﬁxed. We call ik and jk the interval k
selections for populations 1 and 2.
The following lemma links shows that ik+1 must perform at least as well as ik against
both jk and jk+1 , and that the analogous comparisons between the payoﬀs of jk and jk+1
also hold.
Lemma 6.3.2. Suppose that revising agents select strategies i = ik and j = jk during interval Ik ,
and strategies i = ik+1 and j = jk+1 during interval Ik+1 . Then
(i) Ui1 j ≥ Ui1j and Ui2j ≥ Ui2j , and
240 (ii) Ui1 j ≥ Ui1j and Ui2 j ≥ Ui2 j .
Exercise 6.3.3. Prove Lemma 6.3.2. (Hint: Start by verifying that x2k is a convex combination
t
2
2
2
of xtk−1 and the vertex v j , and that xtk+1 is a convex combination of x2k and v2 .)
t
j
Now, recall from Exercise 2.4.4 that U = (U1 , U2 ) is supermodular if
(6.27) Ui1+1, j+1 − Ui1, j+1 ≥ Ui1+1, j − Ui1, j and Ui2+1, j+1 − Ui2+1, j ≥ Ui2, j+1 − Ui2, j for all i < n1 , j < n2 . (When (6.27) holds, the population game F induced by U is supermodular as well.) If the
inequalities in (6.27) always hold strictly, we say that U is strictly supermodular.
Our convergence result requires two additional conditions on U. We say that U exhibits
strictly diminishing returns if for each ﬁxed strategy of the opponent, the beneﬁt a player
obtains by increasing his strategy is decreasing—in other words, if payoﬀs are “concave
in own strategy”:
Ui1+2, j − Ui1+1, j < Ui1+1, j − Ui1, j for all i ≤ n1 − 2 and j ∈ S2 , and
Ui2, j+2 − Ui2, j+1 < Ui2, j+1 − Ui2, j for all i ∈ S1 and j ≤ n2 − 2.
Finally, we say that U is nondegenerate if for each ﬁxed pure strategy of the opponent, a
player is not indiﬀerent among any of his pure strategies.
Theorem 6.3.4. Suppose that F is generated by random matching in a twoplayer normal form
game U that is strictly supermodular, exhibits strictly diminishing returns, and is nondegenerate.
Then every simple solution trajectory of the best response dynamic (BR) converges to a pure Nash
equilibrium.
Proof. To begin, suppose that the sequence of times {tk } is ﬁnite, with ﬁnal element tK .
Let i∗ and j∗ be the selections made by revising agents after time tK . Then the pure state
x∗ = (v1∗ , v2∗ ) is in B(xt ) for all t ≥ tK , and {xt } converges to x∗ . Since payoﬀs are continuous,
i
j
it follows that x∗ ∈ B(x∗ ), and so that x∗ is a Nash equilibrium. To complete the proof of the
theorem, we establish by contradiction that the sequence of times {tk } cannot be inﬁnite.
To begin, note that at time tk , agents in each population p are indiﬀerent between their
interval k and interval k + 1 selections. Moreover, since U exhibits strictly decreasing
returns, it is easy to verify that whenever such an indiﬀerence occurs, it must be between
two consecutive strategies in Sp . Putting these observations together, we ﬁnd that each
transition in the sequence {(ik , jk )} is of length 1, in the sense that
ik+1 − ik  ≤ 1 and jk+1 − jk ≤ 1 for all k.
241 1 ~
j ĵ n2 1
ĩ
n1
Figure 6.3.1: The proof of Theorem 6.3.4. Next, we say that there is an improvement step from (i, j) ∈ S to (i , j ) ∈ S, denoted
(i, j)
(i , j ), if either (i) Ui1 j > Ui1j and j = j, or (ii) i = i and Ui2j > Ui2j . Lemma 6.3.2(i)
and the fact that U is nondegenerate imply that (ik , jk )
(ik+1 , jk+1 ) if either ik = ik+1 or
jk = jk+1 . Moreover, applying both parts of the lemma, we ﬁnd that if ik ik+1 and jk jk+1 ,
we have that (ik , jk )
(ik+1 , jk )
(ik+1 , jk+1 ), and also that (ik , jk )
(ik , jk+1 )
(ik+1 , jk+1 ).
Now suppose that the sequence {tk } is inﬁnite. Then since S is ﬁnite, there must be
a strategy proﬁle that is the interval k selection for more than one k. In this case, the
arguments in the previous two paragraphs imply that there is a length 1 improvement
cycle: that is, a sequence of length 1 improvement steps beginning and ending with the
same strategy proﬁle.
Evidently, this cycle must contain an improvement step of the form (˜, ˜)
ı
(˜, ˜ + 1) for
ı
some (˜, ˜) ∈ S. Strict supermodularity of U then implies that
ı
(6.28) (i, ˜) (i, ˜ + 1) for all i ≥ ˜. ı It follows that for the sequence of length 1 improvement steps to return to (˜, ˜), there must
ı
be an improvement step of the form (˜, ˆ)
ı
(˜ − 1, ˆ) for some ˆ > ˜ (see Figure 6.3.1). This
ı time, strict supermodularity of U implies that
(6.29) (˜, j)
ı (˜ − 1, j) for all j ≤ ˆ.
ı From (6.28) and (6.29), it follows that no cycle of length 1 improvement steps containing
(˜, ˜)
ı
(˜, ˜ + 1) can reach any strategy proﬁle (i, j) with i ≥ ˜ and j ≤ ˜. In particular,
ı
ı the cycle cannot return to (˜, ˜), which is a contradiction. This completes the proof of the
ı
theorem. 242 6.3.2 Stochastically Perturbed Best Response Dynamics While Theorem 6.3.4 was proved using elementary techniques, it was not as general as
one might hope: it restricted attention to twoplayer normal form games, and required not
only the assumption of supermodularity, but also that of decreasing returns. In order to
obtain a more general convergence result, we turn from exact best response dynamics to
perturbed best response dynamics. Doing so allows us to avail ourselves of a powerful set
of techniques for smooth dynamics with a monotone structure: the theory of cooperative
diﬀerential equations.
To begin, let us recall the transformations used to discuss the stochastic dominance
p
p
p
p
˜
order. In Section 2.4, we deﬁned the matrices Σ ∈ R(n −1)×n , Σ ∈ Rn ×(n −1) , and Ω ∈ Rn×n by 0 1 · · · 1 . .
. . .. .
.
. ,
Σ= . .
. 0 · · · 0 1 −1 0 1 −1 ˜ =0 Σ
1 . .
.
.
.. .. .
0 ··· 0 1 . ..
. 0
. . .. , and Ω = 0 . 0 . .
.. . . −1 0
0
1 1 · · · · · · 1 0 · · · · · · 0 .
..
. . .
0
. .
.. .
.
. .
.
. 0 ··· ··· 0 We saw that yp ∈ Xp stochastically dominates xp ∈ Xp if and only if Σ yp ≥ Σxp . We also
veriﬁed that
(6.30) ˜
ΣΣ = I − Ω. Since Ω is the null operator on TXp , equation (6.30) describes a sense in which the stochastic
˜
dominance operator Σ is “inverted” by the diﬀerence operator Σ.
Applying the change of coordinates Σ to the set Xp yields the set of transformed
population states
p
X p ≡ ΣXp = x p ∈ Rn −1 : mp ≥ x 1 ≥ · · · ≥ x p p −1 ≥ 0 .
n
p By postmultiplying both sides of (6.30) by xp and letting xp = (mp , 0, . . . , 0) denote the
minimal state in Xp , we ﬁnd that the inverse of the map Σ : Xp → X p is described as
follows:
(6.31) ˜
x p = Σxp ⇔ xp = Σx p + xp . To work with full social states x ∈ X, we introduce the block diagonal matrices Σ =
˜
˜
˜
diag(Σ, . . . , Σ) and Σ = diag(Σ, . . . , Σ), and let X ≡ ΣX = p∈P X p . If we let x = (x1 , . . . , xp )
243 be the minimal state in X, then the inverse of the map Σ : X → X is described by
(6.32) ˜
x = Σx ⇔ x = Σx + x. To simplify the discussion to follow, let us assume for convenience that each population
is of mass 1. Then our stochastically perturbed best response dynamics take the form
(6.33) ˜
˙
xp = Mp (Fp (x)) − xp , where
p
p
˜p
Mi (πp ) = P i = argmax j∈Sp π j + ε j for some admissible stochastic perturbations ε = (ε1 , . . . , εp ). Rather than study this
dynamic directly, we apply the change of variable (6.32) to obtain a new dynamic on the
set X :
(6.34) ˜
˜
˙
x p = ΣMp (Fp (Σx + x)) − x p . ˜
Given the current state x ∈ X , we use the inverse transformation x → x ≡ Σx + x
to obtain the input for the payoﬀ function Fp , and we use the original transformation
˜
˜
Mp (Fp (x)) → ΣMp (Fp (x)) to convert the perturbed best response into an element of X p . The
next observation veriﬁes the relationship between solutions to the transformed dynamic
(6.34) and solutions to the original dynamic (6.33).
˜
Observation 6.3.5. (6.33) and (6.34) are aﬃnely conjugate: {xt } = {Σx t + x} solves (6.33) if and
only if {x t } = {Σxt } solves (6.34).
Our next task is to show that if F is a supermodular game, then (6.34) is a cooperative
˙
diﬀerential equation: writing this dynamic as x = V (x ), we want to show that
∂Vi p q ∂x j (x ) ≥ 0 for all x ∈ X whenever (i, p) ( j, q). If this inequality is always satisﬁed strictly, we say that (6.34) is strongly cooperative. As
we explain in Section 6.C, strongly cooperative diﬀerential equations converge to rest
points from almost all initial conditions. Thus, if we can prove that equation (6.34) is
strongly cooperative, we can conclude that almost all solutions of our original dynamic
(6.33) converge to perturbed equilibria.
244 To prove that (6.34) is strongly cooperative, we marshal our facts about supermodular
games and stochastically perturbed best responses. Recall from Chapter 2 that if the
population game F is C1 , then F is a supermodular if and only if
˜
˜
Σ DF(x)Σ ≥ 0 for all x ∈ X.
Our result requires an additional nondegeneracy condition: we say that F is irreducible if
˜
˜
each column of Σ DF(x)Σ contains a strictly positive element.
˜
Next, we recall from Lemma 5.C.1 the basic properties of DM(π), the derivative matrix
˜
of the stochastically perturbed best response function M.
˜
Lemma 6.3.6. Fix π ∈ Rn , and suppose that the perturbed best response function M is derived
˜
from admissible stochastic payoﬀ perturbations. Then the derivative matrix DM(π) is symmetric,
˜
has negative oﬀdiagonal elements, and satisﬁes DM(π)1 = 0.
Combining these facts yields the desired result:
Theorem 6.3.7. Let F be a C1 irreducible supermodular game, and let (6.33) be a stochastically perturbed best response dynamic for F. Then the transformed dynamic (6.34) is strongly
cooperative.
˙
Proof. (p = 1) Write the dynamic (6.34) as x = V (x ). Then
(6.35) ˜
˜
DV (x ) = D(ΣM(F(Σx + x))) − I. Since all oﬀdiagonal elements of I equal zero, it is enough to show that the ﬁrst term on
the right hand side of (6.35) has all positive components.
˜
˜
˜
Let x = Σx + x and π = F(x). Using the facts that ΣΣ = I − Ω and DM(π)1 = 0, we
express the ﬁrst term of the right hand side of (6.35) as follows:
˜
˜
˜
˜
D(ΣM(F(Σx + x))) = Σ DM(π) DF(x) Σ
˜
˜
˜
= Σ DM(π) (Σ Σ + Ω ) DF(x) Σ
˜
˜
˜
= (ΣDM(π)Σ )(Σ DF(x)Σ).
Lemma 6.3.6 and the fact that
˜
(ΣDM(π)Σ )i j = ˜
DM(π)kl
k>i l> j ˜
imply that every component of ΣDM(π)Σ is positive (see Exercise 6.3.8). Since F is
˜
˜
supermodular and irreducible, Σ DF(x)Σ is nonnegative, with each column containing a
245 positive element. Thus, the product of these two matrices has all positive elements. This
completes the proof of the theorem.
˜
Exercise 6.3.8.
(i) Prove that every component of ΣDM(π)Σ is positive.
˜
(ii) Explain why Theorem 6.3.7 need not hold when M is generated by deterministic
perturbations.
Observation 6.3.5, Theorem 6.3.7, and Theorems 6.C.1, 6.C.2, and 6.C.3 immediately
imply the following “almost global” convergence result. In part (i) of the theorem, x =
¯
(x1 , . . . , xp ) is the minimal state in X introduced above; similarly, xp = (0, . . . , mp ) is the
p
1
p ) is the maximal state in X.
¯
¯
¯
maximal state in X , and x = (x , . . . , x
˙
Theorem 6.3.9. Let F be a C1 irreducible supermodular game, and let x = VF,ε (x) be a stochastically
perturbed best response dynamic for F. Then
¯
¯
(i) States x∗ ≡ ω(x) and x∗ ≡ ω(x) exist and are the minimal and maximal elements of PE(F, ε).
∗ , x∗ ] contains all ωlimit points of V and is globally asymptotically stable.
Moreover, [x ¯
F,ε
˙
(ii) Solutions to x = VF,ε (x) from an open, dense, full measure set of initial conditions in X
converge to states in PE(F, ε).
Our ﬁnal example shows that the conclusion of Theorem 6.3.9 cannot be extended from
convergence from almost all initial conditions to convergence from all initial conditions.
Example 6.3.10. Let U be a normal form game with p ≥ 5 players and two strategies per
player. Each player p in U obtains a payoﬀ of 1 if he chooses the same strategy as player
p + 1 (with the convention that p + 1 = 1) and obtains a payoﬀ of 0 otherwise. U has three
Nash equilibria: two strict equilibria in which all players coordinate on the same strategy,
1
and the mixed equilibrium x∗ = (( 1 , 1 ), . . . , ( 1 , 2 )). If F is the p population game generated
22
2
by random matching in U, it can be shown that F is supermodular and irreducible (see
Exercise 6.3.11(i)).
pp
We now introduce random perturbations εp = (ε1 , ε2 ) to each player’s payoﬀs. These
p
p
perturbations are such that the diﬀerences ε2 − ε1 admit a common density g that is
symmetric about 0, is decreasing on R+ , and satisﬁes g(0) > 1 . It can be shown that the
2
resulting perturbed best response dynamic (6.33) possesses exactly three rest points: the
mixed equilibrium x∗ , and two stable symmetric rest points that approximate the two
pure Nash equilibria (see Exercise 6.3.11(ii)).
One can show that the rest point x∗ is unstable under (6.33). It then follows from
Theorem 6.3.9 that the two stable rest points of (6.33) attract almost all initial conditions in
X, and that the basins of attraction for these rest points are separated by a p − 1 dimensional
246 invariant manifold M that contains x∗ . Furthermore, one can show that when p ≥ 5, the
rest point x∗ is unstable with respect to the manifold M . Thus, solutions from all states in
M − {x∗ } fail to converge to a rest point. §
The details of these last arguments require techniques for determining the local stability
of rest points. This is the topic of the next chapter.
Exercise 6.3.11.
(i) Prove that the game F introduced in Example 6.3.10 is supermodular and irreducible.
(ii) Prove that under the assumption on payoﬀ perturbations stated in the example,
there are exactly three perturbed equilibria, all of which are symmetric. 6.4 Dominance Solvable Games The elimination of strictly dominated strategies is the mildest requirement employed in
standard gametheoretic analyses, and so it seems natural to expect evolutionary dynamics
obey this dictum. In this section, we provide some positive results on the elimination of
dominated strategies: under the best response dynamic, any strictly dominated strategy
must vanish in the limit; the same is true under any imitative dynamic so long as we focus
on interior initial conditions. Arguing inductively, we show next that any strategy that
does not survive iterated elimination of strictly dominated strategies vanishes as well. In
particular, if a game is dominance solvable—that is, if removing iteratively dominated
strategies leaves only one strategy for each population, then best response and imitative
dynamics select this strategy.
These results may seem unsurprising. However, we argue in Chapter 8 that they
are actually borderline cases: under “typical” evolutionary dynamics, strictly dominated
strategies can survive in perpetuity. 6.4.1 Dominated and Iteratively Dominated Strategies Let F be a population game. We say that strategy i ∈ Sp is strictly dominated if there
exists a strategy j ∈ Sp such that F j (x) > Fi (x) for all x ∈ X: that is, if there is a strategy j that
ˆ
outperforms strategy i regardless of the population state. Similarly, if Sp is a nonempty
p
p
p
ˆ
ˆ
ˆ
subset of S and S = p∈P S , we say that i ∈ S is strictly dominated relative to S, denoted
ˆ
ˆ
i ∈ D p (S), if there exists a strategy j ∈ Sp such that F j (x) > Fi (x) for all x ∈ X that satisfy
ˆ
support(xp ) ⊆ Sp for all p ∈ P .
We can use these deﬁnitions to introduce the notion of iterative dominance. Set
S0 = S. Then D p (S0 ) is the set of strictly dominated strategies for population p, and
247 p p so S1 = S0 − D p (S0 ) is the set of strategies that are not strictly dominated. Proceeding
inductively, we deﬁne D p (Sk ) to be the set of strategies that are eliminated during the
p
p
(k + 1)st round of removal of iteratively dominated strategies, and we let Sk+1 = Sk − D p (Sk )
be the set of strategies that survive k + 1 rounds of removal of such strategies.
Since the number of strategies is ﬁnite, this iterative procedure must converge, leaving
p
us with nonempty sets S1 , . . . , S∗ . Strategies in these sets are said to survive iterative removal
∗
of strictly dominated strategies. If each of these sets is a singleton, then the game F is said
to be dominance solvable. In this case, the pure social state at which each agent plays his
population’s sole surviving strategy is the game’s unique Nash equilibrium; we call this
state the dominance solution of F. 6.4.2 The Best Response Dynamic Under the best response dynamic, revising agents always switch to optimal strategies.
Since strictly dominated strategies are never optimal, such strategies cannot persist:
Observation 6.4.1. Let {xt } be a solution trajectory of (BR) for population game F, in which
p
strategy i ∈ Sp is strictly dominated. Then limt→∞ (xt )i = 0.
p p ˙
Indeed, since i is never a best response, we have that (xt )i ≡ −(xt )i , and hence that
p
p −t
(xt )i = (x0 )i e : the mass playing the dominated strategy converges to zero exponentially
quickly.
An inductive argument takes us from the observation above to the following result.
Theorem 6.4.2. Let {xt } be a solution trajectory of (BR) for population game F, in which strategy
p
i ∈ Sp does not survive iterative elimination of strictly dominated strategies. Then limt→∞ (xt )i = 0.
In particular, if F is dominance solvable, then all solutions of (BR) converge to the dominance
solution.
p p Proof. Observation 6.4.1 provides the basis for this induction: if i S1 , then limt→∞ (xt )i =
p
0. As the inductive hypothesis, suppose that this same equality holds for all i Sk . Now
p
p
p
p
p
let j ∈ Sk − Sk+1 . Then by deﬁnition, there exists a j ∈ Sk+1 such that F j (x) > F j (x) whenever
p
p
x ∈ Xk , where Xk = {x ∈ X : xi > 0 ⇒ i ∈ Sk } is the set of social states in which all agents
p
in each population p choose a strategies in Sk . Since Xk is compact and F is continuous, it
p
p
follows that for some c > 0, we have that F j (x) > F j (x) + c whenever x ∈ Xk , and so that
p
p
p
p
for some ε > 0, we have that F j (x) > F j (x) whenever x ∈ Xk,ε = {x ∈ X : xi > ε ⇒ i ∈ Sk }.
By the inductive hypothesis, there exists a T > 0 such that xt ∈ Xk,ε for all t ≥ T. Thus, for
p
p
˙
such t, j is not a best response to xt . This implies that (xt ) j = −(xt ) j for t ≥ T, and hence
p
p
that (xt ) j = (xT ) j eT−t , which converges to 0 as t approaches inﬁnity. 248 Exercise 6.4.3. Show that under (BR), the time until convergence to the set X∗,ε = {x ∈ X :
p
p
xi > ε ⇒ i ∈ S∗ } is uniform over initial conditions in X. 6.4.3 Imitative Dynamics We now establish analogous results for imitative dynamics. Since these dynamics
leave the boundary of the state space invariant, the elimination results can only hold for
solutions starting from interior initial conditions.
Theorem 6.4.4. Let {xt } be an interior solution trajectory of an imitative dynamic for population
p
game F, in which strategy i ∈ Sp is strictly dominated. Then limt→∞ (xt )i = 0.
˙
Proof. (p = 1) Observation 4.4.16 tells us that all imitative dynamics x = VF (x) exhibit
monotone percentage growth rates (4.16): we can write the dynamic as
(6.36) ˙
xi = xi Gi (x), where the continuous function G : X → Rn satisﬁes
(6.37) Gk (x) ≤ Gl (x) if and only if Fk (x) ≤ Fl (x) for all x ∈ int(X). Now suppose strategy i is strictly dominated by strategy j ∈ S. Since X is compact and
F is continuous, we can ﬁnd a c > 0 such that F j (x) − Fi (x) > c for all x ∈ X. Since G is
continuous as well, equation (6.37) implies that for some C > 0, we have that G j (x) − Gi (x) >
C for all x ∈ X.
Now write r = xi /x j . Equation (6.36) and the quotient rule imply that
(6.38) ˙
˙
d xi xi x j − x j xi xi Gi (x)x j − x j G j (x)xi
d
r=
=
=
= r Gi (x) − G j (x) .
dt
dt x j
(x j )2
(x j )2 ˙
Thus, along every interior solution trajectory {xt } of x = VF (x) we have that
t rt = r0 + t rs ds. rs Gi (x) − G j (x) ds ≤ r0 − C
0 0 Gronwall’s Inequality (Lemma 3.A.7) then tells us that rt ≤ r0 exp(−Ct), and hence that rt
¨
vanishes as t approaches inﬁnity. Since (xt ) j is bounded above by 1, (xt )i must approach 0
as t approaches inﬁnity.
An argument similar to the one used to prove Theorem 6.4.2 can be used to prove that
iteratively dominated strategies are eliminated by imitative dynamics.
249 Theorem 6.4.5. Let {xt } be an interior solution trajectory of an imitative dynamic for population
game F, in which strategy i ∈ Sp does not survive iterative elimination of strictly dominated
p
strategies. Then limt→∞ (xt )i = 0. In particular, if F is dominance solvable, then all interior
solutions of any imitative dynamic converge to the dominance solution.
Exercise 6.4.6.
(i) Prove Theorem 6.4.5.
p
p
(ii) Is the time until convergence to X∗,ε = {x ∈ X : xi > ε ⇒ i ∈ S∗ } uniform over initial
conditions in int(X)? Explain. Appendix
6.A Limit and Stability Notions for Deterministic Dynamics We consider diﬀerential equations and diﬀerential inclusions that are forward invariant
on the compact set X ⊂ Rn .
(D) ˙
x = V (x), a unique forward solution exists from each ξ ∈ X. (DI) ˙
x ∈ V (x), V is nonempty, convexvalued, bounded, and upper hemicontinuous. When V is discontinuous, we allow solutions to be of the Carath´ odory type—that is, to
e
˙
˙
satisfy xt = V (xt ) (or xt ∈ V (xt )) at almost all t ∈ [0, ∞). 6.A.1 ωLimits and Notions of Recurrence Let {xt } = {xt }t≥0 be a solution trajectory to (D) or (DI). The ωlimit of {xt } is the set of all
points that the trajectory approaches arbitrarily closely inﬁnitely often:
ω({xt }) = y ∈ X : there exists {tk }∞ 1 with lim tk = ∞ such that lim xtk = y .
k=
k→∞ The following proposition lists some basic properties of ωlimit sets.
Proposition 6.A.1. Let {xt } be a solution to (D) (or (DI)). Then
(i) ω({xt }) is nonempty and connected.
(ii) ω({xt }) is closed. In fact, ω({xt }) = t≥0 cl({xs : s ≥ t}).
(iii) ω({xt }) is invariant under (D) (or (DI)). 250 k→∞ If {xt } is the unique solution to (D) with initial condition x0 = ξ, we write ω(ξ) in place
of ω({xt }). In this case, the set
Ω= ω(ξ),
ξ∈X contains all points that are approached arbitrarily closely inﬁnitely often by some solution
of (D). Among other things, Ω contains all rest points, periodic orbits, and chaotic attractors
¯
of (D). Since Ω need not be closed, its closure Ω = cl(Ω) is used to deﬁne a standard notion
of recurrence for diﬀerential equations.
Example 6.A.2. To see that Ω need not be closed, consider the replicator dynamic in
1
1
standard RockPaperScissors (Figure 4.3.1(i)). The unique Nash equilibrium x∗ = ( 3 , 1 , 3 )
3
is a rest point, and solution trajectories from all other interior initial conditions form
closed orbits around x∗ . The vertices eR , eP , and eS are also rest points, and each trajectory
starting from a boundary point that is not a vertex converges to a vertex. Thus, Ω =
¯
int(X) ∪ {eR , eP , eS }, but Ω = X. §
While we will not make much use of them here, many other notions of recurrence
¯
besides Ω are available. To obtain a more demanding notion of recurrence for (D), call the
state ξ recurrent if the solution from (D) returns arbitrarily close to ξ inﬁnitely often—in
other words, if ξ ∈ ω(ξ). The Birkhoﬀ center of (D) is the closure of the set of all recurrent
points of (D).
More inclusive notions of recurrence—most importantly, the notion of chain recurrence—can be obtained by allowing occasional short jumps between nearby solution trajectories. In addition to its uses in the theory of learning in games, chain recurrence is
the key idea needed to state the Fundamental Theorem of Dynamical Systems: the domain
of any smooth ﬂow can be decomposed into two sets: a set on which the ﬂow admits a
Lyapunov function, and the set of chain recurrent points. 6.A.2 Stability of Sets of States Let A ⊆ X be a closed set, and call O ⊆ X a neighborhood of A if it is open relative
to X and contains A. We say that A is Lyapunov stable under (D) (or (DI)) if for every
neighborhood O of A there exists a neighborhood O of A such that every solution {xt } that
starts in O is contained in O: that is, x0 ∈ O implies that xt ∈ O for all t ≥ 0. A is attracting
if there is a neighborhood Y of A such that every solution that starts in Y converges to A:
that is, x0 ∈ Y implies that ω({xt }) ⊆ A. A is globally attracting if it is attracting with Y = X. 251 Finally, the set A is asymptotically stable if it is Lyapunov stable and attracting, and it is
globally asymptotically stable if it is Lyapunov stable and globally attracting.
Example 6.A.3. Attracting sets need not be asymptotically stable. A counterexample is
provided by a ﬂow on the unit circle that moves clockwise except at a single point. The
fact that the domain is the unit circle is unimportant, since one can embed this ﬂow as a
limit cycle in a ﬂow on the plane. §
Example 6.A.4. Invariance is not included in the deﬁnition of asymptotic stability. Thus,
˙
under the dynamic x = −x on R, any closed interval containing the origin is asymptotically
stable. § 6.B Stability Analysis via Lyapunov Functions Let Y ⊆ X. The function L : Y → R is a Lyapunov function for (D) or (DI) if its value
changes monotonically along every solution trajectory. We state the results to follow for
the case in which the value of L decreases along solution trajectories; of course, the obvious
analogues of these results hold for the opposite case.
The following lemma will prove useful in a number of the analyses to come.
Lemma 6.B.1. Suppose that the function L : Y → R and the trajectory {xt }t≥0 are Lipschitz
continuous.
˙
(i) If L(xt ) ≤ 0 for almost all t ≥ 0, then the map t → L(xt ) is nonincreasing.
˙
(ii) If in addition L(xs ) < 0, then L(xt ) < L(xs ) for all t > s.
Proof. The composition t → L(xt ) is Lipschitz continuous. Thus, the Fundamental
Theorem of Calculus tells us that when t > s, we have that
t L(xt ) − L(xs ) = ˙
L(xu ) du ≤ 0,
s ˙
where the inequality is strict if L(xs ) < 0. 6.B.1 Lyapunov Stable Sets The basic theorem on Lyapunov stability applies both to diﬀerential equations (D) and
diﬀerential inclusions (DI). 252 Theorem 6.B.2 (Lyapunov stability). Let A ⊆ X be closed, and let Y ⊆ X be a neighborhood of
A. Let L : Y → R+ be Lipschitz continuous with L−1 (0) = A. If each solution {xt } of (D) (or (DI))
˙
satisﬁes L(xt ) ≤ 0 for almost all t ≥ 0, then A is Lyapunov stable under (D) (or (DI)).
Proof. Let O be a neighborhood of A such that cl(O) ⊂ Y. Let c = minx∈bd(O) L(x), so that
c > 0. Finally, let O = {x ∈ O : L(x) < c}. Lemma 6.B.1 implies that solution trajectories
that start in O do not leave O, and hence that A is Lyapunov stable.
Example 6.B.3. The requirement that the function L be constant on A cannot be dispensed
with. Consider a ﬂow on the unit circle C = {x ∈ R2 : (x1 )2 + (x2 )2 = 1} that moves clockwise
at states x with x1 > 0 and is at rest at states in the semicircle A = {x ∈ C : x1 ≤ 0}. If we let
˙
L(x) = x2 , then L(x) ≤ 0 for all x ∈ C, and A is attracting (see Theorem 6.B.4 below), but A
is not Lyapunov stable.
We can extend this example so that the ﬂow is deﬁned on the unit disk D = {x ∈ R2 :
(x1 )2 + (x2 )2 ≤ 1}. Suppose that when x1 > 0, the ﬂow travels clockwise along the circles
centered at the origin, and that the half disk A = {x ∈ D : x1 ≤ 0} consists entirely of rest
˙
points. Then L(x) = x2 satisﬁes L(x) ≤ 0 for all x ∈ D, and A is attracting, but A is not
Lyapunov stable. § 6.B.2 ωLimits and Attracting Sets
We now provide some results that use Lyapunov functions to characterize ωlimits
of solution trajectories that begin in the Lyapunov function’s domain. These results
immediately yield suﬃcient conditions for a set to be attracting. To state our results, we
call the (relatively) open set Y ⊂ X inescapable if for each solution trajectory {xt }t≥0 with
x0 ∈ Y, we have that cl ({xt }) ∩ bd(Y) = ∅.
Our ﬁrst result focuses on the diﬀerential equation (D).
Theorem 6.B.4. Let Y ⊂ X be relatively open and inescapable under (D). Let L : Y → R be C1 ,
˙
˙
and suppose that L(x) ≡ L(x) V (x) ≤ 0 for all x ∈ Y. Then ω(x0 ) ⊆ {x ∈ Y : L(x) = 0} for all
˙
x0 ∈ Y. Thus, if L(x) = 0 implies that V (x) = 0, then ω(x0 ) ⊆ RP(V ) ∩ Y.
Proof. Let {xt } be the solution to (D) with initial condition x0 = ξ ∈ Y, let χ ∈ ω(ξ), and
let { yt } be the solution to (D) with y0 = χ. Since Y is inescapable, the closures of trajectories
{xt } and { yt } are contained in Y.
˙
Suppose by way of contradiction that L(χ) 0. Since χ ∈ ω(ξ), we can ﬁnd a divergent
sequence of times {tk }∞ 1 such that limk→∞ xtk = χ = y0 . Since solutions to (D) are unique,
k= 253 and hence continuous in their initial conditions, we have that
(6.39) lim xtk +1 = y1 , and hence that lim L(xtk +1 ) = L( y1 ). k→∞ k→∞ ˙
But since y0 = χ ∈ ω(ξ) and L(χ) 0, applying Lemma 6.B.1 to both {xt } and { yt } yields L(xt ) ≥ L(χ) > L( y1 )
for all t ≥ 0, contradicting the second limit in (6.39). This proves the ﬁrst claim of the
theorem, and the second claim follows immediately from the ﬁrst.
Theorem 6.B.5 is an analogue of Theorem 6.B.4 for upper hemicontinous diﬀerential
inclusions. Where the proof of Theorem 6.B.4 relied on the continuity of solutions to
(D) in their initial conditions, the proof of Theorem 6.B.5 takes advantage of the upper
hemicontinuity of the map from initial conditions ξ to solutions of (DI) starting from ξ.
Theorem 6.B.5. Let Y ⊂ X be relatively open and inescapable under (DI). Let L : Y → R be C1
and satisfy (i) ∂L (x) ≡ L(x) v ≤ 0 for all v ∈ V (x) and x ∈ Y, and (ii) [0 V (x) implies that
∂v
∂L
(x) < 0] for all v ∈ V (x) and x ∈ Y. Then for all solutions {xt } of (DI) with x0 ∈ Y, we have that
∂v
ω({xt }) ⊆ {x ∈ Y : 0 ∈ V (x)}.
Proof. Suppose that χ ∈ ω({xt }), but that 0 V (χ). Then ∂L (χ) < 0 for all v ∈ V (χ).
∂v
Thus, since V (χ) is compact by assumption, there exists a b > 0 such that ∂L (χ) < −b for all
∂v
b
ˆ
v ∈ V (χ). Because V is upper hemicontinuous and L is C1 , it follows that ∂L (χ) < − 2 for all
ˆ
∂v
ˆ
ˆ
ˆ
v ∈ V (χ) and all χ suﬃciently close to χ. So since V is bounded, there is a time u ∈ (0, 1]
such that all solutions { yt } of (DI) with y0 = χ satisfy
(6.40) L( yt ) ≤ L( ys ) ≤ L(χ) − bs
2 for all s ∈ [0, u] and t > s. Now let {tk }∞ 1 be a divergent sequence of times such that limk→∞ xtk = χ, and for each
k=
k, deﬁne the trajectory {xk }t≥0 by xk = xt+tk . Since the set of continuous trajectories C[0,T] (X)
t
t
is compact in the sup norm topology, the sequence of trajectories {{xk }}∞ 1 has a convergent
t k=
subsequence, which we take without loss of generality to be {{xk }}∞ 1 itself. We call the
t k=
ˆ
ˆ
limit of this subsequence { yt }. Evidently, y0 = χ.
Given our conditions on the correspondence V , the setvalued map
ˆ
ˆ
χ → {{xt } : {xt } is a solution to (DI) with x0 = χ}
is upper hemicontinuous with respect to the sup norm topology on C[0,T] (X) (see Appendix
254 ˆ
5.A). It follows that { yt } is a solution to (DI). Consequently,
(6.41) ˆ
ˆ
lim xtk +1 = y1 , and so lim L(xtk +1 ) = L( y1 ). k→∞ k→∞ But Lemma 6.B.1 and inequality (6.40) imply that
ˆ
L(xt ) ≥ L(χ) > L( y1 )
for all t ≥ 0, contradicting the second limit in (6.41).
Theorem 6.B.6 is a simple convergence result for diﬀerential inclusions. Here the
Lyapunov function need only be Lipschitz continuous (rather than C1 ), but the condition
on the rate of decrease of this function is stronger than in the previous results.
Theorem 6.B.6. Let Y ⊂ X be relatively open and inescapable under (DI), and let L : Y → R+
be Lipschitz continuous. Suppose that along each solution {xt } of (DI) with x0 ∈ Y, we have that
˙
L(xt ) ≤ −L(xt ) for almost all t ≥ 0. Then ω({xt }) ⊂ {x ∈ Y : L(x) = 0}.
Proof. Observe that
t L(xt ) = L(x0 ) + t ˙
L(xu ) du ≤ L(x0 ) +
0 −L(xu ) du = L(x0 ) e−t ,
0
t where the ﬁnal equality follows from the fact that α0 + 0 −αu du is the value at time t of the
˙
solution to the linear ODE αt = −αt with initial condition α0 ∈ R. It follows immediately
that limt→∞ L(xt ) = 0. 6.B.3 Asymptotically Stable and Globally Asymptotically Stable Sets Combining Theorem 6.B.2 with Theorem 6.B.4, 6.B.5, or 6.B.6 yields asymptotic stability
and global asymptotic stability results for deterministic dynamics. Corollary 6.B.7 oﬀers
such a result for the diﬀerential equation (D).
Corollary 6.B.7. Let A ⊆ X be closed, and let Y ⊆ X be a neighborhood of A. Let L : Y → R+ be
˙
C1 with L−1 (0) = A. If L(x) ≡ L(x) V (x) < 0 for all x ∈ Y − A, then A is asymptotically stable
under (D). If in addition Y = X, then A is globally asymptotically stable under (D). 255 6.C Cooperative Diﬀerential Equations Cooperative diﬀerential equations are deﬁned by the property that increases in the
value of one component of the state variable increase the growth rates of all other components. Their solutions have appealing monotonicity and convergence properties.
Let ≤ denote the standard partial order on Rn : that is, x ≤ y if and only if xi ≤ yi for all
i ∈ {1, . . . , n}. We write x < y when x ≤ y and x y, so that x j < y j for some j. Finally, we
write x
y when xi < yi for all i ∈ {1, . . . , n}. We call a vector or a matrix strongly positive
if all of its components are positive; thus, x ∈ Rn is strongly positive if x
0.
n
Let X ⊂ R be a compact convex set that possesses a minimal and a maximal element
with respect to the partial order ≤. Let V : X → Rn be a C1 vector ﬁeld with V (x) ∈ TX(x)
for all x ∈ X, so that the diﬀerential equation
(6.42) ˙
x = V (x) is forward invariant on X. We call the diﬀerential equation (6.42) cooperative if
(6.43) ∂Vi
(x) ≥ 0 for all i
∂x j j and x ∈ X. Equation (6.42) is irreducible if for every x ∈ X and every nonempty proper subset I of
the index set {1, . . . n}, there exist indices i ∈ I and j ∈ {1, . . . , n} − I such that ∂Vji (x) 0.
∂x
An obvious suﬃcient condition for (6.42) to be irreducible is that it be strongly cooperative,
meaning that the inequality in condition (6.43) is strict for all i j and x ∈ X.
In Appendix 3.A.3, we saw how to represent all solutions to the dynamic (6.42) simultaneously via the semiﬂow φ : R+ × X → X, deﬁned by φt (ξ) = xt , where {xt }t≥0 is the
solution to (6.42) with initial condition x0 = ξ. We say that the semiﬂow φ is monotone
if x ≤ y implies that φt (x) ≤ φt ( y) for all t ≥ 0: that is, weakly ordered initial conditions
induce weakly ordered solution trajectories. If in addition x < y implies that φt (x)
φt ( y)
for all t > 0, we say that φ is strongly monotone.
Theorem 6.C.1 tells us that cooperative irreducible diﬀerential equations generate
strongly monotone semiﬂows.
˙
Theorem 6.C.1. Suppose that x = V (x) is cooperative and irreducible. Then
(i) For all t > 0, the derivative matrix of its semiﬂow φ is strongly positive: Dφt (x)
(ii) The semiﬂow φ is strongly monotone. 0. For the intuition behind Theorem 6.C.1, let {xt } and { yt } be solutions to (6.42) with
x0 < y0 . Suppose that at some time t > 0, we have that xt ≤ yt and (xt )i = ( yt )i . If we could
256 show that Vi (xt ) ≤ Vi ( yt ), then it seems reasonable to expect that (xt+ε )i will not be able
to surpass ( yt+ε )i . But since xt and yt only diﬀer in components other than i, the vector
z = yt − xt ≥ 0 has zi = 0, and so
1 Vi ( yt ) − Vi (xt ) = 1 Vi (xt + αz) z dα =
0 0 ji ∂V i
(xt + αz) z j dα.
∂x j The ﬁnal expression is nonnegative as long as ∂Vji ≥ 0 whenever j i.
∂x
The next theorem sets out the basic properties of strongly monotone semiﬂows on X.
To state this result, we let C(φ) = {x ∈ X : ω(x) = {x∗ } for some x∗ ∈ RP(φ)} denote the
set of initial conditions from which the semiﬂow φ converges to a rest point. Also, let
Ω(φ) = x∈X ω(x) be the set of ωlimit points under φ.
Theorem 6.C.2. Suppose that the semiﬂow φ on X is strongly monotone. Then
(i) (Convergence criteria) If φT (x) ≥ x for some T > 0, then ω(x) is periodic with period T.
If φt (x) ≥ x over some nonempty open interval of times, then x ∈ C(φ).
(ii) (Unordered ωlimit sets) If x, y ∈ ω(z), then x y and y x.
¯
(iii) (Minimal and maximal rest points) Let x = min X and x = max X. Then x∗ = min RP(φ)
∗ = max RP(φ) exist; in fact, ω(x) = x∗ and ω(x) = x∗ . Moreover, [x∗ , x∗ ] contains
¯
¯
¯
¯
and x
Ω(φ) and is globally asymptotically stable.
Proof. (i) If φT (x) ≥ x, then φ(n+1)T (x) ≥ φnT (x) for all positive integers n, so monotonicity
and the compactness of X imply that limn→∞ φnT (x) = y for some y ∈ X. By the continuity
and group properties of the ﬂow,
φt+T ( y) = φt+T lim φnT (x) = lim φt+(n+1)T (x) = lim φt (φ(n+1)T (x)) = φt ( y),
n→∞ n→∞ n→∞ so the ﬂow from y is Tperiodic. A continuity argument shows that the orbit from y is
none other than ω(x). The proof of the second claim is omitted.
(ii) Suppose that x, y ∈ ω(z) and that x < y. Since φ is strongly monotone, and by the
continuity of φt (ξ) in ξ, there are neighborhoods Nx , N y ⊂ X of x and y and a time T > 0
such that φT (Nx )
φT (N y ). Choose τ y > τx > 0 such that φτx (z) ∈ Nx and φτy (z) ∈ N y .
Then for all t close enough to τ y ,
φτx +T (z) φt+T (z) = φt−τx (φτx +T (z)). Therefore, part (i) implies that ω(z) is a singleton.
¯
(iii) Since x and x are the minimal and maximal points in X, part (i) implies that ω(x) =
∗ and ω(x) = x∗ for some x∗ , x∗ ∈ RP(φ). Hence, if x ∈ X ⊆ [x, x], then φ (x) ≤ φ (x) ≤ φ (x)
¯
¯
¯
¯
x
t
t
t¯
257 ¯
¯
for all t ≥ 0, so taking limits yields x∗ ≤ ω(x) ≤ x∗ ; thus, Ω(φ) ⊆ [x∗ , x∗ ]. Finally, if
¯
¯
[x∗ , x∗ ] ⊆ [ y, z] ⊆ X, then x ∈ [ y, z] implies that φt (x) ∈ [φt ( y), φt (z)] ⊆ [ y, z], so [x∗ , x∗ ] is
Lyapunov stable, and hence globally asymptotically stable by the previous argument.
If the derivative matrices of the semiﬂow are strongly positive, one can obtain even
stronger results, including the convergence of solution trajectories from generic initial
conditions to rest points.
Theorem 6.C.3. Suppose that the semiﬂow φ on X is strongly monotone, and that its derivative
matrices Dφt (x) are strongly positive for all t > 0. Then
(i) (Limit set dichotomy) If x < y, then either ω(x) < ω( y), or ω(x) = ω( y) = {x∗ } for some
x∗ ∈ RP(φ).
(ii) (Generic convergence to equilibrium) C(φ) is an open, dense, full measure subset of X. 6.N Notes Section 6.1. The results in Section 6.1.1 are proved for symmetric random matching
games in Hofbauer (2000), the seminal reference on Lyapunov functions for evolutionary
dynamics. Global convergence in all potential games of dynamics satisfying positive
correlation is proved in Sandholm (2001), building on earlier work of Hofbauer and Sigmund (1988) and Monderer and Shapley (1996). Convergence of perturbed best response
dynamics in potential games is proved by Hofbauer and Sandholm (2007).
Shahshahani (1979), building on the early work of Kimura (1958), showed that the replicator dynamic for a potential game is a gradient dynamic after a “change in geometry”—
that is, after the introduction of an appropriate Riemannian metric on int(X). Subsequently,
Akin (1979, 1990) proved that Shahshahani’s (1979) result can also be represented using
the change of variable presented in Theorem 6.1.9. The direct proof oﬀered in the text is
from Sandholm et al. (2008).
Section 6.2. Theorem 6.2.1 is due to Nagurney and Zhang (1997); the proof in the text
is from Sandholm et al. (2008). Theorem 6.2.4 was ﬁrst proved for normal form games
with an interior ESS by Hofbauer et al. (1979) and Zeeman (1980). Akin (1990, Theorem
6.4) and Aubin (1991, Section 1.4) extend this result to nonlinear single population games,
while Cressman et al. (2001) extend it to linear multipopulation games.
Section 6.2.2 follows Hofbauer and Sandholm (2007, 2008). These papers take inspiration from Hart and MasColell (2001), which points out the role of integrability in models
of regretbased learning in repeated normal form games. Hofbauer (2000) proves the
convergence of the BNN, best response, and perturbed best response dynamics in normal
258 form games with an interior ESS. A proof of the existence of a cycle in Example 6.2.6 can
be found in Hofbauer and Sandholm (2008); this reference also contains a statement and
proof of the version of Danskin’s Envelope Theorem cited in the text. The probabilistic characterization of integrability alluded to the text is presented in Sandholm (2006b).
For subdiﬀerentials of convex functions, see HiriartUrruty and Lemar´ chal (2001); their
e
Example D.3.4 is especially relevant to our discussion in the text.
Smith (1984) proves Theorem 6.2.11 for his dynamic; the general result presented here
is due to Hofbauer and Sandholm (2008).
Kojima and Takahashi (2007) consider a class of single population random matching
games called anticoordination games, in which at each state x, the worst response to x
is always in the support of x. They prove (see also Hofbauer (1995b)) that such games
must have a unique equilibrium, that this equilibrium is interior, and that it is globally
asymptotically stable under the best response dynamic. However, also they present an
example (due to Hofbauer) showing that neither the replicator dynamic nor the logit
dynamic need converge in these games, the latter even at arbitrarily low noise levels.
Section 6.3. Section 6.3.1 follows Berger (2007). Exercise 6.3.1(ii) is due to Hofbauer
(1995b), and Lemma 6.3.2(i) is due to Monderer and Sela (1997). It is worth noting
that Theorem 6.3.4 extends immediately to ordinal supermodular games (also known as
quasisupermodular games; see Milgrom and Shannon (1994)). Moreover, since ordinal
potential games (Monderer and Shapley (1996)) are deﬁned by the absence of cycles of
improvement steps, a portion of the proof of Theorem 6.3.4 establishes the convergence
of simple solutions of (BR) in nondegenerate ordinal potential games.
Section 6.3.2 follows Hofbauer and Sandholm (2002, 2007).
Section 6.4. Akin (1980) shows that starting from any interior population state, the
replicator dynamic eliminates strategies that are strictly dominated by a pure strategy.
Versions of Theorems 6.4.4 and 6.4.5 can be found in Nachbar (1990) and Samuelson and
Zhang (1992); see also Hofbauer and Weibull (1996).
Section 6.A. For properties of ωlimit sets of diﬀerential equations, see Robinson (1995);
for ωlimit sets of diﬀerential inclusions, see Bena¨m et al. (2005). For applications of chain
ı
recurrence in the theory of learning in games, see Bena¨m and Hirsch (1999), Hofbauer
ı
and Sandholm (2002), and Bena¨m et al. (2005, 2006b). The Fundamental Theorem of
ı
Dynamical Systems is due to Conley (1978); see Robinson (1995) for a textbook treatment.
Other good general references on notions of recurrence for diﬀerential equations include
Nemytskii and Stepanov (1960), Akin (1993), and Bena¨m (1998, 1999).
ı
Section 6.B. The standard reference on Lyapunov functions for ﬂows is Bhatia and
Szeg˝ (1970).
o 259 Section 6.C. The standard reference on cooperative diﬀerential equations and monotone
dynamical systems is Smith (1995). Theorems 6.C.1, 6.C.2(i), 6.C.2(ii), and 6.C.3(i) in the
text are Smith’s (1995) Theorems 4.1.1, 1.2.1, 1.2.3, and 2.4.5, respectively. Theorem 6.C.3(ii)
combines Theorem 2.4.7 of Smith (1995) with Theorem 1.1 of Hirsch (1988), the latter after
a reversal of time. 260 CHAPTER SEVEN
Local Stability under Evolutionary Dynamics 7.0 Introduction In Chapter 6, we analyzed classes of games in which many evolutionary dynamics
converge to equilibrium from all or most initial conditions. While we argued in Chapter
2 that games from many applications lie in these classes, it is certain that at least as many
interesting games do not.
In cases where global convergence results are not available, one can turn instead to
analyses of local stability. If a society somehow ﬁnds itself playing a particular equilibrium,
how can we tell whether this equilibrium will persist in the face of occasional, small
disturbances in behavior? This chapter introduces a reﬁnement of Nash equilibrium—
that of an evolutionarily stable state (or ESS)—and establishes that an ESS is locally stable
under many evolutionary dynamics.
Our deﬁnition of ESS is one of many related deﬁnitions considered in the literature.
We present our deﬁnition, its alternatives, and the connections among them in Section 7.3.
We will see that games with an ESS share some structural properties with stable games,
at least in the neighborhood of the ESS. Taking advantage of this connection, we show in
Section 7.4 how to establish local stability of ESS under some dynamics through the use
of local Lyapunov functions. Our results here build on our analyses in Section 6.2, where
we constructed local Lyapunov functions for many dynamics for use in stable games.
The other leading approach to local stability analysis is linearization. Given a rest point
of a nonlinear (but smooth) dynamic, one can approximate the behavior of the dynamic
in a neighborhood of the rest point by studying an appropriate linear dynamic: namely,
the one deﬁned by the derivative matrix of the nonlinear dynamic, evaluated at the rest 261 point in question. In Sections 7.5 and 7.6, we use linearization to study the two families
of smooth dynamics introduced in Chapters 4 and 5: the imitative dynamics, and the
perturbed best response dynamics. Surprisingly, this analysis will lead us to a deep and
powerful connection between the replicator and logit dynamics, one that seems diﬃcult
to reach by other means.
It is worth noting now that linearization is also very useful for establishing instability
results. For this reason, the techniques we develop in this chapter will be a very important
ingredient of our analyses in Chapter 8, where we study nonconvergent dynamics.
The ﬁrst two sections of the chapter formally establish some results that were hinted
at in earlier chapters. In Section 7.1, we indicate two senses in which a nonNash rest
point of an imitative dynamic cannot be stable. In Section 7.2, we show that under most
dynamics, a Nash equilibrium of a potential game is locally stable if and only if it is a local
maximizer of potential.
The linearization techniques used in Sections 7.5 and 7.6 and in Chapter 8 require a
working knowledge of matrix analysis and linear diﬀerential equations; we present these
topics in detail in Appendices 7.A and 7.B. The main theorems of linearization theory are
themselves presented in Appendix 7.C. 7.1 NonNash Rest Points of Imitative Dynamics We saw in Chapters 4 and 5 that under ﬁve of our six classes of evolutionary dynamics,
rest points are identical to Nash equilibria (or to perturbed versions thereof). The lone
exception is the imitative dynamics: Theorem 4.4.21 shows that the rest points of these
dynamics are the restricted equilibria, a set that includes not only the Nash equilibria,
but also any state that would be Nash equilibria were the strategies unused at that state
removed from the game. Theorem 4.7.1 established one sense in which these extra rest
points are fragile: by combining a small amount of a “better behaved” dynamic with an
imitative dynamic, one obtains a new dynamic that satisﬁes Nash stationarity. But we
mentioned in Section 4.4.6 that this fragility can be expressed more directly: there we
claimed that nonNash rest points of imitative dynamics cannot be locally stable, and so
are not plausible predictions of play.
We are now in a position to formally establish this last claim. Recall from Observation 4.4.16 that imitative dynamics exhibit monotone percentage growth rates: they can be
expressed in the form
(7.1) p p p ˙
xi = xi Gi (x),
262 p p with the percentage growth rates Gi (x) ordered by payoﬀs Fi (x) as in equation (4.16). This
fact drives our instability result.
ˆ
Theorem 7.1.1. Let VF be an imitative dynamic for population game F, and let x be a nonNash
ˆ
rest point of VF . Then x is not Lyapunov stable under VF , and no interior solution trajectory of VF
ˆ
converges to x.
ˆ
Proof. (p = 1) Since x is a restricted equilibrium that is not a Nash equilibrium, each
ˆ
ˆ
ˆ
ˆ
strategy j in the support of x satisﬁes F j (x) = F(x), and any best response i to x is an unused
ˆ
ˆ
ˆ
strategy that satisﬁes Fi (x) > F(x). Also, since x is a rest point of VF , equation (7.1) implies
ˆ
ˆ
that each j in the support of x has G j (x) = 0. Thus, monotonicity of percentage growth
ˆ
ˆ
rates implies that Gi (x) > G j (x) = 0, and so the continuity of Gi implies Gi (x) ≥ k > 0 on
ˆ
some small neighborhood O of x.
Now let {xt } be an interior solution trajectory of VF (see Theorem 4.4.14). Then if xs ∈ O
for all s ∈ (t, u), it follows that
u u log((xu )i )−log((xt )i ) =
t d
ds log((xs )i ) ds =
t ˙
(xs )i
ds =
(xs )i u Gi (xs ) ds ≥ k(u−t).
t Rearranging and exponentiating yields
(xu )i ≥ (xt )i exp(k(u − t)).
Thus, during intervals that xs is in O, (xs )i is strictly increasing. This immediately implies
ˆ
that there is no neighborhood O of x such that solutions starting in O stay in O, and so
ˆ
x is not Lyapunov stable. Also, since (xt ) j cannot decrease inside O ∩ int(X), no interior
ˆ
solution trajectory can converge to x. 7.2 Local Stability in Potential Games We saw in Section 6.1 that in potential games, the potential function serves as a strict
Lyapunov function for any evolutionary dynamic satisfying positive correlation (PC);
solution trajectories of such dynamics ascend the potential function and converge to
connected sets of rest points. For dynamics that also satisfy Nash stationarity (NS), these
sets consist entirely of Nash equilibria.
That the potential function is a strict Lyapunov function has important implications
for local stability of sets of rest points. Call A ⊆ X a local maximizer set of the function
f : X → R if it is connected, if f is constant on A, and if there exists a neighborhood O of
263 A such that f (x) > f ( y) for all x ∈ A and all y ∈ O − A. Theorem 2.1.7 implies that such
a set consists entirely of Nash equilibria. We call the set A ⊆ NE(F) isolated if there is a
neighborhood of A that does not contain any Nash equilibria other than those in A.
If the value of f is nondecreasing along solutions of a dynamic, then they cannot escape
a neighborhood of a local maximizer set. If the value of f is actually increasing in this
neighborhood, then solutions in the neighborhood should converge to the set. This is the
content of the following theorem.
Theorem 7.2.1. Let F be a potential game with potential function f , let VF be an evolutionary
dynamic for F, and suppose that A ⊆ NE(F) is a local maximizer set of f .
(i) If VF satisﬁes positive correlation (PC), then A is Lyapunov stable under VF .
(ii) If in addition VF satisﬁes Nash stationarity (NS) and A is isolated, then A is an asymptotically stable set under VF .
Proof. Part (i) of the theorem follows immediately from Lemma 6.1.1 and Theorem
6.B.2. To prove part (ii), note that (NS), (PC), and the fact that A is isolated imply that there
is a neighborhood O of A such that f˙(x) = f (x) VF (x) > 0 for all x ∈ O − A. Corollary
6.B.7 then implies that A is asymptotically stable.
For dynamics satisfying (PC) and (NS), being an isolated local maximizer set is not
only a suﬃcient condition for being asymptotically stable; it is also necessary.
Theorem 7.2.2. Let F be a potential game with potential function f , let VF be an evolutionary
dynamic for F that satisﬁes (PC) and (NS). Suppose that A ⊆ NE(F) is a smoothly connected
asymptotically stable set under VF . Then A is an isolated local maximizer set of f .
Proof. Since A is a smoothly connected set of Nash equilibria, Exercise 2.1.15 implies
that f takes some ﬁxed value c throughout A. Now let ξ be an initial condition in O − A,
where O is the basin of attraction of A. Then ω(ξ) ⊆ A. But since f is a strict Lyapunov
function for VF , it follows that f (ξ) < c. Since ξ ∈ O − A was arbitrary, we conclude that
that A is an isolated local maximizer set.
Theorems 7.2.1 and 7.2.2 allow us to characterize locally stable rest points for dynamics
satisfying positive correlation (PC). Since the best response and perturbed best response
dynamics do not satisfy this condition, the former because of lack of smoothness and the
latter because of the perturbations, Theorems 7.2.1 and 7.2.2 do not apply.
In the case of the best response dynamic, Theorem 5.1.8 establishes analogues of (NS)
and (PC), which in turn imply that solution trajectories ascend the potential function and
converge to Nash equilibrium (Theorem 6.1.4). By using these results along with the
arguments above, we obtain
264 Theorem 7.2.3. Let F be a potential game with potential function f , let VF be the best response
dynamic for F, and let A ⊆ NE(F) be smoothly connected. Then A is an isolated local maximizer
set of f if and only if A is asymptotically stable under VF .
In the case of perturbed best response dynamics, the roles of conditions (PC) and
(NS) are played by virtual positive correlation and perturbed stationarity (Theorem 5.2.13
and Observation 5.2.10). These in turn ensure that the dynamics ascend the perturbed
potential function
f˜(x) = f (x) − 1
mp vp ( mp xp )
p∈P (Theorem 6.1.6). Substituting these results into the arguments above yields
Theorem 7.2.4. Let F be a potential game with potential function f , let VF,v be the perturbed best
response dynamic for F generated by the admissible deterministic perturbations v = (v1 , . . . , vp ),
and let A ⊆ PE(F, v) be smoothly connected. Then A is an isolated local maximizer set of f˜ if and
only if A is asymptotically stable under VF,v . 7.3 Evolutionarily Stable States We now turn to the main ideas of this chapter by introducing a reﬁnement of Nash
equilibrium that is of basic importance in evolutionary modeling. 7.3.1 Deﬁnition Let F be a game played by p ≥ 1 populations. We call x ∈ X an evolutionarily stable state
(ESS) of F if there is a neighborhood O of x such that
(7.2) ( y − x) F( y) < 0 for all y ∈ O − {x}. The notion of ESS was ﬁrst introduced in the context of single population random
matching. The following exercise shows that under multipopulation random matching,
the ESS condition is very restrictive:
Exercise 7.3.1. Suppose that F has no ownpopulation interactions: Fp (x) is independent
p
of xp for all p ∈ P . Show that if x∗ is an ESS of F, then it is a pure social state: (x∗ )p = mp ei
for some i ∈ Sp and p ∈ P . (Hint: If xp is not pure, consider an invasion by y = ( yp , x−p ),
where yp is an alternate best response to x.)
265 The conclusion of this exercise is similar in spirit and in content to that of Proposition 2.3.10,
which showed that a stable multipopulation game without ownpopulation interactions
must be null stable. But just as strictly stable multipopulation games are quite common if
ownpopulation interactions are allowed, so too are interior ESSs. 7.3.2 Variations Our deﬁnition (7.2) of ESS is one of many that can be found in the literature. Some
alternatives are equivalent to deﬁnition (7.2) in a single population setting, but that diﬀer
substantially in multipopulation settings.
Exercise 7.3.2. Consider the following condition on a social state x ∈ X:
(7.3) For all y ∈ O − {x}, there is a p ∈ P such that ( yp − xp ) Fp ( y) < 0. (i) Show that every ESS satisﬁes condition (7.3), but that the converse statement is
false if p ≥ 2.
(ii) Show that if F has no ownpopulation interactions, then any state satisfying condition (7.3) is pure social state (cf Exercise 7.3.1). While deﬁnition (7.2) is the most useful one for studying the evolutionary dynamics
considered in this book, condition (7.3) is more in the spirit to the original motivation for
ESS. See the Notes for an extended discussion of this point.
There are also alternative requirements that are equivalent to deﬁnition (7.2) under
single population random matching, but that are distinct from condition (7.2) in nonlinear
games. The following exercises explore some of these alternatives; see the Notes for
additional discussion.
Exercise 7.3.3. Consider the following requirement on social state x ∈ X:
(7.4) ¯
For each y ∈ O − {x}, there exists an ε > 0 such that
¯
( y − x) F(ε y + (1 − ε)x) < 0 for all ε ∈ (0, ε). (i) Explain requirement (7.4) in words.
(ii) Show that any ESS as deﬁned in (7.2) satisﬁes condition (7.4).
(iii) Show that if F(x) = Ax is a single population random matching game, then conditions (7.2) and (7.4) are equivalent.
(iv) Construct a threestrategy game with a state x∗ that satisﬁes condition (7.4) but that
1
is not an ESS. (Hint: Let x∗ = (0, 1 , 2 ), and let D1 and D2 be closed disks in X ⊂ R3
2
266 that are tangent to bd(X) at x∗ and whose radii are r1 and r2 > r1 . Introduce a payoﬀ
function of the form F(x) = −c(x)(x − x∗ ), where c(x) is positive on int(D1 ) ∪ (X − D2 )
and negative on int(D2 ) − D1 . Then apply Exercise 7.3.5 below.)
Exercise 7.3.4. Consider this pair of conditions on social state x ∈ X:
(7.5) x is a Nash equilibrium: ( y − x) F(x) ≤ 0 for all y ∈ X. (7.6) For all y ∈ O − {x}, ( y − x) F(x) = 0 implies that ( y − x) F( y) < 0. Show that this pair of conditions is equivalent to condition (7.4), and so are satisﬁed if x
is an ESS.
Exercise 7.3.5. The previous exercises imply that every ESS is a Nash equilibrium. Show
further that every ESS is isolated in the set of Nash equilibria.
Exercise 7.3.6. Consider this pair of conditions on social state x ∈ X:
(7.5) x is a Nash equilibrium. (7.7) For all y ∈ O − {x}, ( y − x) F(x) = 0 implies that ( y − x) DF(x)( y − x) < 0. (i) Show that conditions (7.5) and (7.7) imply that x is an ESS.
(ii) Show that if F(x) = Ax is a single population random matching game, then conditions (7.5) and (7.7) hold if and only if x is an ESS.
(iii) Give an example of a twostrategy game with an ESS x∗ that fails condition (7.7). 7.3.3 Regular ESS To prove some of our local stability results, we need a version of the ESS concept that
is slightly stronger than any of those proposed above. To introduce this condition, we ﬁrst
recall that a strict equilibrium is a pure Nash equilibrium at which in each population, the
strategy in use earns a strictly higher payoﬀ than all strategies not in use. In a similar vein,
we call state x a quasistrict equilibrium if within each population, all strategies in use earn
the same payoﬀ, which is higher than the payoﬀ of each unused strategy. Put diﬀerently,
x is a quasistrict equilibrium if
p p p p ¯
Fi (x) = Fp (x) > F j (x) whenever xi > 0 and x j = 0.
With this deﬁnition in hand, we can introduce our reﬁnement of ESS: we call state x
a regular ESS if it is a quasistrict equilibrium that satisﬁes condition (7.7). It is clear from
Exercise 7.3.6 that every regular ESS is in fact an ESS.
267 Let us point out an alternate characterization of regular ESS that will be useful later on.
p
For any set of strategies I ⊂ p∈P Sp , let Rn = { y ∈ Rn : y j = 0 whenever j I} denote the
I
set of vectors in Rn whose components corresponding to strategies outside I equal zero.
Also, let S(x) ⊆ p∈P Sp denote the support of state x.
Observation 7.3.7. State x is a regular ESS if and only if
(7.8) x is a quasistrict equilibrium; (7.9) z DF(x)z < 0 for all nonzero z ∈ TX ∩ Rn(x) .
S Condition (7.9) resembles the derivative condition we associate with strictly stable
games. However, the condition need only hold at the equilibrium, and negative deﬁniteness is only required to hold in directions that move along the face of X on which the
equilibrium lies. For instance, if x is pure, condition (7.9) is vacuous, so the deﬁnition of
regular ESS reduces to that of strict equilibrium. 7.4 Local Stability via Lyapunov Functions In this remainder of this chapter, we show that any (regular) ESS x∗ is locally stable
under many evolutionary dynamics. In this section, our approach is to construct a strict
local Lyapunov function for each dynamic in question: that is, a nonnegative function
deﬁned in a neighborhood of x∗ that vanishes precisely at x∗ and whose value decreases
along solution of the dynamic other than the stationary one at x∗ . The results presented
in Appendix 6.B show that the existence of such a function ensures the asymptotically
stability of x∗ .
The similarity between the deﬁnitions of ESS and of stable games—in particular, the
negative semideﬁniteness conditions that play a central role in both contexts—suggests
the Lyapunov functions for stable games from Section 6.2 as the natural starting points
for our stability analyses of ESSs. In some cases—under the projection and replicator
dynamics, and whenever the ESS is interior—we will be able to use the Lyapunov functions
from Section 6.2 without amendment. But more generally, these functions will require
modiﬁcations to become local Lyapunovs function for ESSs. 7.4.1 The Replicator and Projection Dynamics The analysis is simplest in the cases of the replicator and projection dynamics. In
Section 6.2, we proved global convergence of these dynamics in every strictly stable game
268 by showing that measures of “distance” from the game’s unique Nash equilibrium served
as global Lyapunov functions. The proofs of these convergence results relied on nothing
about the payoﬀ structure of the game apart from the fact that the game’s unique Nash
equilibrium is also a GESS.
This observation suggests that if state x∗ is an ESS of an arbitrary population game,
the same “distance” functions will serve as local Lyapunov functions for x∗ under the two
dynamics. We conﬁrm this logic in the following theorem.
Theorem 7.4.1. Let x∗ be an ESS of F. Then x∗ is asymptotically stable under
(i) the replicator dynamic for F;
(ii) the projection dynamic for F.
Exercise 7.4.2. Prove Theorem 7.4.1 by showing that the functions Hx∗ and Ex∗ from Theorems 6.2.4 and 6.2.1 deﬁne strict local Lyapunov functions for the two dynamics. 7.4.2 Target and Pairwise Comparison Dynamics: Interior ESS In proving convergence results for other classes of dynamics in Section 6.2, we relied
directly on the negative semideﬁniteness condition (2.15) that characterizes stable games.
If a game admits an interior ESS that satisﬁes the strict inequalities in (7.9), then condition
(2.15) holds in a neighborhood of the ESS. This allows us again to use the Lyapunov
functions from Section 6.2 without amendment to prove local stability results.
Theorem 7.4.3. Let x∗ be a regular interior ESS of F. Then x∗ is asymptotically stable under
(i) any separable excess payoﬀ dynamic for F;
(ii) the best response dynamic for F;
(iii) any impartial pairwise comparison dynamic for F.
Exercise 7.4.4. Prove Theorem 7.4.3 by showing that the functions Γ, G, and Ψ from
Theorems 6.2.8, 6.2.9, and 6.2.11 deﬁne a strict local Lyapunov functions for an ESS x∗
under the three dynamics.
Rest points of perturbed best response dynamics generally do not coincide with Nash
equilibria, and hence with ESSs. Nevertheless, the next exercise indicates that an appropriate negative deﬁniteness condition is still enough to ensure local stability.
˜
Exercise 7.4.5. Let x be a perturbed equilibrium of (F, v) for some admissible deterministic
˜
perturbations v = (v1 , . . . vp ), and suppose that z DF(x)z < 0 for all nonzero z ∈ TX. Show
˜
˜
that x is isolated in the set of perturbed equilibria, and that the function G from Theorem
˜
˜
6.2.10 deﬁnes a strict local Lyapunov function for x. (Hint: To show that x is isolated, use
the argument at the end of the proof of Theorem 6.2.10.)
269 For consistency with our previous results, it is natural to try prove stability results for
games with an interior ESS. To do so, we need to assume that the size of the perturbations
is “small”, in the hopes that there will be a perturbed equilibrium that is “close” to the
ESS. Since the logit dynamic is parameterized by a noise level η, it provides a natural
setting for the result we seek.
Theorem 7.4.6. Let x∗ be a regular interior ESS of F. Then for some neighborhood O of x∗ and each
˜
ˆ
η > 0 less than some η > 0, there is a unique logit(η) equilibrium xη in O, and this equilibrium
η
˜
is asymptotically stable under the logit(η) dynamic. Finally, x varies continuously in η, and
˜
limη→0 xη = x∗ .
Proof. (p = 1) Theorem 6.2.10 and Exericises 5.2.6 and 5.2.7 show that for η > 0, the
function −1 ˆ ˜ η (x) = η log G
exp(η F j (x)) + η
x j log x j . j∈S j∈S (with 0 log 0 ≡ 0) is a Lyapunov function for the logit(η) dynamic when F is a stable game.
If we deﬁne
˜
ˆ
G0 (x) ≡ G(x) = max F j (x).
j∈S ˜
to be the Lyapunov function for the best response dynamic in stable games, then Gη (x) is
continuous in (x, η) on X × [0, ∞),
By Exercise 7.4.3, G deﬁnes a strict local Lyapunov function for the best response
dynamic at the interior ESS x∗ . In particular, x∗ is local minimizer of G: there is an open,
convex neighborhood O ⊂ X of x∗ such that G(x) > G(x∗ ) for all x ∈ O − {x∗ }. Moreover,
since F is C1 and satisﬁes z DF(x∗ )z < 0 for all nonzero z ∈ TX, we can choose O in such a
way that z DF(x)z < 0 for all nonzero z ∈ TX and x ∈ O.
˜
Because Gη (x) is continuous in (x, η), the Theorem of the Maximum (see the Notes)
implies that the map
˜
˜
η → β(η) ≡ argmin Gη (x).
x∈cl(O) ˜
is upper hemicontinuous on [0, ∞). Thus, since β(0) = {x∗ } ⊂ O (in particular, since
˜
ˆ
ˆ
x∗ bd(O)), there is an η > 0 such that β(η) ⊂ O for all η < η. This implies that each
η
η
˜
˜
˜
x ∈ β(η) is a local minimum of G not only with respect to cl(O), but also with respect to
the full state space X.
270 ˜
Exercise 7.4.5 implies that the value of Gη is decreasing along solutions to the logit(η)
˜
dynamic in the set O, implying that each local minimizer xη is a rest point of this dynamic—
η
˜
indeed, x must be an asymptotically stable rest point. Finally, since O is convex, the last
ˆ˜
paragraph of the proof of Theorem 6.2.10 shows that when η < η, β(η) ⊂ O is a singleton.
This completes the proof of the theorem. 7.4.3 Target and Pairwise Comparison Dynamics: Boundary ESS It remains for us to prove local stability results for boundary ESSs for the the dynamics
considered in Theorem 7.4.3.
Theorem 7.4.7. Let x∗ be a regular ESS of F. Then x∗ is asymptotically stable under
(i) any separable excess payoﬀ dynamic for F;
(ii) the best response dynamic for F;
(iii) any impartial pairwise comparison dynamic for F.
To prove Theorem 7.4.7, we show that suitably modiﬁed versions of the Lyapunov
functions for stable games serve as local Lyapunov functions here. Letting Sp (x∗ ) =
support((x∗ )p ) and C > 0, we augment the functions Γ, G, and Ψ from Section 6.2 by the
function
(7.10) p Υx∗ (x) = C
p∈P j Sp (x∗ ) xj , which is proportional to the number of agents using strategies outside the support of x∗ .
We provide a detailed proof of the theorem for the case of impartial pairwise comparison
dynamics, and leave the proofs of the other two cases as exercises.
˙
Proof of Theorem 7.4.7(iii). (p = 1) Let x = VF (x) be an impartial pairwise comparison
1
dynamic for F. Deﬁne the C function Ψx∗ : X → R by
Ψx∗ (x) = Ψ(x) + Υx∗ (x) = Ψ(x) + C x j.
j S(x∗ ) Here Ψ is the Lyapunov function deﬁned in Theorem 6.2.11, and Υx∗ is as deﬁned in
equation (7.10); the constant C > 0 will be determined later.
Since VF is an impartial pairwise comparison dynamic, Theorem 6.2.11 shows that
the function Ψ is nonnegative, with Ψ(x) = 0 if and only if x ∈ NE(F). It follows that
Ψx∗ too is nonnegative, with Ψx∗ (x) = 0 if and only if x is a Nash equilibrium of F with
271 support(x) ⊆ support(x∗ ). Thus, since x∗ is a regular ESS, it is isolated in the set of Nash
equilibria (see Exercise 7.3.5), so there is a neighborhood O of x∗ on which x∗ is the unique
˙
zero of Ψx∗ . If we can show that there is also a neighborhood O of x∗ such that Ψx∗ (x) < 0
∗ }, then Ψ ∗ is a strict local Lyapunov function for x∗ , so the conclusion
for all x ∈ O − {x
x
of the theorem will follow from Corollary 6.B.7.
To reduce the amount of notation in the analysis to come, let 10 ∈ Rn be the vector
whose jth component equals 0 if j ∈ support(x∗ ) and equals 1 otherwise, so that (10 ) x is
the mass of agents who use strategies outside the support of x∗ at state x. Then we can
write Ψx∗ (x) = Ψ(x) + C (10 ) x, and so can express the time derivative of Ψx∗ as
˙
˙
˙
Ψx∗ (x) = Ψ(x) + C (10 ) x.
Now the proof of Theorem 6.2.11 shows that the time derivative of Ψ satisﬁes
˙
˙
˙
Ψ(x) ≤ x DF(x)x,
with equality holding precisely at the Nash equilibria of VF . Thus, to ﬁnish the proof, it is
enough to show that
˙
˙
˙
x DF(x)x + C (10 ) x ≤ 0
for all x ∈ O − {x∗ }. This follows directly from the following lemma, choosing C ≥ M/N.
˙
Lemma 7.4.8. Let x = VF (x) be a pairwise comparison dynamic for F, and let x∗ be a regular ESS
of F. Then there is a neighborhood O of x∗ and constants M, N > 0 such that for all x ∈ O ,
˙
˙
(i) x DF(x)x ≤ M (10 ) x;
˙
(ii) (10 ) x ≤ −N (10 ) x.
Proof. Suppose without loss of generality that S(x∗ ) = support(x∗ ) is given by {1, . . . , n∗ }.
Then to complement 10 ∈ Rn , let 1∗ ∈ Rn be the vector whose ﬁrst n∗ components equal
1 and whose remaining components equal 0, so that 1 = 1∗ + 10 . Next, decompose the
identity matrix I as I∗ + I0 , where I∗ = diag(1∗ ) and I0 = diag(10 ), and ﬁnally, decompose I∗
1
as Φ∗ +Ξ∗ , where Ξ∗ = n∗ 1∗ (1∗ ) and Φ∗ = I∗ −Ξ∗ . Notice that Φ∗ is the orthogonal projection
p
of Rn onto Rn ∩ Rn(x∗ ) = {z ∈ Rn : z j = 0 whenever j S(x∗ )}, and that I = Φ∗ + Ξ∗ + I0 .
0
0
S
Using this decomposition of the identity matrix, we can write (7.11) ˙
˙
˙
˙
x DF(x)x = ((Φ∗ + Ξ∗ + I0 ) x) DF(x)((Φ∗ + Ξ∗ + I0 ) x)
∗ x) DF(x)(Φ∗ x) + ((Ξ∗ + I0 ) x) DF(x) x + (Φ∗ x) DF(x)((Ξ∗ + I0 ) x).
˙
˙
˙
˙
˙
= (Φ ˙ 272 Since x∗ is a regular ESS, we know that z DF(x∗ )z < 0 for all nonzero z ∈ TX ∪ Rn(x∗ ) . Thus,
S
since DF(x) is continuous in x, there is a neighborhood O of x∗ on which the ﬁrst term of
(7.11) is nonpositive.
˙
Turning to the second term, note that since 1 x = 0 and (10 ) = 1 I0 , we have that
1
1
˙
˙
˙
(Ξ∗ + I0 )x = ( n∗ 1∗ (1∗ ) + I0 )x = (− n∗ 1∗ (10 ) + I0 )x = ((I − 1∗
11
n∗ ˙
)I0 )x. Let A denote the spectral norm of the matrix A (see Appendix 7.A.6). Then applying
spectral norm inequalities and the CauchySchwarz inequality, we ﬁnd that
(7.12) ˙
˙
((Ξ∗ + I0 )x) DF(x)x = ((I − 1∗
11
n∗ ˙
˙
˙
)I0 x) DF(x)x ≤ I0 x I− 1
1(1∗ )
n∗ DF(x) ˙
x. Since DF(x), VF (x), and ρi j (F(x)) are continuous in x on the compact set X, we can ﬁnd
constants K and R such that
I− 1
1(1∗ )
n∗ DF(x) ˙
x ≤ K and max ρi j (F(x), x) ≤ R for all x ∈ X.
i, j∈S From the bound on ρi j , it follows that
˙
I0 x = j>n∗ ˙
xj 2 ˙
xj ≤
j>n∗ = xk ρk j (F(x), x) − x j
j>n∗ k∈S ≤
j>n∗ ρ jk (F(x), x)
k∈S xk ρk j (F(x), x) + x j
k∈S ≤ 2Rn k∈S ρ jk (F(x), x) xj
j>n∗ = 2Rn (10 ) x.
We therefore conclude that at all x ∈ O ,
˙
˙
((Ξ∗ + I0 )x) DF(x)x ≤ 2KRn (10 ) x.
Essentially the same argument provides a similar bound on the third term of (7.11), 273 completing the proof of part (i) the lemma.
We proceed with the proof of part (ii). Since x∗ is quasistrict, we have that Fi (x∗ ) =
¯
F(x∗ ) > F j (x∗ ) for all i ∈ support(x∗ ) = {1, . . . , n∗ } and all j support(x∗ ) = {n∗ + 1, . . . , n}.
Therefore, since the pairwise comparison dynamic satisﬁes sign preservation (4.23), we
have for such i and j that ρ ji (F(x∗ )) > 0 and ρi j (F(x∗ )) = 0. So, since F and ρ are continuous,
sign preservation implies that there is a neighborhood O of x∗ and an r > 0 such that
ρ ji (F(x)) > r and ρi j (F(x)) = 0 for all i ≤ n∗ , j > n∗ , and x ∈ O . Applying this observation
and then canceling like terms, we ﬁnd that for all x ∈ O ,
˙
(10 ) x = ˙
xj
j>n∗ xk ρk j (F(x)) − x j
= j>n∗ k∈S =
xk ρk j (F(x)) − x j j>n∗ k>n∗ =− k∈S ρ ji (F(x)) xj
j>n∗ k∈S ρ jk (F(x)) ρ jk (F(x)) i≤n∗ ≤ −r n∗ (10 ) x.
This completes the proof of the lemma, and thus the proof of Theorem 7.4.7.
Exercise 7.4.9. Prove Theorem 7.4.7(ii) (for p = 1) by showing that under the best response
dynamic, the function
Gx∗ (x) = G(x) + Υx∗ (x) = max ( y − x) F(x) + C
y∈X xj
j S(x∗ ) is a strict local Lyapunov function for any regular ESS x∗ . (Hint: The proof is nearly the
same as the one above, but building on the proof of Theorem 6.2.9 instead of the proof of
Theorem 6.2.11, and using Theorems 6.B.2 and 6.B.6 in place of Corollary 6.B.7.)
Exercise 7.4.10. Prove Theorem 7.4.7(i) (for p = 1) by showing that under the separable
excess payoﬀ dynamic with revision protocol τ, the function
ˆ
Fi (x) Γx∗ (x) = Γ(x) + Υx∗ (x) = τi (s) ds + C i∈S 0 xj
j S(x∗ ) is a strict local Lyapunov function for any regular ESS x∗ . (Hint: Establish this variant of
274 Lemma 7.4.8: under the excess payoﬀ dynamic generated by τ, there is a neighborhood
O of x∗ such that
˙
˙
(i) x DF(x)x ≤ K T(x) (10 ) x and
˙
(ii) (10 ) x = −T(x) (10 ) x,
for all x ∈ O , where T(x) = 7.5 i∈S ˆ
τi (Fi (x)).) Linearization of Imitative Dynamics In this section and the next, we study the stability of rest points of evolutionary
dynamics using linearization. This technique requires the dynamic in question to be
smooth, at least near the rest point in question, and it can be inconclusive in borderline
cases. But, more optimistically, it does not require the guesswork needed to ﬁnd Lyapunov
functions. Furthermore, instead of establishing just asymptotic stability, a rest point found
stable via linearization (that is, one that is linearly stable) must attract solutions from all
nearby initial conditions at an exponential rate. Linearization is also very useful for
proving that a rest point is unstable, a fact we will avail ourselves of repeatedly when
studying nonconvergence in Chapter 8. Finally, linearization techniques allow us to prove
local stability results for imitative dynamics other than the replicator dynamic, for which
no Lyapunov functions have been proposed.
The appendix to this chapter explains the techniques from matrix analysis (Appendix
7.A), linear diﬀerential equation theory (Appendix 7.B), and linearization theory (Appendix 7.C) used in this chapter and the next. We assume in the remainder of this chapter
and in the next chapter that payoﬀs are deﬁned on the positive orthant (see Appendix
2.A.7), as doing so will allow us to avoid the using aﬃne calculus. Reviewing multivariate
product and chain rules from Appendix 2.A.4 may be helpful for following the arguments
to come.
We begin the analysis with some general background on linearization of evolutionary
dynamics. Recall that a single population dynamic
(D) ˙
x = V (x) describes the evolution of the population state through the simplex X. In evaluating the
stability of the rest point x∗ using linearization, we are relying on the fact that near x∗ , the
dynamic (D) can typically be well approximated by the linear dynamic
(L) ˙
y = DV (x∗ ) y.
275 Because we are only interested in how (D) behaves on the simplex, we only care
about how (L) behaves on the tangent space TX. Indeed, it is only because (D) deﬁnes a
dynamic on X that it makes sense to think of (L) as a dynamic on TX. At each state x ∈ X,
V (x) ∈ TX describes the current direction of motion through the simplex. It follows that
the derivative DV (x) must map any tangent vector z into TX, as one can verify by writing
V (x + z) = V (x) + DV (x)z + o(z)
˙
and noting that V (x) and V (x + z) are both in TX. Thus, in (L), y lies in TX whenever y lies
in TX, implying that TX is invariant under (L).
Keeping this argument in mind is important when using linearization to study stability
under the dynamic (D): rather than looking at all the eigenvalues of DV (x∗ ), we should
only consider those associated with the restricted linear map DV (x∗ ) : TX → TX, which
sends each tangent vector z ∈ TX to a new tangent vector DV (x∗ )z ∈ TX. The scalar
λ = a + ib is an eigenvalue of this restricted map if DV (x∗ )z = λz for some vector z
whose real and imaginary parts are both TX. If all eigenvalues of this restricted map have
negative real part, then the rest point x∗ is linearly stable under (D) (cf Corollary 7.C.2).
Hines’s Lemma, stated next and proved in Appendix 7.A.7, is often the key to making
these determinations. In stating this result, we let Rn = {z ∈ Rn : z 1 = 0} denote the
0
tangent space of the simplex. In the single population case, TX and Rn are the same, but it
0
p
is useful to separate these two notations in multipopulation cases, where TX = p∈P Rn
0
Lemma 7.5.1. Suppose that Q ∈ Rn×n is symmetric, satisﬁes Q1 = 0, and is positive deﬁnite with
respect to Rn , and that A ∈ Rn×n is negative deﬁnite with respect to Rn . Then each eigenvalue of
0
0
the linear map QA : Rn → Rn has negative real part.
0
0 7.5.1 The Replicator Dynamic In this section, we show that any regular ESS x∗ is linearly stable under the replicator
dynamic. To begin, we focus on the case in which x∗ is interior.
Theorem 7.5.2. Let x∗ be an regular interior ESS of F. Then x∗ is linearly stable under the
replicator dynamic.
Proof. (p = 1) The single population replicator dynamic is given by
(R) ˆ
˙
xi = Vi (x) = xi Fi (x). To compute DV (x), recall from equation (6.20) that the derivative of the excess payoﬀ
276 ˆ
¯
function F(x) = F(x) − 1F(x) is given by
ˆ
DF(x) = DF(x) − 1(x DF(x) + F(x) ) = (I − 1x )DF(x) − 1F(x) .
Then applying the product rule for componentwise products (see Appendix 2.A.4), we
ﬁnd that
(7.13) ˆ
DV (x) = D(diag(x)F(x))
ˆ
ˆ
= diag(x)DF(x) + diag(F(x))
ˆ
= diag(x)((I − 1x )DF(x) − 1F(x) ) + diag(F(x))
ˆ
= Q(x)DF(x) − x F(x) + diag(F(x)), where we write Q(x) = diag(x) − xx .
Since x∗ is an interior Nash equilibrium, F(x∗ ) is a constant vector, implying that
ˆ
F(x∗ ) Φ = 0 and that F(x∗ ) = 0. Thus, equation (7.13) becomes
(7.14) DV (x∗ )Φ = Q(x∗ )DF(x∗ )Φ. Since the matrices Q(x∗ ) and DF(x∗ )Φ satisfy the conditions of Hines’s Lemma, the eigenvalues of DV (x∗ )Φ (and hence of DV (x∗ )) corresponding to directions in Rn have negative
0
real part. This completes the proof of the theorem.
Exercise 7.5.3. Let x∗ be an interior Nash equilibrium of F that satisﬁes z DF(x∗ )z > 0 for
all nonzero z ∈ TX. Show that x∗ is a source under the replicator dynamic: all relevant
eigenvalues of DV (x∗ ) have positive real part, implying that all solutions of the replicator
dynamic that start near x∗ are repelled. (Hint: See the discussion in Appendix 7.A.7.)
Also, construct a game with an equilibrium that satisﬁes the conditions of this result.
Exercise 7.5.4. Show that any regular interior ESS is linearly stable under the projection
dynamic.
The next example highlights the fact that being a regular ESS is only a suﬃcient
condition for an interior equilibrium to be locally stable under the replicator dynamic, not
a necessary condition.
Example 7.5.5. Zeeman’s game revisited. In Example 5.1.7, we introduced the single popu 277 lation game F(x) = Ax generated by random matching in 0 6 −4 A = −3 0 5 . −1 3 0 1
4
This game admits Nash equilibria at states x∗ = ( 1 , 3 , 1 ), ( 5 , 0, 1 ) and e1 ; the replicator
3
3
5
dynamic has rest points at these states, as well as at the restricted equilibria (0, 5 , 3 ), e2 ,
88
and e3 . Examining the phase diagram in Figure 7.5.1, we see that the behavior of the
dynamic near the nonNash rest points is consistent with Theorem 7.1.1.
Since F is not a stable game (why not?), Theorem 7.5.2 does not tell us whether x∗ is
stable. But we can check this directly: following the proof of Theorem 7.5.2, we compute 4
9 −13 1 DV (x∗ )Φ = Q(x∗ )DF(x∗ )Φ = Q(x∗ )AΦ = −5 −9 14 . 9 1
0 −1 In addition to the irrelevant eigenvalue of 0 corresponding to eigenvector 1, this matrix
√
√
has pair of complex eigenvalues, − 1 ± i 32 , corresponding to eigenvectors (−2 ± i(3 2), 1
3
√
i(3 2), 1) whose real and complex parts lie in Rn . Since the real parts of the relevant
0
eigenvalues are both − 1 , the Nash equilibrium x∗ is linearly stable under the replicator
3
dynamic. §
We now establish the stability of all regular ESSs.
Theorem 7.5.6. Let x∗ be a regular ESS of F. Then x∗ is linearly stable under the replicator
dynamic.
Proof. (p = 1) Suppose without loss of generality that the support of x∗ is {1, . . . , n∗ },
so that the number of unused strategies at x∗ is n0 = n − n∗ . For any matrix M ∈ Rn×n ,
∗∗
we let M++ ∈ Rn ×n denote the upper left n∗ × n∗ block of M, and we deﬁne the blocks
∗0
0
∗
0
0
M+0 ∈ Rn ×n , M0+ ∈ Rn ×n , and M00 ∈ Rn ×n similarly. Also, for each vector v ∈ Rn , we let
∗
0
v+ ∈ Rn and v0 ∈ Rn denote the upper and lower “blocks” of v.
Recall our expression (7.13) for the derivative matrix of the replicator dynamic:
ˆ
DV (x) = Q(x)DF(x) − x F(x) + diag(F(x)),
ˆ
where Q(x) = diag(x) − xx . Now observe that x∗ = 0 for all j > n∗ , that Fi (x∗ ) = 0 for all
j
ˆ
i ≤ n∗ , and, since x∗ is quasistrict, that F j (x∗ ) < 0 for all j > n∗ (see the proof of Lemma
278 1 2 3 Figure 7.5.1: The replicator dynamic in Zeeman’s game. ˆ
ˆ
4.5.4). Therefore, by writing Q = Q(x∗ ), D = DF(x∗ ), π = F(x∗ ), and π = F(x∗ ), we can
express DV (x∗ ) in the block diagonal form
(7.15) Q++ D++ − (x∗ )+ (π+ ) DV (x∗ ) = 0 Q++ D+0 − x∗ (π0 ) . 0
ˆ
diag(π ) To complete the proof of the theorem, we need to show that if v + iw with v, w ∈ Rn is an
0
∗ ) with eigenvalue a + ib, then a < 0.
eigenvector of DV (x
We split the analysis into two cases. Suppose ﬁrst that (v + iw)0 = 0 (i.e., that v j = w j = 0
whenever j > n∗ ). Then it is easy to see that (v + iw)+ must be an eigenvector of DV (x∗ )++ =
Q++ D++ − (x∗ )+ (π+ ) . Now because x∗ is a Nash equilibrium with support {1, . . . , n∗ }, π+ is
a constant vector, and since v, w ∈ Rn and (v + iw)0 = 0, the components of (v + iw)+ sum to
0
∗∗
zero. Together, these observations imply that (x∗ )+ (π+ ) (v + iw)+ = 0. Finally, Q++ ∈ Rn ×n
∗∗
and D++ ∈ Rn ×n satisfy the conditions of Hines’s Lemma, the latter by requirement (7.9)
for regular ESSs, and so this lemma enables us to conclude that a < 0.
Now suppose that (v + iw)0
0, so that v j + iw j
0 for some j > n∗ . Then since
ˆ
the lower right block of DV (x∗ ) is the diagonal matrix diag(π0 ), the jth component of the
∗ ) is π (v + iw ) = (a + ib)(v + iw ), implying that a = π (and
ˆj j
ˆj
eigenvector equation for DV (x
j
j
j
279 also that b = w j = 0). But as we noted above, the fact that x∗ is a quasistrict equilibrium
ˆ
implies that π j < 0, and so that a < 0. This completes the proof of the theorem.
Exercise 7.5.7. Suppose that x∗ = ei is a strict equilibrium of F. Show that for each j
the vector e j − ei is an eigenvector of DV (x∗ ) with eigenvalue F j (x∗ ) − Fi (x∗ ). i, Exercise 7.5.8. Suppose that x∗ is a quasistrict Nash equilibrium of F. We saw in Theorem
ˆ
7.5.6 that for each unused strategy j, the excess payoﬀ F j (x∗ ) is an eigenvalue of DV (x∗ )
ˆ
corresponding to an eigenvector in TX. Assume that F j (x∗ ) is not an eigenvalue of DV (x∗ )
corresponding to an eigenvector in TX ∩ Rn(x∗ ) . Show that
S 1
ζ + n∗ 1 −ι ∈ TX
j
ˆ
is an eigenvector of DV (x∗ ) corresponding to eigenvalue F j (x∗ ), where ι j is the appropriate
0
∗
standard basis vector in Rn , and where ζ is the unique vector in Rn satisfying 1 ζ = 0 and
ˆ
ˆ1
(Q++ D++ − π j I) ζ = π j ( n∗ 1 − (x∗ )+ ) + Q++ (D+0 ι j − 1
D++ 1).
n∗ Why is there exactly one vector that satisﬁes these conditions? What goes wrong if the
ˆ
restriction on F j (x∗ ) does not hold? 7.5.2 General Imitative Dynamics Theorem 7.5.6 established the local stability of all regular ESSs under the replicator
dynamic. Theorem 7.5.9 parlays the previous analysis into a local stability for all imitative
dynamics.
Theorem 7.5.9. Assume that x∗ is a hyperbolic rest point of both the replicator dynamic (R) and
a given imitative dynamic (4.5). Then x∗ is linearly stable under (R) if and only if it is linearly
stable under (4.5). Thus, if x∗ is a regular ESS that satisﬁes the hyperbolicity assumptions, it is
linearly stable under (4.5).
Proof. (p = 1) We only consider the case in which x∗ is interior; for boundary cases, see
Exercise 7.5.12.
Recall from Observation 4.4.16 that any imitative dynamic (4.5) has monotone percentage growth rates: we can express the dynamic as
(7.16) ˙
xi = xi Gi (x), where
280 (7.17) Gi (x) ≥ G j (x) if and only if Fi (x) ≥ F j (x). Lemma 7.5.10 shows that property (7.17) imposes a remarkable amount of structure on
the derivative matrix of the percentage growth rate function G at the equilibrium x∗ .
Lemma 7.5.10. Let x∗ be an interior Nash equilibrium, and suppose that ΦDF(x∗ ) and ΦDG(x∗ )
deﬁne invertible maps from TX to itself. Then ΦDG(x∗ )Φ = c ΦDF(x∗ )Φ for some c > 0.
Proof. Since x∗ is a Nash equilibrium, and hence a rest point of (7.16), we have that
ΦF(x∗ ) = ΦG(x∗ ) = 0. It follows that
(7.18) ΦF(x∗ + εz) = εΦDF(x∗ )z + o(ε) and ΦG(x∗ + εz) = εΦDG(x∗ )z + o(ε). for all z ∈ TX. Since we can rewrite condition (7.17) as
(ei − e j ) G(x) ≥ 0 if and only if (ei − e j ) F(x) ≥ 0,
and since ei − e j ∈ TX, equation (7.18) implies that for all i, j ∈ S and z ∈ TX,
(7.19) (ei − e j ) ΦDG(x∗ )z ≥ 0 if and only if (ei − e j ) ΦDF(x∗ )z ≥ 0. (This observation is trivial when z = 0, and when z 0 it follows from the fact that the
linear terms dominate in (7.18) when ε is small.) By Proposition 2.B.6, condition (7.19) is
equivalent to the requirement that for all i, j ∈ S, there is a ci j > 0 such that
(7.20) (ei − e j ) ΦDG(x∗ )Φ = ci j (ei − e j ) ΦDF(x∗ )Φ. Now write gi j = (ei − e j ) ΦDG(x∗ )Φ and fi j = (ei − e j ) ΦDF(x∗ )Φ. Since by assumption
ΦDF(x∗ )Φ is an invertible map from TX to itself, so is its transpose (see Exercise 7.5.11
below). Therefore, when i, j, and k are distinct, the unique decomposition of fik as a linear
combination of fi j and f jk is as fi j + f jk . But equation (7.20) reveals that
ci j fi j + c jk f jk = gi j + g jk = gik = cik fik ,
and so ci j = c jk = cik . This and the fact that ci j = c ji imply that ci j is independent of i and
j. So, since vectors of the form ei − e j span TX, we conclude from equation (7.20) that
ΦDG(x∗ )Φ = c ΦDF(x∗ )Φ, where c is the common value of the constants ci j . This completes
the proof of the lemma.
ˆ
We proceed with the proof of Theorem 7.5.9. Let V (x) = diag(x)F(x) and W (x) =
281 diag(x)G(x) denote the replicator dynamic (R) and the dynamic (7.16), respectively. Since
ˆ
W (x) ∈ TX, we have that 1 W (x) = x G(x) = 0, and hence that G(x) ≡ G(x) − 1x G(x) = G(x).
ˆ
Thus (7.16) can be rewritten as W (x) = diag(x)G(x).
Now, repeating calculation (7.13) reveals that
ˆ
DW (x) = Q(x)DG(x) − 1G(x) + diag(G(x)).
Since x∗ is an interior rest point of W , G(x∗ ) is a constant vector, and so
DW (x∗ )Φ = Q(x∗ )DG(x∗ )Φ = Q(x∗ )ΦDG(x∗ )Φ,
where the second equality follows from the fact that Q(x∗ )1 = 0. Similar reasoning for the
replicator dynamic V shows that
DV (x∗ )Φ = Q(x∗ )ΦDF(x∗ )Φ
Lemma 7.5.10 tells us that ΦDG(x∗ )Φ = cΦDF(x∗ )Φ for some c > 0. We therefore conclude
from the previous two equations that if x∗ is a hyperbolic rest point under V and W , its
stability properties under the two dynamics are the same.
Exercise 7.5.11. Suppose that A ∈ Rn×n deﬁnes an invertible map from Rn to itself and
0
maps the vector 1 to the origin. Show that A must also have these properties. (Hint: Use
the Fundamental Theorem of Linear Algebra (7.27).)
Exercise 7.5.12. Extend the proof of Theorem 7.5.9 above to the case of boundary equilibria.
(Hint: Combine Lemma 7.5.10 with the proof of Theorem 7.5.6.) 7.6 Linearization of Perturbed Best Response Dynamics Linearization is also a useful tool for studying perturbed best response dynamics, our
other main class of diﬀerentiable evolutionary dynamics. 7.6.1 Deterministically Perturbed Best Response Dynamics In Chapter 5, we saw that perturbed best response dynamics can be deﬁned in terms
of either stochastic or deterministic payoﬀ perturbations. But Theorem 5.2.2 showed that
there is no loss of generality in focusing on the later case, and so we will do so here. 282 Our ﬁrst result shows that a negative deﬁniteness condition on the payoﬀ derivative
is a suﬃcient condition for stability. The conclusion here is similar to that from Exercise
7.4.5, but the analysis is much simpler, and establishes not only asymptotic stability, but
also linear stability.
˜
Theorem 7.6.1. Consider the perturbed best response dynamic for the pair (F, v), and let x be
˜
˜
a perturbed equilibrium of this pair. If DF(x) is negative deﬁnite with respect to TX, then x is
linearly stable.
Proof. (p = 1) In the single population case, the stochastically perturbed best response
dynamic takes the form
(7.21) ˜
˙
x = M(F(x)) − x, ˜
where the perturbed maximizer function M is deﬁned in equation (5.12). By the chain
rule, the derivative of law of motion (7.21) is
(7.22) ˜
DV (x) = DM(F(x))DF(x) − I. ˜
To determine the eigenvalues of the product DM(F(x))DF(x), let us recall the properties
˜
of the derivative matrix DM(π) from Corollary 5.C.5: it is symmetric, positive deﬁnite on
˜
˜
Rn , and satisﬁes DM(π)1 = 0. Since we have assumed that DF(x) is negative deﬁnite with
0
˜
˜
˜
respect to Rn , Hines’s Lemma implies that the eigenvalues of DM(F(x))DF(x) (as a map
0
n
from R0 to itself) have negative real part. Subtracting the identity matrix I from the matrix
product reduces each of these eigenvalues by 1, so the theorem is proved.
Exercise 7.6.2. Show that the conclusion of the theorem continues to hold if DF(x) is only
negative semideﬁnite with respect to TX. (Hint: See the discussion in Appendix 7.A.7.)
¯
˜
Exercise 7.6.3. Let x be a perturbed equilibrium for (F, v). Let λ be the largest eigenvalue
˜
˜
¯
˜
of DM(F(x)), and let s be the largest singular value of ΦDF(x)Φ (see Section 7.A.6). Show
¯¯
˜
˜
that if λ s < 1, then x is linearly stable: that is, x is stable whenever choice probabilities are
not too sensitive to changes in payoﬀs, or payoﬀs are not too sensitive to changes in the
state. 7.6.2 The Logit Dynamic Imposing the additional structure provided by logit choice allows us to carry our local
stability analysis further. First, building on Theorem 7.4.6, we argue that any regular
283 interior ESS must have a linearly stable logit(η) equilibrium nearby whenever the noise
level η is suﬃciently small.
Corollary 7.6.4. Let x∗ be a regular interior ESS of F. Then for some neighborhood O of x∗ and
ˆ
all η > 0 less than some η > 0, there is a unique and linearly stable logit(η) equilibrium xη in O.
Proof. (p = 1) Theorem 7.4.6 tells us that for η small enough, the equilibrium xη exists
˜
and is unique, and that limη→0 xη = x∗ . Since x∗ is a regular interior ESS, DF(x∗ ) is negative
˜
deﬁnite with respect to TX, so by continuity, DF(xη ) is negative deﬁnite with respect to
TX for all η close enough to 0. The result therefore follows from Theorem 7.6.1.
The derivative matrix for the logit dynamic takes an especially appealing form. Recall
from Exercise 5.2.7 that the derivative matrix of the logit(η) choice function is
(7.23) ˜
˜
˜
˜
˜
DMη (π) = η−1 diag(Mη (π)) − Mη (π)Mη (π) = η−1 Q(Mη (π)). ˜
˜
˜
˜
Now by deﬁnition, the logit equilibrium xη satisﬁes Mη (F(xη )) = xη . Substituting this fact
into equations (7.22) and (7.23) yields
(7.24) ˜
˜
˜
DV η (xη ) = η−1 Q(xη )DF(xη ) − I. To see the importance of this equation, recall from equation (7.14) that at interior rest
points, the derivative matrix for the replicator dynamic satisﬁes
(7.25) DV (x∗ )Φ = Q(x∗ )DF(x∗ )Φ. Together, equations (7.24) and (7.25) show that when evaluated at their respective rest
points and in the relevant tangent directions, the linearizations of the replicator and logit
dynamics at their interior rest points diﬀer only by a positive aﬃne transformation!
Example 7.6.5. To obtain the cleanest connections between the two dynamics, consider a
1
game that admits a Nash equilibrium x∗ = n 1 at the barycenter of the simplex. Then by
symmetry, xη = x∗ is also a logit(η) equilibrium for every η > 0. By the logic above, λ is
a relevant eigenvalue of (7.25) if and only if η−1 λ − 1 is a relevant eigenvalue of (7.24). It
follows that if x∗ is linearly stable under the replicator dynamic, then it is also linearly
stable under the logit(η) dynamic for any η > 0. §
The foregoing discussion shows how analyses of local stability under the replicator
and logit dynamics can be closely linked. Pushing these arguments further, one can use
equations (7.24) and (7.25) to connect the long run behaviors of the replicator and best
284 response dynamics starting from arbitrary initial conditions—see the Notes for further
discussion. Appendix
7.A Matrix Analysis In this section we review some basic ideas from matrix analysis. In doing so, we lay
the groundwork for our introduction to linear diﬀerential equations in Appendix 7.B; this
in turn underlies our introduction to local linearization of nonlinear diﬀerential equations
in Appendix 7.C. The techniques presented here are also used to perform the explicit
calculations that arise when using linearization to analyze evolutionary dynamics. 7.A.1 Rank and Invertibility While in most of this section we focus on square matrices, we start by considering
matrices A ∈ Rm×n of arbitrary dimensions. The rank of A is the number of linearly
independent columns of A, or, equivalently, the dimension of its range. The nullspace (or
kernel) of A is the set of vectors that the matrix maps to the origin, and the dimension of
this set is called the nullity of A. The rank and nullity of a matrix must sum to its number
of columns:
dim(nullspace(A)) + dim(range(A)) = n;
(7.26) dim(nullspace(A )) + dim(range(A )) = m. In Appendix 2.B.2, we introduced the Fundamental Theorem of Linear Algebra:
(7.27) range(A) = (nullspace(A ))⊥ . To derive a key implication of (7.27) for the ranks of matrices, ﬁrst recall that any subspace
V ⊆ Rm satisﬁes dim(V ) + dim(V ⊥ ) = m. Letting V = nullspace(A ) and then combining
the result with equation (7.26), we obtain
dim(range(A )) = dim((nullspace(A )⊥ ).
Therefore, (7.27) yields
dim(range(A )) = dim(range(A)).
285 In words: every matrix has the same rank as its transpose.
From this point forward, we suppose that A ∈ Rn×n is a square matrix. We say that A
is invertible if it admits an inverse matrix A−1 : that is, a matrix satisfying A−1 A = I. Such a
matrix also satisﬁes AA−1 = I, and when an inverse matrix exists, it is unique. Invertible
matrices can be characterized in a variety of ways: for instance, a matrix is invertible if
and only if it has full rank (i.e., if A ∈ Rn×n has rank n); alternatively, a matrix is invertible
if and only if its determinant is nonzero. 7.A.2 Eigenvectors and Eigenvalues Let A ∈ Rn×n , and suppose that
(7.28) Ax = λx for some complex scalar λ ∈ C and some nonzero complex vector x ∈ Cn . Then we call λ
an eigenvalue of A, and x an eigenvector of A associated with λ; sometimes, the pair (λ, x) is
referred to as an eigenpair.
The eigenvector equation (7.28) can be rewritten as (λI − A)x = 0. This equation
can only be satisﬁed by a nonzero vector if (λI − A) is not invertible, or, equivalently, if
det(λI − A) = 0. It follows that λ is an eigenvalue of A if and only if λ is a root of the
characteristic polynomial det(tI − A).
Since det(tI − A) is a polynomial of degree n in t, the Fundamental Theorem of Algebra
ensures that it has n complex roots:
(7.29) det(tI − A) = (t − λ1 ) (t − λ2 ) . . . (t − λn ). To be sure to obtain n roots, we must “count multiplicities”: if the values of λi in the above
expression are not all distinct, the repeated values must be tallied each time they appear.
Evidently, each λi in (7.29) is an eigenvalue of A; if the value λ is repeated k times in (7.29),
we say that λ is an eigenvalue of A of (algebraic) multiplicity k.
We note in passing that that the sum and the product of the eigenvalues of A can be
described very simply:
n n λi = tr(A);
i =1 λi = det(A).
i =1 (Here, the trace tr(A) of the matrix A is the sum of its diagonal elements.) To remember
these formulas, notice that they are trivially true if A is a diagonal matrix, since in this
286 case the eigenvalues of A are its diagonal entries.
Each eigenvalue of A corresponds to at least one eigenvector of A, and if an eigenvalue
λ is of algebraic multiplicity k, then there can be as many as k linearly independent
eigenvectors of A corresponding to this eigenvalue. This number of linearly independent
eigenvectors is called the geometric multiplicity of λ. The collection of all eigenvectors
corresponding to λ, the eigenspace of λ, is a subspace of Cn of dimension equal to the
geometric multiplicity of λ.
Example 7.A.1. Let a, b ∈ R be nonzero, and consider these three 2 × 2 matrices: a b a b a 0 A= 0 a ; B = 0 a ; C = −b a .
The matrix A has just one eigenvalue, a, which therefore has algebraic multiplicity 2.
It also has geometric multiplicity 2, as its eigenspace is all of C2 = span({e1 , e2 }). (This
description of C2 relies on our allowing complex scalars when taking linear combinations
of e1 and e2 .)
The matrix B also has a lone eigenvalue of a of algebraic multiplicity 2. But here the
geometric multiplicity of a is just 1, since its eigenspace is span({e1 }).
The matrix C has no real eigenvalues or eigenvectors; however, it has complex eigenvalues a ± i b ∈ C corresponding to the complex eigenvectors e1 ± i e2 ∈ C2 .
Let us explain for future reference the geometry of the linear map x → Cx. By writing
√
r = a2 + b2 and θ = cos−1 ( a ), we can express the matrix C as
r cos(θ) sin(θ) C=r − sin(θ) cos(θ) . Computing Cx for various values of x (try x = e1 and x = e2 ), reveals that the map x → Cx
ﬁrst rotates the vector x around the origin clockwise by an angle of θ, and then rescales
the result by a factor of r. § 7.A.3 Similarity, (Block) Diagonalization, and the Spectral Theorem The matrix A ∈ Rn×n is similar to matrix B ∈ Rn×n if there exists an invertible matrix
S ∈ Cn×n , called a similarity matrix, such that
B = S−1 AS. 287 When A is similar to B, the linear transformations x → Ax and y → By are equivalent up
to a linear change of variable. Similarity deﬁnes an equivalence relation on the set of n × n
matrices, and matrices that are similar have the same characteristic polynomial and the
same eigenvalues, counting either algebraic or geometric multiplicities.
If A is similar to a diagonal matrix D—that is, if A is diagonalizable—then the eigenvalues
of A are simply the diagonal elements of D. In this deﬁnition the similarity matrix
is allowed to be complex; if the similarity can be achieved via a real similarity matrix
S ∈ Rn×n , then the diagonal matrix D is also real, and we call A real diagonalizable.
It follows easily from our deﬁnitions that a matrix A is diagonalizable if and only if
the sum of the geometric multiplicities of the eigenvalues of A is n. Equivalently, A is
diagaonalizable if and only if each of its eigenvalues has equal algebraic and geometric
multiplicities. It is simple to verify that in this case, a similarity matrix S can be constructed
by choosing n linearly independent eigenvectors of A to be its columns.
It is especially convenient when similarity can be achieved using similarity matrix that
is itself of a simple form. The most important instance occurs when this matrix is an
orthogonal matrix, meaning that its columns form an orthonormal basis for Rn : each column
is of length 1, and distinct columns are orthogonal. (It would make more sense to call
such a matrix an “orthonormal matrix”, but the term “orthogonal matrix” is traditional.)
Orthogonal matrices can be characterized in a variety of ways:
Theorem 7.A.2. The following are equivalent:
(i) R is an orthogonal matrix.
(ii) RR = I.
(iii) R = R−1 .
(iv) The map x → Rx preserves lengths: Rx = x for all x ∈ Rn .
(v) The map x → Rx preserves inner products: (Rx) (Ry) = x y for all x, y ∈ Rn ;.
(vi) The map x → Rx is a composition of rotations and reﬂections.
The last three items are summarized by saying that the linear transformation x → Rx
deﬁned by an orthogonal matrix R is a Euclidean isometry.
While showing that a matrix is similar to a diagonal matrix is quite useful, showing
similarity to a block diagonal matrix often serves just as well. We focus on block diagonal
matrices with diagonal blocks of these two types: a b J1 = (λ); J2 = −b a . For reasons that will become clear in Section 7.A.5, we call block diagonal matrices of this
288 form simple Jordan matrices. Calculations with simple Jordan matrices are often little more
diﬃcult than those with diagonal matrices: for instance, multiplying such a matrix by
itself retains its block diagonal structure.
To muster these ideas, let us call the matrix A ∈ Rn×n normal if it commutes with itself:
that is, if A A = AA .
Theorem 7.A.3 (The Spectral Theorem for Real Normal Matrices). The matrix A ∈ Rn×n is
normal if and only if it is similar via an orthogonal matrix R to a simple Jordan matrix B = R−1 AR.
The matrix B is unique up to the ordering of the diagonal blocks.
The spectral decomposition of A provides a full account of the eigenvalues and eigenvectors of A. Each J1 block ( λ ) contains a real eigenvalue of A, and the pair of complex
numbers a ± i b derived from each J2 block are complex eigenvalues of A. Moreover,
columns of the orthogonal similarity matrix R either are real eigenvectors of A, or are real
and imaginary parts of complex eigenvectors of A.
The spectral theorem tells us that if A is normal, the behavior of the linear map
x → Ax = RBR−1 x can be decomposed into three simple steps. First, one applies the
orthogonal transformation R−1 = R to x, obtaining y = R x. Second, one applies the block
diagonal matrix B to y: each J1 block rescales a component of y, while each J2 block rotates
and rescales a pair of components of y (cf Example 7.A.1). Third, one applies R to BR x to
undo the initial orthogonal transformation.
Additional restrictions on the J1 and J2 blocks yield characterizations of important
subclasses of the normal matrices.
Corollary 7.A.4. (i) The matrix A ∈ Rn×n is symmetric (A = A) if and only if it is similar
via an orthogonal matrix R to a simple Jordan matrix containing only J1 blocks. Thus, the
symmetric matrices are the normal matrices with real eigenvalues.
(ii) The matrix A ∈ Rn×n is skewsymmetric (A = −A) if and only if it is similar via an
orthogonal matrix R to a simple Jordan matrix whose J1 blocks all have λ = 0 and whose
J2 blocks all have a = 0. Thus, the skewsymmetric matrices are the normal matrices with
purely imaginary eigenvalues.
(iii) The matrix A ∈ Rn×n is orthogonal (A = A−1 ) if and only if if it is similar via an
orthogonal matrix R to a simple Jordan matrix whose J1 blocks all have λ2 = 1 and whose
J2 blocks all have a2 + b2 = 1. Thus, the orthogonal matrices are the normal matrices whose
eigenvalues have modulus 1. 289 7.A.4 Symmetric Matrices Which matrices are real diagonalizable by an orthogonal matrix? The spectral theorem
for symmetric matrices tells us that A is real diagonalizable by an orthogonal matrix if and
only if it is symmetric. (This is just a restatement of Corollary 7.A.4.) Among other things,
the spectral theorem implies that the eigenvalues of a symmetric matrix are real.
While we often associate a matrix A with the linear transformation x → Ax, a symmetric
matrix is naturally associated with a quadratic form, x → x Ax. In fact, the eigenvalues
of a symmetric matrix can be characterized in terms of its quadratic form. The RayleighRitz Theorem provides simple descriptions of the λ and λ, the maximal and minimal
eigenvalues of A:
λ = max x Ax; λ = min x Ax.
n
n
x∈R : x=1 x∈R : x=1 The CourantFischer Theorem shows how the remaining eigenvalues of A can be expressed
by in terms of a related sequence of minmax problems.
We say that the matrices A, B ∈ Rn×n are congruent if there is an invertible matrix
Q ∈ Rn×n such that
B = QAQ .
Congruence plays the same role for quadratic forms as similarity does for linear transformations: if two symmetric matrices are congruent, they deﬁne the same quadratic form up
to a linear change of variable. Like similarity, congruence deﬁnes an equivalence relation
on the set of n × n matrices. Lastly, note that two symmetric matrices that are similar by
an orthogonal matrix Q are also congruent, since in this case Q = Q−1 .
The eigenvalues of congruent symmetric matrices are closely linked. Deﬁne the inertia
of a symmetric matrix to be the ordered triple consisting of the numbers of positive, negative, and zero eigenvalues of the matrix. Sylvester’s Law of Inertia tells us that congruent
symmetric matrices have the same inertia. Ostrowski’s Theorem provides a quantitative
extension of this result: if we list the eigenvalues of A and the eigenvalues of B in increasing order, then the ratios between pairs of corresponding eigenvalues are bounded by the
minimal and maximal eigenvalues of Q Q. 7.A.5 The Real Jordan Canonical Form How can we tell if two matrices are similar? If the matrices are diagonalizable, then one
can check for similarity by diagonalizing the two matrices and seeing whether the same
290 diagonal matrix is obtained in each case. To apply this logic beyond the diagonalizable
case, we would need to ﬁnd a simple class of matrices with the property that every matrix
is similar to a unique representative from this class. Such a class of matrices would also
provide a powerful computational aid, since calculations involving arbitrary matrices
could be reduced by similarity to calculations with these simple matrices.
With this motivation, we deﬁne a real Jordan matrix to be a block diagonal matrix whose
diagonal blocks, known as Jordan blocks, are of these four types: λ 1 0 0 0 λ 1 0 a b . ; J3 = . . . . . . . . . . J1 = (λ); J2 = . −b a 0 0 0 λ 0000 0 0 0
0 J2 I 0 J2 I 0 0
0 . . .
.
.. ..
. ; J4 = .
..
. .
. . .
.
. 0 0 0 J I
1 2 0 0 0 0 J2
λ Theorem 7.A.5. Every matrix A ∈ Rn×n is similar via a real similarity matrix S to a real Jordan
matrix J = S−1 AS. The latter matrix is unique up to the ordering of the Jordan blocks.
The real Jordan matrix in the statement of the theorem is called the real Jordan canonical
form of A.
The blocks in the real Jordan form of A provide detailed information about the eigenvalues of A: each J1 block corresponds to a real eigenvalue λ; each J2 block corresponds to
a pair of complex eigenvalues a ± i b; each J3 block corresponds to a real eigenvalue with
less than full geometric multiplicity; and each J4 block corresponds to a pair of complex
eigenvalues with less than full geometric multiplicities. (We can say more if each Jordan
block represents a distinct eigenvalue: then each eigenvalue has geometric multiplicity 1;
the J1 and J2 blocks correspond to eigenvalues whose algebraic multiplicities are also 1;
and the J3 and J4 blocks correspond to eigenvalues with higher algebraic multiplicities,
with these multiplicities being given by the number of appearances of λ (in a J3 block) or
of J2 blocks (in a J4 block).)
Example 7.A.6. Suppose that A ∈ R2×2 has complex eigenvalues a ± i b with complex
eigenvectors v ± i w. Then A(v + i w) = (a + i b)(v + i w). Equating the real and imaginary
parts of this equation yields a b Av w=v w −b a . Premultiplying by ( v w )−1 reveals that the real Jordan form of A is a single J2 block. §
291 Example 7.A.7. Suppose that A ∈ R2×2 has a lone eigenvalue, λ ∈ R, which is of algebraic
multiplicity 2 but geometric multiplicity 1. Let x ∈ R2 be an eigenvector of A, so that
(A − λI)x = 0. It can be shown that there exists a vector y that is linearly independent of
x and that satisﬁes (A − λI) y = x. (Such a vector (and, more generally, vectors that satisfy
higher iterates of this equation) is called a generalized eigenvector of A.) Rewriting the two
equations above, we obtain λ 1 A x y = λx x + λ y = x y 0 λ .
Premultiplying the ﬁrst and last expressions by ( x y )−1 shows that A has a real Jordan
form consisting of a single J3 block. § 7.A.6 The Spectral Norm and Singular Values It is often useful to be able to place bounds on the amount of “expansion” generated
by a linear map x → Ax, or by a composite linear map x → Bx → ABx. One can obtain
such bounds by introducing the spectral norm of a matrix A ∈ Rn×n , deﬁned by
A = max Ax .
x: x=1 (As always in this book, x denotes the Euclidean norm of the vector x.) It is not diﬃcult
to check that the spectral norm is submultiplicative, in the following two senses:
Ax ≤ A x ; and
AB ≤ A B. These inequalities often work hand in hand with the CauchySchwarz inequality, which
expresses the submultiplicativity of inner products of vectors:
x y ≤ x y .
To compute the spectral norm of a matrix, it is best to describe it in a diﬀerent way. The
product A A generated by any matrix A is symmetric. It therefore has n real eigenvalues
(see Section 7.A.4), and it can be shown that these eigenvalues are nonnegative. The
square roots of the eigenvalues of A A are called the singular values of A. 292 One can show that the spectral norm of A equals the largest singular value of A:
A = max √ λ : λ is an eigenvalue of A A . It makes no diﬀerence here if we replace A A with AA , since for any A, B ∈ Rn×n , AB and
BA have the same eigenvalues.
The notion of a singular value also underpins the singular value decomposition
Theorem 7.A.8. Every matrix A ∈ Rn×n can be expressed as A = V ΣW , where V and W are
orthogonal matrices, and where Σ is a diagonal matrix whose diagonal entries are the singular
values of A.
In this decomposition, the columns of V are eigenvectors of AA , and the columns of W
are eigenvectors of A A. 7.A.7 Hines’s Lemma In Section 7.5, we introduced Hines’s Lemma:
Lemma 7.5.1. Suppose that Q ∈ Rn×n is symmetric, satisﬁes Q1 = 0, and is positive deﬁnite
with respect to Rn , and that A ∈ Rn×n is negative deﬁnite with respect to Rn . Then each eigenvalue
0
0
of the linear map QA : Rn → Rn has negative real part.
0
0
If we ignored the complications caused by the fact that our dynamics are restricted to the
simplex, Lemma 7.5.1 would reduce to
Lemma 7.A.9. If Q is symmetric positive deﬁnite and A is negative deﬁnite, then the eigenvalues
of QA have negative real parts.
The proof of Lemma 7.A.9 is a simpler version of the proof below.
The argument below can also be used when other deﬁniteness conditions are imposed
on A. In particular, if A is only negative semideﬁnite with respect to Rn , then the relevant
0
eigenvalues of QA have nonpositive real parts, and if A is positive deﬁnite with respect
to Rn , the relevant eigenvalues of QA have positive real part.
0
Proof of Lemma 7.5.1. Since Q is positive deﬁnite with respect to Rn , since Q1 = 0, and
0
n
n
since R = R0 ⊕ span({1}), we have that nullspace(Q) = span({1}). Thus, because Q is
symmetric, the Fundamental Theorem of Linear Algebra (7.27) tells us that
range(Q) = (nullspace(Q ))⊥ = (nullspace(Q))⊥ = (span({1}))⊥ = Rn .
0
293 In other words, Q maps Rn onto itself, and so is invertible on this space.
0
Now suppose that
(7.30) QA(v + iw) = (a + ib)(v + iw) for some v, w ∈ Rn with v + iw 0 and some a, b in R. Since Q is invertible on Rn , there
0
0
exist y, z ∈ Rn , at least one of which is not 0, such that Qy = v and Qw = z. We can thus
0
rewrite equation (7.30) as
QA(v + iw) = (a + ib)Q( y + iz).
Since Q is invertible on Rn , this implies that
0
A(v + iw) = (a + ib)( y + iz).
Premultiplying by (v − iw) = (Q( y − iz)) yields
(v − iw)A(v + iw) = (a + ib)( y − iz) Q( y + iz).
Equating the real parts of each side yields
v Av + w Aw = a( y Q y + z Qz).
Since Q is positive deﬁnite with respect to Rn and A is negative deﬁnite with respect to
0
Rn , we conclude that a < 0.
0 7.B Linear Diﬀerential Equations The simplest ordinary diﬀerential equations on Rn are linear diﬀerential equations:
(L) ˙
x = Ax, where A ∈ Rn×n . Although our main interest in this book is in nonlinear diﬀerential
equations, linear diﬀerential equations are still very important to us: as we explain in
Section 7.C, the behavior of a nonlinear equation in the neighborhood of a rest point is
often well appproximated by the behavior of linear equation in a neighborhood of the
origin. 294 7.B.1 Examples Example 7.B.1. Linear dynamics on the line. In the onedimensional case, equation (L)
˙
becomes x = ax. We described the solution to this equation from initial condition x0 = ξ
in Example 3.A.1: they are of the form xt = ξ exp(at). Thus, if a 0, the equation has its
unique rest point at the origin. If a > 0, all solutions other than the stationary one move
away from the origin, while if a < 0, all solutions converge to the origin. §
One can always apply a linear change of variable to (L) to reduce it to a simpler form.
˙
˙
In particular, if B = SAS−1 is similar to A, let y = Sx; then since y = Sx, we can rewrite (L)
˙
˙
as S−1 y = AS−1 y, and hence as y = By. It follows from this observation and from Theorem
7.A.5 that to understand linear diﬀerential equations, it is enough to understand linear
diﬀerential equations deﬁned by real Jordan matrices.
Example 7.B.2. Linear dynamics on the plane. There are three generic types of 2 × 2 matrices:
diagonalizable matrices with two real eigenvalues, diagonalizable matrices with two
complex eigenvalues, and nondiagonlizable matrices with a single real eigenvalue. The
corresponding real Jordan forms are a diagonal matrix (which contains two J1 blocks), a
J2 matrix, and a J3 matrix, respectively. We therefore consider linear diﬀerential equations
based on these three types of real Jordan matrices.
When A is diagonal, the linear equation (L) and its solution from initial condition
x0 = ξ are of the following form: λ 0 x1 ˙
x = Ax = 0 µ x ; 2 ξ1 eλt xt = µt .
ξ e 2 The phase diagrams in Figure 7.B.1 show that the behavior of this dynamic depends on
the values of the eigenvalues λ and µ: if both are negative, the origin is a stable node, if
their signs diﬀer, the origin is a saddle, and if both are positive, the origin is an unstable
node.
Now suppose that A is the real Jordan form of a matrix with complex eigenvalues
a ± i b. Then we have a b x1 ξ1 eat cos bt + ξ2 eat sin bt ˙
x = Ax = −b a x ; xt = −ξ eat sin bt + ξ eat cos bt . 2
1
2
Phase diagrams for this equation are presented in Figure 7.B.2. Evidently, the stability of
the origin is determined by the real part of the eigenvalues: if a < 0, the origin is a stable
spiral, while if a > 0, the origin is an unstable spiral. In the nongeneric case where a = 0,
295 (i) stable node (µ < λ < 0) (ii) saddle (µ > 0 > λ) (iii) unstable node ( > λ > 0) Figure 7.B.1: Linear dynamics on the plane: two real eigenvalues λ, µ. (i) stable spiral (a < 0) (ii) center (a = 0) (iii) unstable spiral (a > 0) Figure 7.B.2: Linear dynamics on the plane: complex eigenvalues a ± i b, b < 0. the origin is a center, with each solution following a closed orbit around the origin. The
value of b determines the orientation of the cycles. The diagrams in Figure 7.B.2 use b < 0,
which causes solutions to cycle counterclockwise; had we chosen b > 0, these orientations
would have been reversed.
Finally, suppose that A is the real Jordan form of a nondiagonalizable matrix with lone
eigenvalue λ. Then we obtain λ 1 x1 ˙
x = Ax = 0 λ x ; 2 ξ1 eλt + ξ2 teλt .
xt = λt
ξe
1 The phase diagrams in Figure 7.B.3 reveal the origin to be an improper (or degenerate) node.
It is stable if the eigenvalue λ is negative and unstable if λ is positive. §
296 (i) stable improper node (λ < 0) (ii) unstable improper node (λ > 0) Figure 7.B.3: Linear dynamics on the plane: A not diagonalizable, one real eigenvalue λ. 7.B.2 Solutions The PicardLindelof Theorem (Theorem 3.A.2) implies that for any matrix A ∈ Rn×n
¨
there is a unique solution to the linear equation (L) starting from each initial condition
ξ ∈ Rn . While solutions of nonlinear diﬀerential equations generally cannot be expressed
in closed form, the solutions to linear equations can always be described explicitly. In the
planar case, Example 7.B.2 provided explicit formulas when A is a Jordan matrix, and the
solutions for other matrices can be obtained through a change of variable. Similar logic
can be employed in the general case, yielding the following result:
Theorem 7.B.3. Let {xt }t∈(−∞,∞) be the solution to (L) from initial condition x0 . Then each
coordinate of xt is a linear combination of terms of the form tk eat cos(bt) and tk eat sin(bt), where
a + i b ∈ C is an eigenvalue of A and k ∈ Z+ is less than the algebraic multiplicity of this eigenvalue.
For analytic purposes, it is often convenient to express solutions of the linear equation
(L) in terms of matrix exponentials. Given a matrix A ∈ Rn×n , we deﬁne eA ∈ Rn×n by
applying the series deﬁnition of the exponential function to the matrix A: that is,
∞ e=
A k =0 Ak
,
k! where Ak denotes the kth power of A and A0 ≡ I is the identity matrix.
Recall that the ﬂow φ : (−∞, ∞) × Rn → Rn generated by (L) is deﬁned by φt (ξ) = xt ,
where {xt }t∈(−∞,∞) is the solution to (L) with initial condition x0 = ξ. Theorem 7.B.4 provides
a concise expression for solutions to (L) in terms of matrix exponentials.
Theorem 7.B.4. The ﬂow of (L) is φt (ξ) = eAt ξ.
297 A beneﬁt of representing solutions to (L) in this way is that properties established for
matrix exponentials can be given immediate interpretations in terms of solutions to (L).
For examples, consider these properties:
Proposition 7.B.5.
(i) If A and B commute, then eA+B = eB eA .
(ii) If B = S−1 AS, then eB = S−1 eA S.
(iii) e(A ) = (eA ) .
Applying part (i) of the proposition to matrices As and At yields the group property of
the ﬂow of (L): φs+t (ξ) = φt (φs (ξ)). Part (ii) shows that linear ﬂows generated by similar
matrices are linearly conjugate (i.e., that they are equivalent up to a linear change of
variables), as we discussed before Example 7.B.2. Applying parts (iii) and (i) to At when
A is skewsymmetric shows that in this case, eAt is an orthogonal matrix: thus, for each
ﬁxed time t, the map ξ → φt (ξ) is a Euclidean isometry (cf Figure 7.B.2(ii)). 7.B.3 Stability and Hyperbolicity Theorem 7.B.3 shows in generic cases, the stability of the origin under the linear
equation (L) is determined by the eigenvalues {a1 + i b1 , . . . , an + i bn } of A: more precisely,
by the real parts ai of these eigenvalues. If each ai is negative, then all solutions to (L)
converge to the origin; in this case, the origin is called a sink, and the ﬂow φt (x) = eAt x is
called a contraction. If instead each ai is positive, then all solutions besides the stationary
solution at the origin move away from the origin; in this case, the origin is called a source,
and the ﬂow of (L) is called an expansion.
When the origin is a sink, solutions to (L) converge to the origin at an exponential
rate. Deﬁne a norm on Rn by x = S−1 x, where S is the similarity matrix from the Jordan
decomposition J = S−1 AS of A. Then for any a > 0 satisfying a < ai  for all i ∈ {1, . . . n}, the
ﬂow φ of (L) satisﬁes
0 is a sink ⇔ φt (ξ) ≤ e−at ξ for all t ≥ 0 and all ξ ∈ Rn . A similar statement in terms of the Euclidean norm holds if one introduces an appropriate
multiplicative constant C = C(a) ≥ 1:
(7.31) 0 is a sink ⇔ φt (ξ) ≤ Ce−at ξ for all t ≥ 0 and all ξ ∈ Rn . 298 If the origin is the source, analogous statements hold if time is run backward: for instance,
(7.32) 0 is a source ⇔ φt (ξ) ≤ Ce−at ξ for all t ≤ 0 and all ξ ∈ Rn . More generally, the ﬂow of (L) may be contracting in some directions and expanding
in others. In the generic case in which each real part ai of an eigenvalue of A is nonzero,
˙
the diﬀerential equation x = Ax, its rest point at the origin, and its ﬂow φt (x) = eAt x are all
said to be hyperbolic. Hyperbolic linear ﬂows come in three varieties: contractions (if all
ai are negative), expansions (if all ai are positive), and saddles (if there is at least one ai of
each sign). If a ﬂow is hyperbolic, then the origin is globally asymptotically stable if it is
a sink, and it is unstable otherwise.
If (L) is hyperbolic, then A has k eigenvalues with negative real part (counting algebraic
multiplicities) and n − k eigenvalues with positive real part. In this case, we can view
Rn = Es ⊕ Eu as the direct sum of subspaces of dimensions dim(Es ) = k and dim(Eu ) = n − k,
where the stable subspace Es contains all solutions of (L) that converge to the origin at an
exponential rate (as in (7.31)), while the unstable subspace Eu contains all solutions of (L)
that converge to the origin at an exponential rate if time is run backward (as in (7.32)).
If A is real diagonalizable, then it follows easily from Theorem 7.B.3 that Es and Eu are
the spans of the eigenvectors of A corresponding to the negative and positive eigenvalues
of A, respectively. More generally, Es and Eu can be computed by way of the real Jordan
form J = S−1 AS of A. Arrange S and J so that the Jordan blocks of J corresponding to
eigenvalues of A with negative real parts appear in the ﬁrst k rows and columns, while
the blocks corresponding to eigenvalues with positive real parts appear in the remaining
n − k rows and columns. Then Es is the span of the ﬁrst k columns of the similarity matrix
S, and Eu is the span of the remaining n − k columns of S. (The columns of S are the real
and imaginary parts of the socalled generalized eigenvectors of A—see Example 7.A.7.) 7.C Linearization of Nonlinear Diﬀerential Equations Virtually all of the diﬀerential equations we study in this book are nonlinear. Nevertheless, when studying the behavior of nonlinear equations in the neighborhood of a rest
point, the theory of linear equations takes on a central role.
Consider the C1 diﬀerential equation
(D) ˙
x = V (x) with rest point x∗ . By the deﬁnition of the derivative, we can approximate the value of V
299 in the neighborhood of x∗ via
V ( y) = 0 + DV (x∗ )( y − x∗ ) + o( y − x∗ ).
This suggests that the behavior of the dynamic (D) near x∗ can be approximated by the
behavior near the origin of the linear equation
(L) ˙
y = DV (x∗ ) y. To make this idea precise, we must introduce the notion of topological conjugacy of
ﬂows. To begin, let X and Y be subsets of Rn . Then the function h : X → Y is homeomorphism
if it is bijective (i.e., onetoone and onto) and continuous with a continuous inverse.
Now let I be an interval containing 0, and let φ : I × X → X and ψ : I × Y → Y
be two ﬂows. We say that φ and ψ are topologically conjugate on X and Y if there is a
homeomorphism h : X → Y such that φt (x0 ) = h−1 ◦ ψt ◦ h (x0 ) for all times t ∈ I. In other
words, φ and ψ are topologically conjugate if there is a continuous map with continuous
inverse that sends trajectories of φ to trajectories of ψ (and vice versa), preserving the rate
of passage of time. Therefore, to ﬁnd φt (x0 ), the position at time t under ﬂow φ when
the initial state is x0 ∈ X, one can apply h : X → Y to x0 to obtain the transformed initial
condition y0 = h(x0 ) ∈ Y, then run the ﬂow ψ from y0 for t time units, and ﬁnally apply h−1
to the result. We summarize this construction in the diagram below:
x0 φt h −−→
−− h(x0 ) ψt φt (x0 ) ← − − ψt (h(x0 ))
−−
h−1 The use of linearization to study the behavior of nonlinear diﬀerential equations around
ﬁxed points is justiﬁed by the HartmanGrobman Theorem.
Theorem 7.C.1 (The HartmanGrobman Theorem). Let φ and ψ be the ﬂows of the C1 equation
(D) and the linear equation (L), where x∗ is a hyperbolic rest point of (D). Then there exist
neighborhoods Ox∗ of x∗ and O0 of the origin 0 on which φ and ψ are topologically conjugate.
Combining the HartmanGrobman Theorem with our analysis in Section 7.B.3 provides a
simple characterization of the stability of hyperbolic rest points of (D).
Corollary 7.C.2. Let x∗ be a hyperbolic rest point of (D). Then x∗ is asymptotically stable if all
eigenvalues of DV (x∗ ) have strictly negative real parts, and x∗ is unstable otherwise.
300 By virtue of these results, we say that x∗ is linearly stable if the eigenvalues of DV (x∗ ) all
have negative real part. While the HartmanGrobman Theorem implies that a linearly
stable rest point is asymptotically stable, it can be shown further that solutions starting
near a linearly stable rest point converge to it at an exponential rate, as in equation (7.31).
We say that x∗ is linearly unstable if DV (x∗ ) has at least one eigenvalue with positive
real part. (We do not require x∗ to be hyperbolic.) It can be shown that as long as one
eigenvalue of DV (x∗ ) has positive real part, most solutions of (D) will move away from x∗
at an exponential rate.
While the topological conjugacy established in Theorem 7.C.1 is suﬃcient for local
stability analysis, one should understand that topological conjugacy need not preserve
the geometry of a ﬂow. The following result for linear equations makes this point clear.
˙
˙
Theorem 7.C.3. Let x = Ax and y = By be hyperbolic linear diﬀerential equations on Rn with
ﬂows φ and ψ. If A and B have the same numbers of eigenvalues with negative real part (counting
algebraic multiplicities), then φ and ψ are topologically conjugate throughout Rn .
Looking back at Example 7.B.2, we see that the phase diagrams of stable nodes (Figure
7.B.1(i)), stable spirals (Figure 7.B.2(i)), and stable improper nodes (Figure 7.B.3(i)) have
very diﬀerent appearances. Nevertheless, Theorem 7.C.3 reveals that the ﬂows described
in these ﬁgures are topologically conjugate—that is, they can be continuously transformed
into one another! To ensure that the geometry of phase diagrams is preserved, one needs
not only topological conjugacy, but rather diﬀerentiable conjugacy: that is, conjugacy under
a diﬀeomorphism (a diﬀerentiable transformation with diﬀerentiable inverse). As it turns
out, it is possible to establish a local diﬀerentiable conjugacy between (D) near x∗ and (L)
near 0 if V is suﬃciently smooth, and if the eigenvalues of DV (x∗ ) are distinct and satisfy
a mild nonresonance condition (see the Notes).
Much additional information about the ﬂow of (D) can be surmised from the derivative
matrix DV (x∗ ) at a hyperbolic rest point x∗ . Suppose that DV (x∗ ) has k eigenvalues
with negative real part and n − k eigenvalues with positive real part, counting algebraic
multiplicities. The Stable Manifold Theorem tells us that within some neighborhood of x∗ ,
there is k dimensional local stable manifold Ms on which solutions converge to x∗ at an
loc
exponential rate (as in (7.31)), and an n − k dimensional local unstable manifold Mu on which
loc
solutions converge to x∗ at an exponential rate if time is run backward (as in (7.32)).
Moreover, both of these manifolds can be extended globally: the k dimensional (global)
stable manifold Ms includes all solutions of (D) that converge to x∗ , while the n − k dimensional (global) unstable manifold Mu includes all solutions that converge to x∗ as time runs
backward. Among other implications of the existence of these manifolds, it follows that if 301 x∗ is hyperbolic and unstable, then the set Ms of states from which solutions converge to
x∗ is of measure zero, while the complement of this set is open, dense, and of full measure. 7.N Notes Section 7.1: Theorem 7.1.1 is established by Bomze (1986) for the replicator dynamic
and by Nachbar (1990) for general imitative dynamics; see also Weibull (1995).
Section 7.2: This section follows Sandholm (2001). Bomze (2002) provides an exhaustive treatment of local stability under the replicator dynamic for singlepopulation linear
potential games (which are generated by random matching in common interest games),
and the connections between this stability analysis and quadratic programming.
Section 7.3: The notion of an evolutionarily stable strategy was ﬁrst deﬁned in a single
population random matching setting by Maynard Smith and Price (1973) via conditions
(7.5) and (7.6). The equivalent deﬁnition (7.4) is due to Taylor and Jonker (1978). Taylor and
Jonker (1978) also introduce the notion of a regular ESS for singlepopulation nonlinear
games; in this paper, they also introduce the replicator dynamic, and prove that a regular
ESS is asympotically stable under this dynamic. Basic references on ESS theory include the
survey of Hines (1987) and the monographs of Bomze and Potscher (1989) and Cressman
¨
(1992).
Our deﬁnition (7.2) of ESS generalizes the deﬁnition of twopopulation ESS introduced
by Taylor (1979) (also see Schuster et al. (1981a)) and the deﬁnition of “local ESS” for single
population nonlinear games of Pohley and Thomas (1983). Exercise 7.3.1 is essentially
due to Selten (1980); see also van Damme (1991) and Swinkels (1992). Exercise 7.3.3(iii) is
Example 18 of Bomze and Potscher (1989).
¨
When Maynard Smith and Price (1973) introduced the notion of ESS, the situation they
aimed to capture was rather diﬀerent from the one studied in this book. They envisioned
a population of animals, each member of which plays the same mixed strategy x ∈ X as
they are randomly matched to play a symmetric normal form game A. Occasionally,
this population is invaded by a small group of mutants, each member of which plays
the same mixed strategy y x. Maynard Smith and Price (1973) call x an evolutionarily
stable strategy if regardless of the strategy y x played by the mutants, the payoﬀ of the
incumbents is exceeds that of the mutants in the postentry population. They captured
this notion using conditions (7.5) and (7.6), which as we have noted are equivalent to
condition (7.4).
To preserve the sense of this deﬁnition in multipopulation settings, Cressman (1992)
(see also Cressman (1995, 1996) and Cressman et al. (2001)) calls a strategy proﬁle x =
302 (x1 , . . . , xp ) satisfying condition (7.3) a monomorphic ESS. (The later papers call such a proﬁle
a p species ESS.) To justify this deﬁnition, Cressman (1992) introduces a collection of p dimensional replicator systems, one for each alternative strategy proﬁle y = ( y1 , . . . , yp ).
The pth component of the state variable of this system describes the fraction of the pth
species using mixed strategy yp ; the remainder of the species uses the incumbent mixed
strategy xp . In games with linear payoﬀs, asymptotic stability of the origin (i.e., of the state
at which all members each species p choose mixed strategy xp ) in this system is equivalent
to condition (7.3). Notice that condition (7.3) is less restrictive than our deﬁnition (7.2):
(7.3) requires just one of the incumbent species to outperform its mutant counterpart,
whereas (7.2) requires that the incumbent species do at least as well as the mutant species
on average. To understand why condition (7.3) ensures stability under Cressman’s (1992)
dynamics, note ﬁrst that a single successful incumbent species p will drive its mutant
ˆ
counterpart to extinction. This eﬀectively changes the invading strategy proﬁle to y =
ˆ
ˆ
( y1 , . . . , xp , . . . , yp ), which in turn must have a species that is outperformed by its incumbent
counterpart. Iteration of this logic shows that the origin is locally stable.
Interestingly, Cressman (1992) shows that in twopopulation linear games, a monomorphic ESS is asymptotically stable under the replicator dynamic, but that once there are
three populations, this implication no longer need hold. Given that Maynard Smith and
Price’s (1973) motivation for single population ESS (and Cressman’s (1992) extension to
multiple populations) have little to do with the usual pure strategy replicator dynamic,
the fact that monomorphic ESS implies stability under this dynamic in any cases at all is
“only good fortune” (Cressman et al. (2001, p. 10)).
Other interesting extensions of the ESS concept to multipopulation random matching
settings allow setvalued solutions, which are particularly useful in the context of random
matching in extensive form games; see Thomas (1985), Swinkels (1992), Balkenborg and
Schlag (2001), and Cressman (2003).
The alternatives deﬁnitions of ESS for games with nonlinear payoﬀs that we describe
in Section 7.3.2 are studied by Vickers and Cannings (1988), Bomze and Potscher (1989),
¨
and Bomze (1990, 1991). One alternative we did not mention is that of an uninvadable
strategy, which is based on the requirement of a uniform invasion barrier: namely, that the
¯
threshold ε > 0 in deﬁnition (7.4) be independent of the mutant y. It can be shown that an
uninvadable strategy must satisfy our deﬁnition (7.2) of ESS, and that uninvadability is
strictly weaker than the pair of conditions (7.5) and (7.7): see Theorem 35 and Corollaries
39 and 43 of Bomze and Potscher (1989). As we noted in the text, our deﬁnition (7.2)
¨
of ESS, the alternative deﬁnitions of ESS presented in equation (7.4), equations (7.5) and
(7.6), and equations (7.5) and (7.7), and therefore the notion of uninvadability as well, are 303 equivalent in singlepopulation linear games. See Bomze and Potscher (1989) and Weibull
¨
(1995) for further discussion.
To sum up, the motivations for the alternative deﬁnitions of ESS for multipopulation settings and for nonlinear games come from the monomorphic population, mixedstrategist environment studied by Maynard Smith and Price (1973). Since our focus in
this book is on behavior of agents who choose among diﬀerent pure strategies, our aim
is to employ simple deﬁnitions that support general asymptotic stability results in this
context. This goal motivates the deﬁnitions of ESS and regular ESS put forward here.
Section 7.4: Theorem 7.4.1(i) on the local stability of ESS under the replicator dynamic
is one of the earliest results on evolutionary game dynamics; see Taylor and Jonker (1978),
Taylor (1979), Hofbauer et al. (1979), Zeeman (1980), and Schuster et al. (1981a). Theorem
7.4.1(ii) follows easily from results of Nagurney and Zhang (1997); see also Sandholm
et al. (2008). The results in Section 7.4.2 are extensions of ones from Hofbauer and
Sandholm (2008). For the Theorem of the Maximum, see Ok (2007). Theorem 7.4.7 is due
to Sandholm (2008a). Hofbauer (1995b) establishes the asymptotic stability of ESS under
the best response dynamic in a single population random matching using a diﬀerent
construction than the one presented here.
Section 7.5: Lemma 7.5.1 is due to Hines (1980); see also Hofbauer and Sigmund (1988),
Hopkins (1999), and Sandholm (2007a). Versions of Theorems 7.5.2 and 7.5.6 can be
found in Taylor and Jonker (1978), Taylor (1979), Hines (1980), and Cressman (1992, 1997).
Example 7.5.5 is taken from Zeeman (1980). Theorem 7.5.9 is due to Cressman (1997).
Section 7.6: Linearization of perturbed best response dynamics is studied by Hopkins (1999, 2002), Hofbauer (2000), Hofbauer and Sandholm (2002, 2007), Hofbauer and
Hopkins (2005), and Sandholm (2007a). Exercise 7.6.3 is used in Sandholm (2007a) to
show that Nash equilibria of normal form games can always be puriﬁed (in the sense
of Harsanyi (1973)) in an evolutionarily stable fashion through an appropriate choice of
payoﬀ noise. See Ellison and Fudenberg (2000) and Ely and Sandholm (2005) for related
results. Example 7.6.5 is due to Hopkins (1999). Hopkins (2002) uses this result to show
that the replicator dynamic closely approximates the evolution of choice probabilities under stochastic ﬁctitious play. Hofbauer et al. (2007) use similar ideas to establish an exact
relationship between the long run time averaged behavior of the replicator dynamic and
the long run behavior of the best response dynamic.
Appendix 7.A: Horn and Johnson (1985) is an outstanding general reference on matrix
analysis. Many of the results we described are also presented in Hirsch and Smale (1974).
Appendix 7.B: Both Hirsch and Smale (1974) and Robinson (1995) provide thorough
treatments of linear diﬀerential equations at the undergraduate and graduate levels, re 304 spectively.
Appendix 7.C: Robinson (1995) is an excellent reference on dynamical systems in general
and on linearization in particular. For more on diﬀerentiable conjugacy around rest points,
see Hartman (1964). 305 306 CHAPTER EIGHT
Nonconvergence of Evolutionary Dynamics 8.0 Introduction We began our study of the global behavior of evolutionary dynamics in Chapter 6,
focusing on combinations of games and dynamics generating global or almost global convergence to equilibrium. The analysis there demonstrated that global payoﬀ structure—in
particular, the structure captured in the deﬁnitions of potential, stable, and supermodular
games—makes compelling evolutionary justiﬁcations of the Nash prediction possible.
On the other hand, once we move beyond these classes of wellbehaved games, it is not
clear how often convergence will occur. The present chapter counterbalances Chapter 6 by
investigating nonconvergence of evolutionary dynamics for games, describing a variety
of environments in which cycling or chaos oﬀer the best predictions of long run behavior.
Section 8.1 leads with a study of conservative properties of evolutionary dynamics,
focusing on the existence of constants of motion and on the preservation of volume under
the replicator and projection dynamics. Section 8.2 continues with a panoply of examples of nonconvergence. Among other things, this section oﬀers games in which no
reasonable evolutionary dynamic converges to equilibrium, demonstrating that no evolutionary dynamic can provide a blanket justiﬁcation for the prediction of Nash equilibrium
play. Section 8.3 proceeds by oﬀering examples of chaotic evolutionary dynamics—that
is, dynamics exhibiting complicated attracting sets and sensitive dependence on initial
conditions.
The possibility of nonconvergence has surprising implications for evolutionary support of traditional solution concepts. Under dynamics that satisfy Nash stationarity (NS),
solution trajectories that converge necessarily converge to Nash equilibria. But since no 307 reasonable evolutionary dynamic converges in all games, general support for standard
solution concepts is not assured.
Since the Nash prediction is not always supported by an evolutionary analysis, it is
natural to turn to a less demanding notion—namely, the elimination of strategies that
are strictly dominated by a pure strategy. As this requirement is the mildest employed in
standard game theoretic analyses, it is natural to expect to ﬁnd support for this requirement
via an evolutionary approach.
In Section 8.4, we present the striking ﬁnding that evolutionary dynamics satisfying
four mild conditions—continuity, Nash stationarity, positive correlation, and innovation—
do not eliminate strictly dominated strategies in all games. Moreover, while we saw
in Chapter 6 that imitative dynamics and the best response dynamic eliminate strictly
dominated strategies, we show here that small perturbations of these dynamics do not.
This analysis demonstrates that evolutionary dynamics provide surprisingly little support
for a basic rationality criterion.
As always, the appendices provide the mathematical background necessary for our
analysis. Appendix 8.A describes some classical theorems on nonconvergence used
throughout the chapter. Appendix 8.B introduces the notion of an attractor of a dynamic, and establishes the continuity properties of attractors that underlie our analysis of
dominated strategies. 8.1 Conservative Properties of Evolutionary Dynamics It is often impossible to provide precise descriptions of long run behavior under
nonconvergent dynamics. An important exception occurs in cases where the dynamics
lead certain quantities to be preserved. We explore this idea in the current section, where
we argue that in certain strategic environments, the replicator and projection dynamics
exhibit noteworthy conservative properties. 8.1.1 Constants of Motion in Null Stable Games In Section 6.2.1, we introduced null stable population games. These games are deﬁned
by the requirement that
( y − x) (F( y) − F(x)) = 0 for all x, y ∈ X,
and include zerosum games (Example 2.3.7) and multizerosum games (Exercise 2.3.9)
as special cases.
308 In Exercise 6.2.2, we saw if x∗ is an interior Nash equilibrium of a null stable game
F : X → Rn , then the value of the function
2
Ex∗ (x) = x − x∗ is preserved along interior segments of solution trajectories of the projection dynamic:
thus, as these segments are traversed, Euclidean distance from the equilibrium x∗ is ﬁxed.
Similar conclusions hold for interior solutions of the replicator dynamic: Exercise 6.2.5
shows that such solutions preserve the value of the function
p Hx∗ (x) = p p h(x∗ )p (xp ), where h yp (xp ) =
i∈Sp ( yp ) p∈P p y i
yi log xp
i is a relative entropy function.
When x∗ is interior, the level sets of Ex∗ and Hx∗ foliate from x∗ like the layers of an onion.
Each solution trajectory is limited to one of these layers, a manifold whose dimension is
one less than that of X.
Example 8.1.1. In Figure 4.3.1, we presented phase diagrams of the six basic evolutionary
dynamics for standard RockPaperScissors, FR (x) 0 −1 1 xR xS − xP F (x) = 1 x = x − x , P P R
F(x) = 0 −1 S F (x) −1 1 x x − x 0
S
S
P
R
1
a zerosum game with unique Nash equilibrium x∗ = ( 1 , 1 , 3 ). Figures 4.3.1(i) and 4.3.1(ii)
33
show that interior solutions of the replicator and projection dynamics form closed orbits
around x∗ . These orbits describe the level sets of the functions Ex∗ and Hx∗ . Note that an
aﬃne transformation of Hx∗ yields a simpler constant of motion for the replicator dynamic,
H (x) = − i∈S log xi . § When dim(X) > 2, the level sets of Ex∗ and Hx∗ need not pin down the locations of
interior solutions of (P) and (R). But if the null stable game F has multiple Nash equilibria,
then there are multiple collections of level sets, and intersections of these sets do determine
the positions of interior solutions.
Example 8.1.2. Consider the population game F generated by random matching in the 309 1 3
4 Figure 8.1.1: Solutions of the projection dynamic on level set Ex∗ (x) = √
3
12 , 1
1
x∗ = ( 4 , 1 , 1 , 4 ).
44 symmetric zerosum game A: (8.1) 0 −1 0
1 x1 x4 − x2 1 0 −1 0 x2 x1 − x3 = F(x) = Ax = x x − x . 3 2
0
1
0 −1 4 x3 − x1
−1 0
1
0 x4 The Nash equilibria of F are the points on line segment NE connecting states ( 1 , 0, 1 , 0)
2
2
and (0, 1 , 0, 1 ).
2
2
The arguments above show that interior solutions to the projection dynamic maintain
a constant distance from every Nash equilibrium of F. This is illustrated in Figure 8.1.1,
which presents solutions on the sphere inscribed in the pyramid X; this is the level set on
√
11
which Ex∗ takes the value 123 , where x∗ = ( 1 , 4 , 4 , 1 ). Each solution drawn in the ﬁgure is a
4
4
circular closed orbit orthogonal to line segment NE.
Figure 8.1.2 presents solution trajectories of the replicator dynamic for game F. Dia1
grams (i) and (ii) show solutions on level sets of Hx∗ where x∗ = ( 1 , 4 , 1 , 1 ); the ﬁrst (smaller)
4
44
level set is nearly spherical, while the second approximates the shape of the pyramid X.
3
Diagrams (iii) and (iv) present solutions on level sets of Hx∗ with x∗ = ( 3 , 1 , 8 , 1 ) and
88
8
1
3
x∗ = ( 8 , 3 , 1 , 8 ) . By our previous discussion, the intersection of the two level sets is a
88
closed curve describing a single orbit of the dynamic. §
Example 8.3.2 will show that even in zerosum games, very complicated dynamics can
arise within the level sets of Hx∗ .
Exercise 8.1.3.
(i) Suppose that A ∈ Rn×n is skewsymmetric. Show that the eigenvalues
of A all have zero real part, and so that the number of nonzero eigenvalues is even.
310 1 1 2 3 3
4 4 1
(i) x∗ = ( 4 , 1 , 1 , 1 ), Hx∗ (x) = .02
444 1
(ii) x∗ = ( 1 , 1 , 1 , 4 ), Hx∗ (x) = .58
444
1 1 3 3 4 4 (iii) x∗ = ( 3 , 1 , 3 , 1 ), Hx∗ (x) = .35
8888 3
(iv) x∗ = ( 1 , 8 , 1 , 3 ), Hx∗ (x) = .35
8
88 Figure 8.1.2: Solutions of the replicator dynamic on level sets of Hx∗ . 311 (ii) Suppose that A ∈ Rn×n is a symmetric zerosum game that admits an interior Nash
equilibrium x∗ . Show that if n is even, then x∗ is contained in a line segment
consisting entirely of Nash equilibria. (Hint: Consider the matrix ΦAΦ.)
The previous analysis shows that in zerosum games, typical solutions of the replicator
dynamic do not converge. The next exercise shows that the time averages of these solutions
do converge, and that the limits of the time averages are Nash equilibria.
Exercise 8.1.4. Convergence of time averages under the replicator dynamic. Let F(x) = Ax be
the population game generated by the symmetric normal form game A ∈ Rn×n , and let
˙
x = VF (x) be the replicator dynamic for this game. Suppose that {xt }t≥0 is a solution to VF
that is bounded away from bd(X) (i.e., that there is an ε > 0 such that (xt )i ≥ ε for all t ≥ 0
and i ∈ S). Let
1
¯
xt =
t t xs ds
0 be the average value of the state over the time interval [0, t]. Following the steps below,
¯
prove that {xt }t≥0 converges to the set of (interior) Nash equilibria of F as t approaches
inﬁnity:
(8.2) ¯
lim min xt − x∗ = 0. t→∞ x∗ ∈NE(F) ¯
In particular, if F has a unique interior Nash equilibrium x∗ , then {xt } converges to x∗ .
d
(i) Deﬁne yt ∈ Rn by ( yt )i = log (xt )i . Compute dt yt .
(ii) Show that
1
1
yt − y0 =
t
t t Axs − 1xs Axs ds.
0 ¯
¯
¯
(iii) Let x∗ be an ωlimit point of the trajectory {xt }. Show that Ax∗ is a constant vector,
¯
and hence that x∗ is a Nash equilibrium. (Hint: Use the fact that the trajectory { yt }
is constrained to a compact set.)
¯
(iv) Conclude that (8.2) holds. (Hint: Use the fact that the trajectory {xt } is constrained
to a compact set.)
Exercise 8.1.5. Prove that the conclusion of Exercise 8.1.4 continues to hold in a twopopulation random matching setting.
Exercise 8.1.6. Explain why the argument in Exercise 8.1.4 does not allow its conclusion to
be extended to random matching in p ≥ 3 populations.
312 8.1.2 Preservation of Volume ˙
Let x = V (x) be diﬀerential equation on X with ﬂow φ : R × X → X, and let µ
denote Lebesgue measure on X. The diﬀerential equation is said to volume preserving (or
incompressible) on Y ⊆ X if for any measurable set A ⊆ Y, we have µ(φt (A)) = µ(A) for
all t ∈ R. Preservation of volume has strong implications for local stability of rest points:
since an asymptotically stable rest point must draw in all nearby initial condition, no such
rest points can exist in regions where volume is preserved (see Theorem 8.A.4).
We now show that in single population zerosum games, the replicator dynamic is
volume preserving after a wellchosen change in speed. Compared to the standard
replicator dynamic, the speedadjusted replicator dynamic on int(X),
(8.3) p pp ˆ
˙
xi = q(x) xi Fi (x), where q(x) =
r∈P j∈S 1
,
xrj moves relatively faster at states closer to the boundary of the simplex, with speeds approaching inﬁnity as the boundary is approached. The solution trajectories of (8.3) have
the same locations as those of the standard replicator dynamic (see Exercise 4.4.10), so the
implications of volume preservation for stability of rest points extend immediately to the
latter dynamic.
Theorem 8.1.7. Let F(x) = Ax be generated by random matching in the symmetric zerosum
game A = −A ∈ Rn×n . Then the dynamic (8.3) for F is volume preserving on int(X). Therefore,
no interior Nash equilibrium of F is asymptotically stable under the replicator dynamic.
The proof of Theorem 8.1.7 is based on Liouville’s Theorem, which tells us that the rate
˙
at which the dynamic x = V (x) expands or contracts volume near state x is given by the
divergence divV (x) ≡ tr(DV (x)). More precisely, Liouville’s Theorem tells us that
d
µ(φt (A))
dt = φt (A) divV (x) dµ(x). for each Lebesgue measurable set A. Thus, if divV ≡ 0, so that V is divergence free, then
the ﬂow φ is volume preserving. See Section 8.A.1 for a proof and further discussion of
this result.
Proof. The replicator dynamic is described by the vector ﬁeld R : X → TX, where
R(x) = diag(x)(F(x) − 1x F(x)). 313 Since F(x) = Ax, and since x Ax ≡ 0 (because A is symmetric zerosum), we can simplify
the previous expression to
(8.4) R(x) = diag(x)Ax. The dynamic (8.3) can be written as
V (x) = q(x)R(x),
where q is the function from int(X) → R+ deﬁned in equation (8.3). If we can show that V
is divergence free on int(X), then our result will follow from Liouville’s Theorem.
ˆ
ˆ
To compute DV (x), let q : int(Rn ) → R+ and R : Rn → Rn be the natural extensions of q
+
ˆ
ˆ
and R, so that q(x) = Φ q(x) and DR(x) = DR(x)Φ. Then the chain rule implies that
(8.5) ˆ
ˆ
DV (x) = q(x)DR(x) + R(x) q(x) = q(x)DR(x) + R(x) q(x) Φ. To evaluate this expression, write [x−1 ] = ( x11 , . . . , x1n ) , and compute from equations (8.3)
and (8.4) that
ˆ
ˆ
ˆ
q(x) = −q(x)[x−1 ] and DR(x) = diag(x)A + diag(Ax).
Substituting into equation (8.5) yields
DV (x) = q(x) (diag(x)A + diag(Ax) − diag(x)Ax[x−1 ] Φ
= q(x) diag(x)A + diag(Ax) − diag(x)Ax[x−1 ]
− 1
n diag(x)A + diag(Ax) − diag(x)Ax[x−1 ] 11 . Therefore, divV (x) = q(x) xi Aii +
i∈S − (Ax)i −
i∈S 1
n xi
i∈S xi (Ax)i 1
xi i∈S Ai j − Ai j x j + 1
n j∈S i∈S j∈S 1
n xi Ai j x j
i∈S j∈S k∈S 1
. xk The ﬁrst term in the brackets equals 0 since Aii = 0; the second and third terms cancel; the
fourth and ﬁfth terms cancel since Ai j = −A ji ; and the sixth term is 0 since x Ax = 0. We
therefore conclude that divV (x) = 0 on int(X), and hence that the ﬂow of (8.3) is volume 314 preserving. The conclusion about asymptotic stability follows from Theorem 8.A.4.
Under single population random matching, volume preservation under the replicator
dynamic is only assured in zerosum games. Remarkably, moving to multipopulation
random matching ensures volume preservation regardless of the payoﬀs in the underlying
normal form game.
Suppose the population game F is generated by random matching of members of
p ≥ 2 populations to play a p player normal form game. Since each agent’s opponents in
a match will be members of the other populations, the agent’s payoﬀs do not depend on
his own population’s state: Fp (x) ≡ Fp (x−p ). Theorem 8.1.8 shows that this last condition
is suﬃcient to prove that the ﬂow of the replicator dynamic for F is volume preserving.
Theorem 8.1.8. Let F be a game played by p ≥ 2 populations that satisﬁes Fp (x) ≡ Fp (x−p ). Then
the dynamic (8.3) for F is volume preserving on int(X). Therefore, no interior Nash equilibrium of
F is asymptotically stable under the replicator dynamic.
Exercise 8.1.9. Prove Theorem 8.1.8. To simplify the notation, assume that each population
is of unit mass. (Hint: To prove that the vector ﬁeld V from equation (8.3) is divergence
free, start by showing that the derivative matrix of V p at x with respect to directions in
TXp is the np × np matrix
¯
¯
DTXp V p (x) = q(x) diag(πp ) − πp I − xp (πp ) − diag(xp ) πp [(xp )−1 ] + πp xp [(xp )−1 ] Φ,
¯
¯
where πp = Fp (x−p ) and πp = Fp (x−p ) = (xp ) πp .)
Analogues of Theorems 8.1.7 and 8.1.8 can be established for the projection dynamics
via much simpler calculations, and without introducing a change in speed.
Exercise 8.1.10. Let F(x) = Ax be generated by random matching in the symmetric zero
sum game A = −A ∈ Rn×n . Show that the projection dynamic for F is volume preserving
on int(X).
Exercise 8.1.11. Let F be a game played by p ≥ 2 unitmass populations that satisﬁes
Fp (x) ≡ Fp (x−p ). Show that the projection dynamic for F is volume preserving on int(X). 8.2 Games with Nonconvergent Evolutionary Dynamics In this section, we introduce examples of games for which many evolutionary dynamics
fail to converge to equilibrium.
315 8.2.1 Circulant Games The matrix A ∈ Rn×n is called a circulant matrix if it is of the form a0 a1 · · · an−2 an−1 an−1 a0 a1 · · · an−2 .
. . ... ... ... ... .
A= a 2 · · · an−1 a0
a1 a1 a2 · · · an−1 a0
When we view A as the payoﬀ matrix for a symmetric normal form game, we refer to A
1
as a circulant game. Such games always include the central state x∗ = n 1 among their Nash
equilibria. Note that RockPaperScissors games are circulant games with n = 3, a0 = 0,
a1 = −l, and a2 = w. Most of the speciﬁc games considered below will also have diagonal
payoﬀs equal to 0.
Their symmetric structure make circulant games simple to analyze. In doing so, we
will ﬁnd it convenient to refer to strategies modulo n.
Exercise 8.2.1. Verify that the eigenvalue/eigenvector pairs of the circulant matrix A are
(8.6) n−1 jk (λk , vk ) = a j ιn , (1, ιk , . . . , ι(n−1)k ) , k = 0, . . . , n − 1, n
n j =0 π
π
π
where ιn = exp( 2n i ) = cos( 2n ) + i sin( 2n ) is the nth root of unity. Exercise 8.2.2. Let F(x) = Ax be generated by random matching in the circulant game A,
˙
and let x = R(x) = diag(x)(Ax − 1x Ax) be the replicator dynamic for F. Show that the
1
derivative matrix of R at the Nash equilibrium x∗ = n 1 is the circulant matrix
1
¯
DR(x∗ ) = n (A − 2 11 a), ¯1
where a = n 1 a is the average of the components of the vector a = (a0 , a1 , . . . , an−1 ) . It then
follows from the previous exercise that the eigenvalue/eigenvector pairs (λk , vk ) of DR(x∗ )
are given by (8.7) 1 (λk , vk ) = n n −1
j=0 jk (
¯
(a j − 2a)ιn , (1, ιk , . . . , ιnn−1)k ) , k = 0, . . . , n − 1. n Example 8.2.3. The hypercycle system. Suppose that a0 = . . . = an−2 = 0 and that an−1 = 1,
so that each strategy yields a positive payoﬀ only against the strategy that precedes it
316 λ2 λ3 1
3 λ4
1
4 λ2 λ3 1
5 λ2
λ1 λ1 λ1
(i) n = 3 (ii) n = 4 (iii) n = 5 Figure 8.2.1: Eigenvalues of the hypercycle system.
1
(modulo n). In this case, x∗ = n 1 is the unique Nash equilibrium of F, and the replicator
dynamic for A is known as the hypercycle system.
We determine the local stability of the rest point x∗ by considering the eigenvalues of
DR(x∗ ). Substituting into equations (8.6) and (8.7) shows that the eigenvector/eigenvalue
pairs are of the form 1
(
2 (λk , vk ) = ιnn−1)k − 2 n n n−1
j =0 jk ιn , (1, ιk , . . . , ι(n−1)k ) , k = 0, . . . , n − 1. n
n 1
2
1
Eigenvalue λ0 = n − n = − n corresponds to eigenvector v0 = 1 and so has no bearing on
the stability analysis. For k ≥ 1, the sum in the formula for λk vanishes (why?), leaving
1
1
us with λk = n ι(n−1)k = n ι−k . The stability of x∗ therefore depends on whether any λk with
n
n
k > 0 has positive real part. As Figure 8.2.1 illustrates, this largest real part is negative
when n ≤ 3, zero when n = 4, and positive when n ≥ 5. It follows that x∗ is asymptotically
stable when n ≤ 3, but unstable when n ≥ 5. Exercise 8.2.4 shows that the local stability
results can be extended to global stability results, and that global stability can also be
proved when n = 4. When n ≥ 5, it is possible to show that the boundary of X is repelling,
as it is in the lower dimensional cases, and that the dynamic admits a stable periodic orbit
(see the Notes). § Exercise 8.2.4. Consider the function H : int(X) → R deﬁned by H (x) = − i∈S log xi (cf
Example 8.1.1.)
(i) Show that under the hypercycle equation with n = 2 or 3, H is a strict Lyapunov
function on int(X), and hence that x∗ is globally asymptotically stable with respect
to int(X).
317 ˙
(ii) Show that under the hypercycle equation with n = 4 we have H (x) ≤ 0 on int(X),
with equality if and only if x lies in Y = { y ∈ int(X) : y1 + y3 = y2 + y4 }. Show that the
sole invariant subset of Y is {x∗ }. Then use Theorems 6.B.2 and 6.B.4 and Proposition
6.A.1(iii) to conclude that x∗ is globally asymptotically stable with respect to int(X).
Example 8.2.5. Monocyclic games. A circulant game A is monocyclic if a0 = 0, a1 , . . . , an−2 ≤ 0,
1
¯
¯1
and an−1 > 0. Let a = n i ai . If we assume that a < 0, then the Nash equilibrium x∗ = n 1,
¯
which yields a payoﬀ of a for each strategy, is the unique interior Nash equilibrium of
F(x) = Ax. More importantly, there is an open, dense, full measure set of initial conditions
from which the best response dynamic for F(x) = Ax converges to a limit cycle; this limit
cycle is contained in the set where M(x) = maxi∈S Fi (x) equals 0.
Here is a sketch of the proof. Consider a solution trajectory {xt } of the best response
dynamic that lies in set B1 = {x ∈ X : argmaxi∈S Fi (x) = {1}} during time interval [0, T). For
any t ∈ [0, T), we have that
xt = e−t x0 + (1 − e−t ) e1 .
Since the diagonal elements of A all equal zero, it follows that
(8.8)
For j
(8.9) M(xt ) = F1 (xt ) = e−t F1 (x0 ) = e−t M(x0 ).
{1, 2} we have that
F j (xt ) = e−t F j (x0 ) + (1 − e−t )A j1 ≤ e−t F j (x0 ) < e−t F1 (x0 ) = F1 (xt ). Equations (8.8) and (8.9) and the fact that
F1 (e1 ) = 0 < an−1 = F2 (e1 )
imply that a solution starting in region B1 must hit the set B12 = {x ∈ X : argmaxi∈S Fi (x) =
{1, 2}}, and then immediately enter region B2 = {x ∈ X : argmaxi∈S Fi (x) = {2}}.
Repeating the foregoing argument shows that the trajectory next enters best response
regions B3 , B4 , . . . , B0 in succession before returning to region B1 . Therefore, if we denote
by B the set of states at which there are at most two best responses, then B is forward
invariant under the best response dynamic. Moreover, equation (8.8) implies that the
maximal payoﬀ M(xt ) approaches 0 along all solution trajectories in B.
In light of this discussion, we can deﬁne the return map r : B12 → B12 , where r(x) is the
position at which a solution starting at x ∈ B12 ﬁrst returns to B12 . All ﬁxed points of r lie in
318 M−1 (0). In fact, it can be shown that r is a contraction on M−1 (0) for an appropriate choice
of metric, and so that r has a unique ﬁxed point (see the Notes). We therefore conclude
that any solution trajectory starting in the open, dense, full measure set B converges to the
closed orbit that passes through the unique ﬁxed point of the return map r. § 8.2.2 Continuation of Attractors for Parameterized Games The games we construct in the examples to come will generate nonconvergent behavior
for large classes of evolutionary dynamics. Recall our general formulation of evolutionary
dynamics from Chapter 3: each revision protocol ρ deﬁned a map from population games
˙
F to diﬀerential equations x = VF (x) via
(8.10) p p pp ˙
xi = (VF )i (x) = p p x j ρ ji (Fp (x), xp ) − xi
j∈Sp ρi j (Fp (x), xp ).
j∈Sp In Chapter 4, we introduced the following three desiderata for ρ and V .
(C)
(NS)
(PC) Continuity:
ρp is Lipschitz continuous.
Nash stationarity:
VF (x) = 0 if and only if x ∈ NE(F).
p
p
Positive correlation: VF (x) 0 implies that VF (x) Fp (x) > 0. We have seen that under continuity condition (C), any Lipschitz continuous population game F will generate a Lipschitz continuous diﬀerential equation (8.10), an equation
that admits unique solutions from every initial condition in X. But a distinct consequence
of condition (C)—one involving comparisons of dynamics across games—is equally important for the analyses to come.
Suppose we have a collection of population games {Fε }ε∈(−ε,ε) that have identical strategy
¯¯
sets and whose payoﬀs vary continuously in ε. Then under condition (C), the law of
˙
motion x = VFε (x) varies continuously in ε. Moreover, if we let φε : R+ × X → X denote
the semiﬂow under VFε , then the map (ε, t, x) → φε (x) is continuous as well. This fact is
t
important for understanding how evolution under V(·) changes as we vary the underlying
game. To capture the eﬀects on long run behavior under V(·) , we must introduce the notion
of an attractor. We keep the introduction here brief; additional details can be found in
Appendix 8.B.
A set A ⊆ X is an attractor of the ﬂow φ if it is nonempty, compact, and invariant under
φ, and if there is a neighborhood U of A such that
(8.11) lim sup dist(φt (x), A ) = 0. t→∞ x∈U 319 The set B(A ) = {x ∈ X : ω(x) ⊆ A } is called the basin of A . Put diﬀerently, attractors are
asymptotically stable sets that are also invariant under the ﬂow.
A key property of attractors for the current context is known as continuation. Fix an
attractor A = A 0 of the ﬂow φ0 . Then as ε varies continuously from 0, there exist attractors
A ε of the ﬂows φε that vary upper hemicontinuously from A ; their basins B(A ε ) vary lower
hemicontinuously from B(A ). Thus, if we slightly change the parameter ε, the attractors
that exist under φ0 continue to exist, and they do not explode.
Exercise 8.2.6. In deﬁning an attractor via equation (8.11), we require that it attract solutions
from all nearby states uniformly in time. To understand the role of uniformity in this
deﬁnition, let φ be a ﬂow on the unit circle that moves clockwise except at the topmost
point x∗ (cf Example 6.A.3). Explain why {x∗ } is not an attractor under this ﬂow.
As a ﬁrst application of these ideas, consider the 4 × 4 circulant game (8.12) 0
0 −1 ε x1 ε 0
0 −1 x2 ε
ε . F (x) = A x = −1 ε 0
0 x3 0 −1 ε
0 x4 When ε = 0, the payoﬀ matrix Aε = A0 is symmetric, so F0 is a potential game with
1
potential function f (x) = 1 x A0 x = −x1 x3 − x2 x4 . The function f attains its minimum of − 4
2
1
1
at states v = ( 1 , 0, 2 , 0) and w = (0, 1 , 0, 2 ), has a saddle point with value − 1 at the Nash
2
2
8
1
1
equilibrium x∗ = ( 4 , 1 , 4 , 1 ), and attains its maximum of 0 along the closed path of Nash
4
4
equilibria γ consisting of edges e1 e2 , e2 e3 , e3 e4 , and e4 e1 . It follows from results in Section
˙
6.1 that if x = VF0 (x) satisﬁes (NS) and (PC), then all solutions whose initial conditions ξ
satisfy f (ξ) > − 1 converge to γ. (In fact, if x∗ is a hyperbolic rest point of VFε , then the
8
Stable Manifold Theorem (see Appendix 7.C) tells us that the set of initial conditions from
which solutions converge to x∗ is a manifold of dimension no greater than 2, and hence
has measure zero.) The phase diagram for the Smith dynamic game F0 is presented in
Figure 8.2.2(i).
Now suppose that ε > 0. If our revision protocol satisﬁes continuity (C), then the
attractor γ of VF0 continues to an attractor γε of VFε ; γε is contained in a neighborhood of
γ, and its basin approximates that of γ (see Figure 8.2.2(ii)). At the same time, the unique
Nash equilibrium of Fε is the central state x∗ . We have therefore proved
Proposition 8.2.7. Let V(·) be an evolutionary dynamic that satisﬁes (C), (PC), and (NS), let Fε
˙
be given by (8.12), and let δ > 0. Then for ε > 0 suﬀﬁciently small, solutions to x = VFε (x) from
320 1 3
4 (i) ε = 0
1 3
4 (ii) ε = 1
10 Figure 8.2.2: The Smith dynamic in game Fε . 321 T
H
H T
H T Figure 8.2.3: The replicator dynamic in Mismatching Pennies. all initial conditions x with f (x) > − 1 + δ converge to an attractor γε on which f exceeds −δ; in
8
ε
particular, γ contains neither Nash equilibria nor rest points. 8.2.3 Mismatching Pennies Mismatching Pennies is a threeplayer normal form game in which each player has two
strategies, Heads and Tails. Player p receives a payoﬀ of 1 for choosing a diﬀerent strategy
than player p + 1 and a payoﬀ of 0 otherwise, where players are indexed modulo 3.
If we let F be the population game generated by random matching in Mismatching
Pennies, then for each population p ∈ P = {1, 2, 3} we have that p p+1 F (x) x Fp (x) = H = T+1 .
p p F (x) x T
H
1
1
The unique Nash equilibrium of F is the central state x∗ = (( 2 , 1 ), ( 1 , 1 ), ( 1 , 2 )). Since
2
22
2
p
there are two strategies per player, it will simplify our analysis to let yp = xH be the
proportion of population p players choosing Heads, and to focus on the new state variable
y = ( y1 , y2 , y3 ) ∈ Y = [0, 1]3 (see Exercise 8.2.12 for details). Example 8.2.8. The replicator dynamic for Mismatching Pennies. After our change of variable, 322 ˆ
˙
the replicator dynamic y = VF ( y) for Mismatching Pennies takes the form y1 y1 (1 − y1 )(1 − 2 y2 ) ˙ ˙ ˙ y = y2 = y2 (1 − y2 )(1 − 2 y3 ) . y3 y3 (1 − y3 )(1 − 2 y1 )
˙
11
The derivative matrix for an arbitrary state y and the equilibrium state y∗ = ( 2 , 2 , 1 ) are
2 ˆ
DV ( y) = (1−2 y1 )(1−2 y2 ) −2 y1 (1− y1 )
0
0
(1−2 y2 )(1−2 y3 ) −2 y2 (1− y2 )
−2 y3 (1− y3 )
0
(1−2 y3 )(1−2 y1 ) ˆ
and DV ( y∗ ) = 1
0 −2 0
1
0 0 −2
1
−2 0 0 . ˆ
DV ( y∗ ) is a circulant matrix with an eigenvalue of − 1 corresponding to eigenvector 1,
2
√
√√
3
1
and eigenvalues of 4 ± 4 i corresponding to eigenvectors (−1, −1, 2) ± (− 3, 3, 0) ; note
√√
that 1, (−1, −1, 2) , and (− 3, 3, 0) are mutually orthogonal. The phase diagram for
the replicator dynamic is a spiral saddle: interior solutions on the diagonal where y1 =
y2 = y3 head directly toward y∗ , while all other orbits are attracted to a twodimensional
manifold containing an unstable spiral. This is depicted in Figure 8.2.3, where behavior
in populations 1, 2, and 3 is measured on the leftright, frontback, and topbottom axes,
respectively. Solutions on the manifold containing the unstable spiral converge to a sixsegment heteroclinic cycle; this cycle agrees with the best response cycle of the underlying
normal form game. §
Example 8.2.9. The best response dynamic in Mismatching Pennies. The analysis of the best
response dynamic in Mismatching Pennies is very similar to the corresponding analysis
in monocyclic games (Example 8.2.5). Divide the state space Y = [0, 1]3 into eight octants
in the natural way. Then the two octants corresponding to vertices HHH and TTT are
backward invariant, while solutions starting in any of the remaining six octants proceed
through those octants according to the best response cycle of the underlying game (see
Exercise 8.2.10). As Figure 8.2.4 illustrates, almost all solutions to the best response
dynamic converge to a sixsided closed orbit in the interior of Y. §
Exercise 8.2.10.
(i) Give an explicit formula for the best response dynamic for Mismatching Pennies in terms of the state variable y ∈ Y = [0, 1]3 .
(ii) Prove that octants HHH and TTT described in the previous example are backward
invariant.
(iii) Prove that solutions starting in any of the remaining octants proceed through those
octants according to the best response cycle of the underlying game.
The following proposition shows that the previous two examples are not exceptional.
323 T
H
H T
H T
T
H
H T
T H Figure 8.2.4: The best response dynamic in Mismatching Pennies (two viewpoints). 324 Proposition 8.2.11. Let V(·) be an evolutionary dynamic that is generated by a C1 revision protocol
ρ and that satisﬁes Nash stationarity (NS). Let F be Mismatching Pennies, and suppose that the
˙
unique Nash equilibrium x∗ of F is a hyperbolic rest point of x = VF (x). Then x∗ is unstable under
VF , and there is an open, dense, full measure set of initial conditions from which solutions to VF
do not converge.
Proposition 8.2.11 is remarkable in that it does not require the dynamic to satisfy a
payoﬀ monotonicity condition. Instead, it takes advantage of the fact that by deﬁnition,
the revision protocol for population p does not condition on the payoﬀs of other populations. In fact, the speciﬁc payoﬀs of Mismatching Pennies are not important to obtain
the instability result; any threeplayer game whose unique Nash equilibrium is interior
works equally well. The proof of the theorem makes these points clear.
Proof. For ε close to 0, let Fε be generated by a perturbed version of Mismatching
1+2ε
Pennies in which player 3’s payoﬀ for playing H when player 1 plays T is not 1, but 1−2ε .
Then like Mismatching Pennies itself, Fε has a unique Nash equilibrium, here given by
1
(( 1 + ε, 1 − ε), ( 1 , 1 ), ( 2 , 1 )).
2
2
22
2
For convenience, let us argue in terms of the state variable y = (x1 , x2 , x3 ) ∈ Y = [0, 1]3
HHH
ˆ
˙
˙
(see Exercise 8.2.12). If y = VFε ( y) is the dynamic x = VFε (x) expressed in terms of y, then
Nash stationarity (NS) tells us that
(8.13) ˆ1
VFε ( 2 + ε, 1 , 1 ) = 0
22 whenever ε is small. Now by deﬁnition, the law of motion for population 1 does not
depend directly on payoﬀs in the other populations, regardless of the game at hand (cf
equation (8.10)). Therefore, since changing the game from Fε to F0 does not alter population
1’s payoﬀ function, equation (8.13) implies that
ˆ1
VF0 ( 1 + ε, 1 , 1 ) = 0
2
22
whenever ε is small. This observation and the fact that the dynamic is diﬀerentiable at
1
y∗ = ( 2 , 1 , 1 ) imply that
22
ˆ1
∂VF0
∂ y1 ( y∗ ) = 0. ˆ
Repeating this argument for the other populations shows that the trace of DVF0 ( y∗ ),
∗ ), is 0. Since y∗ is a hyperbolic rest point
ˆ
and hence the sum of the eigenvalues of DVF0 ( y
ˆ
ˆ
of VF0 , it follows that some eigenvalue of DVF0 ( y∗ ) has positive real part, and thus that y∗
325 ˆ
is unstable under VF0 . Thus, the Stable Manifold Theorem (see Appendix 7.C) tells us that
the set of initial conditions from which solutions converge to y∗ is of dimension at most 2,
and that its complement is open, dense, and of full measure in Y.
Exercise 8.2.12. Let X be the state space for a p population game with two strategies per
population, and let Y = [0, 1]p , so that TY = Rp .
(i) Show that the change of variable h : X → Y has inverse h−1 : Y → X, where y1 x1 1− y1 1 .
.
.
−1 . and h ( y) = . . .
h(x) = p p y x
1 1− yp (ii) Show that the derivative of h at x, Dh(x) : TX → TY, and the derivative of h−1 at y,
˜
Dh−1 ( y) : TY → TX, can be written as Dh(x)z = Mz and Dh−1 ( y)ζ = Mζ for some
˜
matrices M ∈ Rp ×2p and M ∈ R2p ×p . Show that if M is viewed as a linear map from
˜
TX to TY, then its inverse is M.
ˆ
(iii) Fix a C1 vector ﬁeld V : X → TX, and deﬁne the new vector ﬁeld V : Y → TY by
ˆ
ˆ
˙
˙
V ( y) = h(V (h−1 ( y))). Show that the dynamics x = V (x) and y = V ( y) are linearly
conjugate under H: that is, that {xt } solves the former equation if and only if {h(xt )}
solves the latter.
(iv) Let x∗ be a rest point of V , and let y∗ = h(x∗ ) be the corresponding rest point of
ˆ
V . Show that the eigenvalues of DV (x∗ ) with respect to TX are identical to the
ˆ
eigenvalues of DV ( y∗ ) with respect to TY. What is the relationship between the
corresponding pairs of eigenvectors? 8.2.4 The Hypnodisk Game The virtue of Proposition 8.2.11 is that apart from hyperbolicity of equilibrium, virtually no assumptions about the evolutionary dynamic V(·) were needed to establish
nonconvergence. We now show that if one is willing to introduce a payoﬀ monotonicity condition—namely, positive correlation (PC)—then one can obtain a nonconvergence
without smoothness conditions, and using a twodimensional state variable, rather than
a threedimensional one as in Mismatching Pennies. This low dimensionality will turn
out to be crucial when we study survival of dominated strategies in Section 8.4.
Our construction will be based on potential games. In Figure 8.2.5, we present the 326 1 2 3 (ii) The projected payoﬀ vector ﬁeld (i) The potential function Figure 8.2.5: A coordination game. potential function and projected payoﬀ vector ﬁeld of the coordination game 1 0 0 x1 x1 C
0 1 0 x = x . 2 2
F (x) = Cx = 0 0 1 x x 3
3
˙
By our analysis in Chapter 2, solutions to any evolutionary dynamic x = VFC (x) satisfying
1
C
conditions (NS) and (PC) ascend the potential function f (x) = 2 x Cx = 1 ((x1 )2 +(x2 )2 +(x3 )2 )
2
drawn in diagram (i), or, equivalently, travel at acute angles to the projected payoﬀ vectors
in diagram (ii). It follows that solutions to VFC from most initial conditions converge to
the strict Nash equilibria at the vertices of X.
As a second example, suppose that agents are randomly matched to play the anticoordination game −C. In Figure 8.2.6, we draw the resulting population game F−C (x) =
1
−Cx = −x and its concave potential function f −C (x) = − 1 x Cx = − 2 ((x1 )2 + (x2 )2 + (x3 )2 ).
2
Both pictures reveal that under any evolutionary dynamic satisfying conditions (NS) and
(PC), all solution trajectories converge to the unique Nash equilibrium x∗ = ( 1 , 1 , 1 ).
333
The construction of the hypnodisk game H : X → R3 is easiest to describe in geometric
terms. Begin with the coordination game FC (x) = Cx pictured in Figure 8.2.5(ii). Then draw
1
1
two circles centered at state x∗ = ( 3 , 1 , 1 ) with radii 0 < r < R < √6 , as shown in Figure
33
8.2.7(i); the second inequality ensures that both circles are contained in the simplex. Twist
the portion of the vector ﬁeld lying outside of the inner circle in a clockwise direction,
327 1 2 3 (ii) The projected payoﬀ vector ﬁeld (i) The potential function Figure 8.2.6: An anticoordination game. excluding larger and larger circles as the twisting proceeds, so that the outer circle is
reached when the total twist is 180◦ (Figure 8.2.7(ii)).
Exercise 8.2.13. Provide an explicit formula for the resulting population game H(x).
What does this construction accomplish? Examining Figure 8.2.7(ii), we see that inside
the inner circle, H is identical to the coordination game FC . Thus, solutions to dynamics
satisfying (NS) and (PC) starting at states other than x∗ in the inner circle must leave the
inner circle. At states outside the outer circle, H is identical to the anticoordination game
F−C , so solutions to dynamics satisfying (NS) and (PC) starting at states outside the outer
circle must enter the outer circle. Finally, at each state x in the annulus bounded by the two
circles, H(x) is not a componentwise constant vector. Therefore, states in the annulus are
not Nash equilibria, and so are not rest points of dynamics satisfying (NS). We assemble
these observations in the following proposition.
Proposition 8.2.14. Let V(·) be an evolutionary dynamic that satisﬁes (C), (NS), and (PC), and
˙
let H be the hypnodisk game. Then every solution to x = VH (x) other than the stationary solution
at x∗ enters the annulus with radii r and R and never leaves, ultimately converging to a cycle
therein.
The claim of convergence to limit cycles in the ﬁnal sentence of the proposition follows
from the Poincar´ Bendixson Theorem (Theorem 8.A.5).
e
328 1 2 3 (i) Projected payoﬀ vector ﬁeld for the coordination game
1 2 3 (ii) Projected payoﬀ vector ﬁeld for the hypnodisk game
Figure 8.2.7: Construction of the hypnodisk game. 329 8.3 Chaotic Evolutionary Dynamics In all of the phase diagrams we have seen so far, ωlimit sets have taken a fairly simple
form: solution trajectories have converged to rest points, closed orbits, or chains of rest
points and connecting orbits. When we consider games with just two or three strategies,
this is unavoidable: clearly, all solution trajectories of continuous time dynamics in one
dimension converge to equilibrium, while in twodimensional systems, the Poincar´ e
Bendixson Theorem (Theorem 8.A.5) tells us that the three types of ωlimit sets described
above exhaust all possibilities.
Once we move to ﬂows in three or more dimensions, ωlimit sets can be much more
complicated sets often referred to as chaotic (or strange) attractors. Central to most deﬁnitions of chaos is sensitive dependence on initial conditions: solution trajectories starting from
close together points on the attractor move apart at an exponential rate. Chaotic attractors
can also be recognized in phase diagrams by their rather intricate appearance. Rather
than delving deeply into these ideas, we content ourselves by presenting a few examples.
Example 8.3.1. Consider the single population game F generated by random matching in
the normal form game A below: 0 −12 20 0 F(x) = Ax = −21 −4 10 −2 0 22 x1 0 −10 x2 . 0 35 x3 x4
20 1
11
The lone interior Nash equilibrium of this game is the central state x∗ = ( 4 , 1 , 4 , 4 ).
4
˙
Let x = VF (x) be the replicator dynamic for game F. One can calculate that the eigenvalues of DVF (x∗ ) are approximately −3.18 and .34 ± 1.98i, so like the Nash equilibrium of
Mismatching Pennies (Example 8.2.8), the interior equilibrium x∗ here is a sprial saddle
with an unstable spiral.
˙
Figure 8.3.1 presents the initial portion of the solution of x = VF (x) from initial condition
x0 = (.24, .26, .25, .25). This solution spirals clockwise about x∗ . Near the rightmost point
of each circuit, where the value of x3 gets close to zero, solutions sometimes proceed along
an “outside” path on which the value of x3 surpasses .6. But they sometimes follow an
“inside” path on which x3 remains below .4, and at other times do something in between.
Which of these alternatives occurs is diﬃcult to predict from approximate information
about the previous behavior of the system.
Sensitive dependence on initial conditions is illustrated directly in Figure 8.3.2, which
tracks the solutions from two nearby initial conditions, (.47, .31, .11, .11) and (.46999, .31, 330 1 3
4
x2 x1
1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 50 t x3 100 150 200 t 50 100 150 200 t x4
1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 t Figure 8.3.1: A chaotic attractor under the replicator dynamic. 331 x2 x1
1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 20 40 60 80 100 20 40 60 80 100 t 20 t 40 60 80 100 t x4 x3
1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 20 40 60 80 100 t Figure 8.3.2: Sensitive dependence on initial conditions under the replicator dynamic. .11, .11001). Apparently, the two solutions stay close together through time t = 50 but
diverge thereafter; after time t = 60, the current position of one of the solutions provides
little hint about the current position of the other. §
The scattered payoﬀ entries in the previous example may seem to suggest that chaos
only occurs in “artiﬁcial” examples. To dispute this view, we now show that chaotic
behavior can occur in very simple games.
Example 8.3.2. Asymmetric RockPaperScissors. Suppose that two populations of agents
are randomly matched to play the twoplayer zerosum game U = (U1 , U2 ): r II
p s 1
1
, −2
2 −1, 1 1, −1 P 1, −1 1
, −1
2
2 −1, 1 S −1, 1 1, −1 1
, −1
2
2 R
I U is an asymmetric version of RockPaperScissors in which a “draw” results in a halfcredit win for player 1.
Figures 8.3.3 and 8.3.4 each present a single solution trajectory of the replicator dynamic
for FU . Since the social state x = (x1 , x2 ) is fourdimensional, we draw it in two pieces,
332 with x1 represented on the left hand side of each ﬁgure and x2 represented on the right.
1
1
Because U is a zerosum game with Nash equilibrium (( 1 , 1 , 3 ), ( 1 , 1 , 3 )), each solution of
33
33
p
the replicator dynamic lives within a level set of H (x) = − p∈P i∈Sp log xi . In Figure
8.3.3, whose initial condition is ((.5, .25, .25), (.5, .25, .25)), the solution trajectory appears to
follow a periodic orbit, much like those in our examples from Section 8.1.1. But in Figure
8.3.4, whose initial condition ((.5, .01, .49), (.5, .25, .25)) is closer to the boundary of X, the
solution trajectory travels around the level set of H in a seemingly haphazard way. Thus,
despite the regularity provided by the constant of motion, the evolution of behavior in
this simple game is complicated indeed. § 8.4 Survival of Dominated Strategies By now we have thoroughly considered whether the prediction of Nash equilibiurm
play can be justiﬁed using evolutionary arguments. On the positive side, Chapters 4
and 5 show that there are many dynamics whose rest points are always identical to the
Nash equilibria of the underlying game, and Chapter 6 shows that convergence to Nash
equilibrium can be assured under many of these dynamics in particular classes of games.
But the ﬁnal word on this question appears in Section 8.2, which demonstrates that no
evolutionary dynamic can converge to Nash equilibrium in all games.
This negative result leads us to consider a more modest question. Rather than seek
evolutionary support for equilibrium play, we instead turn our attention to a more basic
rationality requirement: namely, the avoidance of strategies that are strictly dominated.
Theorem 6.4.4 seems to bear out the intuition that evolutionary dynamics select against
dominated strategies. But upon further reﬂection, one ﬁnds that there is no a priori reason
to expect dominated strategies to be eliminated. Evolutionary dynamics are built upon
the notion that agents switch to strategies whose current payoﬀs are reasonably good. But
even if a strategy is dominated, it can have reasonably good payoﬀs at many population
states. Put diﬀerently, domination is a “global” property, depending on payoﬀs at all
states, while decision making in evolutionary models is “local”, depending only on the
payoﬀs available at present. By this logic, there is no reason to expect evolutionary
dynamics to eliminate dominated strategies as a general rule.
To turn this intuition into a formal result, we introduce one further condition on
evolutionary dynamics.
(IN) Innovation If x NE(F), xi = 0, and i ∈ argmax F j (x), then (VF )i (x) > 0.
j∈S 333 R r P S p s xR xP 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 t 50 xr 150 200 t 0.4 0.2 100 0.6 0.4 50 0.8 0.6 200 t 1 0.8 150 xp 1 100 0.2 50 100 150 200 t Figure 8.3.3: Cycling in asymmetric RockPaperScissors. 334 R r P S p s xR xP 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 t 50 xr 150 200 t 0.4 0.2 100 0.6 0.4 50 0.8 0.6 200 t 1 0.8 150 xp 1 100 0.2 50 100 150 200 t Figure 8.3.4: Chaos in asymmetric RockPaperScissors. 335 Innovation (IN) requires that when a nonNash population state includes an unused optimal strategy, this strategy’s growth rate must be strictly positive. In other words, if
an unplayed strategy is suﬃciently rewarding, some members of the population will
discover it and select it.
We are now in a position to state our survival theorem.
Theorem 8.4.1. Suppose the evolutionary dynamic V(·) satisﬁes (C), (NS), (PC), and (IN). Then
˙
there is a game F such that under x = VF (x), along solutions from most initial conditions, there is
a strictly dominated strategy played by a fraction of the population bounded away from 0.
Proof. Let H be the hypnodisk game introduced in Section 8.2.4. Let F be the fourstrategy game obtained from H by adding a twin to strategy 3:
Fi (x1 , x2 , x3 , x4 ) = Hi (x1 , x2 , x3 + x4 ) for i ∈ {1, 2, 3};
F4 (x) = F3 (x).
Strategies 3 and 4 are identical, in that they always yield the same payoﬀ and always have
the same payoﬀ consequences for other strategies. The set of Nash equilibria of F is the
line segment
NE = x∗ ∈ X : x∗ = x∗ = x∗ + x∗ =
2
3
4
1 1
3 . Let
I = x ∈ X : (x1 − 1 )2 + (x2 − 1 )2 + (x3 + x4 − 1 )2 ≤ r2 and
3
3
3
O = x ∈ X : (x1 − 1 )2 + (x2 − 1 )2 + (x3 + x4 − 1 )2 ≤ R2
3
3
3
be concentric cylindrical regions in X surrounding NE, as pictured in Figure 8.4.1. By
construction, we have that 1 0 0 ˜ = 0 1 0
F(x) = Cx 0 0 1 001 0
0
1
1 x1 x 2 . x 3 x4 at all x ∈ I, so under any dynamic satisfying (PC) and (NS), solutions starting in I − NE
˜
ascend the potential function f C (x) = 1 ((x1 )2 + (x2 )2 + (x3 + x4 )2 ) until leaving the set I. At
2
˜
states outside the set O, we have that F(x) = −Cx, so solutions starting in X − O ascend
˜
˜
f −C (x) = − f C (x) until entering O. In summary:
336 1 2 3 4
Figure 8.4.1: Regions O, I, and D = O − I. Lemma 8.4.2. Suppose that V(·) is an evolutionary dynamic that satisﬁes conditions (C), (NC)
˙
and (PC), and let F be the “hypnodisk with a twin” game. Then every solution to x = VF (x) other
than the stationary solutions in NE enter region D = O − I and never leave.
Deﬁne the ﬂow from the set U ⊆ X under the dynamic VF by
˙
φt (U) = {ξ ∈ X : there is a solution {xs } to x = VF (x) with x0 ∈ U and xt = ξ.}
In words, φt (U) contains the time t positions of solutions to VF whose initial conditions
are in U.
˜
Since solutions to VF starting in I − NE ascend the function f C until leaving the set I,
the reverse time ﬂow is welldeﬁned from all such states, and NE is a repellor under VF .
This means that all backwardtime solutions to VF that begin in some neighborhood U of
NE converge to NE uniformly over time, or, equivalently, that NE is an attractor of the
˙
timereversed equation y = −VF ( y) (see Appendix 8.B). The dual attractor A of the repellor
NE is the forwardtime limit of the ﬂow of VF starting from the complement of cl(U): A= φt (X − cl(U)).
t≥0 337 A is nonempty, compact, and (both forward and backward) invariant under VF , and
Lemma 8.4.2 tells us that A ⊂ D.
We now show that the twin strategy is used by a positive mass of agents throughout
the attractor A . Let Z = {x ∈ X : x4 = 0} be the face of X on which the twin strategy is
unused; we prove
Lemma 8.4.3. The attractor A and the face Z are disjoint.
Proof. Since VF is Lipschitz continuous and satisﬁes (VF )i (x) ≥ 0 whenever xi = 0,
solutions to VF that start in X − Z cannot approach Z more than exponentially quickly,
and in particular cannot reach Z in ﬁnite time (see Exercise 8.4.4). Equivalently, backward
solutions to VF starting from states in Z cannot enter int(X).
Now suppose by way of contradiction that there exists a state ξ in A ∩ Z. Then by
our previous arguments, the entire backward orbit from ξ is also contained in A ∩ Z, and
hence in D ∩ Z. Since the latter set contains no rest points by condition (PC), the Poincar´ e
Bendixson Theorem (Theorem 8.A.5) implies that the backward orbit from ξ converges to
a closed orbit γ in D ∩ Z that circumnavigates I ∩ Z.
By construction, the annulus D ∩ Z can be split into three regions: one in which
strategy 1 is the best response, one in which strategy 2 is the best response, and one in
which strategy 3 (and hence strategy 4) is a best response (Figure 8.4.2). Each of these
regions is bounded by a simple closed curve that intersects the inner and outer boundaries
of the annulus. Therefore, the closed orbit γ, on which strategy 4 is unused, passes through
the region in which strategy 4 is optimal. This contradicts innovation (IN).
Exercise 8.4.4. Use Gronwall’s Inequality (Lemma 3.A.7) to check the initial claim in the
¨
proof of the lemma.
To complete the proof, we now make the twin strategy “feeble”: we uniformly reduce
its payoﬀ by ε, creating the new game
Fε (x) = F(x) − εe4 .
Observe that strategy 4 is strictly dominated by strategy 3 in game Fε .
As increasing ε from 0 continuously changes the game from F to Fε , doing so also
continuously changes the dynamic from VF to VFε . Thus, by Theorem 8.B.3 on continuation
of attractors, we have that for small ε, the attractor A of VF continues to an attractor A ε
of VFε on which x4 > 0: thus, the dominated strategy survives throughout A ε . The basin
of the attractor A ε contains all points outside of a thin tube around the set NE of Nash
equilibria of F. This completes the proof of Theorem 8.4.1.
338 1 3 2 1 2 3 Figure 8.4.2: The best response correspondence of the hypnodisk game. We conclude this chapter with some examples that illustrate and extend the analysis
above.
Example 8.4.5. We use the hypnodisk game as the basis for the proof of Theorem 8.4.1
because it generates cycling under any dynamic that satisﬁes (NS) and (PC). But the use
of this game is not essential: once we ﬁx the dynamic under consideration, we can ﬁnd a
simpler game that leads to cycling; then the argument based on the introduction of twin
strategies can proceed as above.
We illustrate this point by constructing an example of survival under the Smith dynamic. Figure 8.4.3 contains the phase diagram for the Smith dynamic in the bad RockPaperScissors game 0 −l w x1 G(x) = Ax = w 0 −l x2 , −l w 0 x 3
where w = 1 and l = 2. Evidently, the unique Nash equilibrium x∗ = ( 1 , 1 , 1 ) is unstable,
333
and most solution trajectories converge to a cycle located in int(X). 339 Figure 8.4.3: The Smith dynamic in bad RPS. R R P P S S
T (i) bad RPS with a twin T (ii) bad RPS with a feeble twin Figure 8.4.4: The Smith dynamic in two games. 340 Figure 8.4.4(i) presents the Smith dynamic in “bad RPS with a twin”, (8.14) 0 −l w w x1 w 0 −l −l x 2 ˜ =
F(x) = Ax −l w 0 0 x . 3 −l w 0 0 x4 1
The Nash equilibria of F are the states on line segment NE = {x∗ ∈ X : x∗ = ( 3 , 1 , c, 1 − c)},
3
3
which is a repellor under the Smith dynamic. Furthermore, since Scissors and Twin always
earn the same payoﬀs (F3 (x) ≡ F4 (x)), we can derive a simple expression for the rate of
change of the diﬀerence between their utilization levels: (8.15) ˙
˙
x3 − x4 = x j [F3 (x) − F j (x)]+ − x3
j∈S j∈S − [F j (x) − F3 (x)]+ x j [F4 (x) − F j (x)]+ − x4
j∈S j∈S = −(x3 − x4 ) [F j (x) − F4 (x)]+ [F j (x) − F4 (x)]+ .
j∈S Intuitively, strategies lose agents at rates proportional to their current levels of use, but
gain strategies at rates that depend on their payoﬀs; thus, when the dynamics are not at
rest, the weights x3 and x4 move closer together. It follows that except at Nash equilibrium
states, the dynamic moves toward the plane P = {x ∈ X : x3 = x4 } on which the identical
twins receive equal weight (see Exercise 8.4.6).
Figure 8.4.4(ii) presents the Smith dynamic in “bad RPS with a feeble twin”, (8.16) 0
−l
w w x1 w 0
−l −l x2 ,
˜
Fε (x) = Aε x = −l w
0 0 x3 −l − ε w − ε −ε −ε x4 1
where ε = 10 . Evidently, the attractor from the previous ﬁgure moves slightly to the left,
and the strictly dominated strategy Twin survives. Indeed, since the Nash equilibrium of
“RPS with a twin” on plane P puts mass 1 on Twin, when ε is small solutions to the Smith
6
1
dynamic in “RPS with a feeble twin” place mass greater than 6 on the strictly dominated
strategy Twin inﬁnitely often. This lower bound is driven by the fact that in the game
with an exact twin, solutions converge to plane P; thus, the bound will obtain under any 341 R R P P S S T (i) RPS with a twin T (ii) RPS with a feeble twin Figure 8.4.5: The replicator dynamic in two games. dynamic that treats diﬀerent strategies symmetrically. §
Exercise 8.4.6. Show that under the Smith dynamic in “RPS with a twin”, solutions from
states not on the line of Nash equilibria NE converge to the plane P where the weights on
Scissors and Twin are equalized. (Hint: Use equation (8.15) and the Poincar´ Bendixson
e
Theorem. You may take as given that the set NE is a repellor.)
Example 8.4.7. Theorem 6.4.4 showed that dominated strategies are eliminated along interior solutions of imitative dynamics. But Theorem 8.4.1 shows that this result is not robust
to small changes in these dynamics.
To understand why, consider evolution under the replicator dynamic in “(standard)
RPS with a twin”. In standard RockPaperScissors, interior solutions of the replicator
dynamic are closed orbits (see, e.g., Section 8.1.1). When we introduce an exact twin,
xS
equation (6.38) tells us that the ratio xT is constant along every solution trajectory. This is
xS
evident in Figure 8.4.5(i), which shows that the planes on which the ratio xT is constant
are all invariant sets. If we make the twin feeble by lowering its payoﬀ uniformly by ε, we
xS
obtain the dynamics pictured in Figure 8.4.5(ii): now the ratio xT increases monotonically,
and the dominated strategy is eliminated.
The existence of a continuum of invariant hyperplanes under imitative dynamics in
games with identical twins is crucial to this argument. At the same time, dynamics with
a continuum of invariant hyperplanes are structurally unstable. If we ﬁx the game but
slightly alter the agents’ revision protocol, these invariant sets can collapse, overturning
the elimination result.
342 R R P P S S
T T (i) bad RPS with a twin
Figure 8.4.6: The 9
10 (ii) bad RPS with a feeble twin
replicator + 1
10 Smith dynamic in two games. To make this argument concrete, suppose that instead of always following an imitative
protocol, agents occasionally use a protocol that requires direct evaluation of payoﬀs.
Such a situation is illustrated in Figure 8.4.6(i), which contains the phase diagram for
11
91
“bad RPS with a twin” (with w = 1 and l = 10 ) under a ( 10 , 10 ) convex combination of the
replicator and Smith dynamics. While Figure 8.4.5(i) displayed a continuum of invariant
hyperplanes, Figure 8.4.6(i) shows almost all solution trajectories converging to a limit
cycle on the plane where xS = xT . If we make the twin feeble, the limit cycle moves slightly
to the left, as in Figure 8.4.6(ii), and the dominated strategy survives. §
Exercise 8.4.8. Show that an analogue of equation (6.38) holds for the projection dynamic
on int(X). Explain why this does not imply that dominated strategies are eliminated along
all solutions to the projection dynamic starting from interior initial conditions. Appendix
8.A Three Classical Theorems on Nonconvergent Dynamics 8.A.1 Liouville’s Theorem ˙
Let V : Rn → Rn be a C1 vector ﬁeld, and consider the diﬀerential equation x = V (x)
with ﬂow φ : R × Rn → Rn . Let the set A ⊂ Rn be measurable with respect to Lebesgue 343 measure µ on Rn . Liouville’s Theorem concerns the time evolution of µ(φt (A)), the measure
(or volume) of the time t image of A under φ.
Theorem 8.A.1 (Liouville’s Theorem). d
µ(φt (A))
dt = φt (A) tr(DV (x)) dµ(x). The quantity tr(DV (x)) = i ∂Vii (x) ≡ divV (x) is known as the divergence of V at x.
∂x
According to Liouville’s Theorem, divV governs the local rates of change in volume
˙
under the ﬂow φ of x = V (x). In particular, if divV = 0 on an open set O ⊆ Rn —that is, if
V is divergencefree on this set—then the ﬂow φ conserves volume on O.
Before proceeding with the proof of Liouville’s Theorem, let us note that it extends
immediately to cases in which the law of motion V : X → TX deﬁned on an aﬃne set
X ⊂ Rn with tangent space TX. In this case, µ represents Lebesgue measure on (the aﬃne
hull of) X. The only cautionary note is that the derivative of V at state x ∈ X must be
represented using the derivative matrix DV (x) ∈ Rn×n , which by deﬁnition has rows in
ˆ
TX. We showed how to compute this matrix in Appendix 2.B.3: if V : Rn → Rn is a C1
ˆ
extension of V , then DV (x) = DV (x)PTX , where PTX ∈ Rn×n is the orthogonal projection of
Rn onto the subspace TX.
Proof. Using the standard multivariate change of variable, we express the measure of
the set φt (A) as
(8.17) µ(φt (A)) = φt (A) 1 dµ(xt ) = det(Dφt (x0 )) dµ(x0 ).
A The derivative matrix Dφt (x0 ) in equation (8.17) captures changes in φt (x0 ), the time t
˙
position of the solution to x = V (x) from initial condition x0 , as this initial condition is
varied. It follows from arguments below that det(Dφt (x0 )) > 0, so that the absolute value
taken in equation (8.17) is unnecessary. Taking the time derivative of this equation and
then diﬀerentiating under the integral sign thus yields
(8.18) d
µ(φt (A))
dt =
A d
dt det(Dφt (x0 )) dµ(x0 ). Evaluating the right hand side of equation (8.18) requires two lemmas. The ﬁrst of
these is stated in terms of the time inhomogeneous linear equation
(8.19) ˙
yt = DV (xt ) yt , ˙
where {xt } is the solution to x = V (x) from initial condition x0 . Equation (8.19) is known as
˙
the (ﬁrst) variation equation associated with x = V (x).
344 Lemma 8.A.2. The matrix trajectory {Dφt (x0 )}t≥0 is the matrix solution to the ﬁrst variation
equation from initial condition Dφ0 (x0 ) = I ∈ Rn×n . More explicitly,
(8.20) d
Dφt (x0 )
dt = DV (φt (x0 )) Dφt (x0 ). In words, Lemma 8.A.2 tells us that the column trajectories of {Dφt (x0 )}t≥0 are the solutions
to the ﬁrst variation equation whose initial conditions are the standard basis vectors
e1 , . . . , en ∈ Rn .
d
Proof. By deﬁnition, the time derivative of the ﬂow from x0 satisﬁes dt φt (x0 ) = V (φt (x0 )).
Diﬀerentiating with respect to x0 and then reversing the order of diﬀerentiation yields
(8.20). Lemma 8.A.3 provides two basic matrix identities, the ﬁrst of which is sometimes
called Liouville’s formula.
Lemma 8.A.3. Let M ∈ Rn×n . Then
(i) det(exp(M)) = exp(tr(M)).
d
(ii) dt det(exp(Mt)) t=0 = tr(M).
Proving part (i) of the lemma is not diﬃcult, but the intuition is clearest when M is a
diagonal matrix: λ1 det exp ... ··· 0
. . . .
.
. 0 ··· λn eλ1 . = det .. ··· 0 . .
.
. . . 0 ··· eλn = e = exp λi i i λ1 λi = exp tr ... ··· 0
. .. .
.
. 0 ··· λn . Part (ii) follows from part (i) by replacing M with Mt and diﬀerentiating.
Lemmas 8.A.2 and 8.A.3(ii) enable us to evaluate equation (8.18). First, note that
Lemma 8.A.2 and the linearity of the ﬁrst variation equation imply that
Dφt (x0 ) ≈ exp(DV (x0 )t)
when t is close to 0. Combining this observation with Lemma 8.A.3(ii) shows that
d
dt det(Dφt (x0 ))) t=0 ≈ d
dt det(exp(DV (x0 )t)) = tr(DV (x0 )). By substituting this equality into of equation (8.18) and noting that our focus on time t = 0
has been arbitrary, we obtain Liouville’s Theorem.
Liouville’s Theorem can be used to prove the nonexistence of asymptotically stable
sets. Since solutions in a neighborhood of such a set all approach the set, volume must
345 be contracted in this neighborhood. It follows that a region in which divergence is
nonnegative cannot contain an asymptotically stable set.
Theorem 8.A.4. Suppose divV ≥ 0 on the open set O ⊆ Rn , and let A ⊂ O be compact. Then A
˙
is not asymptotically stable under x = V (x).
Theorem 8.A.4 does not rule out the existence of Lyapunov stable sets. In fact, the example
of the replicator dynamic in standard RockPaperScissors shows that such sets are not
unusual when V is divergencefree. 8.A.2 The Poincar´ Bendixson and BendixsonDulac Theorems
e We now present two classical results concerning diﬀerential equations on the plane.
The celebrated Poincar´Bendixson Theorem characterizes the possible long run behave
iors of such dynamics, and provides a simple way of establishing the existence of periodic
orbits. Recall that a periodic (or closed) orbit of a diﬀerential equation is a nonconstant
solution {xt }t≥0 such that xT = x0 for some T > 0.
Theorem 8.A.5 (The Poincar´ Bendixson Theorem). Let V : R2 → R2 be Lipschitz continue
˙
ous, and consider the diﬀerential equation x = V (x).
(i) Let x ∈ R2 . If ω(x) is compact, nonempty, and contains no rest points, then it is a periodic
orbit.
(ii) Let Y ⊂ R2 . If Y is nonempty, compact, forward invariant, and contains no rest points,
then it contains a periodic orbit.
Theorem 8.A.5 tells us that in planar systems, the only possible ωlimit sets are rest
points, sequences of trajectories leading from one rest point to another (called heteroclinic
orbits where there are multiple rest points in the sequence and homoclinic orbits when there
is just one), and periodic orbits. In part (i) of the theorem, the requirement that ω(x) be
compact and nonempty are automatically satisﬁed when the dynamic is forward invariant
on a compact set—see Proposition 6.A.1.
The next result, the BendixsonDulac Theorem, provides a method of ruling out the
existence of closed orbits in planar systems. To state this theorem, we recall that a set
Y ⊂ R2 is simply connected if it contains no holes: more precisely, if every closed curve in
Y can be continuously contracted within Y to a single point.
Theorem 8.A.6 (The BendixsonDulac Theorem). Let V : R2 → R2 be C1 , and consider the
˙
diﬀerential equation x = V (x). If divV 0 throughout the simply connected set Y, then Y does
not contain a closed orbit.
346 Proof. If γ is a closed orbit in Y, then the region R bounded by γ is invariant under φ.
Thus
d
µ(φt (R))
dt = φt (R) divV (x) dµ(x) by Liouville’s Theorem. Since divV is continuous and nonzero throughout Y, its sign must
be constant throughout Y. If this sign is negative, then the volume of R contracts under
φ; if it is positive, then the volume of R expands under φ. Either conclusion contradicts
the invariance of R under φ.
Both of the results above extend to dynamics deﬁned on twodimensional aﬃne spaces
in the obvious way. 8.B Attractors and Continuation 8.B.1 Attractors and Repellors Let φ be a semiﬂow on the compact set X ⊂ Rn : that is, φ : [0, ∞) × X → X is a continuous
map with φ0 (x) = x that satisﬁes the group property φt (φs (x)) = φt+s (x) for all s, t ≥ 0 and
x ∈ X. We call the set A ⊆ X forward invariant under φ if φt (A ) = A for all t ≥ 0. Note
that in this case, the sets {φt (A )}t≥0 are nested. We call A invariant under φ if φt (A ) = A
for all t ∈ R. It is implicit in this deﬁnition that on the set A we have not only a semiﬂow,
but also a ﬂow: on A , we can extend the map φ to be welldeﬁned and satisfy the group
property not just for times in [0, ∞), but also for times in (−∞, ∞).
A set A ⊆ X is an attractor of φ if it is nonempty, compact, and invariant under φ, and
if there is a neighborhood U of A such that
lim sup dist(φt (x), A ) = 0. t→∞ x∈U The set B(A ) = {x ∈ X : ω(x) ⊆ A } is called the basin of A . Attractors only diﬀer from
asymptotically stable sets (as deﬁned in Chapter 6) only in that the latter need not be
invariant.
Attractors can be deﬁned in a number of equivalent ways. In the following lemma,
the ωlimit of the set U ⊆ X is deﬁned as ω(U) =
cl φs (U) . t≥0 s≥t 347 Proposition 8.B.1. The following statements are equivalent:
(i) A is an attractor of φ.
(ii) There is a neighborhood U of A such that A = ω(U).
(iii) A = t≥0 φt (O), where O is open and satisﬁes φT (cl(O)) ⊂ O for some T > 0.
(iv) A = t≥0 φt (O), where O is open, forward invariant, and satisﬁes φT (cl(O)) ⊂ O for some
T > 0.
(v) A = t≥0 φt (O), where O is open, forward invariant, and satisﬁes φt (cl(O)) ⊂ O for all
t > 0.
In parts (iii), (iv), and (v), the set O is known as a weak trapping region, a trapping region,
and a strongly forward invariant trapping region, respectively.
Now suppose that φ : (−∞, ∞) × X → X is a ﬂow on X. In this case, the set A ∗ = B(A ) − A
is known as the dual repellor of A . A ∗ is the αlimit of of a neighborhood of itself (i.e., it is the
ωlimit of a neighborhood of itself under the timereversed version of φ), it is nonempty,
compact, and invariant under φ.
Theorem 8.B.2 shows that the behavior of the ﬂow on B(A ) = X − (A ∪ A ∗ ) is very
simple: it admits a strict Lyapunov function.
Theorem 8.B.2. Let (A , A ∗ ) be an attractorrepellor pair of the ﬂow φ on X. Then there exists
a continuous function L : X → [0, 1] with L−1 (0) = A ∗ and L−1 (1) = A such that L is strictly
increasing on B(A ) under φ.
If φ is only a semiﬂow, one can still ﬁnd a continuous Lyapunov function L : (B(A ) ∪ A ) →
[0, 1] with L−1 (1) = A that is strictly increasing on B(A ). 8.B.2 Continuation of Attractors ˙
Consider now a oneparameter family of diﬀerential equations x = V ε (x) in Rn with
ε
ε
unique solutions xt = φt (x0 ) such that (ε, x) → V (x) is continuous. Then (ε, t, x) → φε (x)
t
n
is continuous as well. Suppose that X ⊂ R is compact and forward invariant under the
semiﬂows φε . For ε = 0 we omit the superscript in φ.
The following continuation theorem for attractors is part of the folklore of dynamical
systems.
Theorem 8.B.3. Let A be an attractor for φ with basin B(A ). Then for each small enough
ε > 0 there exists an attractor A ε of φε with basin B(A ε ), such that the map ε → A ε is upper
hemicontinuous and the map ε → B(A ε ) is lower hemicontinuous.
Upper hemicontinuity cannot be replaced by continuity in this result. Consider the
˙
family of diﬀerential equations x = (ε + x2 )(1 − x) on the real line. The semiﬂow φ
348 corresponding to ε = 0 admits A = [0, 1] as an attractor, but when ε > 0 the unique
attractor of φε is A ε = {1}. This example shows that perturbations can cause attractors to
implode; the theorem shows that perturbations cannot cause attractors to explode.
Theorem 8.B.3 is a direct consequence of the following lemma.
Lemma 8.B.4. Let A be an attractor for φ with basin B(A ), and let U1 and U2 be open sets
satisfying A ⊆ U1 ⊆ U2 ⊆ cl(U2 ) ⊆ B(A ). Then for each small enough ε > 0 there exists an
attractor A ε of φε with basin B(A ε ), such that A ε ⊆ U1 and U2 ⊆ B(A ε ).
In this lemma, one can always set U1 = {x : dist(x, A ) < δ} and U2 = {x ∈ B(A ) :
dist(x, X − B(A )) > δ} for some small enough δ > 0.
Proof of Lemma 8.B.4. Since A is an attractor and ω(cl(U2 )) = A , there is a T > 0 such
that φt (cl(U2 )) ⊂ U1 for t ≥ T. By the continuous dependence of the ﬂow on the parameter
ε and the compactness of φT (cl(U2 )), we have that φε (cl(U2 )) ⊂ U1 ⊆ U2 for all small
T
enough ε. Thus, U2 is a weak trapping region for the semiﬂow φε , and so A ε ≡ ω(U2 )
is an attractor for φε . In addition, A ε ⊂ U1 (since A ε = φε (A ε ) ⊆ φε (cl(U2 )) ⊂ U1 ) and
T
T
U2 ⊂ B(A ε ). 8.N Notes Section 8.1. The conservative properties of dynamics studied in this chapter—the
existence of a constant of motion and the preservation of volume—are basic properties
of Hamiltonian systems. For more on this connection, see Akin and Losert (1984) and
Hofbauer (1995a, 1996); for a general introduction to Hamiltonian systems, see Marsden
and Ratiu (2002). Exercises 8.1.4 and 8.1.5 are due to Schuster et al. (1981b,c). Theorem
8.1.7 is due to Akin and Losert (1984) and Hofbauer (1995a), while Theorem 8.1.8 is due to
Hofbauer and Sigmund (1988) and Ritzberger and Weibull (1995) (also see Weibull (1995)).
Section 8.2. Circulant games were introduced by Hofbauer and Sigmund (1988), who
call them “cyclically symmetric games”; also see Schuster et al. (1981c). The hypercycle
system was proposed by Eigen and Schuster (1979) to model of cyclical catalysis in a
collection of polynucleotides during prebiotic evolution. That the boundary of the simplex
is repelling under the hypercycle system when n ≥ 5, a property known as permanence,
was established by Hofbauer et al. (1981); the existence of stable limit cycles in this context
was proved by Hofbauer et al. (1991).
Monocyclic games are studied in the context of the replicator dynamic by Hofbauer
and Sigmund (1988), who call them “essentially hypercyclic” games. The uniqueness of
349 the interior Nash equilibrium in Example 8.2.5 follows from the fact that the replicator dynamic is permanent in this game: see Theorems 19.5.1 and 20.5 of Hofbauer and Sigmund
(1988) (or Theorems 13.5.1 and 14.5.1 of Hofbauer and Sigmund (1998)). The analysis
of the best response dynamic in this example is due to Hofbauer (1995b), Gaunersdorfer and Hofbauer (1995), and Bena¨m et al. (2006a). Lahkar (2007), building on work of
ı
Hopkins and Seymour (2002), employs these results to establish the dynamic instability
of dispersed price equilibria in Burdett and Judd’s (1983) model of equilibrium price dispersion. Proposition 8.2.7 is due to Hofbauer and Swinkels (1996); also see Hofbauer and
Sigmund (1998, Section 8.6).
The Mismatching Pennies game was introduced by Jordan (1993), and was inspired
by a 3 × 3 example due to Shapley (1964); see Sparrow et al. (2008) for a recent analysis of
Shapley’s (1964) example. The analyses of the replicator and best response dynamics in
Mismatching Pennies are due to Gaunersdorfer and Hofbauer (1995). Proposition 8.2.11
is due to Hart and MasColell (2003). The hypnodisk game is introduced in Hofbauer and
Sandholm (2006).
Section 8.3. For introductions to chaotic diﬀerential equations, see Hirsch et al. (2004)
at the undergraduate level or Guckenheimer and Holmes (1983) at the graduate level.
Example 8.3.1 is due to Arneodo et al. (1980), who introduce it in the context of the LotkaVolterra equations; also see Skyrms (1992). The attractor in this example is known as a
Shilnikov attractor; see Hirsch et al. (2004, Chapter 16). Example 8.3.2 is due to Sato et al.
(2002).
Section 8.4. This section follows Hofbauer and Sandholm (2006). That paper builds on
the work of Berger and Hofbauer (2006), who establish that strictly dominated strategies
can survive under the BNN dynamic. For a survival result for the projection dynamic, see
Sandholm et al. (2008).
Section 8.A. For further details on Liouville’s Theorem, see Sections 4.1 and 5.3 of
Hartman (1964). Theorem 8.A.4 in the text is Proposition 6.6 of Weibull (1995). For
treatments of the Poincar´ Bendixson Theorem, see Hirsch and Smale (1974) and Robinson
e
(1995).
Section 8.B. The deﬁnition of attractor we use is from Bena¨m (1999). Deﬁnition (ii)
ı
in Proposition 8.B.1 is from Conley (1978), and deﬁnitions (iii), (iv), and (v) are from
Robinson (1995). Theorem 8.B.2 is due to Conley (1978).
Theorem 8.B.3 is part of the folklore of dynamical systems theory; compare Proposition
8.1 of Smale (1967). The analysis presented here is from Hofbauer and Sandholm (2006). 350 Part IV
Stochastic Evolutionary Models 351 CHAPTER NINE
Stochastic Evolution and Deterministic Approximation 9.0 Introduction In Parts II and III of this book, we investigated the evolution of aggregate behavior
under deterministic dynamics. We provided foundations for these dynamics in Chapter
3: there we showed that given any revision protocol ρ and population game F, we can
˙
derive a mean dynamic x = VF (x). This diﬀerential equation describes expected motion
under the stochastic process that ρ and F implicitly deﬁne. We justiﬁed our focus on this
deterministic equation through an informal appeal to a law of large numbers: since all of
the randomness in our evolutionary model is idiosyncratic, it should be averaged away
in the aggregate so long as the population size is suﬃciently large.
Our goal in this chapter is to make this argument rigorous. To do so, we explicitly
derive a stochastic evolutionary process—a Markov process—from a given population
game F, revision protocol ρ, and ﬁnite population size N. Our main result in this chapter,
Theorem 9.2.3, is a ﬁnite horizon deterministic approximation theorem. Building on our
earlier intuition, the theorem shows that over any ﬁnite time span, the behavior of the
stochastic evolutionary process is indeed nearly deterministic: if the population size is
large enough, the stochastic process closely follows a solution trajectory of the mean
dynamic with probability close to one.
The Markov process we introduce in this chapter provides a precise description of the
stochastic evolution of aggregate behavior. Theorem 9.2.3 tells us that over time horizons
of moderate length, we can do without studying this Markov process directly, as the
deterministic approximation is adequate to address most questions of interest. But if we
want to understand behavior in a society over very long time spans, then the deterministic 353 approximation theorem no longer applies, and we must study the Markov process directly.
This inﬁnite horizon analysis is the subject of our ﬁnal chapter.
This chapter and the next employ variety of techniques from the theory of probability
and stochastic processes. These techniques are reviewed in Appendices 9.A and 9.B and
in the appendices to Chapter 10. Appendix 9.C describes an extension of the deterministic
approximation theorem for discretetime models. 9.1 The Markov Process Let us review and slightly modify the model of evolution from in Chapter 3. We take
a population game F : X → Rn with pure strategy sets (S1 , . . ., Sp ) as given. It will be
convenient to assume that the population masses (m1 , . . ., mp ) are integervalued. Agents’
p
p
p
choice procedures are described by a revision protocol ρ, where ρp : Rn × Xp → Rn ×n is a
+
map that takes current payoﬀs and population states as inputs and returns collections of
p
conditional switch rate ρi j (Fp (x), xp ) as outputs.
To set the stage for our limiting analysis, we suppose that population p ∈ P has Nmp
members, where the integer parameter N is called the population size. The feasible social
1
states therefore lie in the discrete grid X N = X ∩ N Zn = {x ∈ X : Nx ∈ Zn }.
N
The stochastic process {Xt } generated by F, ρ, N is described as follows. Each agent in
the society is equipped with a rate R Poisson alarm clock. The ringing of a clock signals
the arrival of a revision opportunity for the clock’s owner: if the owner is currently
p
playing strategy i ∈ Sp , he switches to strategy j i with probability ρi j /R. Finally, the
model respects independence assumptions that ensure that “the future is independent of
the past except through the present”: diﬀerent agents’ clocks ring independently of one
another, strategy choices are made independently of the timing of the clocks’ rings, and
as evolution proceeds, the clocks and the agents are only inﬂuenced by the history of the
process by way of the current value of the social state.
In Chapter 3, we argued informally that the stochastic process described above is well
approximated by solutions to the mean dynamic
(M) p pp ˙
xi = p p x j ρ ji (Fp (x), xp ) − xi
j∈Sp ρi j (Fp (x), xp ).
j∈Sp The rest of this chapter provides a formal defense of this approximation result.
N
We begin by giving a more formal account of the stochastic evolutionary process {Xt }.
N
The independence assumptions above ensure that {Xt } is a continuoustime Markov
process on the ﬁnite state space X N . To describe this process explicitly, it is enough to
354 specify its jump rates {λN }x∈X N and transition probabilities {pN }x, y∈X N (see Appendix 9.B).
x
xy
p
N
Suppose that the current social state is x ∈ X . Then there are Nxi agents playing
strategy i ∈ Sp , Nmp agents in population p ∈ P , and NM agents in total, where M =
q
q∈P m is the total mass of the p populations. Since agents receive revision opportunities
independently at exponential rate R, the basic properties of the exponential distribution
(see Proposition 9.A.1) imply that revision opportunities arrive in the society as a whole
at exponential rate NMR.
When an agent playing strategy i ∈ Sp receives a revision opportunity, he switches to
p
strategy j i with probability ρi j /R. Since this choice is independent of the arrivals of
revision opportunities, the probability that the next revision opportunity goes to an agent
playing strategy i who then switches to strategy j is
p p Nxi NM × ρi j
R pp = xi ρi j
MR . This switch decreases the number of agents playing strategy i by one and increases the
p
1p
number playing j by one, shifting the state by N (e j − ei ).
Summarizing this analysis yields the following observation, which speciﬁes the paN
rameters of the Markov process {Xt }.
Observation 9.1.1. A population game F, a revision protocol ρ, and a population size N deﬁne
N
a Markov process {Xt } on the state space X N . This process is described by some initial state
N
X0 = xN , the jump rates λN = NMR, and the transition probabilities
x
0 pN,x+z
x 9.2 pp p xi ρi j (F (x), xp ) MR p xi R −
= p∈P i∈Sp 0 if z = 1p
(e
Nj p − ei ), i, j ∈ Sp , i j, p ∈ P , p ji ρi j (Fp (x), xp ) MR if z = 0,
otherwise. Finite Horizon Deterministic Approximation N
In the previous section, we formally deﬁned the Markov process {Xt } generated by a
population game F, a revision protocol ρ, and a population size N. Earlier, we introduced
the mean dynamic (M), an ordinary diﬀerential equation that captures the expected motion
of this process; solutions {xt } of (M) are continuous paths through the set of social states.
Can we say more precisely how the stochastic and deterministic processes are linked? 355 The main result in this chapter, Theorem 9.2.3, shows that when the population size N
N
is suﬃciently large, the Markov process {Xt } is well approximated over ﬁnite time spans
by the deterministic trajectory {xt }. 9.2.1 Kurtz’s Theorem N
To begin, we state a general result on the convergence of a sequence {{Xt }}∞=N0 of
N
Markov processes with decreasing step sizes. We suppose that the process indexed by
N
N takes values in the state space X N = {x ∈ X : Nx ∈ Zn }, and we let λN ∈ RX and
+
N
N
pN ∈ RX ×X denote the jump rate vector and transition matrix of this process.
+
To simplify the deﬁnitions to follow, we let ζN be a random variable (deﬁned on an
x
N
arbitrary probability space) whose distribution describes the stochastic increment of {Xt }
from state x: (9.1) N
N
P(ζN = z) = P Xτk+1 = x + z Xτk = x ,
x where τk is the time of the process’s kth jump. We then deﬁne the functions V N : X N → TX,
AN : X N → R, and AN : X N → R by
δ
V N (x) = λN EζN ,
x
x
AN (x) = λN E ζN ,
x
x
AN (x) = λN E ζN 1{ζN >δ} .
x
x
δ
x
V N (x), the product of the jump rate at state x and the expected increment per jump at x,
N
represents the expected increment per time unit from x under {Xt }. V N is thus an alternate
N
deﬁnition of the mean dynamic of {Xt }. In a similar vein, AN (x) is the expected absolute
displacement per time unit, while AN (x) is the expected absolute displacement per time
δ
unit due to jumps traveling further than δ.
With these deﬁnitions in hand, we can state the basic approximation result.
Theorem 9.2.1 (Kurtz’s Theorem). Let V : X → TX be a Lipschitz continuous vector ﬁeld.
Suppose that for some sequence {δN }∞=N0 converging to 0, we have
N
(9.2)
(9.3) lim sup V N (x) − V (x) = 0, N→∞ sup sup AN (x) < ∞, and
N (9.4) x∈X N x∈X N lim sup ANN (x) = 0,
δ N→∞ x∈X N 356 X3 x3 X2 x2 X1
x1 X 0 = x0
Figure 9.2.1: Kurtz’s Theorem.
N
and that the initial conditions X0 = xN converge to x0 ∈ X. Let {xt }t≥0 be the solution to the mean
0
dynamic (M) ˙
x = V (x) starting from x0 . Then for each T < ∞ and ε > 0, we have that N lim P sup Xt − xt < ε = 1. N→∞
t∈[0,T] Fix a ﬁnite time horizon T < ∞ and an error bound ε > 0. Kurtz’s Theorem tells us that
N
when the index N is large, nearly all sample paths of the Markov process {Xt } stay within
ε of a solution of the mean dynamic (M) through time T. By making N large enough, we
N
can ensure that with probability close to one, Xt and xt diﬀer by no more than ε for all t
between 0 and T (Figure 9.2.1).
What conditions do we need to reach this conclusion? Condition (9.2) demands that
as N grows large, the expected displacements per time unit V N converge uniformly to
a Lipschitz continuous vector ﬁeld V . Lipschitz continuity of V ensures the existence
˙
and uniqueness of solutions of the mean dynamic x = V (x). Condition (9.3) requires that
the expected absolute displacement per time unit is bounded. Finally, condition (9.4)
demands that jumps larger than δN make vanishing contributions to the motion of the
processes, where {δN }∞=N0 is a sequence of constants that approaches zero.
N
357 The intuition behind Kurtz’s Theorem can be explained as follows. At each revision
N
opportunity, the increment in the process {Xt } is stochastic. However, the expected
number of revision opportunities that arrive during the brief time interval I = [t, t + dt]
is of order λN dt. Whenever it does not vanish, this quantity grows without bound as
x
the population size N becomes large. Conditions (9.3) and (9.4) ensure that when N is
large, each increment in the state is likely to be small. This ensures that the total change
in the state during time interval I is small, so that jump rates and transition probabilities
vary little during this interval. Since during I there are a very large number of revision
opportunities, each generating nearly the same expected increment, intuition from the law
N
of large numbers suggests that the change in {Xt } during the interval should be almost
N
completely determined by the expected motion of {Xt }. This expected motion is captured
N
by the limiting mean dynamic V , whose solutions approximate the stochastic process {Xt }
over ﬁnite time spans with probability close to one.
12
N
Exercise 9.2.2. Suppose that {Xt } is a Markov process on X N = {0, N , N , . . . , 1} with λN ≡ N.
x
1
To ensure that 2 is always a state, restrict attention to even N. Give examples of sequences
of transition probabilities from state 1 that
2
(i) satisfy condition (9.2) of Kurtz’s Theorem, but not conditions (9.3) or (9.4);
(ii) satisfy conditions (9.2) and (9.4), but not condition (9.3);
(iii) satisfy conditions (9.2) and (9.3), but not condition (9.4).
(Hint: it is enough to consider transition probabilities under which V N ( 1 ) = 0.)
2
Guided by your answers to parts (ii) and (iii), explain intuitively what conditions (9.3)
and (9.4) require. 9.2.2 Deterministic Approximation of the Stochastic Evolutionary Process Returning to our model of evolution, we now use Kurtz’s Theorem to show that the
N
Markov processes {{Xt }}∞=N0 deﬁned in Section 9.1 can be approximated by solutions to
N
the mean dynamic (M) derived in Section 3.1.2.
N
To begin, we compute the expected increment per time unit V N (x) of the process {Xt }.
Deﬁning the random variable ζN as in equation (9.1) above, we ﬁnd that
x
V N (x) = λN EζN
x
x
= NMR
p∈P i∈Sp j i 1p
1p
p
p
(e j − ei ) P ζN = (e j − ei )
x
N
N
pp = NMR
p∈P i∈Sp j i xi ρi j
1p
p
(e j − ei )
N
MR
358 p = p pp (e j − ei ) xi ρi j
p∈P i∈Sp j∈Sp
p =
p∈P j∈Sp =
p∈P i∈S pp pp xi ρi j − ej i∈Sp p ei p p ρi j ei xi
p∈P i∈Sp pp p x j ρ ji − xi
j∈Sp j∈S j∈Sp p ρ ji . p Thus, the vector ﬁeld V N = V is independent of N, and is expressed more concisely as
(M) p pp Vi (x) = p p x j ρ ji − xi
j∈Sp ρi j ,
j∈Sp as we established using a diﬀerent calculation in Section 3.1.2.
Conditions (9.3) and (9.4) of Kurtz’ Theorem require that the motions of the processes
√
p
p
N
{Xt } not be too abrupt. To verify these conditions, observe that since e j − ei  = 2 for any
√ N
distinct i, j ∈ Sp , the increments of {Xt } are always either of length
√ we choose δN = 2
,
N 2
N or of length zero. If this observation immediately implies condition (9.4): √
AN 2 (x) = λN E ζN 1
x
x
N √ ζN > N2
x = 0. The observation also helps us verify condition (9.3):
√
A (x) =
N λN
x E ζN
x ≤ RNM × 2√
= 2RM.
N With these calculations in hand, we can present the deterministic approximation theorem.
N
N
Theorem 9.2.3 (Deterministic Approximation of {Xt }). Let {{Xt }}∞=N0 be the sequence of
N
stochastic evolutionary processes deﬁned in Observation 9.1.1. Suppose that V = V N is Lipschitz
N
continuous. Let the initial conditions X0 = xN converge to state x0 ∈ X, and let {xt }t≥0 be the
0
solution to the mean dynamic (M) starting from x0 . Then for all T < ∞ and ε > 0, N
lim P sup Xt − xt < ε = 1. N→∞
t∈[0,T] Choose a ﬁnite time span T and two small constants δ and ε. Then for all large enough
N
population sizes N, the probability that the process {Xt } stays within ε of the deterministic
trajectory {xt } through time T is at least 1 − δ.
A key requirement of Theorem 9.2.3 is that V must be Lipschitz continuous, ensuring
359 that the mean dynamic (M) admits a unique solution from every initial condition in
X. This requirement is satisﬁed by members of the families of dynamics (imitative,
excess payoﬀ, pairwise comparison) studied in Chapter 4, as well as the perturbed best
response dynamics from Chapter 5. The best response and projection dynamics, being
discontinuous, are not covered by Theorem 9.2.3, but it seems likely that deterministic
approximation results that apply to these dynamics can be proved (see the Notes).
It is well worth emphasizing that Theorem 9.2.3 is a ﬁnite horizon approximation result,
and that it cannot be extended to an inﬁnite horizon result. To see why not, consider the
logit choice protocol (Example 3.2.5). Under this protocol, switches between all pairs of
strategies occur with positive probability regardless of the current state. It follows that the
N
induced Markov process {Xt } is irreducible: there is a positive probability path between
each ordered pair of states in X N . As we shall see in Chapter 10, irreducibility implies
that every state in X N is visited inﬁnitely often with probability one. This fact clearly
precludes an inﬁnite horizon analogue of Theorem 9.2.3. Indeed, inﬁnite horizon analysis
N
of {Xt } requires a diﬀerent set of tools, which we present in Chapter 10.
Example 9.2.4. Toss and switch. Suppose that agents play a game with strategy set S = {L, R}
1
using the constant revision protocol ρ, where ρLL = ρLR = ρRL = ρRR = 2 . Under the
simplest interpretation of this protocol, each agent receives revision opportunities at rate
1; upon receiving an opportunity, an agent ﬂips a fair coin, switching strategies if the coin
comes up Heads.
N
For each population size N, the protocol generates a Markov process {Xt } with common
jump rate λN ≡ N and transition probabilities
x pN,x+z
x 1 xR
2 = 1 xL
2 1 2 if z = 1
(e
NL − eR ), if z = 1
(e
NR − eL ), if z = 0. We can simplify the notation by replacing the vector state variable x = (xL , xR ) ∈ X with
the scalar state variable r = xR ∈ [0, 1]. The resulting Markov process {RN } has common
t
jump rate λN ≡ N and transition probabilities
r pNr+z
r, 1
1
r
2
if z = − N , 1
= 1 (1 − r) if z = N ,
2 1 2
if z = 0. 360 Its mean dynamic is thus
V N (r) = λN EζN
r
r
1
11
= N (− N · 1 r) + ( N · 2 (1 − r))
2 = 1
2 − r, regardless of the population size N. To solve this diﬀerential equation, we move the rest
point r = 1 to the origin using change of variable s = r − 1 . The equation
2
2
˙˙
s=r= 1
2 −r= 1
2 − (s + 1 ) = −s
2 has the general solution st = s0 e−t , implying that
rt = 1
2 + (r0 − 1 ) e−t .
2 Fix a time horizon T < ∞. Theorem 9.2.3 tells us that when N is suﬃciently large,
the process {RN } is very likely to stay very close to an almost deterministic trajectory; this
t
1
trajectory converges to state r = 2 , with convergence occurring at exponential rate 1.
If we instead ﬁx the population size N and look at behavior over the inﬁnite time
horizon (T = ∞), the process {RN } eventually splits oﬀ from the deterministic trajectory,
t
1
visiting all states in {0, N , . . . , 1} inﬁnitely often. §
Exercise 9.2.5. Consider a single population playing population game F using revision
protocol ρ.
(i) Show that the resulting mean dynamic can be expressed as
˙
x = (x R(x)) ,
where R(x) ∈ Rn×n is given by ρ (x, F(x)) ij
if i j, Ri j (x) = − ρ (x, F(x)) if i = j. ik ki Note that when ρ is independent of F(x) and x as in the previous example, the
matrix R is independent of x as well. In this case we obtain the linear dynamic
˙
x = (x R) , whose solutions can be expressed in closed form (see Appendix 7.B).
(ii) Suppose that ρi j = 1 for all i and j. Describe the parameters of the resulting
N
Markov process {Xt }, and write down the corresponding mean dynamic. Show
361 that solutions to the mean dynamic take the form xt = x∗ + (x0 − x∗ ) e−nt , where
1
x∗ = n 1. 9.3
9.3.1 Extensions
Finite Population Eﬀects In our model of individual choice, the revision protocol ρ was deﬁned independently of
the population size. In some cases, it is more appropriate to allow the revision protocol to
depend on N in some vanishing way—for example, to account for the eﬀects of sampling
from a ﬁnite population, or for the fact that an agent whose choices are based on imitation
will not imitate himself. If we include these eﬀects, then ρ varies with N, so the normalized
expected increments V N vary with N as well. Fortunately, Kurtz’s Theorem allows for
these sorts of eﬀects so long as they are vanishing in size: examining condition (9.2), we
see that as long as the functions V N converge uniformly to a limiting mean dynamic V ,
the ﬁnite horizon approximation continues to hold. 9.3.2 Discrete Time Models It is also possible to prove deterministic approximation results for discrete time models
of stochastic evolution. To do so, we assume that the number of discrete periods that pass
per unit of clock time grows with the population size N. In this situation, one can
employ a discrete time version of Kurtz’s Theorem (Theorem 9.C.1 in Appendix 9.C), the
requirements of which are direct analogues of those from Theorem 9.2.1 above.
So, let us suppose that when the population size is N, each discrete time period is
1
of duration dN = NMR , so that periods begin at times in the set TN = {0, dN , 2dN , . . .}. We
N
consider two speciﬁcations of the discrete time evolutionary process {Xt }t∈TN .
Exercise 9.3.1. Discrete time model I: One revision opportunity per period. Suppose that during
each period, exactly one agent is selected at random and granted a revision opportunity,
with each agent being equally likely to be chosen. The chosen agent’s choices are then
p
governed by the conditional switch probabilities ρi j /R. Using Theorem 9.C.1, show that
Theorem 9.2.3 extends to this discrete time model.
Discrete time models can allow a possibility that our continuous time model cannot:
they permit many agents to switch strategies simultaneously. The next exercise shows
that deterministic approximation is still possible even when simultaneous revisions by
many agents are possible, so long as they are suﬃciently unlikely.
362 Exercise 9.3.2. Discrete time model II: Random numbers of revision opportunities in each period.
Suppose that during each period, each agent tosses a coin that comes up heads with
1
probability NM . Every agent who tosses a head receives a revision opportunity; choices
p
for such agents are again governed by the conditional switch probabilities ρi j /R. Use the
Poisson Limit Theorem (Propositions 9.A.4(ii) and 9.A.5) and Theorem 9.C.1 to show that
Theorem 9.2.3 extends to this model. (Hint: In any given period, the number of agents
1
whose tosses come up heads is binomially distributed with parameters NM and NM .) Appendix
9.A The Exponential and Poisson Distributions 9.A.1 Basic Properties The random variable T with support [0, ∞) has an exponential distribution with rate λ,
denoted T ∼ exponential(λ), if its decumulative distribution is P(T ≥ t) = e−λt , so that its
density function is f (t) = λe−λt . A Taylor approximation shows that for small dt > 0,
(9.5) P(T ≤ dt) = 1 − e−λdt = 0 + λe−λ·0 dt + O (dt)2 ≈ λ dt. Exponential random variables are often used to model the random amount of time that
passes before a certain occurrence: the arrival of a customer at a queue, the decay of
a particle, and so on. We often describe the behavior of exponential random variables
using the rhetorical device of a “stochastic alarm clock” that rings after an exponentially
distributed amount of time has passed.
Some basic properties of the exponential distribution are listed next.
Proposition 9.A.1. Let T1 , . . . , Tn be independent with Ti ∼ exponential(λi ). Then
(i) ETi = λ−1 ;
i
(ii) P (Ti ≥ u + t Ti ≥ u ) = P (Ti ≥ t) = e−λi t ;
(iii) If Mn = min{T1 , . . . , Tn } and In = argmin j T j , then Mn ∼ exponential( n=1 λi ),
i
n
P(In = i) = λi / j=1 λ j , and Mn and In are independent.
Property (ii), memorylessness, says that if the time before one’s alarm clock rings is
exponentially distributed, then one’s beliefs about how long from now the clock will
ring do not depend on how long one has already been waiting. Together, this property
and equation (9.5) above tell us that until the time when the clock rings, the conditional
363 probability that it rings during the next dt times units is proportional to dt:
P (Ti ≤ t + dt Ti ≥ t ) = P (Ti ≤ dt) ≈ λi dt
The exponential distributions are the only continuous distributions with these properties.
Property (iii) says that given a collection of independent exponential alarm clocks, then
the time until the ﬁrst clock rings is itself exponentially distributed, the probability that a
particular clock rings ﬁrst is proportional to its rate, and the time until the ﬁrst ring and
the ringing clock’s identity are independent random variables. These facts are essential
to the workings of our stochastic evolutionary model.
Proof. Parts (i) and (ii) are easily veriﬁed. To establish part (iii), set λ =
compute the distribution of Mn as follows:
P(Mn ≥ t) = P n
i =1 n {Ti ≥ t} = i =1 n P(Ti ≥ t) = i =1 n
i=1 λi , and e−λi t = e−λt . To prove the remaining claims from part (iii), observe that
∞ ∞ (9.6) P ji {Ti ≤ T j } ∩ {Ti ≥ t} = ji t λ j e−λ j t j dt j λi e−λi ti dti
ti ∞ =
t ji e−λ j ti λi e−λi ti dti ∞ λi e−λti dti =
t = (λi /λ) e−λt .
Setting t = 0 in equation (9.6) yields P(In = i) = λi /λ, and an arbitrary choice of t shows
that P(Mn ≥ t, In = i) = P(Mn ≥ t)P(In = i).
A random variable R has a Poisson distribution with rate λ, denoted R ∼ Poisson(λ), if
P(R = r) = e−λ λr /r! for all r ∈ {0, 1, 2, . . .}. Poisson random variables are used to model
the number of occurrences of rare events (see Propositions 9.A.3 and 9.A.4). Two of their
basic properties are listed below.
Proposition 9.A.2. If R1 , . . . , Rn are independent with Ri ∼ Poisson(λi ), then
(i) E(Ri ) = λi ;
n
n
(ii)
j=1 R j ∼ Poisson( j=1 λ j ).
∞ Proof. (i) E(Ri ) = re
r =1 r
−λi (λi ) r! ∞
−λ i = λi e
r =1 (λi )r−1
= λi
(r − 1)! 364 ∞ s
−λi (λi ) e
s=0 s! = λi . r Rt = 3 3
Rt = 2 2
Rt = 1 1
Rt = 0 0 S 1 = T1 S 2 = T1 + T2 S 3 = T1+ T2+ T3 t Figure 9.A.1: Ring times Sn and numbers of rings Rt of an exponential alarm clock. (ii) When n = 2, we can compute that
r P(R1 + R2 = r) = r r1 =0
r −(λ1 +λ2 ) =e e−λ1 P(R1 = r1 ) P(R2 = r − r1 ) =
r1 =0 (λ1 )r1 −λ2 (λ2 )r−r1
e
r1 !
(r − r1 )! r
(λ1 ) (λ2 )
−(λ1 +λ2 ) (λ1 + λ2 )
=e
,
r1 ! (r − r1 )!
r!
r =0
r1 r−r1 1 where the ﬁnal equality follows from the binomial expansion
r (λ1 + λ2 )r = r!
(λ1 )r1 (λ2 )r−r1 .
r ! (r − r1 )!
r =0 1
1 Iterating yields the general result.
The exponential and Poisson distributions are fundamentally linked. Let {Ti }∞ 1 be a
i=
sequence of i.i.d. exponential(λ) random variables. We can interpret T1 as the ﬁrst time that
an exponential alarm clock rings, T2 as the interval between the ﬁrst and second rings,
and Tk as the interval between the (k − 1)st and kth rings. In this interpretation, the sum
Sn = n=1 Tk represents the time of the nth ring, while Rt = max{n : Sn ≤ t} represents the
k
number of rings through time t. Figure 9.A.1 presents a single realization of the ring time
sequence {Sn }∞ 1 and the numberofrings process {Rt }t≥0 .
n=
Proposition 9.A.3 derives the distribution of Rt , establishing a key connection between
the exponential and Poisson distributions.
Proposition 9.A.3. Rt ∼ Poisson(λt).
365 Proof. To begin, we prove that Sn has density
(9.7) fn (t) = λe−λt (λt)n−1
.
(n − 1)! This formula is obviously correct when n = 1. Suppose it is true for some arbitrary n.
∞
Then using the convolution formula fX+Y (z) = −∞ fY (z − x) fX (x) dx, we ﬁnd that
t fn+1 (t) = t fn (t − s) f1 (s)ds =
0 0 n
(λ(t − s))n−1 −λ(t−s)
−λs
−λt (λt)
e
× λe ds = λe
.
λ
(n − 1)!
n! Next, we show that this equation implies that Sn has cumulative distribution
∞ e−λt P(Sn ≤ t) =
m =n Since (λt)m
∞
m =0 m ! (λt)m
.
m! = eλt , this statement is equivalent to
n−1 e−λt P(Sn ≤ t) = 1 −
m =0 (λt)m
.
m! Diﬀerentiating shows that this expression is in turn equivalent to the density of Sn taking
form (9.7), as established above.
To complete the proof, we express the event that at least n rings have occurred by time
t in two equivalent ways: {Rt ≥ n} = {Sn ≤ t}. This observation and the expression for
P(Sn ≤ t) above imply that
P(Rt = n) = P(Rt ≥ n) − P(Rt ≥ n + 1) = P(Sn ≤ t) − P(Sn+1 ≤ t) = e−λt 9.A.2 (λt)n
.
n! The Poisson Limit Theorem Proposition 9.A.3 shows that the Poisson distribution describes the number of rings of
an exponential alarm clock during a ﬁxed time span. We now establish a discrete analogue
of this result.
The random variable Xp has a Bernoulli distribution with parameter p ∈ [0, 1], denoted
p
Xp ∼ Bernoulli(p), if P(Xp = 1) = p and P(Xp = 0) = 1 − p. Let {Xi }n=1 be a sequence of
i
p
p
i.i.d. Bernoulli(p) random variables (e.g. coin tosses), and let Sn = n=1 Xi denote their sum
i
p
(the number of heads in n tosses). Then Sn has a binomial distribution with parameters n 366 p and p (Sn ∼ binomial(n, p)): n
p P(Sn = s) = ps (1 − p)n−s for all s ∈ {0, 1, . . . , n}.
s
Finally, the random variable Z has a standard normal distribution (Z ∼ N(0, 1)) if its density
2
function is f (z) = √1 π exp(− z2 ).
2
p
Proposition 9.A.4 considers the behavior of the binomial random variables Sn when
the number of tosses n becomes large. Recall that the sequence of random variables
{Yn }∞ 1 with distribution functions {Fn }∞ 1 converges in distribution (or converges weakly)
n=
n=
to the random variable Y with distribution function F (denoted Yn ⇒ Y, or Fn ⇒ F) if
limn→∞ Fn (x) = F(x) at all points x ∈ R at which F is continuous.
p Proposition 9.A.4. Let Sn ∼ binomial(n, p). Then as n → ∞,
p (i)
(ii) S
√ n −np ⇒ Z, where Z ∼ N(0, 1). np(1−p)
λ/n
Sn ⇒ Rλ where Rλ ∼ Poisson(λ). If we increase the number of tosses n of a coin whose bias p is ﬁxed, the Central Limit
p
Theorem tells us that the distribution of the number of heads Sn approaches a normal
p
p
distribution. (In statement (i), we subtract the mean ESn = np oﬀ of Sn and then divide by
p
the standard deviation SD(Sn ) = np(1 − p) to obtain convergence to a ﬁxed distribution.)
Suppose instead that as we increase the number of tosses n, we decrease the probability
of heads p in such a way that the expected number of heads np = λ remains ﬁxed. Then
p
statement (ii), the Poisson Limit Theorem, tells us that the distribution of Sn approaches a
Poisson distribution. The basic calculation needed to prove this is as follows:
P(Sλ/n
n n!
λ
= s) =
s!(n − s)! n
λ = P(R = s) × s λ
1−
n (1 − λ )n
n
e −λ s−1 ×
r=0 n−s λ
= 1−
n n λ
λs
× 1−
s!
n −s n!
(n − s)! ns n−r
→ P(Rλ = s).
n−λ The second term of the penultimate expression above is independent of s and is less than
1 (because (1 − λ )n increases to e−λ ), while the ﬁnal term attains its maximum over s when
n
s = λ + 1 and decreases to 1 as n grows large. Together, these observations yield the
following upper bound, which is needed in Exercise 9.3.2.
Proposition 9.A.5. P(Sλ/n = s) ≤ Cλ P(Rλ = s) for some Cλ ∈ R independent of n and s.
n 367 9.B Countable State Markov Processes 9.B.1 Countable Probability Models We begin our review of probability theory by discussing probability models with a
countable sample space. A countable probability model is a pair (Ω, P), where the sample
space Ω is a ﬁnite or countable set, 2Ω is the set of all subsets of Ω, and P : 2Ω → [0, 1]
is a probability measure: that is, a function satisfying P(∅) = 0, P(Ω) = 1, and countable
additivity: if {Ak } is a ﬁnite or countable collection of disjoint events (i.e., subsets of Ω), then
P( k Ak ) = k P(Ak ).
A random variable X is a function whose domain is Ω. The distribution of X is deﬁned
by P(X ∈ A) = P(ω ∈ Ω : X(ω) ∈ A) for all subsets A of the range of X. To deﬁne a ﬁnite
collection of discrete random variables {Xk }n=1 , we specify a probability model (Ω, P) and
k
then deﬁne the random variables as functions on Ω. To interpret this construction, imagine
picking an ω at random from the sample space Ω according to the probability distribution
P. The value of ω so selected determines the realizations X1 (ω), X2 (ω), . . . , Xn (ω) of the
entire sequence of random variables X1 , X2 , . . ., Xn .
Example 9.B.1. Repeated rolls of a fair die. Suppose we would like to construct a sequence of
random variables {Xk }n=1 , where Xk is to represent the kth roll of a fair die. To accomplish
k
this, we let R = {1, 2, 3, 4, 5, 6} be the set of possible results of an individual roll, and let
the sample space be the set of nvectors Ω = Rn , with typical element ω = (ω1 , . . ., ωn ). To
deﬁne the probability measure P, it is enough to let P({ω}) = ( 1 )n for all ω ∈ Ω; additivity
6
then determines the probabilities of all other events in 2Ω .
The random variables Xk can then be deﬁned as coordinate functions: Xk (ω) = ωk for all
ω ∈ Ω and k ∈ {1, . . ., n}. Observe once again that by randomly selecting an ω ∈ Ω, we
determine the realizations of all n random variables. Since
P(Xk = xk ) = P(ω ∈ Ω : Xk (ω) = xk ) = P(ω ∈ Ω : ωk = xk ) = 1
6 for all xk ∈ R, the random variables Xk have the correct marginal distributions. Moreover,
if Ak ⊆ R for k ∈ {1, . . ., n}, it is easy to conﬁrm that P n
k =1 {Xk ∈ Ak } = n P(Xk ∈ Ak ),
k =1 so the Xk are independent, as desired. §
The expected value of a random variable is its integral with respect to the probability
368 measure P. In the case of the kth die roll,
EXk = Ω Xk (ω) dP(ω) = ωk P(ω) =
ωk ∈R ω∈Ω ωk ω−k P(ωk , ω−k ) = 6 i× 1
6 = 31.
2 i=1 We can create new random variables out of old ones using functional operations. For
instance, the total of the results of the n die rolls is a new random variable Sn deﬁned by
Sn = n=1 Xk , or, more explicitly, by Sn (ω) = n=1 Xk (ω) for all ω ∈ Ω.
k
k 9.B.2 Uncountable Probability Models and Measure Theory While the constructions above are suﬃcient for ﬁnite collections of discrete random
variables, they do not suﬃce when individual random variables take an uncountable
number of values, or when we are interested in inﬁnite numbers of random variables.
To handle these situations, we need the sample space Ω to be uncountable: that is, not
expressible as a sequence of elements.
Unfortunately, uncountable sample spaces introduce a serious new technical diﬃculty.
As an illustration, suppose we want to construct a random variable representing a uniform
draw from the unit interval. It is natural to choose Ω = [0, 1] as our sample space and to
deﬁne our random variable as the identity function on Ω: that is, X(ω) = ω. But then we
encounter a major diﬃculty: it is impossible to deﬁne a countably additive probability
measure P that speciﬁes the probability of every subset of Ω.
To resolve this problem, one chooses a set of subsets F ⊆ 2Ω whose probabilities will
be speciﬁed, and then introduces corresponding restrictions on the deﬁnition of a random
variable. A random variable satisfying these restrictions is said to be measurable, and this
general approach to studying functions deﬁned on uncountable domains is known as
measure theory.
To summarize some of the foregoing discussion: an uncountable probability model consists of a triple (Ω, F , P), where Ω is a sample space, F ⊆ 2Ω is a collection (more specifically, a σalgebra) of subsets of Ω, and P : F → [0, 1] is a countably additive probability
measure.
Suppose we would like to study a collection of random variables described by some
prespeciﬁed joint distributions. How do we know whether it is possible to construct
these random variables on some wellchosen probability space? Happily, as long as
the marginal and joint distributions satisfy certain obviously necessary consistency conditions, existence of the probability space and the random variables is ensured by the
Carath´odory Extension Theorem and the Kolmogorov Extension Theorem.
e
369 9.B.3 Distributional Properties and Sample Path Properties The reader may wonder why we bother with the explicit construction of random
variables. After all, once we specify the joint distributions of the basic random variables
of interest, we also determine the joint distributions of any random variables that can be
derived from our original collection. Why not work entirely in terms of these distributions
and avoid the explicit construction of the random variables altogether?
If we are only interested in distributional properties of our random variables, explicit
construction of the random variables is not essential. However, many key results in
probability theory concern not the distributional properties of random variables, but rather
their sample path properties. These are properties of realization sequences: i.e., the sequences
of values X1 (ω), X2 (ω), X3 (ω), . . . that arise for each choice of ω ∈ Ω. The diﬀerences
between the two sorts of properties can be illustrated through a simple example.
Example 9.B.2. Consider the probability model (Ω, P) with sample space Ω = {−1, 1} and
1
probability measure P({−1}) = P({1}) = 2 . Deﬁne the sequences of random variables {Xi }∞ 1
i=
ˆ
and {Xi }∞ 1 as follows:
i=
Xi (ω) = ω; −ω if i is odd, ˆ
Xi (ω) = ω if i is even.
ˆ
If we look only at marginal distributions, {Xi }∞ 1 and {Xi }∞ 1 are identical, as both sequences
i=
i=
consist of random variables equally likely to have realizations –1 and 1. But from the
sample path point of view, the two sequences are diﬀerent: for either choice of ω, the
ˆ
sequence {Xi (ω)}∞ 1 is constant, while the sequence {Xi (ω)}∞ 1 alternates between 1 and 1
i=
i=
forever.
We illustrate these ideas in Figures 9.B.1 and 9.B.2, which provide graphical representations of our two sequences of random variables. In these pictures, the vertical axis
represents the sample space Ω, the horizontal axis represents indices (or “times”) of the
ˆ
trials, and the interiors of the ﬁgures contain the realizations Xi (ω) and Xi (ω). To focus on
distributional properties of a sequence of random variables, we focus on the collection of
outcomes in each vertical section of the picture (Figure 9.B.1). In this respect, each Xi is
ˆ
identical to its partner Xi , and in fact all of the random variables in both sequences share
the same distribution. To focus on sample path properties, we look instead at the sequences
of outcomes in each horizontal slice of each picture (Figure 9.B.2). By doing so, we see that
ˆ
for each ω, the sample path {Xi (ω)}∞ 1 is quite diﬀerent from the sample path {Xi (ω)}∞ 1 . §
i=
i=
370 ˆ
X X
–1 –1
1 –1 –1 –1 –1 1 1 1 1 1 1 Ω 2 3
4
time 5 –1 1 –1 1 –1 1 1 –1 1 –1 1 –1 1 Ω 2 3
4
time 5 ... ... ˆ
Figure 9.B.1: Distributional properties of X and X. ˆ
X X
–1 –1
1 –1 –1 –1 –1 1 1 1 1 1 1 Ω 2 3
4
time 5 –1 1 –1 1 –1 1 1 –1 1 –1 1 –1 1 Ω 2 3
4
time 5 ... ... ˆ
Figure 9.B.2: Sample path properties of X and X. Example 9.B.3. Properties of i.i.d. random variables. The distinction between distributional
properties and sample path properties can be used to classify the fundamental theorems
about sequences of i.i.d. random variables. Let {Xi }∞ 1 be a sequence of i.i.d. random
i=
variables, each of which is a function on the (uncountable) probability space (Ω, F , P).
For simplicity, assume that each Xi has mean zero and variance one. Then the sum
Sn = n=1 Xi has mean zero and variance n, while the sample average Xn = Sn /n has mean
i
1
zero and variance n .
The laws of large numbers concern the convergence of the sample averages Xn as the
number of trials n grows large. The Weak Law of Large Numbers is a distributional result:
as n goes to inﬁnity, the distributions of the random variables Xn approach a point mass
at 0.
The Weak Law of Large Numbers : For all ε > 0, lim P Xn ∈ [−ε, ε] = 1.
n→∞ In contrast, the Strong Law of Large Numbers is a sample path result: for almost every choice
of ω ∈ Ω, the sequence of realizations {Xn (ω)}∞ 1 converges to zero.
n=
The Strong Law of Large Numbers: P ω ∈ Ω : lim Xn (ω) = 0 = 1.
n→∞ Note that while the WLLN can be stated directly in terms of distributions, the SLLN only
371 makes sense if our random variables are deﬁned as functions on a probability space.
A second pair of results focuses on variation. The Central Limit Theorem concerns dis√
tributions: as n goes to inﬁnity, the distributions of the normalized sums Sn / n converge
to the standard normal distribution.
Sn
The Central Limit Theorem: lim P √ ∈ [a, b] =
n→∞
n b
a 1
2
√ e−x /2 dx.
2π The Law of the Iterated Logarithm looks at variation within individual sample paths:
for almost every choice of ω ∈ Ω, the sequence of realizations {Sn (ω)}∞ 1 exceeds (1 −
n=
ε) 2n log log n inﬁnitely often, but exceeds (1 + ε) 2n log log n only ﬁnitely often. Sn (ω) The Law of the Iterated Logarithm: P ω ∈ Ω : lim sup
= 1 = 1. § n→∞
2n log log n
In Chapter 10, we present distributional and sample path convergence theorems for
Markov processes; these results are the key to describing the evolution of behavior over
inﬁnite time horizons. 9.B.4 Countable State Markov Chains Markov chains and Markov processes are collections of random variables {Xt }t∈T with
the property that “the future only depends on the past through the present”. We focus on
settings where these random variables take values in some ﬁnite or countable state space
X . (Of course, even if the state space X is countable, the random variables Xt : Ω → X must
be deﬁned on a probability model with an uncountable sample space Ω if the set of times T is
inﬁnite.) We use the terms “Markov chain” and “Markov process” to distinguish between
the discrete time (T = {0, 1, . . .}) and continuous time (T = [0, ∞)) frameworks. (Some
authors use these terms to distinguish between discrete and continuous state spaces.)
The sequence of random variables {Xt } = {Xt }∞ 0 is a Markov chain if it satisﬁes the
t=
Markov property:
P (Xt+1 = xt+1 X0 = x0 , . . . , Xt = xt ) = P (Xt+1 = xt+1 Xt = xt )
for all times t ∈ {0, 1, . . .} and all collections of states x0 , . . . , xt+1 ∈ X for which the conditional expectations are well deﬁned. We only consider temporally homogeneous Markov
chains, which are Markov chains whose onestep transition probabilities are independent
of time:
P Xt+1 = y Xt = x = pxy .
372 We call the matrix p ∈ RX ×X the transition matrix for the Markov chain {Xt }. The vector
+
X deﬁned by P(X = x) = π is the initial distribution of {X }; when π puts all of its
π ∈ R+
0
x
t
mass on a single state x0 , we call x0 the initial condition or the initial state. The vector π and
the matrix p fully determine the joint distributions of {Xt } via
t P (X0 = x0 , . . . , Xt = xt ) = πx0 pxs−1 xs .
s=1 Since certain properties of Markov chains do not depend on the initial distribution π, it is
sometimes left unspeciﬁed. 9.B.5 Countable State Markov Processes A (temporally homogeneous) Markov process on the countable state space X is a collection
of random variables {Xt } = {Xt }t≥0 with continuous time index t. This collection must
satisfy the following three properties:
(MP) The (continuous time) Markov property:
P Xtk+1 = xtk+1 Xt0 = xt0 , . . . , Xtk = xtk = P Xtk+1 = xtk+1 Xtk = xtk for all
0 ≤ t0 < . . . < tk+1 and xt0 , . . ., xtk+1 ∈ X with P Xt0 = xt0 , . . . , Xtk = xtk > 0. (TH) Temporal homogeneity:
P Xt+u = y Xt = x = pxy (u) for all t, u ≥ 0. (RCLL) Right continuity and left limits:
For every ω ∈ Ω, the sample path {Xt (ω)}t≥0 is continuous from the right
and has left limits. That is, lims↓t Xs (ω) = Xt (ω) for all t ∈ [0, ∞), and
lims↑t Xs (ω) exists for all t ∈ (0, ∞). While conditions (MP) and (TH) are restrictions on the (joint) distributions of {Xt }, condition (RCLL) is a restriction on the sample paths of {Xt }.
Processes satisfying the distributional requirements (MP) and (TH) must take this form:
there must be an initial distribution π ∈ RX , a jump rate vector λ ∈ RX , and a transition
+
+
X ×X such that
matrix p ∈ R+
(i) The initial distribution of the process is given by P(X0 = x) = πx .
(ii) When the process is in state x, the random time before the next jump is exponentially
distributed with rate λx .
(iii) The state at which a jump from x lands follows the distribution {pxy } y∈X . (Note that
the landing state can be x itself if pxx > 0.)
373 (iv) Times between and landing states of jumps are independent of each other, and are
also independent of the past conditional on the current state.
The objects π, λ, and p implicitly deﬁne the joint distributions of the random variables
{Xt }, so the Kolmogorov Extension Theorem (Section 9.B.2) tells us that a collection of
random variables with these joint distributions exists (i.e., can be deﬁned as functions on
some well chosen probability space). However, Kolmogorov’s Theorem does not ensure
that the random variables so constructed satisfy the sample path continuity property
(RCLL).
Fortunately, it is not too diﬃcult to construct the process {Xt } explicitly. Let {Yk }∞ 0 be
k=
a discrete time Markov chain with initial distribution π and transition matrix p, and let
{Tk }∞ 1 be a sequence of i.i.d. exponential(1) random variables that are independent of the
k=
Markov chain {Yk }. (Since both of these collections are countable, questions of sample
path continuity do not arise; the existence of these random variables as functions deﬁned
on a common probability space is ensured by Kolmogorov’s Theorem.)
Deﬁne the random jump times {τn }∞ 0 by τ0 = 0 and
n=
n τn =
k =1 Tn
Tk
, so that τn − τn−1 =
.
λYk−1
λYn−1 Finally, deﬁne the process {Xt }t≥0 by
Xt = Yn when t ∈ [τn , τn+1 ).
The process {Xt } begins at some initial state X0 = Y0 = y0 . It remains there for the random
duration τ1 ∼ exponential(λ y0 ), at which point a transition to some new state Xτ1 = Y1 = y1
occurs; the process then remains at y1 for the random duration τ2 − τ1 ∼ exponential(λ y1 ),
at which point a transition to Xτ2 = Y2 = y2 occurs; and so on. By construction, the sample
paths of {Xt } are right continuous with left limits, and it is easy to check that the joint
distributions of {Xt } are the ones we desire.
Example 9.B.4. The Poisson Process. Consider a Markov process {Xt } with state space
X = Z+ , initial condition X0 = 0, jump rates λx = λ > 0 for all x ∈ X , and transition matrix
pxy = 1{ y=x+1} for all x, y ∈ X . Under this process, jumps arrive randomly at the ﬁxed rate
λ, and every jump increases the state by exactly one unit. A Markov process ﬁtting this
description is called a Poisson process.
By the deﬁnition of this process,
(P1) The waiting times τn – τn−1 are i.i.d. with τn – τn−1 ∼ exponential(λ)
374 (n ∈ {1, 2, . . . }).
In fact, it can be shown that under the sample path continuity condition (RCLL), condition
(P1) is equivalent to
(P2) The increments Xtn − Xtn−1 are independent random variables,
and (Xtn − Xtn−1 ) ∼ Poisson(λ(tn − tn−1 )) (0 < t1 < . . . < tn ). Proposition 9.A.3 established part of this result: it showed that if condition (P1) holds,
then Xt ∼ Poisson(λt) for all t > 0. But the present result says much more: a “pure birth
process” whose waiting times are i.i.d. exponentials is not only Poisson distributed at each
time t; in fact, all increments of the process are Poisson distributed, and nonoverlapping
increments are stochastically independent. Conversely, if one begins with the assumption
that the increments of the process are independent and Poisson, then the waits between
jumps must be i.i.d. and exponential. § 9.C Kurtz’s Theorem in Discrete Time To obtain a deterministic approximation theorem for discrete time Markov chains, we
must assume that the length of a period with respect to clock time becomes vanishingly
small as the population size N increases. Let dN be the duration of a period under
N
the Markov chain {Xt }, so that this chain is initialized at time 0 and has transitions at
N
N
N
times dN , 2dN , . . . . We can deﬁne {Xt } at all times in [0, ∞) by letting Xt = XkdN when
N
N
t ∈ [kdN , (k + 1)dN ), making each sample path {Xt (ω)} = {Xt (ω)}t≥0 a step function whose
jumps occur at multiples of dN .
Theorem 9.C.1 (Kurtz’s Theorem in Discrete Time). Suppose that limN→∞ dN = 0. Deﬁne the
distributions of the random variables ζN by
x
N
N
P(ζN = z) = P X(k+1)dN = x + z XkdN = x ,
x and deﬁne the functions V N , AN , and AN by
δ
V N (x) = 1
E ζN ,
x
dN AN (x) = 1
E
dN ζN , and AN (x) =
x
δ 1
E
dN ζN 1{ζN >δ} .
x
x N
Then the conclusions of Theorem 9.2.1 hold for the sequence of Markov chains {{Xt }}∞=N0 .
N 375 9.N Notes Section 9.2. Kurtz’s Theorem ﬁrst appeared in Kurtz (1970). For an advanced textbook
treatment and further references, see Ethier and Kurtz (1986, Chapter 11).
The ﬁrst formal results in the game theory literature akin to Theorem 9.2.3 focus on
speciﬁc revision protocols. Boylan (1995) shows how evolutionary processes based on
random matching schemes converge to deterministic trajectories when the population
size grows large. Binmore et al. (1995), Borgers and Sarin (1997), and Schlag (1998)
¨
consider particular models of evolution that converge to the replicator dynamic. Binmore
and Samuelson (1999) prove a general deterministic approximation result for discrete time
models of evolution under a somewhat restrictive timing assumption. Sandholm (2003)
uses Kurtz’s Theorem to prove a general ﬁnite horizon convergence result. This paper also
N
shows that after spatial normalization, the behavior of {Xt } near rest points of the mean
dynamic can be approximated by a diﬀusion. The strongest deterministic approximation
results can be found in Bena¨m and Weibull (2003). These authors establish an exponential
ı
N
bound on the probability of deviations of {Xt } from solutions of the mean dynamic. They
N
also establish results relating the inﬁnite horizon behavior of {Xt } to the mean dynamic;
we introduce these results in Chapter 10.
While the results described above rely on the assumption that the mean dynamic is
Lipschitz continuous, we conjecture that that analogous results can be established in more
general settings—in particular, when the mean dynamic is not a diﬀerential equation at
all, but rather a diﬀerential inclusion. For related results in a somewhat diﬀerent context,
see Bena¨m et al. (2005).
ı
While we have focused here on the evolution of the distribution of behavior, Tanabe
(2006), building on work of Tanaka (1983) and Shiga and Tanaka (1985), proves results
about the evolution of the strategy proﬁle: i.e., about the joint distribution of individual
agents’ choice trajectories. Suppose that at time 0, the N agents’ choices of strategies from
S are i.i.d. Then as N grows large, each agent’s random choice trajectory converges in
distribution to ν, the distribution of a certain timeinhomogeneous Markov process—a
socalled McKean process—taking values in S. Furthermore, the joint distribution of any
k individuals’ choice trajectories converges to the kfold product of the measure ν. This
means that the independence of the k individuals’ choices at time 0 persists over any
ﬁnite time span, a phenomenon sometimes called propagation of chaos. One can further
show that the empirical distribution of the N agents’ choice trajectories also converges
to the measure ν. (Since ν is the (limiting) distribution of each individual’s stochastic
choice trajectory, this result is a generalization of the GlivenkoCantelli Theorem.) Now
N
the time t marginal of this empirical distribution is none other than our state variable Xt ,
376 so Theorem 9.2.3 tells us that the collection of time t marginals of ν is none other than
the solution to our mean dynamic (M). For an overview of the mathematical literature
relevant to this discussion, see Sznitman (1991).
Appendices 9.A and 9.B: Billingsley (1995) and Durrett (2005) are excellent graduate
level probability texts. The former book provides more thorough coverage of the topics
considered in this chapter, and contains an especially clear treatment of the Poisson
process. Norris (1997), Br´ maud (1999), and Stroock (2005) are all excellent books on
e
Markov chains and Markov processes. The ﬁrst of these is at an undergraduate level, the
last at a graduate level, and the middle one somewhere in between.
Appendix 9.C: This section follows Kurtz (1970). 377 378 CHAPTER TEN
Inﬁnite Horizon Behavior and Equilibrium Selection 10.0 Introduction To be added. 379 380 BIBLIOGRAPHY Abraham, R. and Robbin, J. (1967). Transversal Mappings and Flows. W. A. Benjamin, New
York.
Akin, E. (1979). The Geometry of Population Genetics. Springer, Berlin.
Akin, E. (1980). Domination or equilibrium. Mathematical Biosciences, 50:239–250.
Akin, E. (1990). The diﬀerential geometry of population genetics and evolutionary games.
In Lessard, S., editor, Mathematical and Statistical Developments of Evolutionary Theory,
pages 1–93. Kluwer, Dordrecht.
Akin, E. (1993). The General Topology of Dynamical Systems. American Mathematical Society,
Providence, RI.
Akin, E. and Losert, V. (1984). Evolutionary dynamics of zerosum games. Journal of
Mathematical Biology, 20:231–258.
Anderson, S. P., de Palma, A., and Thisse, J.F. (1992). Discrete Choice Theory of Product
Diﬀerentiation. MIT Press, Cambridge.
Arneodo, A., Coullet, P., and Tresser, C. (1980). Occurrence of strange attractors in threedimensional Volterra equations. Physics Letters, 79A:259–263.
Aubin, J.P. (1991). Viability Theory. Birkh¨ user, Boston.
a
Aubin, J.P. and Cellina, A. (1984). Diﬀerential Inclusions. Springer, Berlin.
Avriel, M. (1976). Nonlinear Programming: Analysis and Methods. PrenticeHall, Englewood
Cliﬀs, NJ.
Balkenborg, D. and Schlag, K. H. (2001). Evolutionarily stable sets. International Journal of
Game Theory, 29:571–595.
Beckmann, M., McGuire, C. B., and Winsten, C. B. (1956). Studies in the Economics of
Transportation. Yale University Press, New Haven. 381 Bena¨m, M. (1998). Recursive algorithms, urn processes, and the chaining number of chain
ı
recurrent sets. Ergodic Theory and Dynamical Systems, 18:53–87.
Bena¨m, M. (1999). Dynamics of stochastic approximation algorithms. In Az´ ma, J. et al.,
ı
e
editors, S´minaire de Probabilit´s XXXIII, pages 1–68. Springer, Berlin.
e
e
Bena¨m, M. and Hirsch, M. W. (1999). Mixed equilibria and dynamical systems arising
ı
from ﬁctitious play in perturbed games. Games and Economic Behavior, 29:36–72.
Bena¨m, M., Hofbauer, J., and Hopkins, E. (2006a). Learning in games with unstable
ı
equilibria. Unpublished manuscript, Universit´ de Neuchˆ tel, University of Vienna,
e
a
and University of Edinburgh.
Bena¨m, M., Hofbauer, J., and Sorin, S. (2005). Stochastic approximation and diﬀerential
ı
inclusions. SIAM Journal on Control and Optimization, 44:328–348.
Bena¨m, M., Hofbauer, J., and Sorin, S. (2006b). Stochastic approximation and diﬀerential
ı
inclusions II: Applications. Mathematics of Operations Research, 31:673–695.
Bena¨m, M. and Weibull, J. W. (2003). Deterministic approximation of stochastic evolution
ı
in games. Econometrica, 71:873–903.
Berger, U. (2007). Two more classes of games with the continuoustime ﬁctitious play
property. Games and Economic Behavior, 60:247–261.
Berger, U. and Hofbauer, J. (2006). Irrational behavior in the Brownvon NeumannNash
dynamics. Games and Economic Behavior, 56:1–6.
Bhatia, N. P. and Szeg˝ , G. P. (1970). Stability Theory of Dynamical Systems. Springer, Berlin.
o
Billingsley, P. (1995). Probability and Measure. Wiley, New York, third edition.
Binmore, K. and Samuelson, L. (1999). Evolutionary drift and equilibrium selection.
Review of Economic Studies, 66:363–393.
Binmore, K., Samuelson, L., and Vaughan, R. (1995). Musical chairs: Modeling noisy
evolution. Games and Economic Behavior, 11:1–35.
Bishop, D. T. and Cannings, C. (1978). A generalised war of attrition. Journal of Theoretical
Biology, 70:85–124.
Bjornerstedt, J. and Weibull, J. W. (1996). Nash equilibrium and evolution by imitation. In
¨
Arrow, K. J. et al., editors, The Rational Foundations of Economic Behavior, pages 155–181.
St. Martin’s Press, New York.
Bomze, I. M. (1986). Noncooperative twoperson games in biology: A classiﬁcation.
International Journal of Game Theory, 15:31–57. 382 Bomze, I. M. (1990). Dynamical aspects of evolutionary stability. Monatshefte fur Mathe¨
matik, 110:189–206.
Bomze, I. M. (1991). Cross entropy minimization in uninvadable states of complex populations. Journal of Mathematical Biology, 30:73–87.
Bomze, I. M. (2002). Regularity versus degeneracy in dynamics, games, and optimization:
A uniﬁed approach to diﬀerent aspects. SIAM Review, 44:394–414.
Bomze, I. M. and Potscher, B. M. (1989). Game Theoretical Foundations of Evolutionary
¨
Stability. Springer, Berlin.
Borgers, T. and Sarin, R. (1997). Learning through reinforcement and the replicator dy¨
namics. Journal of Economic Theory, 77:1–14.
Boylan, R. T. (1995). Continuous approximation of dynamical systems with randomly
matched individuals. Journal of Economic Theory, 66:615–625.
¨
Braess, D. (1968). Uber ein Paradoxen der Verkehrsplanung. Unternehmensforschung,
12:258–268.
Br´ maud, P. (1999). Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues.
e
Springer, New York.
Brown, G. W. and von Neumann, J. (1950). Solutions of games by diﬀerential equations. In
Kuhn, H. W. and Tucker, A. W., editors, Contributions to the Theory of Games I, volume 24
of Annals of Mathematics Studies, pages 73–79. Princeton University Press, Princeton.
Bulow, J. and Klemperer, P. (1999). The generalized war of attrition. American Economic
Review, 89:175–189.
Burdett, K. and Judd, K. L. (1983). Equilibrium price dispersion. Econometrica, 51:955–969.
Camerer, C. (2003). Behavioral Game Theory. Princeton University Press, Princeton.
Conley, C. (1978). Isolated Invariant Sets and the Morse Index. American Mathematical
Society, Providence, RI.
Cooper, R. W. (1999). Coordination Games: Complementarities and Macroeconomics. Cambridge University Press, Cambridge.
Cressman, R. (1992). The Stability Concept of Evolutionary Game Theory: A Dynamic Approach.
Springer, Berlin.
Cressman, R. (1995). Evolutionary game theory with two groups of individuals. Games
and Economic Behavior, 11:237–253.
Cressman, R. (1996). Frequencydependent stability for twospecies interactions. Theoretical Population Biology, 49:189–210.
383 Cressman, R. (1997). Local stability of smooth selection dynamics for normal form games.
Mathematical Social Sciences, 34:1–19.
Cressman, R. (2003). Evolutionary Dynamics and Extensive Form Games. MIT Press, Cambridge.
Cressman, R., Garay, J., and Hofbauer, J. (2001). Evolutionary stability concepts for nspecies frequencydependent interactions. Journal of Theoretical Biology, 211:1–10.
Crouzeix, J.P. (1998). Characterizations of generalized convexity and generalized monotonicity: A survey. In Crouzeix, J.P. et al., editors, Generalized Convexity, Generalized
Monotonicity: Recent Results, pages 237–256. Kluwer, Dordrecht.
Crow, J. F. and Kimura, M. (1970). An Introduction to Population Genetics Theory. Harper
and Row, New York.
Dafermos, S. and Sparrow, F. T. (1969). The traﬃc assignment problem for a general
network. Journal of Research of the National Bureau of Standards B, 73:91–118.
Dawkins, R. (1976). The Selﬁsh Gene. Oxford University Press, Oxford.
Dawkins, R. (1982). The Extended Phenotye. Oxford University Press, Oxford.
Demichelis, S. and Germano, F. (2000). On the indices of zeros of Nash ﬁelds. Journal of
Economic Theory, 94:192–217.
Demichelis, S. and Germano, F. (2002). On (un)knots and dynamics in games. Games and
Economic Behavior, 41:46–60.
Demichelis, S. and Ritzberger, K. (2003). From evolutionary to strategic stability. Journal
of Economic Theory, 113:51–75.
Dupuis, P. and Nagurney, A. (1993). Dynamical systems and variational inequalities.
Annals of Operations Research, 44:9–42.
Durrett, R. (2005). Probability: Theory and Examples. BrooksCole, Belmont, CA, third
edition.
Eigen, M. and Schuster, P. (1979). The Hypercycle: A Principle of Natural SelfOrganization.
Springer, Berlin.
Ellison, G. and Fudenberg, D. (2000). Learning puriﬁed mixed equilibria. Journal of
Economic Theory, 90:84–115.
Ely, J. C. and Sandholm, W. H. (2005). Evolution in Bayesian games I: Theory. Games and
Economic Behavior, 53:83–109.
Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence.
Wiley, New York.
384 Friedberg, S. H., Insel, A. J., and Spence, L. E. (1989). Linear Algebra. PrenticeHall,
Englewood Cliﬀs, NJ, second edition.
Friedman, D. (1991). Evolutionary games in economics. Econometrica, 59:637–666.
Fudenberg, D. and Kreps, D. M. (1993). Learning mixed equilibria. Games and Economic
Behavior, 5:320–367.
Fudenberg, D. and Levine, D. K. (1998). Theory of Learning in Games. MIT Press, Cambridge.
Fudenberg, D. and Tirole, J. (1991). Game Theory. MIT Press, Cambridge.
Gaunersdorfer, A. and Hofbauer, J. (1995). Fictitious play, Shapley polygons, and the
replicator equation. Games and Economic Behavior, 11:279–303.
Gilboa, I. and Matsui, A. (1991). Social stability and equilibrium. Econometrica, 59:859–867.
Gordon, W. B. (1972). On the diﬀeomorphisms of euclidean space. American Mathematical
Monthly, 79:755–759.
Guckenheimer, J. and Holmes, P. (1983). Nonlinear Oscillations, Dynamical Systems, and
Bifurcations of Vector Fields. Springer, Berlin.
Haigh, J. (1975). Game theory and evolution. Advances in Applied Probability, 7:8–11.
Hamilton, W. D. (1967). Extraordinary sex ratios. Science, 156:477–488.
Hamilton, W. D. (1996). Narrow Roads of Gene Land, volume 1. W. H. Freeman/Spektrum,
Oxford.
Harker, P. T. and Pang, J.S. (1990). Finitedimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms, and applications.
Mathematical Programming, 48:161–220.
Harsanyi, J. C. (1973). Games with randomly disturbed payoﬀs: A new rationale for
mixedstrategy equilibrium points. International Journal of Game Theory, 2:1–23.
Hart, S. and MasColell, A. (2001). A general class of adaptive strategies. Journal of
Economic Theory, 98:26–54.
Hart, S. and MasColell, A. (2003). Uncoupled dynamics do not lead to Nash equilibrium.
American Economic Review, 93:1830–1836.
Hartman, P. (1964). Ordinary Diﬀerential Equations. Wiley, New York.
Henry, C. (1973). An existence theorem for a class of diﬀerential equations with multivalued righthand side. Journal of Mathematical Analysis and Applications, 41:179–186.
Hewitt, E. and Stromberg, K. (1965). Real and Abstract Analysis. Springer, Berlin.
385 Hines, W. G. S. (1980). Three characterizations of population strategy stability. Journal of
Applied Probability, 17:333–340.
Hines, W. G. S. (1987). Evolutionary stable strategies: A review of basic theory. Theoretical
Population Biology, 31:195–272.
HiriartUrruty, J.B. and Lemar´ chal, C. (2001). Fundamentals of Convex Analysis. Springer,
e
Berlin.
Hirsch, M. W. (1988). Systems of diﬀerential equations that are competitive or cooperative
III: Competing species. Nonlinearity, 1:51–71.
Hirsch, M. W. and Smale, S. (1974). Diﬀerential Equations, Dynamical Systems, and Linear
Algebra. Academic Press, San Diego.
Hirsch, M. W., Smale, S., and Devaney, R. L. (2004). Diﬀerential Equations, Dynamical
Systems, and an Introduction to Chaos. Elsevier, Amsterdam.
Hofbauer, J. (1981). On the occurrence of limit cycles in the VolterraLotka equation.
Nonlinear Analysis, 5:1003–1007.
Hofbauer, J. (1985). The selection mutation equation. Journal of Mathematical Biology,
23:41–53.
Hofbauer, J. (1995a). Imitation dynamics for games. Unpublished manuscript, University
of Vienna.
Hofbauer, J. (1995b). Stability for the best response dynamics. Unpublished manuscript,
University of Vienna.
Hofbauer, J. (1996). Evolutionary dynamics for bimatrix games: A Hamiltonian system?
Journal of Mathematical Biology, 34:675–688.
Hofbauer, J. (2000). From Nash and Brown to Maynard Smith: Equilibria, dynamics, and
ESS. Selection, 1:81–88.
Hofbauer, J. and Hopkins, E. (2005). Learning in perturbed asymmetric games. Games and
Economic Behavior, 52:133–152.
Hofbauer, J., MalletParet, J., and Smith, H. L. (1991). Stable periodic solutions for the
hypercycle system. Journal of Dynamics and Diﬀerential Equations, 3:423–436.
Hofbauer, J. and Sandholm, W. H. (2002). On the global convergence of stochastic ﬁctitious
play. Econometrica, 70:2265–2294.
Hofbauer, J. and Sandholm, W. H. (2006). Survival of dominated strategies under evolutionary dynamics. Unpublished manuscript, University of Vienna and University of
Wisconsin.
386 Hofbauer, J. and Sandholm, W. H. (2007). Evolution in games with randomly disturbed
payoﬀs. Journal of Economic Theory, 132:47–69.
Hofbauer, J. and Sandholm, W. H. (2008). Stable games and their dynamics. Unpublished
manuscript, University of Vienna and University of Wisconsin.
Hofbauer, J., Schuster, P., and Sigmund, K. (1979). A note on evolutionarily stable strategies
and game dynamics. Journal of Theoretical Biology, 81:609–612.
Hofbauer, J., Schuster, P., and Sigmund, K. (1981). Competition and cooperation in catalytic
selfreplication. Journal of Mathematical Biology, 11:155–168.
Hofbauer, J. and Sigmund, K. (1988). Theory of Evolution and Dynamical Systems. Cambridge
University Press, Cambridge.
Hofbauer, J. and Sigmund, K. (1998). Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge.
Hofbauer, J. and Sigmund, K. (2003). Evolutionary game dynamics. Bulletin of the American
Mathematical Society (New Series), 40:479–519.
Hofbauer, J., Sorin, S., and Viossat, Y. (2007). Time average replicator and best reply
dynamics. Unpublished manuscript, University of Vienna.
Hofbauer, J. and Swinkels, J. M. (1996). A universal Shapley example. Unpublished
manuscript, University of Vienna and Northwestern University.
Hofbauer, J. and Weibull, J. W. (1996). Evolutionary selection against dominated strategies.
Journal of Economic Theory, 71:558–573.
Hopkins, E. (1999). A note on best response dynamics. Games and Economic Behavior,
29:138–150.
Hopkins, E. (2002). Two competing models of how people learn in games. Econometrica,
70:2141–2166.
Hopkins, E. and Seymour, R. M. (2002). The stability of price dispersion under seller and
consumer learning. International Economic Review, 43:1157–1190.
Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge University Press,
Cambridge.
Imhof, L. A. (2005). The longrun behavior of the stochastic replicator dynamics. Annals
of Applied Probability, 15:1019–1045.
Jordan, J. S. (1993). Three problems in learning mixedstrategy Nash equilibria. Games and
Economic Behavior, 5:368–386. 387 Kimura, M. (1958). On the change of population ﬁtness by natural selection. Heredity,
12:145–167.
Kojima, F. and Takahashi, S. (2007). Anticoordination games and dynamic stability.
International Game Theory Review, 9:667–688.
Krantz, S. G. and Parks, H. R. (1999). The Geometry of Domains in Space. Birkh¨ user, Boston.
a
Kuhn, H. W. (2003). Lectures on the Theory of Games. Princeton University Press, Princeton.
Kurtz, T. G. (1970). Solutions of ordinary diﬀerential equations as limits of pure jump
Markov processes. Journal of Applied Probability, 7:49–58.
Lahkar, R. (2007). The dynamic instability of dispersed price equilibria. Unpublished
manuscript, University College London.
Lahkar, R. and Sandholm, W. H. (2008). The projection dynamic and the geometry of
population games. Games and Economic Behavior, forthcoming.
Lang, S. (1997). Undergraduate Analysis. Springer, New York, second edition.
Lax, P. D. (2007). Linear Algebra and Its Applications. Wiley, Hoboken, NJ, second edition.
Lotka, A. J. (1920). Undamped oscillation derived from the law of mass action. Journal of
the American Chemical Society, 42:1595–1598.
Luce, R. D. and Raiﬀa, H. (1957). Games and Decisions: Introduction and Critical Survey.
Wiley, New York.
Marsden, J. E. and Ratiu, T. S. (2002). Introduction to Mechanics and Symmetry: A Basic
Exposition of Classical Mechanical Systems. Springer, Berlin, second edition.
Matsui, A. (1992). Best response dynamics and socially stable strategies. Journal of Economic
Theory, 57:343–362.
Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge University Press,
Cambridge.
Maynard Smith, J. and Price, G. R. (1973). The logic of animal conﬂict. Nature, 246:15–18.
McFadden, D. (1981). Econometric models of probabilistic choice. In Manski, C. F. and
McFadden, D., editors, Structural Analysis of Discrete Data with Econometric Applications,
pages 198–272. MIT Press, Cambridge.
McKelvey, R. D. and Palfrey, T. R. (1995). Quantal response equilibria for normal form
games. Games and Economic Behavior, 10:6–38.
Milgrom, P. and Roberts, J. (1990). Rationalizability, learning, and equilibrium in games
with strategic complementarities. Econometrica, 58:1255–1278.
388 Milgrom, P. and Shannon, C. (1994). Monotone comparative statics. Econometrica, 62:157–
180.
Milnor, J. W. (1965). Topology from the Diﬀerentiable Viewpoint. Princeton University Press,
Princeton.
Minty, G. J. (1967). On the generalization of a direct method of the calculus of variations.
Bulletin of the American Mathematical Society, 73:315–321.
Monderer, D. and Sela, A. (1997). Fictitious play and nocycling conditions. Unpublished
manuscript, The Technion.
Monderer, D. and Shapley, L. S. (1996). Potential games. Games and Economic Behavior,
14:124–143.
Nachbar, J. H. (1990). ’Evolutionary’ selection dynamics in games: Convergence and limit
properties. International Journal of Game Theory, 19:59–89.
Nagurney, A. (1999). Network Economics: A Variational Inequality Approach. Kluwer, Dordrecht, second edition.
Nagurney, A. and Zhang, D. (1996). Projected Dynamical Systems and Variational Inequalities
with Applications. Kluwer, Dordrecht.
Nagurney, A. and Zhang, D. (1997). Projected dynamical systems in the formulation,
stability analysis, and computation of ﬁxed demand traﬃc network equilibria. Transportation Science, 31:147–158.
Nash, J. F. (1951). Noncooperative games. Annals of Mathematics, 54:287–295.
Nemytskii, V. V. and Stepanov, V. V. (1960). Qualitative Theory of Diﬀerential Equations.
Princeton University Press, Princeton.
Norris, J. R. (1997). Markov Chains. Cambridge University Press, Cambridge.
Ok, E. A. (2007). Real Analysis with Economic Applications. Princeton University Press,
Princeton.
Patriksson, M. (1994). The Traﬃc Assignment Problem: Models and Methods. VSP, Utrecht.
Pohley, H.J. and Thomas, B. (1983). Nonlinear ESS models and frequency dependent
selection. BioSystems, 16:87–100.
Ritzberger, K. (1994). The theory of normal form games from the diﬀerentiable viewpoint.
International Journal of Game Theory, 23:207–236.
Ritzberger, K. and Weibull, J. W. (1995). Evolutionary selection in normal form games.
Econometrica, 63:1371–1399. 389 Roberts, A. W. and Varberg, D. E. (1973). Convex Functions. Academic Press, New York.
Robinson, C. (1995). Dynamical Systems: Stability, Symbolic Dynamics, and Chaos. CRC
Press, Boca Raton, FL.
Rockafellar, R. T. (1970). Convex Analysis. Princeton University Press, Princeton.
Rosenthal, R. W. (1973). A class of games possessing pure strategy Nash equilibria.
International Journal of Game Theory, 2:65–67.
Roughgarden, T. (2005). Selﬁsh Routing and the Price of Anarchy. MIT Press, Cambridge.
´
Roughgarden, T. and Tardos, E. (2002). How bad is selﬁsh routing? Journal of the ACM,
49:236–259.
´
Roughgarden, T. and Tardos, E. (2004). Bounding the ineﬃciency of equilibria in
nonatomic congestion games. Games and Economic Behavior, 49:389–403.
Samuelson, L. and Zhang, J. (1992). Evolutionary stability in asymmetric games. Journal
of Economic Theory, 57:363–391.
Sandholm, W. H. (2001). Potential games with continuous player sets. Journal of Economic
Theory, 97:81–108.
Sandholm, W. H. (2002). Evolutionary implementation and congestion pricing. Review of
Economic Studies, 69:81–108.
Sandholm, W. H. (2003). Evolution and equilibrium under inexact information. Games
and Economic Behavior, 44:343–378.
Sandholm, W. H. (2005a). Excess payoﬀ dynamics and other wellbehaved evolutionary
dynamics. Journal of Economic Theory, 124:149–170.
Sandholm, W. H. (2005b). Negative externalities and evolutionary implementation. Review
of Economic Studies, 72:885–915.
Sandholm, W. H. (2006a). Pairwise comparison dynamics and evolutionary foundations
for Nash equilibrium. Unpublished manuscript, University of Wisconsin.
Sandholm, W. H. (2006b). A probabilistic characterization of integrability for game dynamics. Unpublished manuscript, University of Wisconsin.
Sandholm, W. H. (2007a). Evolution in Bayesian games II: Stability of puriﬁed equilibria.
Journal of Economic Theory, 136:641–667.
Sandholm, W. H. (2007b). Pigouvian pricing and stochastic evolutionary implementation.
Journal of Economic Theory, 132:367–382.
Sandholm, W. H. (2008a). Local stability under evolutionary game dynamics. Unpublished
manuscript, University of Wisconsin.
390 Sandholm, W. H. (2008b). Potential functions for normal form games and for population
games. Unpublished manuscript, University of Wisconsin.
Sandholm, W. H., Dokumacı, E., and Lahkar, R. (2008). The projection dynamic and the
replicator dynamic. Games and Economic Behavior, forthcoming.
Sato, Y., Akiyama, E., and Farmer, J. D. (2002). Chaos in learning a simple twoperson
game. Procedings of the National Academy of Sciences, 99:4748–4751.
Schlag, K. H. (1998). Why imitate, and if so, how? A boundedly rational approach to
multiarmed bandits. Journal of Economic Theory, 78:130–156.
Schuster, P. and Sigmund, K. (1983). Replicator dynamics. Journal of Theoretical Biology,
100:533–538.
Schuster, P., Sigmund, K., Hofbauer, J., Gottlieb, R., and Merz, P. (1981a). Selfregulation of
behaviour in animal societies III: Games between two populations with selﬁnteraction.
Biological Cybernetics, 90:16–25.
Schuster, P., Sigmund, K., Hofbauer, J., and Wolﬀ, R. (1981b). Selfregulation of behaviour
in animal societies I: Symmetric contests. Biological Cybernetics, 40:1–8.
Schuster, P., Sigmund, K., Hofbauer, J., and Wolﬀ, R. (1981c). Selfregulation of behaviour
in animal societies II: Games between two populations without selﬁnteraction. Biological
Cybernetics, 90:9–15.
Selten, R. (1980). A note on evolutionarily stable strategies in asymmetric animal conﬂicts.
Journal of Theoretical Biology, 84:93–101.
Shahshahani, S. (1979). A new mathematical framework for the study of linkage and
selection. Memoirs of the American Mathematical Society, 211.
Shapley, L. S. (1964). Some topics in two person games. In Dresher, M., Shapley, L. S.,
and Tucker, A. W., editors, Advances in Game Theory, volume 52 of Annals of Mathematics
Studies, pages 1–28. Princeton University Press, Princeton.
Sheﬃ, Y. (1985). Urban Transportation Networks: Equilibrium Analysis with Mathematical
Programming Methods. PrenticeHall, Englewood Cliﬀs, NJ.
Shiga, T. and Tanaka, H. (1985). Central Limit Theorem for a system of Markovian particles with mean ﬁeld interactions. Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte
¨
Gebiete, 69:439–459.
Skyrms, B. (1990). The Dynamics of Rational Deliberation. Harvard University Press, Cambridge.
Skyrms, B. (1992). Chaos in game dynamics. Journal of Logic, Language, and Information,
1:111–130.
391 Slade, M. E. (1994). What does an oligopoly maximize? Journal of Industrial Economics,
42:45–51.
Smale, S. (1967). Diﬀerentiable dynamical systems. Bulletin of the American Mathematical
Society, 73:747–817.
Smirnov, G. V. (2002). Introduction to the Theory of Diﬀerential Inclusions. American Mathematical Society, Providence, RI.
Smith, H. L. (1995). Monotone Dynamical Systems: An Introduction to the Theory of Competitive
and Cooperative Systems. American Mathematical Society, Providence, RI.
Smith, M. J. (1984). The stability of a dynamic model of traﬃc assignment—an application
of a method of Lyapunov. Transportation Science, 18:245–252.
Sparrow, C., van Strien, S., and Harris, C. (2008). Fictitious play in 3 × 3 games: The
transition between periodic and chaotic behaviour. Games and Economic Behavior, 63:259–
291.
Stroock, D. W. (2005). An Introduction to Markov Processes. Springer, Berlin.
Swinkels, J. M. (1992). Evolutionary stability with equilibrium entrants. Journal of Economic
Theory, 57:306–332.
Swinkels, J. M. (1993). Adjustment dynamics and rational play in games. Games and
Economic Behavior, 5:455–484.
Sznitman, A. (1991). Topics in propagation of chaos. In Hennequin, P. L., editor, Ecole
d’Et´ de Probabilit´s de SaintFlour XIX, 1989, pages 167–251. Springer, Berlin.
e
e
Tanabe, Y. (2006). The propagation of chaos for interacting individuals in a large population. Mathematical Social Sciences, 51:125–152.
Tanaka, H. (1983). Some probabilistic problems in the spatially homogeneous Boltzmann
equation. In Kallianpur, G., editor, Theory and Application of Random Fields (Bangalore,
1982), pages 258–267. Springer, Berlin.
Taylor, P. D. (1979). Evolutionarily stable strategies with two types of players. Journal of
Applied Probability, 16:76–83.
Taylor, P. D. and Jonker, L. (1978). Evolutionarily stable strategies and game dynamics.
Mathematical Biosciences, 40:145–156.
Thomas, B. (1985). On evolutionarily stable sets. Journal of Mathematical Biology, 22:105–
115.
Topkis, D. (1979). Equilibrium points in nonzerosum nperson submodular games. SIAM
Journal on Control and Optimization, 17:773–787.
392 Topkis, D. (1998). Supermodularity and Complementarity. Princeton University Press, Princeton.
Ui, T. (2000). A Shapley value representation of potential games. Games and Economic
Behavior, 31:121–135.
van Damme, E. (1991). Stability and Perfection of Nash Equilibria. Springer, Berlin, second
edition.
Vickers, G. T. and Cannings, C. (1988). Patterns of ESS’s I. Journal of Theoretical Biology,
132:381–510.
Vives, X. (1990). Nash equilibrium with strategic complementarities. Journal of Mathematical Economics, 19:305–321.
Vives, X. (2000). Oligopoly Pricing: Old Ideas and New Tools. MIT Press, Cambridge.
Vives, X. (2005). Complementarities and games: New developments. Journal of Economic
Literature, 43:437–479.
Volterra, V. (1931). Lecons sur la Theorie Mathematique de la Lutte pour la Vie. GauthierVillars,
Paris.
Weibull, J. W. (1995). Evolutionary Game Theory. MIT Press, Cambridge.
Weibull, J. W. (1996). The mass action interpretation. Excerpt from ’The work of John
Nash in game theory: Nobel Seminar, December 8, 1994’. Journal of Economic Theory,
69:165–171.
Zeeman, E. C. (1980). Population dynamics from game theory. In Nitecki, Z. and Robinson,
C., editors, Global Theory of Dynamical Systems (Evanston, 1979), number 819 in Lecture
Notes in Mathematics, pages 472–497, Berlin. Springer. 393 ...
View
Full
Document
This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.
 Spring '09
 R.Srikant
 The Land

Click to edit the document details