pged - Population Games and Evolutionary Dynamics William H...

Info iconThis preview shows pages 1–5. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Population Games and Evolutionary Dynamics William H. Sandholm April 29, 2008 1 3 4 A chaotic attractor of the replicator dynamic ii CONTENTS 0 Introduction 1 I Population Games 3 1 Population Games 1.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Population Games . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Populations, Strategies, and States . . . . . . . . . . 1.1.2 Payoffs . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Best Responses and Nash Equilibria . . . . . . . . . 1.1.4 Prelude to Evolutionary Dynamics . . . . . . . . . . 1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Random Matching in Normal Form Games . . . . . 1.2.2 Congestion Games . . . . . . . . . . . . . . . . . . . 1.2.3 Two Simple Externality Models . . . . . . . . . . . . 1.3 The Geometry of Population Games and Nash Equilibria . 1.3.1 Drawing Two-Strategy Games . . . . . . . . . . . . 1.3.2 Displacement Vectors and Tangent Spaces . . . . . 1.3.3 Orthogonal Projections . . . . . . . . . . . . . . . . . 1.3.4 Drawing Three-Strategy Games . . . . . . . . . . . 1.3.5 Tangent Cones and Normal Cones . . . . . . . . . . 1.3.6 Normal Cones and Nash Equilibria . . . . . . . . . 1.A Affine Spaces, Tangent Spaces, and Orthogonal Projections 1.A.1 Affine Spaces . . . . . . . . . . . . . . . . . . . . . . 1.A.2 Affine Hulls of Convex Sets . . . . . . . . . . . . . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 6 6 6 7 8 8 8 10 11 12 12 14 17 19 22 23 27 28 29 1.A.3 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.B The Moreau Decomposition Theorem . . . . . . . . . . . . . . . . . . . . . . 34 1.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2 Potential Games, Stable Games, and Supermodular Games 2.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Full Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Full Population Games . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Definition and Characterization . . . . . . . . . . . . . . . . . . 2.1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Nash Equilibria of Full Potential Games . . . . . . . . . . . . . . 2.1.5 The Geometry of Nash Equilibrium in Full Potential Games . . 2.1.6 Efficiency in Homogeneous Full Potential Games . . . . . . . . 2.1.7 Inefficiency Bounds for Congestion Games . . . . . . . . . . . . 2.2 Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Definition, Characterizations, and Examples . . . . . . . . . . . 2.2.3 Potential Games and Full Potential Games . . . . . . . . . . . . 2.2.4 Passive Games and Constant Games . . . . . . . . . . . . . . . . 2.3 Stable Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Invasion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Global Neutral Stability and Global Evolutionary Stability . . . 2.3.5 Nash Equilibrium and Global Neutral Stability in Stable Games 2.4 Supermodular Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Best Response Monotonicity in Supermodular Games . . . . . . 2.4.4 Nash Equilibria of Supermodular Games . . . . . . . . . . . . . 2.A Multivariate Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.1 Univariate Calculus . . . . . . . . . . . . . . . . . . . . . . . . . 2.A.2 The Derivative as a Linear Map . . . . . . . . . . . . . . . . . . . 2.A.3 Differentiation as a Linear Operation . . . . . . . . . . . . . . . . 2.A.4 The Product Rule and the Chain Rule . . . . . . . . . . . . . . . 2.A.5 Homogeneity and Euler’s Theorem . . . . . . . . . . . . . . . . 2.A.6 Higher Order Derivatives . . . . . . . . . . . . . . . . . . . . . . iv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 38 38 38 40 43 49 51 53 56 56 57 60 62 64 64 67 70 71 75 79 79 82 84 85 86 86 86 89 89 91 92 2.A.7 The Whitney Extension Theorem . . . . . . . . . . . . . . . . . 2.A.8 Vector Integration and the Fundamental Theorem of Calculus 2.A.9 Potential Functions and Integrability . . . . . . . . . . . . . . . 2.B Affine Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.B.1 Linear Forms and the Riesz Representation Theorem . . . . . 2.B.2 Dual Characterizations of Multiples of Linear Forms . . . . . 2.B.3 Derivatives of Functions on Affine Spaces . . . . . . . . . . . . 2.B.4 Affine Integrability . . . . . . . . . . . . . . . . . . . . . . . . . 2.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II 3 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Deterministic Evolutionary Dynamics 105 Revision Protocols and Evolutionary Dynamics 3.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Revision Protocols and Mean Dynamics . . . . . . . . . . . . . . . 3.1.1 Revision Protocols . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Mean Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Target Protocols and Target Dynamics . . . . . . . . . . . . 3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Evolutionary Dynamics . . . . . . . . . . . . . . . . . . . . . . . . 3.A Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . 3.A.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . 3.A.2 Existence, Uniqueness, and Continuity of Solutions . . . . 3.A.3 Ordinary Differential Equations on Compact Convex Sets 3.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Deterministic Dynamics: Families and Properties 4.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Principles for Evolutionary Modeling . . . . . . . . . . . . . . 4.2 Desiderata for Revision Protocols and Evolutionary Dynamics 4.2.1 Limited Information . . . . . . . . . . . . . . . . . . . . 4.2.2 Incentives and Aggregate Behavior . . . . . . . . . . . 4.3 Families of Evolutionary Dynamics . . . . . . . . . . . . . . . . 4.4 Imitative Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . v 93 93 94 95 95 96 98 100 102 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 107 108 108 110 112 113 117 119 119 120 123 125 . . . . . . . . . 127 127 128 130 130 132 135 140 140 141 4.4.3 Biological Derivations of the Replicator Dynamic . . . . . . . 4.4.4 Extinction and Invariance . . . . . . . . . . . . . . . . . . . . . 4.4.5 Monotone Percentage Growth Rates and Positive Correlation 4.4.6 Rest Points and Restricted Equilibria . . . . . . . . . . . . . . . 4.5 Excess Payoff Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Definition and Interpretation . . . . . . . . . . . . . . . . . . . 4.5.2 Incentives and Aggregate Behavior . . . . . . . . . . . . . . . 4.6 Pairwise Comparison Dynamics . . . . . . . . . . . . . . . . . . . . . 4.6.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Incentives and Aggregate Behavior . . . . . . . . . . . . . . . 4.6.3 Desiderata Revisited . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Multiple Revision Protocols and Combined Dynamics . . . . . . . . . 4.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Best Response and Projection Dynamics 5.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Best Response Dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Definition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Construction and Properties of Solution Trajectories . . . . . . . . . 5.1.3 Incentive Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Perturbed Best Response Dynamics . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Revision Protocols and Mean Dynamics . . . . . . . . . . . . . . . . 5.2.2 Perturbed Optimization: A Representation Theorem . . . . . . . . 5.2.3 Logit Choice and the Logit Dynamic . . . . . . . . . . . . . . . . . . 5.2.4 Perturbed Incentive Properties via Virtual Payoffs . . . . . . . . . . 5.3 The Projection Dynamic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Solution Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Incentive Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Revision Protocols and Connections with the Replicator Dynamic . 5.A Differential Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.A.1 Basic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.A.2 Differential Equations Defined by Projections . . . . . . . . . . . . . 5.B The Legendre Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.B.1 Legendre Transforms of Functions on Open Intervals . . . . . . . . 5.B.2 Legendre Transforms of Functions on Multidimensional Domains . 5.C Perturbed Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi . . . . . . . . . . . . . 145 147 149 151 152 152 154 156 156 157 159 159 161 . . . . . . . . . . . . . . . . . . . . . . 163 163 164 164 166 171 172 173 174 177 182 184 184 185 188 188 191 191 193 194 195 198 199 5.C.1 Proof of the Representation Theorem . . . . . . . . . . . . . . . . . . 199 5.C.2 Additional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 5.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 III 6 7 Convergence and Nonconvergence Global Convergence of Evolutionary Dynamics 6.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Potential Functions as Lyapunov Functions . . . . . . . . . . . . . 6.1.2 Gradient Systems for Potential Games . . . . . . . . . . . . . . . . 6.2 Stable Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 The Projection and Replicator Dynamics in Strictly Stable Games 6.2.2 Integrable Target Dynamics . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Impartial Pairwise Comparison Dynamics . . . . . . . . . . . . . 6.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Supermodular Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 The Best Response Dynamic in Two-Player Normal Form Games 6.3.2 Stochastically Perturbed Best Response Dynamics . . . . . . . . . 6.4 Dominance Solvable Games . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Dominated and Iteratively Dominated Strategies . . . . . . . . . 6.4.2 The Best Response Dynamic . . . . . . . . . . . . . . . . . . . . . . 6.4.3 Imitative Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.A Limit and Stability Notions for Deterministic Dynamics . . . . . . . . . . 6.A.1 ω-Limits and Notions of Recurrence . . . . . . . . . . . . . . . . . 6.A.2 Stability of Sets of States . . . . . . . . . . . . . . . . . . . . . . . . 6.B Stability Analysis via Lyapunov Functions . . . . . . . . . . . . . . . . . 6.B.1 Lyapunov Stable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 6.B.2 ω-Limits and Attracting Sets . . . . . . . . . . . . . . . . . . . . . 6.B.3 Asymptotically Stable and Globally Asymptotically Stable Sets . 6.C Cooperative Differential Equations . . . . . . . . . . . . . . . . . . . . . . 6.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 209 211 211 216 219 221 224 235 237 239 240 243 247 247 248 249 250 250 251 252 252 253 255 256 258 Local Stability under Evolutionary Dynamics 261 7.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 7.1 Non-Nash Rest Points of Imitative Dynamics . . . . . . . . . . . . . . . . . . 262 vii 7.2 7.3 7.4 7.5 7.6 7.A 7.B 7.C 7.N 8 Local Stability in Potential Games . . . . . . . . . . . . . . . . . . . . . Evolutionarily Stable States . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 Regular ESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Local Stability via Lyapunov Functions . . . . . . . . . . . . . . . . . 7.4.1 The Replicator and Projection Dynamics . . . . . . . . . . . . 7.4.2 Target and Pairwise Comparison Dynamics: Interior ESS . . . 7.4.3 Target and Pairwise Comparison Dynamics: Boundary ESS . Linearization of Imitative Dynamics . . . . . . . . . . . . . . . . . . . 7.5.1 The Replicator Dynamic . . . . . . . . . . . . . . . . . . . . . . 7.5.2 General Imitative Dynamics . . . . . . . . . . . . . . . . . . . . Linearization of Perturbed Best Response Dynamics . . . . . . . . . . 7.6.1 Deterministically Perturbed Best Response Dynamics . . . . . 7.6.2 The Logit Dynamic . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.A.1 Rank and Invertibility . . . . . . . . . . . . . . . . . . . . . . . 7.A.2 Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . 7.A.3 Similarity, (Block) Diagonalization, and the Spectral Theorem 7.A.4 Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . . 7.A.5 The Real Jordan Canonical Form . . . . . . . . . . . . . . . . . 7.A.6 The Spectral Norm and Singular Values . . . . . . . . . . . . . 7.A.7 Hines’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . 7.B.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.B.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.B.3 Stability and Hyperbolicity . . . . . . . . . . . . . . . . . . . . Linearization of Nonlinear Differential Equations . . . . . . . . . . . Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nonconvergence of Evolutionary Dynamics 8.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Conservative Properties of Evolutionary Dynamics . 8.1.1 Constants of Motion in Null Stable Games . . 8.1.2 Preservation of Volume . . . . . . . . . . . . . 8.2 Games with Nonconvergent Evolutionary Dynamics 8.2.1 Circulant Games . . . . . . . . . . . . . . . . . viii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 265 265 266 267 268 268 269 271 275 276 280 282 282 283 285 285 286 287 290 290 292 293 294 295 297 298 299 302 . . . . . . 307 307 308 308 313 315 316 8.2.2 Continuation of Attractors for Parameterized Games . . 8.2.3 Mismatching Pennies . . . . . . . . . . . . . . . . . . . . 8.2.4 The Hypnodisk Game . . . . . . . . . . . . . . . . . . . . 8.3 Chaotic Evolutionary Dynamics . . . . . . . . . . . . . . . . . . . 8.4 Survival of Dominated Strategies . . . . . . . . . . . . . . . . . . 8.A Three Classical Theorems on Nonconvergent Dynamics . . . . . 8.A.1 Liouville’s Theorem . . . . . . . . . . . . . . . . . . . . . 8.A.2 The Poincar´ -Bendixson and Bendixson-Dulac Theorems e 8.B Attractors and Continuation . . . . . . . . . . . . . . . . . . . . . 8.B.1 Attractors and Repellors . . . . . . . . . . . . . . . . . . . 8.B.2 Continuation of Attractors . . . . . . . . . . . . . . . . . . 8.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stochastic Evolutionary Models Stochastic Evolution and Deterministic Approximation 9.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 The Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Finite Horizon Deterministic Approximation . . . . . . . . . . . . . . . . . . 9.2.1 Kurtz’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.2 Deterministic Approximation of the Stochastic Evolutionary Process 9.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Finite Population Effects . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Discrete Time Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.A The Exponential and Poisson Distributions . . . . . . . . . . . . . . . . . . . 9.A.1 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.A.2 The Poisson Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . 9.B Countable State Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . 9.B.1 Countable Probability Models . . . . . . . . . . . . . . . . . . . . . . 9.B.2 Uncountable Probability Models and Measure Theory . . . . . . . . 9.B.3 Distributional Properties and Sample Path Properties . . . . . . . . . 9.B.4 Countable State Markov Chains . . . . . . . . . . . . . . . . . . . . . 9.B.5 Countable State Markov Processes . . . . . . . . . . . . . . . . . . . . 9.C Kurtz’s Theorem in Discrete Time . . . . . . . . . . . . . . . . . . . . . . . . . 9.N Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 319 322 326 330 333 343 343 346 347 347 348 349 351 353 353 354 355 356 358 362 362 362 363 363 366 368 368 369 370 372 373 375 376 10 Infinite Horizon Behavior and Equilibrium Selection 379 10.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Bibliography 381 x Frequently Used Definitions Classes of games (2.8) ΦF(x) = f (x) ( ≡ Φ f (x) ) (2.14) ( y − x) (F( y) − F(x)) ≤ 0 (2.22) ˜ ˜ Σ y ≥ Σx implies that Σ F( y) ≥ Σ F(x) potential game 58 stable game 64 supermodular game 80 General equations for mean dynamics (M) p pp ˙ xi = p p x j ρ ji (Fp (x), xp ) − xi ρi j (Fp (x), xp ) (3.3) p mean dynamic 111 target dynamic 112 exact target dynamic j∈Sp 113 j∈Sp p p p τ j (Fp (x), xp ) ˙ xi = mp τi (Fp (x), xp ) − xi j∈Sp (3.5) ˙ xp = mp σp (Fp (x), xp ) − xp . Properties of evolutionary dynamics (NS) VF (x) = 0 if and only if x ∈ NE(F) (PC) VF (x) p Nash stationarity positive correlation p 0 implies that VF (x) Fp (x) > 0 132 132 Six fundamental evolutionary dynamics (R) p p ˆp ˙ xi = xi Fi (x) replicator dynamic p p ˆp ˙ (BNN) xi = mp [Fi (x)]+ − xi BNN dynamic 154 Smith dynamic 157 best response dynamic 165 logit(η) dynamic 177 projection dynamic p ˆ [F j (x)]+ 141 184 j∈Sp (S) p p ˙ xi = p p j∈Sp (BR) p p x j [Fi (x) − F j (x)]+− xi j∈Sp ˙ xp ∈ mp Mp (Fp (x)) − xp p p [F j (x) − Fi (x)]+ p exp(η−1 Fi (x)) −1 p j∈Sp exp(η F j (x)) (L) ˙ xi = mp (P) p ˙ x = ΠTX(x) (F(x)) − xi xi xii CHAPTER ZERO Introduction 1 2 Part I Population Games 3 CHAPTER ONE Population Games 1.0 Introduction Population games are used to model strategic interactions with these five traits: (i) The number of agents is large. (ii) Individual agents are small: Any one agent’s behavior has little or no effect on other agents’ payoffs. (iii) The number of roles is finite: Each agent is a member of one of a finite number of populations. Members of a population choose from the same set of strategies, and their payoffs are identical functions of own behavior and opponents’ behavior. (iv) Agents interact anonymously: Each agent’s payoffs only depend on opponents’ behavior through the distribution of opponents’ choices. (v) Payoffs are continuous: The dependence of each agent’s payoffs on the distribution of opponents’ choices is continuous. Applications fitting this description can be found in a variety of disciplines, including economics (externalities, macroeconomic spillovers, centralized markets), biology (animal conflict, genetic natural selection), transportation science (highway network congestion, mode choice), and computer science (selfish routing of Internet traffic). Population games provide a unified framework for studying these and other topics, helping us to identify the forces that drive parallel conclusions in seemingly disparate fields. The most convenient way to define population games is to assume that the set of agents forms a continuum, as doing so enables us to study these games using tools from analysis. Of course, real populations are finite. Still, the continuum assumption is appropriate when the effects of individuals’ choices on opponents’ payoffs are small, or, more generally, when 5 individuals ignore these effects when deciding how to act. In subsequent chapters we will draw explicit links between the finite and continuous models. 1.1 1.1.1 Population Games Populations, Strategies, and States Let P = {1, . . . , p }, be a society consisting of p ≥ 1 populations of agents. Agents in population p form a continuum of mass mp > 0. (Thus, p is the number of populations, while p is an arbitrary population.) The set of strategies available to agents in population p is denoted Sp = {1, . . . , np }, and has typical elements i, j, and (in the context of normal form games) sp . We let n = p∈P np equal the total number of pure strategies in all populations. During game play, each agent in population p selects a (pure) strategy from Sp . The set p p of population states (or strategy distributions) for population p is Xp = {xp ∈ Rn : i∈Sp xi = + p mp }. The scalar xi ∈ R+ represents the mass of players in population p choosing strategy p i ∈ Sp . Elements of Xv , the set of vertices of Xp , are called pure population states, since at these states all agents choose the same strategy. Elements of X = p∈P Xp = {x = (x1 , . . . , xp ) ∈ Rn : xp ∈ Xp }, the set of social states, + p describe behavior in all p populations at once. The elements of Xv = p∈P Xv are the vertices of X, and are called the pure social states. When there is just one population (p = 1), we assume that its mass is 1, and we omit the superscript p from all of our notation: thus, the strategy set is S = {1, . . . , n}, the state space is X = {x ∈ Rn : i∈S xi = 1}, the simplex in Rn , and the set of pure states Xv = {ei : i ∈ S} is + the set of standard basis vectors in Rn . 1.1.2 Payoffs We generally take the sets of populations and strategies as fixed and identify a game with its payoff function. A payoff function F : X → Rn is a continuous map that assigns p each social state a vector of payoffs, one for each strategy in each population. Fi : X → R p denotes the payoff function for strategy i ∈ Sp , while Fp : X → Rn denotes the payoff functions for all strategies in Sp . While our standing assumption is that F is continuous, we often impose the stronger requirements that F be Lipschitz continuous or continuously differentiable (C1 ). These additional assumptions will be made explicit whenever we use them. 6 We define Fp (x) = 1 mp pp xi Fi (x) i∈Sp to be the (weighted) average payoff obtained by members of population p at social state x. Similarly, we let pp F(x) = xi Fi (x) = p∈P i∈Sp mp Fp (x) p∈P denote the aggregate payoff achieved by the society as a whole. 1.1.3 Best Responses and Nash Equilibria To describe optimal behavior, we define population p’s pure best response correspondence, b : X ⇒ Sp , which specifies the strategies in Sp that are optimal at each social state x: p p bp (x) = argmax Fi (x). i∈Sp p p p Let ∆p = { yp ∈ Rn : i∈Sp yi = 1} denote the simplex in Rn . The mixed best response + correspondence for population p, Bp : X ⇒ ∆p is given by p Bp (x) = yp ∈ ∆p : yi > 0 ⇒ i ∈ bp (x) . In words, Bp (x) is the set of probability distributions in ∆p whose supports only contain pure strategies that are optimal at x. Geometrically, Bp (x) is the convex hull of the vertices of ∆p corresponding to elements of bp (x). Social state x ∈ X is a Nash equilibrium of the game F if each agent in every population chooses a best response to x: NE (F) = x ∈ X : xp ∈ mp Bp (x) for all p ∈ P . We will see in Section 1.3.6 that the Nash equilibria of a population game can also be characterized in a purely geometric way. Nash equilibria always exist: Theorem 1.1.1. Every population game admits at least one Nash equilibrium. Theorem 1.1.1 can be proved by applying Kakutani’s Theorem to the profile of best response correspondences. But we will see that in each of the three classes of games we 7 focus on in Chapter 2—potential games (Sections 2.1 and 2.2), stable games (Section 2.3), and supermodular games (Section 2.4)—existence of Nash equilibrium can be established without recourse to fixed point theorems. 1.1.4 Prelude to Evolutionary Dynamics In traditional game-theoretic analyses, it is usual to assume that players follow some Nash equilibrium of the game at hand. But because population games involve large numbers of agents, the equilibrium assumption is quite strong, making it more appealing to rely on less demanding assumptions. Therefore, rather than assume equilibrium play, we suppose that individual agents gradually adjust their choices to their current strategic environment. We then ask whether or not the induced behavior trajectories converge to Nash equilibrium. When they do, the Nash prediction can be justified; when they do not, the Nash prediction may be unwarranted. The question of convergence to equilibrium is a central issue in this book. We will see later on that in the three classes of games studied in Chapter 2, convergence results can be established in some generality—that is, without being overly specific about the exact nature of the agents’ revision protocols. But in this chapter and the next, we confine ourselves to introducing population games and studying their equilibria. 1.2 Examples To fix ideas, we offer four examples of population games. These examples and many more will be developed and analyzed through the remainder of the book. 1.2.1 Random Matching in Normal Form Games Let us begin with the canonical example of evolutionary game theory. Example 1.2.1. Random matching in a single population to play a symmetric game. A symmetric two player normal form game is defined by a strategy set S = {1, . . . , n} and a payoff matrix A ∈ Rn×n . Ai j is the payoff a player obtains when he chooses strategy i and his opponent chooses strategy j; this payoff does not depend on whether the player in question is called player I or player II. Below is the bimatrix corresponding to A when n = 3. 8 1 1 A11 , A11 Player I 2 A21 , A12 3 A31 , A13 Player II 2 3 A12 , A21 A13 , A31 A22 , A22 A23 , A32 A32 , A23 A33 , A33 To obtain a population game from this normal form game, we suppose that agents in a single (unit mass) population are randomly matched to play A. Assuming that agents evaluate probability distributions over payoffs by taking expectations (i.e., that the entries of the matrix A are von Neumann–Morgenstern utilities), the payoff to strategy i when the population state is x is Fi (x) = j∈S Ai j x j . It follows that the population game associated with A is described by the linear map F(x) = Ax. § Example 1.2.2. Random matching in two populations. A (possibly asymmetric) two player game is defined by two strategy sets, S1 = {1, . . . , n1 } and S2 = {1, . . . , n2 }, and two payoff 1 2 1 2 matrices, U1 ∈ Rn ×n and U2 ∈ Rn ×n . The corresponding bimatrix when n1 = 2 and n2 = 3 is as follows. 1 Player I 1 2 1 2 U11 , U11 1 2 U21 , U21 Player II 2 3 1 2 1 2 U12 , U12 U13 , U13 1 1 2 2 U22 , U22 U23 , U23 To define the corresponding population game, we suppose that there are two unit mass populations, one corresponding to each player role. One agent from each population is drawn at random and matched to play the game (U1 , U2 ). The payoff functions for populations 1 and 2 are given by F1 (x) = U1 x2 and F2 (x) = (U2 ) x1 , so the entire population game is described by the linear map F1 (x) 0 F(x) = 2 = 2 F (x) (U ) U1 x1 U1 x2 = 2 2 1 . § x (U ) x 0 Example 1.2.3. Random matching in p populations. To generalize the previous example, we define a p player normal form game. Let Sp = {1, . . . , np } denote player p’s strategy set and S = q∈P Sq the set of pure strategy profiles; player p’s payoff function Up is a map from S to R. In the population game, agents in p unit mass populations are randomly matched to play the normal form game U = (U1 , . . . , Up ), with one agent from each population p 9 being drawn to serve in player role p. This procedure yields a population game with the multilinear (i.e., linear in each xp ) payoff function p Up s1 , . . . , sp Fsp (x) = s−p ∈S−p xrr , where S−p = s Sq . § qp rp We conclude with an observation relating the Nash equilibria of population games generated by random matching to those of the underlying normal form games. Observation 1.2.4. (i) In the single population case (Example 1.2.1), the Nash equilibria of F are the symmetric Nash equilibria of the symmetric normal form game U = (A, A ). (ii) In the multipopulation cases (Examples 1.2.2 and 1.2.3), the Nash equilibria of F are the Nash equilibria of the normal form game U = (U1 , . . . , Up ). 1.2.2 Congestion Games Because of the linearity of the expectation operator, random matching in normal form games generates population games with linear or multilinear payoffs. Moreover, when p ≥ 2, each agent’s payoffs are independent of the behavior of other members of his population. Outside of the random matching context, neither of these properties need hold. Our next class of example provides a case in point. Example 1.2.5. Congestion games. Consider the following model of highway congestion. A collection of towns is connected by a network of links (Figure 1.2.1). For each ordered pair of towns there is a population of agents, each of whom needs to commute from the first town in the pair (where he lives) to the second (where he works). To accomplish this, the agent must choose a path connecting the two towns. The payoff the agent obtains is the negation of the delay on the path he takes. The delay on the path is the sum of the delays on its constituent links, while the delay on a link is a function of the number of agents who use that link. Congestion games are used to study not only highway congestion, but also more general settings involving “symmetric” externalities. To define a congestion game, we begin with a finite collection of facilities (e.g., links in a highway network), denoted Φ. p Every strategy i ∈ Sp requires the use of some collection of facilities Φi ⊆ Φ (e.g., the links p in route i). The set ρp (φ) = {i ∈ Sp : φ ∈ Φi } contains those strategies in Sp that require facility φ. Each facility φ has a cost function cφ : R+ → R whose argument is the facility’s utilization 10 Figure 1.2.1: A highway network. level uφ , the total mass of agents using the facility: p uφ (x) = xi . p∈P i∈ρp (φ) Payoffs in the congestion game are obtained by summing the appropriate facility costs and multiplying by −1. p Fi (x) = − cφ uφ (x) . p φ∈Φi Since driving on a link increases the delays experienced by other drivers on that link, cost functions in models of highway congestion are increasing; they are typically convex as well. On the other hand, when congestion games are used to model settings with positive externalities (e.g., consumer technology choice), cost functions are decreasing. Evidently, payoffs in congestion games depend on own-population behavior, and need only be linear if the underlying cost functions are linear themselves. Congestion games are the leading examples of potential games (Sections 2.1 and 2.2; congestion games with increasing cost functions are also stable games (Section 2.3). § 1.2.3 Two Simple Externality Models We conclude this section with two simpler models of externalities. Example 1.2.6. Asymmetric negative externalities. Agents from a single population choose from a set of n activities. There are externalities both within and across activities; the 11 increasing C1 function ci j : [0, 1] → R represents the cost imposed on agents who choose activity i by agents who choose activity j. Payoffs in this game are described by Fi (x) = − ci j (x j ) . j∈S If “own activity” externalities are strong, in the sense that the derivatives of the cost functions satisfy ci j (x j ) + c ji (xi ) , 2cii (xi ) ≥ ji then F is a stable game (Section 2.3 of Chapter 2). § Example 1.2.7. Search with positive externalities. Consider this simple model of macroeconomic spillovers. Members of a single population choose levels of search effort from the set S = {1, . . . , n}. Stronger efforts increase the likelihood of finding trading partners, so that payoffs are increasing both in own search effort and in aggregate search effort. In particular, payoffs are given by Fi (x) = m(i) b(a(x)) − c(i), where a(x) = n=1 kxk represents aggregate search effort, the increasing function b : R+ → R k represents the benefits of search as a function of aggregate effort, the increasing function m : S → R is the benefit multiplier, and the arbitrary function c : S → R captures search costs. In Section 2.4, we will show that F is a supermodular game. § 1.3 The Geometry of Population Games and Nash Equilibria In low-dimensional cases, we can present the payoff vectors generated by a population game in pictures. Doing so provides a way of visualizing the strategic forces at work; moreover, the geometric insights we obtain can be extended to games that we cannot draw. 1.3.1 Drawing Two-Strategy Games The population games that are easiest to draw are two-strategy games: i.e., games played by a single population of agents who choose between a pair of strategies. When drawing a two-strategy game, we represent the simplex as a subset of R2 . We synchronize the 12 x1 x* x2 Figure 1.3.1: Payoffs in 12 Coordination. drawing with the layout of the payoff matrix by using the vertical coordinate to represent the mass on the first strategy and the horizontal coordinate to represent the mass on the second strategy. We then select a group of states spaced evenly through the simplex; from each state x, we draw an arrow representing the payoff vector F(x) that corresponds to x. (Actually, we draw scaled-down versions of the payoff vectors in order to make the diagrams easier to read.) In Figures 1.3.1 and 1.3.2, we present the payoff vectors generated by the two-strategy coordination game FC2 and the Hawk-Dove game FHD : 1 0 x1 x1 FC2 (x) = 0 2 x = 2x ; 2 2 −1 2 xH 2xD − xH FHD (x) = 0 1 x = x D D Let us focus on the coordination game FC2 . At the pure state e1 = (1, 0) at which all agents play strategy 1, the payoffs to the two strategies are FC2 (e1 ) = 1 and FC2 (e1 ) = 0; 2 1 hence, the arrow representing FC2 (e1 ) points directly upward from state e1 . At the interior Nash equilibrium x∗ = (x∗ , x∗ ) = ( 2 , 1 ), each strategy earns a payoff of 2 , so the arrow 12 33 3 2 representing payoff vector FC2 (x∗ ) = ( 3 , 2 ) is drawn at a right angle to the simplex at x∗ . 3 Similar logic explains how the payoff vectors are drawn at other states, and how the Hawk-Dove figure is constructed as well. The diagrams of FC2 and FHD help us visualize the incentives faced by agents playing these games. In the coordination game, the payoff vectors “push outward” toward the 13 xH x* xD Figure 1.3.2: Payoffs in the Hawk-Dove game. two axes, reflecting an incentive structure that drives the population toward the two pure Nash equilibria. In contrast, payoff vectors in the Hawk-Dove game “push inward”, away from the axes, reflecting forces leading the population toward the interior Nash equilibrium x∗ = ( 1 , 1 ). 22 1.3.2 Displacement Vectors and Tangent Spaces To draw games with more than two strategies we need to introduce two new objects: TX, the tangent space of the state space X; and Φ, the orthogonal projection of Rn onto TX. We summarize the relevant concepts in this subsection and the next; for a fuller treatment, see the Appendix. To start, let us focus on a single-population game F. Imagine that the population is initially at state x, and that a group of agents of mass ε switch from strategy i to strategy j. These revisions move the state from x to x + ε(e j − ei ): the mass of agents playing strategy i goes down by ε, while the mass of agents playing strategy j goes up by ε. Vectors like ε(e j − ei ), which represent the effects of such strategy revisions on the population state, are called displacement vectors. (Since these vectors are tangent to the state space X, we also call them tangent vectors—more on this below.) In Figure 1.3.3, we illustrate displacement vectors for two-strategy games. In this setting, displacement vectors can only point in two directions: when agents switch from strategy 1 to strategy 2, the state moves in direction e2 − e1 , represented by an arrow 14 e1 –e e1 2 –e e2 x 1 e2 Figure 1.3.3: Displacement vectors for two-strategy games. pointing southeast; when agents switch from strategy 2 to strategy 1, the state moves in direction e1 − e2 , represented by an arrow pointing northwest. Both of these vectors are tangent to the state space X. (Two clarifications are in order here. First, remember that a vector is characterized by its direction and its length, not where we position its base. When we draw an arrow representing the vector z, we use the context to determine an appropriate position x for the arrow’s base; the arrow takes the form of a directed line segment from x to x + z. Second, since we are mainly interested in displacement vectors’ relative sizes, we rescale them before drawing them, just as we did with payoff vectors in Figures 1.3.1 and 1.3.2.) Now consider a three-strategy game: a game with one population and three strategies, whose state space X is thus the simplex in R3 . A “three-dimensional” picture of X is provided in Figure 1.3.4, where X is situated within the plane in R3 that contains it. This plane is called the affine hull of X, and is denoted by aff (X) (see Appendix 1.A.2). For future reference, note that displacement vectors drawn from states in X are situated in the plane aff (X). Instead of representing the state space X explicitly in R3 , it is more common to present it as a two-dimensional equilateral triangle (Figure 1.3.5). When we follow this approach, our sheet of paper itself represents the affine hull aff (X), and so arrows drawn on the paper represent displacement vectors. Figure 1.3.5 presents arrows describing the 3 × 2 = 6 displacement vectors of the form e j − ei , which correspond to switches between distinct ordered pairs of strategies. Each of these arrows is parallel to some edge of the simplex. 15 e1 e2 e3 Figure 1.3.4: The simplex in R3 . e1 – e3 e1 – e2 e1 – e1 e2 – e1 e3 – e2 e3 e2 – e3 x e3 e2 Figure 1.3.5: Displacement vectors for three-strategy games. 16 For purposes of orientation, note that if we resituate the simplex from Figure 1.3.5 in three-dimensional space (i.e., in Figure 1.3.4), then each of these six arrows is obtained by subtracting one standard basis vector from another. Switches between pairs of strategies are not the only ways of generating displacement vectors—they can also come from switches involving three or more strategies, and, in multipopulation settings, from switches occurring within more than one population. The set of all displacement vectors from states in X forms a subspace of Rn ; this subspace is called the tangent space TX. To formally define TX, let us first consider population p ∈ P in isolation. The state p p space for population p is Xp = {xp ∈ Rn : i∈Sp xi = mp }. The tangent space of Xp , denoted + p TXp , is the smallest subspace of Rn that contains all vectors describing motions between points in Xp . In other words, if xp , yp ∈ Xp , then yp − xp ∈ TXp , and TXp is the span of all p p p vectors of this form. It is not hard to see that TXp = Rn ≡ {zp ∈ Rn : i∈Sp zi = 0}: that is, 0 p TXp contains exactly those vectors in Rn whose components sum to zero. The restriction on the sum embodies the fact that changes in the population state leave the population’s mass constant. The definition above is sufficient for studying single population games. What if there are multiple populations? In this case, any change in the social state x ∈ X = p∈P Xp is a combination of changes occurring within the individual populations. Therefore, the grand tangent space TX is just the product of the tangent spaces for each set Xp : in other words , TX = p∈P TXp . 1.3.3 Orthogonal Projections Suppose we would like to draw a diagram representing a three-strategy game F. One possibility is to draw a “three-dimensional” representation of F in the fashion of Figure 1.3.4. We would place more modest demands on our drafting skills if we instead represented F in just two dimensions. But this simplification comes at a cost: since threedimensional payoff vectors F(x) ∈ R3 will be presented as two-dimensional objects, some of the information contained in these vectors will be lost. From a geometric point of view, the most natural way to proceed is pictured in Figure 1.3.6: instead of drawing an arrow from state x corresponding to the vector F(x) itself, we instead draw the arrow closest to F(x) among those that lie in the plane aff (X). This arrow represents a vector in the tangent space TX: namely, the orthogonal projection of F(x) onto TX. Let Z be a linear subspace of Rn . The orthogonal projection of Rn onto Z is a linear map that sends each π ∈ Rn to the closest point to π in Z. Each orthogonal projection can be 17 e1 F(x) ΦF(x) x e2 e3 Figure 1.3.6: Projected payoff vectors for three-strategy games. represented by a matrix PZ ∈ Rn×n via the map π → PZ π, and it is common to identify the projection with its matrix representation . We treat orthogonal projections in some detail in Appendix 1.A.3; here we focus only on the orthogonal projections we need. p First consider population p ∈ P in isolation. The orthogonal projection of Rn onto the p p 1 tangent space TXp , denoted Φ ∈ Rn ×n , is defined by Φ = I − np 11 , where 1 = (1, . . . , 1) is 1 1 the vector of ones; thus np 11 is the matrix whose entries are all np . p If πp is a payoff vector in Rn , the projection of πp onto TXp is Φπp = πp − 1 11 np πp = πp − 1 1 np p k∈Sp πk . The ith component of Φπp is the difference between the actual payoff to strategy i and the unweighted average payoff of all strategies in Sp . Thus, Φπp discards information about average payoffs while retaining information about relative payoffs of different strategies in Sp . This interpretation is important from a game-theoretic point of view, since incentives, and hence Nash equilibria, only depend on payoff differences. Therefore, when incentives (as opposed to, e.g., efficiency) are our main concern, we do not need to know the actual payoff vectors πp ; looking at the projected payoff vectors Φπp is enough. In multipopulation settings, the tangent space TX = p∈P TXp has a product structure; hence, the orthogonal projection onto TX, denoted Φ ∈ Rn×n , has a block diagonal structure: Φ = diag(Φ, . . . , Φ). (Note that the blocks on the diagonal of Φ are generally p p not identical: the pth block is an element of Rn ×n .) If we apply Φ to the society’s payoff 18 x1 x2 Figure 1.3.7: Payoffs and projected payoffs in 12 Coordination. vector π = (π1 , . . . , πp ), the resulting vector Φπ = (Φπ1 , . . . , Φπp ) lists the relative payoffs in each population. 1.3.4 Drawing Three-Strategy Games Before using orthogonal projections to draw three-strategy games, let us see how this 1 device affects our pictures of two-strategy games. Applying the projection Φ = I − 2 11 to the payoff vectors from the coordination game FC2 and the Hawk-Dove game FHD yields 1 1 2 − 1 x1 2 x1 − x2 ΦFC2 (x) = 1 12 = 1 − 2x − x + x and 2 2 2 2 21 1 2 − 1 2xD − xH 1 (xD − xH ) 2 HD = ΦF (x) = 1 12 1 − x (x − x ) . H D D 2 2 2 We draw the projected payoffs along with the original payoffs in Figures 1.3.7 and 1.3.8. The projected payoff vectors ΦF(x) lie in the tangent space TX, and so are represented by arrows running parallel to the simplex X. Projecting away the orthogonal component of payoffs makes the “outward force” in the coordination game and the “inward force” in the Hawk-Dove game more transparent. Indeed, Figures 1.3.7 and 1.3.8 are suggestive of evolutionary dynamics for these two games—a topic we take up starting in Chapter 3. Now, let us consider the three-strategy coordination game FC3 and the Rock-Paper19 x1 x2 Figure 1.3.8: Payoffs and projected payoffs in the Hawk-Dove game. Scissors game FRPS . 1 0 0 x1 x1 C3 0 2 0 x = 2x ; 2 2 F (x) = 0 0 3 x 3x 3 3 0 −1 1 xR xS − xP RPS 1 x = x − x . P R F (x) = 0 −1 S −1 1 x x − x 0 S P R These games are pictured in Figures 1.3.9 and 1.3.10. The arrows in Figure 1.3.9 represent the projected payoff vectors ΦFC3 (x), defined by 1 x1 3 (2x1 − 2x2 − 3x3 ) 1 C3 1 2x = (−x + 4x − 3x ) . 2 3 ΦF (x) = I − 3 11 1 2 3 1 3x (−x − 2x + 6x ) 3 1 2 3 3 But in the Rock-Paper-Scissors game, the column sums of the payoff matrix all equal 0, implying that the maps FRPS and ΦFRPS are identical; by drawing one, we draw the other. As with that of FC2 , the diagram of the coordination game FC3 shows forces pushing outward toward the extreme points of the simplex. In contrast, Figure 1.3.10 displays a property that cannot occur with just two strategies: instead of driving toward some Nash equilibrium, the arrows in Figure 1.3.10 cycle around the simplex. Thus, the figure suggests that in the Rock-Paper-Scissors game, evolutionary dynamics need not converge to Nash equilibrium, but instead may avoid equilibrium in perpetuity. We return to questions of convergence and nonconvergence of evolutionary dynamics beginning in 20 1 2 3 Figure 1.3.9: Projected payoffs in 123 Coordination. R P S Figure 1.3.10: Payoffs (= projected payoffs) in Rock-Paper-Scissors. 21 later chapters. 1.3.5 Tangent Cones and Normal Cones To complete our introduction to the geometric approach to population games, we explain how one can find a game’s Nash equilibria by examining a picture of the game. To begin this discussion, note that the constraint that defines vectors z as lying in the tangent space TX—the constraint that keeps population masses constant—is not always enough to ensure that motion in direction z is feasible. Motions in every direction in TX p are feasible if we begin at a state x in the interior of the state space X. But if xi = 0 for p some strategy i ∈ Sp , then motion in any direction z with zi < 0 would cause the mass of agents playing strategy i to become negative, taking the state out of X. To describe the feasible displacement directions from an arbitrary state x ∈ X, we introduce the notion of a tangent cone. To begin, recall that the set K ⊆ Rn is a cone if whenever it contains the vector z, it also contains the vector αz for every α > 0. Most often one is interested in convex cones (i.e., cones that are convex sets). The polar of the convex cone K is a new convex cone K◦ = y ∈ Rn : y z ≤ 0 for all z ∈ K . In words, the polar cone of K contains all vectors that form a weakly obtuse angle with each vector in K (Figure 1.3.11). Exercise 1.3.1. Let K be a convex cone. Show that (i) K◦ is a closed convex cone, and K◦ = (cl(K))◦ . (Hence, K◦ contains the origin.) (ii) K is a subspace of Rn if and only if K is symmetric, in the sense that K = −K. Moreover, in this case, K◦ = K⊥ . (iii) (K◦ )◦ = cl(K). (Hint: To show that (K◦ )◦ ⊆ cl(K), use the separating hyperplane theorem.) The last result above tells us that (K◦ )◦ = K for any closed convex cone K; thus, polarity defines an involution on the set of closed convex cones. Another fundamental result about closed convex cones and their polar cones, the Moreau Decomposition Theorem, is not needed until later chapters. But as the preceding discussion provides the proper context to present this result, we do so in Appendix 1.B. If C ⊂ Rn is a closed convex set, then the tangent cone of C at state x ∈ C, denoted TC(x), is the closed convex cone TC(x) = cl z ∈ Rn : z = α y − x for some y ∈ C and some α ≥ 0 . 22 K K° Figure 1.3.11: A convex cone and its polar cone. If C ⊂ Rn is a polytope (i.e., the convex hull of a finite number of points), then the closure operation is redundant. In this case, TC(x) is the set of directions of motion from x that initially remain in C; more generally, TC(x) also contains the limits of such directions. (To see the difference, construct TC(x) for x ∈ bd(C) when C is a square and when C is a circle.) If x is in the relative interior of C (i.e., the interior of C relative to aff (C)), then TC(x) is just TC, the tangent space of C; otherwise, TC(x) is a strict subset of TC. Finally, define the normal cone of C at x to be the polar of the tangent cone of C at x: that is, NC(x) = (TC(x))◦ . By definition, NC(x) is a closed convex cone, and it contains every vector that forms a weakly obtuse angle with every feasible displacement vector at x. In Figures 1.3.12 and 1.3.13, we sketch examples of tangent cones and normal cones when X is the state space for a two strategy game (i.e., the simplex in R2 ) and for a three strategy game (the simplex in R3 ). Since the latter figure is two-dimensional, with the sheet of paper representing the affine hull of X, the figure actually displays the projected normal cones Φ(NX(x)). 1.3.6 Normal Cones and Nash Equilibria At first glance, normal cones might appear to be less relevant to game theory than tangent cones. Theorem 1.3.2 shows that this impression is false: normal cones and Nash 23 NX(x) x TX(x) TX(y) y NX(y) Figure 1.3.12: Tangent cones and normal cones for two-strategy games. 24 Φ(NX(v)) v TX(v) Φ(NX(y)) TX(y) y TX(x) x Φ(NX(x)) = {0} Figure 1.3.13: Tangent cones and normal cones for three-strategy games. 25 equilibria are intimately linked. Theorem 1.3.2. Let F be a population game. Then x ∈ NE(F) if and only if F(x) ∈ NX(x). p p p Proof. x ∈ NE(F) ⇔ [xi > 0 ⇒ Fi (x) ≥ F j (x)] for all i, j ∈ Sp , p ∈ P ⇔ (xp ) Fp (x) ≥ ( yp ) Fp (x) for all yp ∈ Xp , p ∈ P ⇔ ( yp − xp ) Fp (x) ≤ 0 for all yp ∈ Xp , p ∈ P ⇔ (zp ) Fp (x) ≤ 0 for all zp ∈ TXp (x), p ∈ P ⇔ Fp (x) ∈ NXp (xp ) for all p ∈ P ⇔ F(x) ∈ NX(x). Exercise 1.3.3. Justify the last equivalence above. Theorem 1.3.2 tells us that state x is a Nash equilibrium if and only if the payoff vector F(x) lies in the normal cone of the state space X at x. This result provides us with a simple, purely geometric description of Nash equilibria of population games. Its proof is very simple: some algebra shows that x is a Nash equilibrium if and only if it solves a variational inequality problem—that is, if it satisfies (1.1) ( y − x) F(x) ≤ 0 for all y ∈ X. Applying the definitions of tangent and normal cones then yields the result. In many cases, it is more convenient to speak in terms of projected payoff vectors and projected normal cones. Corollary 1.3.4 restates Theorem 1.3.2 in these terms. Corollary 1.3.4. x ∈ NE(F) if and only if ΦF(x) ∈ Φ(NX(x)). Proof. Clearly, F(x) ∈ NX(x) implies that ΦF(x) ∈ Φ(NX(x)). The reverse implication follows from the facts that NX(x) = Φ(NX(x)) + (TX)⊥ (see Exercise 1.3.5) and that Φ((TX)⊥ ) = {0} (which is the equation that defines Φ as the orthogonal projection of Rn onto TX). Exercise 1.3.5. (i) Using the notions of relative and average payoffs discussed in Section 1.3.3, explain the intuition behind Corollary 1.3.4 in the single population case. (ii) Prove that NX(x) = Φ(NX(x)) + (TX)⊥ . (iii) Only one of the two statements to follow is equivalent to x ∈ NE(F) : F(x) ∈ Φ(NX(x)), or ΦF(x) ∈ NX(x). Which is it? 26 In Figures 1.3.7 through 1.3.10, we mark the Nash equilibria of our four population games with dots. In the two-strategy games FC2 and FHD , the Nash equilibria are those states x at which the payoff vector F(x) lies in the normal cone NX(x), as Theorem 1.3.2 requires. In both these games and in the three-strategy games FC3 and FRPS , the Nash equilibria are those states x at which the projected payoff vector ΦF(x) lies in the projected normal cone Φ(NX(x)), as Corollary 1.3.4 demands. Even if the dots were not drawn, we could locate the Nash equilibria of all four games by examining the arrows alone. Exercise 1.3.6. Compute the Nash equilibria of the four games studied above, and verify that the equilibria appear in the correct positions in Figures 1.3.7 through 1.3.10. Exercise 1.3.7. Two-population two-strategy games. Let F be a game played by two unit mass populations (p = 2) with two strategies for each (n1 = n2 = 2). Describe the state space X, tangent space TX, and orthogonal projection Φ for this setting. (ii) Show that the state space X can be represented on a sheet of paper by a unit square, with the upper left vertex representing the state at which all agents in both populations play strategy 1, and with the upper right vertex representing the state at which all agents in population 1 play strategy 1 and all agents in population 2 play strategy 2. Explain how the projected payoff vectors ΦF(x) can be represented as arrows in this diagram. (iii) At (a) a point in the interior of the square, (b) a non-vertex boundary point, and (c) a vertex, draw the tangent cone TX(x) and the projected normal cone Φ(NX(x)), and give algebraic descriptions of each. (iv) Suppose we draw projected payoff vectors ΦF(x) in the manner you described in part (ii) and projected normal cones in the manner you described in part (iii). Verify that in each of the cases considered in part (iii), the arrow representing ΦF(x) is contained in the sketch of Φ(NX(x)) if and only if x is a Nash equilibrium of F. (i) Appendix 1.A Affine Spaces, Tangent Spaces, and Orthogonal Projections The simplex in Rn , the state space for single population games, is an n − 1 dimensional subset of Rn ; state spaces for multipopulation games are Cartesian products of scalar 27 multiples of simplices. For this reason, linear subspaces, affine spaces, and orthogonal projections all play important roles in the study of population games. 1.A.1 Affine Spaces The set Z ⊆ Rn is a (linear) subspace of Rn if it is closed under linear combination: if ˆ ˆ z, z ∈ Z and a, b ∈ R, then az + bz ∈ Z as well. Suppose that Z is a subspace of Rn of dimension dim(Z) < n, and that the set A is a translation of Z by some vector v ∈ Rn : A = Z + {v} = {x ∈ Rn : x = z + v for some z ∈ Z}. Then we say that A is an affine space of dimension dim(A) = dim(Z). Observe that any vector representing a direction of motion through A is itself an element of Z: if x, y ∈ A, then y − x = (z y + v) − (zx + v) = z y − zx for some zx and z y in Z; since Z is closed under linear combinations, z y − zx ∈ Z. For this reason, the set Z is called the tangent space of A, and we often write TA in place of Z. Since the origin is an element of Z, the translation vector v in the definition A = Z + {v} can be any element of A. But is there a “natural” choice of v? Recall that the orthogonal complement of Z, denoted by Z⊥ , contains the vectors in Rn orthogonal to all elements of Z: that is, Z⊥ = {v ∈ Rn : v z = 0 for all z ∈ Z}. It is easy to show that the set A ∩ Z⊥ contains a single element, which we denote by z⊥ , and that this orthogonal translation vector is the A closest point in Z⊥ to every point in A (in the language of Section 1.A.3 below, PZ⊥ x = z⊥ A for all x ∈ A). We will see that for many purposes, this translation vector is the most convenient choice. Example 1.A.1. Consider the subspace Rn = {z ∈ Rn : 1 z = 0} and the affine space 0 A = Rn + {e1 } = {z ∈ Rn : 1 z = 1}, where 1 = (1, . . . , 1) . Since (Rn )⊥ = span({1}) and 0 0 1 1 A ∩ span({1}) = { n 1}, the vector n 1 is the orthogonal translation vector that generates A. 1 1 In particular, A = Rn + { n 1}, and n 1 is the closest point in span({1}) to every x ∈ A. We 0 illustrate the case in which n = 2 in Figure 1.A.1; note again our convention of using the vertical axis to represent the first component of x = (x1 , x2 ). § 28 ⊥ n (R 0 ) = span({1}) 1 n 1 e1 A n R0 Figure 1.A.1: The state space and its affine hull for two-strategy games. 1.A.2 Affine Hulls of Convex Sets Let Y ⊆ Rn . The affine hull of Y, denoted aff (Y), is the smallest affine space that contains Y. This set can be described as k k ii ik ik i (Y) = x ∈ Rn : x = (1.2) aff λ y for some { y }i=1 ⊂ Y and {λ }i=1 ⊂ R with λ = 1 . i =1 i =1 The vector x is called an affine combination of the vectors yi . If we also required the λi to be nonnegative, x would instead be an convex combination of the yi , and (1.2) would become conv(Y), the convex hull of Y. Now suppose that Y is itself convex, let A = aff (Y) be its affine hull, and let Z = TA be the tangent space of A; then we also call Z = TY the tangent space of Y, as Z contains directions of motion from points in the (relative) interior of Y that stay in Y. We also call dim(Y) = dim(Z) the dimension of Y. In constructing the affine hull of a convex set as in (1.2), it is enough to take affine combinations of a fixed set of dim(Y) + 1 points in Y. To accomplish this, let d = dim(Y), fix y0 ∈ Y arbitrarily, and choose y1 , . . . , yd so that { y1 − y0 , . . . , yd − y0 } is a basis for Z. Then letting λ0 = 1 − d=1 λi , we see that i Z + { y0 } = span({ y1 − y0 , . . . , yd − y0 }) + { y0 } 29 = x ∈ Rn : x = = x ∈ Rn : x = d i=1 d i=0 λi ( yi − y0 ) + y0 for some {λi }d=1 ⊂ R . i λi yi for some {λi }d=0 ⊂ R with i d i=0 λi = 1 = aff (Y). p Example 1.A.2. Population states. Let Xp = {xp ∈ Rn : 1 xp = mp } be the set of population + p p states for a population of mass m . This convex set has affine hull aff (Xp ) = {xp ∈ Rn : p p 1 xp = mp } and tangent space TXp = {zp ∈ Rn : 1 zp = 0} = Rn (cf Example 1.A.1). § 0 Example 1.A.3. Social states. Let X = p∈P Xp be the set of social states for a collection of populations P = {1, . . . , p } with masses m1 , . . . mp . This convex set has affine hull p aff (X) = p∈P aff (Xp ) and tangent space TX = p∈P Rn . Thus, if z = (z1 , . . . , zp ) ∈ TX, then 0 each zp has components that sum to zero. § 1.A.3 Orthogonal Projections If V and W are subspaces of Rn , their sum is V + W = span (V ∪ W ), the set of linear combinations of elements of V and W . If V ∩ W = {0}, every x ∈ V + W has a unique decomposition x = v + w with v ∈ V and w ∈ W . In this case, we write V + W as V ⊕ W , and call it the direct sum of V and W . For instance, V ⊕ V ⊥ = Rn for any subspace V ⊆ Rn . Every matrix A ∈ Rn×n defines a linear operator from Rn to itself via x → Ax. To understand the action of this operator, remember that the ith column of A is the image of the standard basis vector ei , and, more generally, that Ax is a linear combination of the columns of A. We call the linear operator P ∈ Rn×n a projection onto the subspace V ⊆ Rn if there is a second subspace W ⊆ Rn satisfying V ∩ W = {0} and V ⊕ W = Rn such that (i) Px = x for all x ∈ V , and (ii) Py = 0 for all y ∈ W . If W = V ⊥ , we call P the orthogonal projection onto V , and write PV in place of P. Every projection onto V maps all points in Rn to points in V . While for any given subspace V there are many projections onto V , the orthogonal projection onto V is unique. For example, 0 0 0 0 P1 = 1 1 and P2 = 0 1 both define projections of R2 onto the horizontal axis {x ∈ R2 : x1 = 0}. (Recall again our convention of representing x1 on the vertical axis.) However, since P2 maps the vertical 30 Figure 1.A.2: A projection. Figure 1.A.3: An orthogonal projection. axis {x ∈ R2 : x2 = 0} to the origin, it is the orthogonal projection. The action of the two projections is illustrated in Figures 1.A.2 and 1.A.3 below. The latter figure illustrates a geometrically obvious property of orthogonal projections: the orthogonal projection of Rn onto V maps each point y ∈ Rn to the closest point to y in V : 2 PV y = argmin y − v . v∈V Projections admit simple algebraic characterizations. Recall that the matrix A ∈ Rn×n is idempotent if A2 = A. It is easy to see that projections are represented by idempotent matrices: once the first application of P projects Rn onto the subspace V , the second application of P does nothing more. In fact, we have Theorem 1.A.4. (i) P is a projection if and only if P is idempotent. 31 Figure 1.A.4: The orthogonal projection Φ in R2 . (ii) P is an orthogonal projection if and only if P is symmetric idempotent. p Example 1.A.5. The orthogonal projection onto Rn . In Example 1.A.2, we saw that the set 0 p p of population states Xp = {xp ∈ Rn : 1 xp = mp } has tangent space TXp = Rn = {zp ∈ + 0 p p p p Rn : 1 zp = 0}. We can decompose the space Rn into the direct sum Rn ⊕ Rn , where 0 1 p p p p 1 Rn = (Rn )⊥ = span({1}). The orthogonal projection of Rn onto Rn is Ξ = np 11 , the matrix 0 1 1 p 1 whose entries all equal np ; to verify this, note that Ξzp = 0 for zp ∈ Rn and Ξ1 = 1. The 0 p np np p p orthogonal projection of R onto R0 is Φ = I − Ξ, since Φz = z for zp ∈ Rn and Φ1 = 0 0 (see Figure 1.A.4 for the case of np = 2). Both Ξ and Φ are clearly symmetric, and since 1 1 1 Ξ2 = ( np 11 )( np 11 ) = np 11 = Ξ and Φ2 = (I − Ξ)(I − Ξ) = I − 2Ξ + Ξ2 = I − 2Ξ + Ξ = I − Ξ = Φ, both are idempotent as well. § More generally, it is easy to show that if P is the orthogonal projection of Rn onto V , then I − P is the orthogonal projection of Rn onto V ⊥ . Or, in the notation introduced above, PV⊥ = I − PV . Example 1.A.6. The orthogonal projection onto TX. Recall from Example 1.A.3 that the set of p social states X = p∈P Xp has tangent space TX = p∈P Rn . We can decompose Rn into the 0 p p direct sum p∈P Rn ⊕ p∈P Rn = TX ⊕ p∈P span({1}). The orthogonal projection of Rn 0 1 onto p∈P span({1}) is the block diagonal matrix Ξ = diag(Ξ, . . . , Ξ), while the orthogonal projection of Rn onto TX is Φ = I − Ξ = diag(Φ, . . . , Φ). Of course, Ξ and Φ are both symmetric idempotent. § 32 Example 1.A.7. Ordinary least squares. Suppose we have a collection of n > k data points, {(xi , yi )}n=1 , where each xi ∈ Rk contains k components of “explanatory” data and each i yi ∈ R is the corresponding component of “explainable” data. We write these data as y1 (x1 ) . . . . ∈ Rn×k and y = . ∈ Rn X= . n n y (x ) and assume that X is of full rank. We seek the best linear predictor: the map x → x β that minimizes the sum of squared prediction errors n=1 ( yi − (xi ) β)2 = | y − Xβ|2 . i (The prediction function xi → (xi ) β is a truly linear function of x, in the sense that the input vector 0 generates a prediction of 0. Typically, one seeks an affine prediction function—that is, one that allows for a nonzero constant term. To accomplish this, one sets xi1 = 1 for all i, leaving only k − 1 components of true explanatory data. In this case, the component β1 serves as a constant term in the affine prediction function (x2 , . . . , xk ) → β1 + k=2 xi βi .) i Let span(X) = {Xb : b ∈ Rk } be the column span of X. That β ∈ Rk minimizes | y − Xβ|2 is equivalent to the requirement that Xβ be the closest point to y in the column span of X: 2 Xβ = argmin y − v . v∈span(X) Both calculus and geometry tell us that for this to be true, the vector of prediction errors y − Xβ must be orthogonal to span(X), and hence to each column of X. X ( y − Xβ) = 0. One can verify that X ∈ Rn×k and X X ∈ Rk×k have the same null space, and hence the same (full) rank. Therefore, (X X)−1 exists, and we can solve the previous equation for β: β = (X X)−1 X y. To this point, we have taken X ∈ Rn×k and y ∈ Rn as given and used them to find the vector β ∈ Rk , which we have viewed as defining a map from vectors of explanatory data x ∈ Rk to predictions x β ∈ R. Now, let us take X alone as given and consider the map from vectors of “explainable” data y ∈ Rn to vectors of predictions Xβ = X(X X)−1 X y ∈ Rn . By construction, this linear map P = X(X X)−1 X ∈ Rn×n is the orthogonal projection of Rn onto span(X). P is clearly symmetric (since the inverse of a symmetric matrix is symmetric), 33 and since P2 = X(X X)−1 X X(X X)−1 X = X(X X)−1 X = P it is idempotent as well. § 1.B The Moreau Decomposition Theorem A basic fact about projection onto subspaces holds that for any vector v ∈ Rn and any subspace Z ⊆ Rn , the sum v = PZ (v) + PZ⊥ (v) is the unique decomposition of v into the sum of elements of Z and Z⊥ . The Moreau Decomposition Theorem is a generalization of this result that replaces the subspace Z and its orthogonal complement with a closed convex cone and its polar cone. We use this theorem repeatedly in Chapter 5 in our analysis of the projection dynamic. To state this result, we need an appropriate analogue of orthogonal projection for the context of closed, convex sets. To this end, we define ΠC : Rn → C, the (closest point) projection of Rn onto the closed convex set C by ΠC ( y) = argmin y − x . x∈C This definition generalizes that of the projection PZ onto the subspace Z ⊆ Rn to cases in which the target set is not linear, but merely closed and convex. With this definition in hand, we can state our new decomposition theorem; an illustration is provided in Figure 1.B.1. Theorem 1.B.1 (The Moreau Decomposition Theorem). Let K ⊆ Rn and K◦ ⊆ Rn be a closed convex cone and its polar cone, and let v ∈ Rn . Then the following are equivalent: (i) vK = ΠK (v) and vK◦ = ΠK◦ (v). (ii) vK ∈ K, vK◦ ∈ K◦ , v = vK + vK◦ , and vK vK◦ = 0. 34 K v ΠK°(v) ΠK(v) K° Figure 1.B.1: The Moreau Decomposition Theorem. 1.N Notes Congestion games are introduced in Beckmann et al. (1956); see the notes to Chapter 2 for further references. For the biological motivation for the Hawk-Dove game, see Maynard Smith (1982, Chapter 2). Portions of Section 1.3 follow Lahkar and Sandholm (2008). The link between normal cones and Nash equilibria is known from the literature on variational inequalities; see Harker and Pang (1990) and Nagurney (1999). For more on affine spaces, tangent cones, normal cones, the Moreau Decomposition Theorem, and related notions, see HiriartUrruty and Lemar´ chal (2001). The algebra of orthogonal projections is explained, e.g., in e Friedberg et al. (1989, Section 6.6). 35 36 CHAPTER TWO Potential Games, Stable Games, and Supermodular Games 2.0 Introduction In the previous chapter, we offered a general definition of population games and characterized their Nash equilibria geometrically. Still, since any continuous map F from the state space X to Rn defines a population game, population games with even a moderate number of strategies can be difficult to analyze. In this chapter, we define three important classes of population games: potential games, stable games, and supermodular games. From an economic point of view, each definition places constraints on the sorts of externalities agents impose on one another through their choices in the game. From a mathematical point of view, each definition imposes a structure on payoff functions that renders their analysis relatively simple. We show through examples that potential games, stable games, and supermodular games each encompass a variety of interesting applications. We also establish the basic properties of each class of games. Among other things, we show that for games in each class, existence of Nash equilibrium can be proved using elementary methods. Beginning in Chapter 6, we investigate the behavior of evolutionary dynamics in the three classes of games; there, our assumptions on the structure of externalities will allow us to establish a range of global convergence results. The definitions of our three classes of games only require continuity of the payoff functions. If we instead make the stronger assumption that payoffs are smooth (in particular, continuously differentiable), we can avail ourselves of the tools of calculus. Doing so not only simplifies computations, but also allows us to express our definitions and results in simple, useful, and intuitively appealing ways. The techniques from calculus that we 37 require are reviewed in the Appendices 2.A and 2.B. 2.1 Full Potential Games In potential games, all information about payoffs that is relevant to agents’ incentives can be captured in a single scalar-valued function. The existence of this function—the game’s potential function—underlies potential games’ many attractive properties. In this section, we consider full potential games, which can be analyzed using standard multivariate calculus techniques (Appendix 2.A), but at the expense of requiring an extension of the payoff functions’ domain. In Section 2.2, we introduce a definition of potential games that does not use this device, but that instead requires analyses that rely on affine calculus (Appendix 2.B). 2.1.1 Full Population Games To understand the issues alluded to above, consider a game F played by a single population of agents. Since population states for this game are elements of X = {x ∈ Rn : + n k∈S xk = 1}, the simplex in R , the payoff Fi to strategy i is a real-valued function with domain X. In looking for useful properties of population games, a seemingly natural characteristic to consider is the marginal effect of adding new agents playing strategy j on the payoffs ∂F of agents currently choosing strategy i. This effect is captured by the partial derivative ∂x ij . But herein lies the difficulty: if Fi is only defined on the simplex, then even if the function ∂F F is differentiable, the partial derivative ∂x ij does not exist. To ensure that partial derivatives exist, we extend the domain of the game F from the state space X = {x ∈ Rn : k∈S xk = 1} to the entire positive orthant Rn . In mul+ tipopulation settings, the analogous extension is from the original set of social states p X = {x = (x1 , . . . , xp ) ∈ Rn : i∈Sp xi = mp } to Rn . In either setting, we call the game with + + payoffs defined on the positive orthant a full population game. In many interesting cases, one can interpret the extensions of payoffs as specifying the values that payoffs would take were the population sizes to change—see Section 2.1.3. 2.1.2 Definition and Characterization With these preliminaries addressed, we are now prepared to define full potential games. 38 Definition. Let F : Rn → Rn be a full population game. We call F a full potential game if there + exists a continuously differentiable function f : Rn → R satisfying + (2.1) f (x) = F(x) for all x ∈ Rn . + Property (2.1) can be stated more explicitly as ∂f p (x) ∂xi p = Fi (x) for all p ∈ P , i ∈ Sp and x ∈ Rn . + The function f , which is unique up to the addition of a constant, is called the full potential function for the game F. It represents the game’s payoffs in an integrated form. To explain the potential function’s role, suppose that x ∈ X is a population state at p p which F j (x) > Fi (x), so that an agent choosing strategy i ∈ Sp would be better off choosing strategy j ∈ Sp . Now suppose some small group of agents switch from strategy i to p p strategy j. These switches are represented by the displacement vector z = e j − ei , where p ei is the (i, p)th standard basis vector in Rn . The marginal impact that these switches have on the value of potential is therefore ∂f (x) = ∂z f (x) z = ∂f p (x) ∂x j − ∂f p (x) ∂x i p p = F j (x) − Fi (x) > 0. In other words, profitable strategy revisions increase potential. More generally, we will see in later chapters that the “uphill” directions of the potential function include all directions in which reasonable adjustment processes might lead. This fact underlies the many attractive properties that potential games possess. If the map F : Rn → Rn is C1 (continuously differentiable), it is well known that F + admits a potential function if and only if its derivative matrices DF(x) are symmetric (see Appendix 2.A.9). In the current game-theoretic context, we call this condition full externality symmetry. Observation 2.1.1. Suppose the population game F is C1 . Then F is a full potential game if and only if it satisfies full externality symmetry: (2.2) DF(x) is symmetric for all x ∈ Rn + 39 More explicitly, F is a potential game if and only if q p (2.3) ∂Fi q (x) ∂x j = ∂F j p (x) ∂x i for all i ∈ Sp , j ∈ Sq , p, q ∈ P , and x ∈ Rn . + Observation 2.1.1 characterizes smooth full potential games in terms of a simple, economically meaningful property: condition (2.2) requires that the effect on the payoffs to strategy i ∈ Sp of introducing new agents choosing strategy j ∈ Sq always equals the effect on the payoffs to strategy j of introducing new agents choosing strategy i. 2.1.3 Examples Our first two examples build on ones studied in Chapter 1. Example 2.1.2. Random matching in normal form games with common interests. Suppose a single population is randomly matched to play symmetric two player normal form game A ∈ Rn×n , generating the population game F(x) = Ax. While earlier we used this formula to define F on the state space X, here we will use it to define F on all of Rn . (While this + choice works very well in the present example, it is not always innocuous, as will see in Section 2.2.) The symmetric normal form game A has common interests if both players always receive the same payoff. This means that Ai j = A ji for all i and j, or, equivalently, that the matrix A is symmetric. Since DF(x) = A, this is precisely what we need for F to be a full potential game. The full potential function for F is 1 f (x) = 2 x Ax, which is one-half of x Ax = i∈S xi Fi (x) = F(x), the aggregate payoff function for F. To cover the multipopulation case, call the normal form game U = (U1 , . . . , Up ) a common interest game if there is a function V : S → R such that Up (s) = V (s) for all s ∈ S and p ∈ P . As before, this means that under any pure strategy profile, all p players earn the same payoff. This normal form game generates the full population game p Fsp (x) = xrr s V (s) s−p ∈S−p rp 40 on Rn . Aggregate payoffs in F are given by + p p xsp Fsp (x) = p F(x) = p∈P sp ∈Sp ∂f p (x) ∂xsp r∈P s∈S Hence, if we let f (x) = s∈S = V (s) 1 xrr = p F(x), we obtain s p xrr = Fsp (x). s V (s) s−p ∈S−p r∈P xrr . s V (s) rp So once again, random matching in a common interest game generates a full potential game in which potential is proportional to aggregate payoffs. § Exercise 2.1.3. In the multipopulation case, check directly that condition (2.2) holds. Example 2.1.4. Congestion games. For ease of exposition, suppose that the congestion game F models behavior in a traffic network. In this environment, an agent taking path j ∈ Sq affects the payoffs of agents choosing path i ∈ Sp through the marginal increases in p q congestion on the links φ ∈ Φi ∩ Φ j that the two paths have in common. But the marginal effect of an agent taking path i on the payoffs of agents choosing path j is identical: q p ∂Fi q (x) ∂x j =− cφ (uφ (x)) = p q φ∈Φi ∩Φ j ∂F j p (x). ∂x i In other words, congestion games satisfy condition (2.2), and so are full potential games. The full potential function for the congestion game F can be written explicitly as uφ (x) f (x) = − φ∈Φ cφ (z) dz. 0 Hence, potential is typically unrelated to aggregate payoffs, which are given by pp F(x) = xi Fi (x) = − p∈P i∈Sp uφ (x)cφ (uφ (x)). φ∈Φ In Section 2.1.6, we offer conditions under which potential and aggregate payoffs are directly linked. § Example 2.1.5. Cournot competition. Consider a unit mass population of firms who choose production quantities from the set S = {1, . . . , n}. The firms’ aggregate production is given 41 by a(x) = i∈S i xi . Let p : R+ → R+ denote inverse demand, a decreasing function of aggregate production. Let the firms’ production cost function c : S → R be arbitrary. Then the payoff to a firm producing quantity i ∈ S at population state x ∈ X is Fi (x) = i p(a(x)) − c(i). It is easy to check that F is a full potential game with full potential function a(x) f (x) = xi c(i). p(z) dz − 0 i∈S In contrast, aggregate payoffs in F are F(x) = xi Fi (x) = a(x)p(a(x)) − i∈S xi c(i). i∈S The difference between the two is a(x) f (x) − F(x) = p(z) − p(a(x)) dz, 0 which is simply consumers’ surplus. Thus, the full potential function f = F + ( f − F) measures the total surplus received by firms and consumers. (Total surplus differs from aggregate payoffs because the latter ignores consumers, who are not modeled as active agents.) § Example 2.1.6. Games generated by variable externality pricing schemes. Population games can be viewed as models of externalities for environments with many agents. One way to force agents to internalize the externalities they impose upon others is to introduce pricing schemes. Given an arbitrary full population game F with aggregate payoff function F, ˜ define an augmented game F as follows: q ˜p Fi (x) = p Fi (x) q xj + q∈P j∈Sq ∂F j p (x). ∂xi The double sum represents the marginal effect that an agent choosing strategy i has on other agents’ payoffs. Suppose that when the game F is played, a social planner charges each agent choosing strategy i a tax equal to this double sum, and that each agent’s payoff function is separable ˜ in this tax. The population game generated by this intervention is F. 42 Now observe that (2.4) ∂F ∂ p (x) = p ∂xi ∂xi q qq x j F j (x) = p Fi (x) q xj + q∈P j∈Sq q∈P j∈Sq ∂F j p (x) ∂x i ˜p = Fi (x). ˜ Equation (2.4) tells us that the augmented game F is a full potential game, and that its full potential function is the aggregate payoff function of the original game F. Hence, changes in strategy which are profitable in the augmented game increase efficiency with respect to the payoffs of the original game. § 2.1.4 Nash Equilibria of Full Potential Games We saw in Section 2.1.2 that in full potential games, profitable strategy revisions increase potential. It is therefore natural to expect that Nash equilibria of full potential games are related to local maximizers of potential. To investigate this idea, consider the nonlinear program max f (x) p xi = mp for all p ∈ P , and subject to i∈Sp p xi ≥ 0 for all i ∈ Sp and p ∈ P . The Lagrangian for this maximization problem is L(x, µ, λ) = f (x) + p∈P µp mp − i∈Sp p xi pp + λi xi , p∈P i∈Sp so the Kuhn-Tucker first order necessary conditions for maximization are (2.5) (2.6) (2.7) ∂f p (x) ∂xi pp λi xi = p λi ≥ 0 p = µp − λi for all i ∈ Sp and p ∈ P , 0, for all i ∈ Sp and p ∈ P , and for all i ∈ Sp and p ∈ P . Let KT( f ) = x ∈ X : (x, µ, λ) satisfies (2.5)-(2.7) for some λ ∈ Rn and µ ∈ Rp . Theorem 2.1.7 shows that the Kuhn-Tucker first order conditions for maximizing f on X 43 characterize the Nash equilibria of F. Theorem 2.1.7. If F is a full potential game with full potential function f, then NE(F) = KT( f ). Proof. If x is a Nash equilibrium of F, then since F = f , the Kuhn-Tucker conditions p p p are satisfied by x, µp = max j∈Sp F j (x), and λi = µp − Fi (x). Conversely, if (x, µ, λ) satisfies the p ∂f p (x) ∂xi p p µ − λj Kuhn-Tucker conditions, then for every p ∈ P , (2.5) and (2.6) imply that Fi (x) = p = µp for all i in the support of xp . Furthermore, (2.5) and (2.7) imply that F j (x) = ≤ µp p for all j ∈ Sp . Hence, the support of xp is a subset of argmax j∈Sp F j (x), and so x is a Nash equilibrium of F. Note that the multiplier µp represents the equilibrium payoff in population p, and that the p multiplier λi represents the “payoff slack” of strategy i ∈ Sp . Since the set X satisfies constraint qualification, satisfaction of the Kuhn-Tucker conditions is necessary for local maximization of the full potential function. Thus, Theorem 2.1.7, along with the fact that a continuous function on a compact set achieves its maximum, gives us a simple proof of existence of Nash equilibrium in full potential games. On the other hand, the Kuhn-Tucker conditions are not sufficient for maximizing potential. Therefore, while all local maximizers of potential are Nash equilibria, not all Nash equilibria locally maximize potential. Example 2.1.8. Consider again the 123 Coordination game introduced in Chapter 1: F1 (x) 1 0 0 x1 x1 F(x) = F2 (x) = 0 2 0 x2 = 2x2 . F (x) 0 0 3 x 3x 3 3 3 3 The full potential function for this game is the convex function f (x) = 1 (x1 )2 + (x2 )2 + 2 (x3 )2 . 2 The three pure states, e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 0, 1), all locally maximize potential, and so are Nash equilibria. To focus on one instance, note that the Kuhn-Tucker conditions are satisfied at state e1 by the multipliers µ = 1, λ1 = 0, and λ2 = λ3 = 1. The 632 global minimizer of potential, ( 11 , 11 , 11 ), is a state at which payoffs to all three strategies are equal, and is therefore a Nash equilibrium as well; the Kuhn-Tucker conditions are 6 satisfied here with multipliers µ = 11 and λ1 = λ2 = λ3 = 0. Finally, at each of the boundary 2 states ( 2 , 1 , 0), ( 3 , 0, 1 ), and (0, 3 , 5 ), the strategies which are played receive equal payoffs, 33 4 4 5 which exceed the payoff accruing to the unused strategy; thus, these states are Nash equilibria as well. These states, coupled with the appropriate multipliers, also satisfy the Kuhn-Tucker conditions: for example, x = ( 2 , 1 , 0) satisfies the conditions with µ = 2 , 33 3 λ1 = λ2 = 0 and λ3 = 2 . This exhausts the set of Nash equilibria of F. 3 44 Figure 2.1.1: The potential function of 123 Coordination. Figures 2.1.1 and 2.1.2 contain a graph and a contour plot of the full potential function f , and show the connection between this function and the Nash equilibria of F. § The previous example demonstrates that in general, potential games can possess Nash equilibria that do not maximize potential. But if the full potential function f is concave, the Kuhn-Tucker conditions are not only necessary for maximizing f ; they are also sufficient. This fact gives us the following corollary to Theorem 2.1.7. Corollary 2.1.9. (i) If f is concave on X, then NE(F) is the convex set of maximizers of f on X. (ii) If f is strictly concave on X, then NE(F) is a singleton containing the unique maximizer of f on X. Example 2.1.10. A network of highways connects Home and Work. The two towns are separated by a river. Highways A and D are expressways that go around bends in the river, and that do not become congested easily: cA (u) = cD (u) = 4 + 20u. Highways B and C cross the river over two short but easily congested bridges: cB (u) = cC (u) = 2 + 30u2 . In order to create a direct path between the towns, a city planner considers building a new expressway E that includes a third bridge over the river. Delays on this new expressway are described by cE (u) = 1 + 20u. The highway network as a whole is pictured in Figure 2.1.3. 45 1 2 3 Figure 2.1.2: A contour plot of the potential function of 123 Coordination. Before link E is constructed, there are two paths from Home to Work: path 1 traverses links A and B, while path 2 traverses links C and D. The equilibrium driving pattern splits the drivers equally over the two paths, yielding an equilibrium driving time (= –equilibrium payoff) of 23.5 on each. After link E is constructed, drivers may also take path 3, which uses links C, E, and B. (We assume that traffic on link E only flows to the right.) The resulting population game has payoff functions F1 (x) −(6 + 20x1 + 30(x1 + x3 )2 ) F (x) = 2 F(x) = 2 −(6 + 20x2 + 30(x2 + x3 ) ) F (x) −(5 + 20x + 30(x + x )2 + 30(x + x )2 ) 3 3 1 3 2 3 and full potential function f (x) = − 6x1 + 6x2 + 5x3 + 10((x1 )2 + (x2 )2 + (x3 )2 + (x1 + x3 )3 + (x2 + x3 )3 ) . Figures 2.1.4 and 2.1.5 contain a graph and a contour plot of the full potential function. Note that the full potential function for the two-path game is the restriction of f to the states at which x3 = 0. Evidently, the full potential function f is concave. (This is no coincidence—see Exercise 2.1.11 below.) The unique maximizer of potential on X, the state x ≈ (.4616, .4616, .0768), is the unique Nash equilibrium of the game. In this equilibrium, the driving time on each 46 A B Work Home C E D Figure 2.1.3: A highway network. Figure 2.1.4: The potential function of a congestion game. 47 1 2 3 Figure 2.1.5: A contour plot of the congestion game’s potential function. path is approximately 23.93, which exceeds the original equilibrium time of 23.5. In other words, adding an additional link to the network actually increases equilibrium driving times—a phenomenon known as Braess’ paradox. The intuition behind this phenomenon is easy to see. By opening up the new link E, we make it possible for a single driver on path 3 to use both of the easily congested bridges, B and C. But while using path 3 is bad for the population as a whole, it is appealing to individual drivers, as drivers do not account for the negative externalities their use of the bridges imposes on others. § Exercise 2.1.11. Uniqueness of equilibrium in congestion games. (i) Let F be a congestion game with cost functions cφ and full potential function f . Show that if each cφ is increasing, then f is concave, which implies that NE(F) is the convex set of maximizers of f on X. (Hint: Fix y, z ∈ X, let x(t) = (1 − t) y + tz, and show that g(t) = f (x(t)) is concave.) (ii) Construct a congestion game in which each cφ is strictly increasing but in which NE(F) is not a singleton. (iii) Show that in case (ii), the equilibrium link utilization levels uφ are unique. (Hint: Since f (x) only depends on the state x through the utilization levels uφ (x), we can define a function g : U → R on U = {vφ }φ∈Φ : vφ = uφ (x) for some x ∈ Rn by + g(uφ (x)) = f (x). Show that x maximizes f on X if and only if uφ (x) maximizes g on U.) 48 Exercise 2.1.12. Example 2.1.6 shows that by adding state-dependent congestion charges to a congestion game, a planner can ensure that drivers use the network efficiently, in the sense of minimizing average travel times. Show that these congestion charges can be imposed on a link-by-link basis, and that the price on each link need only depend on the number of drivers on that link. Exercise 2.1.13. Show that Cournot competition (Example 2.1.5) with a strictly decreasing inverse demand function generates a potential game with a strictly concave potential function, and hence admits a unique Nash equilibrium. Exercise 2.1.14. Entry and exit. When we define a full population game F : Rn → Rn , we + specify the payoffs of each of the n strategies for all possible vectors of population masses. It is only a small additional step to allow agents to enter and leave the game. Fixing a vector of population masses (m1 , . . . , mp ), we define a population game with entry and exit by p assuming that the set of feasible social states is X = {x = (x1 , . . . , xp ) ∈ Rn : i∈Sp xi ≤ mp }, + and that an agent who exits the game receives a payoff of 0. (i) State an appropriate definition of Nash equilibrium for population games with entry and exit. (ii) A population game with entry and exit is a potential game if it satisfies full externality symmetry (2.2). Prove an analogue of Theorem 2.1.7 for such games. 2.1.5 The Geometry of Nash Equilibrium in Full Potential Games Theorem 2.1.7 shows that if F is a potential game with potential function f , then the set of states satisfying the Kuhn-Tucker first order conditions for maximizing f are precisely the Nash equilibria of F. We now offer a geometric proof of this result, and discuss its implications. The nonlinear program from Section 2.1.4 seeks to maximize the function f on the polytope X. What do the Kuhn-Tucker conditions for this program mean? The Kuhn-Tucker conditions adapt the classical approach to optimization based on linearization to settings with both equality and inequality constraints. In the current context, these conditions embody the following construction: To begin, one linearizes the objective function f at the state x ∈ X of interest, replacing it with the function l f ,x ( y) = f (x) + f (x) ( y − x). Then, one determines whether the linearized function reaches its maximum on X at state x. Of course, this method can accept states that are not maximizers: for instance, if x is an interior local minimizer of f , then the linearization l f ,x is a constant function, and so is maximized everywhere in X. But because X is a polytope 49 (in particular, since constraint qualification holds), x must maximize l f ,x on X if it is to maximize f on X. With this interpretation of the Kuhn-Tucker conditions in hand, we can offer a simple geometric proof that NE(F) = KT( f ). The analysis employs our normal cone characterization of Nash equilibrium from Chapter 1. Theorem 2.1.7: If F is a full potential game with full potential function f, then NE(F) = KT( f ). Second proof. x ∈ KT( f ) ⇔ x maximizes l f ,x on X. ⇔ z ∈ TX(x) ⇒ ⇔ f (x) z ≤ 0 f (x) ∈ NX(x) ⇔ F(x) ∈ NX(x) ⇔ x ∈ NE(F). This proof is easy to explain in words. As we have argued, satisfying the Kuhn-Tucker conditions for f on X is equivalent to maximizing the linearized version of f on X. This in turn is equivalent to the requirement that if z is in the tangent cone of X at x—that is, if z is a feasible displacement direction from x—then z forms a weakly obtuse angle with the gradient vector f (x), representing the direction in which f increases fastest. But this is precisely what it means for f (x) to lie in the normal cone NX(x). The definition of potential tells us that we can replace f (x) with F(x); and we know from Chapter 1 that F(x) ∈ NX(x) means that x is a Nash equilibrium of F. This argument sheds new light on Theorem 2.1.7. The Kuhn-Tucker conditions, which provide a way of finding the maximizers of the function f , are stated in terms of the gradient vectors f (x). At first glance, it seems rather odd to replace f (x) with some non-integrable map F: after all, what is the point of the Kuhn-Tucker conditions when there is no function to maximize? But from the geometric point of view, replacing f (x) with F makes perfect sense. When the Kuhn-Tucker conditions are viewed in geometric terms— namely, in the form f (x) ∈ NX(x)—they become a restatement of the Nash equilibrium condition; the fact that f (x) is a gradient vector plays no role. So to summarize, the Nash equilibrium condition F(x) ∈ NX(x) is identical to the Kuhn-Tucker conditions, but applies whether or not the map F is integrable. Exercise 2.1.15. Let F be a full potential game with full potential function f . Let C ⊆ NE(F) be smoothly connected, in the sense that if x, y ∈ C, then there exists a piecewise C1 path α : [0, 1] → C with α(0) = x and α(1) = y. Show that f is constant on C. (Hint: Use the Fundamental Theorem of Calculus and the fact that F(x) ∈ NX(x) for all x ∈ NE(F), along 50 with the fact that when α(t) = x and α is differentiable at x, both α (t) and −α (t) are in TX(x).) 2.1.6 Efficiency in Homogeneous Full Potential Games We saw in Section 2.1.3 that when agents are matched to play normal form games with common interests, the full potential function of the resulting population game is proportional to the game’s aggregate payoff function. How far can we push this connection? Definition. We call a full potential game F homogeneous of degree k if each of its payoff p functions Fi : Rn → R is a homogeneous function of degree k, where k −1. + Example 2.1.16. Random matching in normal form games with common interests. In the single population setting, each payoff function F(x) = Ax is linear, so the full potential game F is homogeneous of degree 1. With p ≥ 2 populations, the payoffs Fp to population p’s strategies are multilinear in (x1 , . . . , xp−1 , xp+1 , . . . , xp ), so the full potential game F is homogeneous of degree p − 1. § Example 2.1.17. Isoelastic congestion games. Let F be a congestion game with cost functions cφ . For each facility φ ∈ Φ, let ηφ (u) = ucφ (u) cφ (u) denote φ’s cost elasticity, which is well defined whenever cφ (u) 0. We call a congestion game isoelastic with elasticity η ∈ R if ηφ = η for all φ ∈ Φ. Thus, a congestion game is isoelastic if all facilities in Φ are equally sensitive to congestion at all levels of use. Isoelasticity implies that all cost functions are of the form cφ (u) = aφ uη , where the aφ are arbitrary (i.e., positive or negative) scalar constants. (Notice that η cannot be negative, as this would force facility costs to become infinite at u = 0.) Since each uφ is linear in x, each p payoff function Fi is a sum of functions that are homogeneous of degree η in x, and so is itself homogeneous of degree η. Therefore, any isoelastic congestion game with elasticity η is a homogeneous potential game of degree η. § The efficiency properties of homogeneous potential games are consequences of the following theorem. Theorem 2.1.18. The full potential game F is homogeneous of degree k −1 if and only if the 1 normalized aggregate payoff function k+1 F(x) is a full potential function for F and is homogeneous of degree k + 1 0. 51 1 Proof. If the potential game F is homogeneous of degree k −1, then k+1 F(x) = pp 1 p∈P j∈Sp x j F j (x) is clearly homogeneous of degree k + 1. Therefore, condition (2.2) k +1 and Euler’s law imply that ∂ p ∂xi 1 1 F(x) = k+1 k+1 q∈P 1 = k+1 q ∂F j q p x j p (x) + Fi (x) ∂x i j∈Sq p ∂Fi q p x j q (x) + Fi (x) ∂x q q∈P j∈S j 1 p p kFi (x) + Fi (x) k+1 p = Fi (x), = 1 1 so k+1 F is a full potential function for F. On the other hand, if k+1 F is homogeneous of p 1 degree k +1 0 and is a full potential function for F, then each payoff function Fi = ∂∂p ( k+1 F) xi is homogeneous of degree k, so the converse statement follows. To understand the connection between homogeneity and efficiency, consider the expression ∂∂p F(x), which represents the impact of an agent who chooses strategy i on xi aggregate payoffs. Recalling Example 2.1.6, we split this impact into two terms. The q first term, q q ∂F j j x j ∂xp (x), i represents the impact of this agent’s behavior on his opponents’ p payoffs. The second term, Fi (x), represents the agent’s own payoffs. In homogeneous potential games, these two effects are precisely balanced: the payoff an agent receives from choosing a strategy is directly proportional to the social impact of his choice. For this reason, self-interested behavior leads to desirable social outcomes. Observe that if a potential game is homogeneous of degree less than −1, its full potential function is negatively proportional to aggregate payoffs. In this case, self-interested behavior leads to undesirable social outcomes. To remove this case from consideration, we call a potential game positively homogeneous if its full potential function is homogeneous of strictly positive degree, so that the game itself is homogeneous of degree k > −1. With this definition in hand, we can present a result on the efficiency of Nash equilibria. We call the social state x locally efficient in game F (x ∈ LE(F)) if there exists an ε > 0 such that F(x) ≥ F( y) for all y ∈ X within ε of x. If this inequality holds for all y ∈ X, we call x globally efficient (x ∈ GE(F)). Corollary 2.1.19. (i) If the full potential game F is positively homogeneous, then LE(F) ⊆ NE(F). (ii) If in addition its full potential function f is concave, then GE(F) = LE(F) = NE(F). 52 Exercise 2.1.20. Establish these claims. Exercise 2.1.21. Let F be a congestion game with nondecreasing affine cost functions: cφ (u) = aφ u + bφ . Suppose that within each population, the fixed cost of each route is equal: bφ = bp for all i ∈ Sp and p ∈ P . p φ∈Φi Show that NE(F) = GE(F). 2.1.7 Inefficiency Bounds for Congestion Games The results from the previous section provide stringent conditions under which Nash equilibria of congestion games are efficient. Since exact efficiency rarely obtains, it is natural to ask just how inefficient equilibrium behavior can be. We address this question in the context of congestion games with nondecreasing cost functions—in other words, congestion games in which congestion is a bad. It will be convenient to use notation tailored to the questions at hand. Given the facilities φ and the nondecreasing cost functions cφ , we let p p Ci (x) = −Fi (x) = cφ (uφ (x)) p φ∈Φi denote the cost of strategy i ∈ Sp at state x and let p C(x) = −F(x) = p xi Ci (x) = p∈P i∈Sp uφ (x)cφ (uφ (x)) φ∈Φ denote social cost at state x. We refer to the resulting congestion game either as C or as (C, m) (to emphasize the population masses m). When we introduce alternative cost functions γφ , we replace C with Γ in the notation above. One approach to bounding the inefficiency of equilibria is to compare the equilibrium social cost to the minimal social cost in a game with additional agents. Proposition 2.1.22. Let C be a congestion game with nondecreasing cost functions. Let x∗ be a Nash equilibrium of (C, m), and let y be a feasible state in (C, 2m). Then C(x∗ ) ≤ C( y). Exercise 2.1.23. This exercise outlines a proof of Proposition 2.1.22. (i) Define the cost functions γφ by γφ (u) = max{cφ (uφ (x∗ )), cφ (u)}. Show that u(γφ (u) − cφ (u)) ≤ cφ (uφ (x∗ )) uφ (x∗ ). 53 (ii) Show that Γi ( y) ≥ min j∈Sp C j (x∗ ). p p (iii) Use parts (i) and (ii) to show that Γ( y) − C( y) ≤ C(x∗ ) and that Γ( y) ≥ 2C(x∗ ), and conclude that C(x∗ ) ≤ C( y). Exercise 2.1.24. This exercise applies Proposition 2.1.22 to settings with fixed population masses but varying cost functions. ˜ (i) Show that the equilibrium social cost under cost functions cφ (u) = 1 cφ ( u ) is bounded 2 2 above by the minimal social cost under cost functions cφ . (ii) Let C be a congestion game with cost functions cφ (u) = (kφ − u)−1 for some capacities kφ > 0. (We assume that population masses are small enough that no edge can reach its capacity.) Using part (i), show that the equilibrium social cost when capacities are 2k is bounded above by the minimal social cost when capacities are k. In other words, doubling the capacities of the edges reduces costs at least as much as enforcing efficient behavior under the original capacities. A more direct way of understanding inefficiency is to bound a game’s inefficiency ratio: the ratio between the game’s equilibrium social cost and its minimal feasible social cost. Example 2.1.25. A highway network consisting of two parallel links is to be traversed by a unit mass of drivers. The links’ cost functions are c1 (u) = 1 and c2 (u) = u. In the unique Nash equilibrium of this game, all drivers travel on route 2, creating a social cost of 1. The efficient state, which minimizes C(x) = x1 + (x2 )2 , is xmin = ( 1 , 1 ); it generates a social cost 22 3 of C(xmin ) = 4 . Thus, the inefficiency ratio in this game is 4 . 3 The next result describes an easily established upper bound on inefficiency ratios. Proposition 2.1.26. Suppose that the cost functions cφ are nondecreasing and satisfy ucφ (u) ≤ u α 0 cφ (z) dz for all u ≥ 0. If x∗ ∈ NE(C) and x ∈ X, then C(x∗ ) ≤ αC(x). Exercise 2.1.27. (i) Prove Proposition 2.1.26. (Hint: Use a potential function argument.) (ii) Show that if cost functions in C are polynomials of degree at most k with nonnegative coefficients, then the inefficiency ratio in C is at most k + 1. Exercise 2.1.27 tells us that the inefficiency ratio of a congestion game with affine cost functions cannot exceed 2. Is it possible to establish a smaller upper bound? We saw in 4 Example 2.1.25 that inefficiency ratios as high as 3 can arise in very simple games with affine cost functions. Amazingly, 4 is the highest possible inefficiency ratio for congestion 3 games with cost functions of this form. 54 Theorem 2.1.28. Let C be a congestion game whose cost functions cφ are nonnegative, nondecreasing, and affine: cφ (u) = aφ u + bφ with aφ , bφ ≥ 0. If x∗ ∈ NE(C) and x ∈ X, then C(x∗ ) ≤ 4 C(x). 3 To prove Theorem 2.1.28, we introduce the auxiliary cost functions γφ (u) = 2aφ u + bφ . For intuition, note that when facility φ is used by u agents, the marginal externality imposed by one of these agents on each other user of the facility is cφ (u) = aφ , so that the total externalities imposed by this agent on all other users are ucφ (u) = aφ u. Thus, the cost functions γφ are the ones generated by an externality pricing scheme under which agents are always made to pay for the externalities they currently impose on others. Since the aggregate payoff function is concave, it follows that the Nash equilibria of the new game Γ are the efficient states in the original game C (cf Example 2.1.6). Lemma 2.1.29. (i) γφ ( u ) = cφ (u) for all u ≥ 0. 2 d (ii) γφ (u) = du (ucφ (u)) for all u ≥ 0. (iii) ucφ (u) ≥ vcφ (v) + (u − v)γφ (v) for all u, v ≥ 0. (iv) NE(Γ, m ) = GE(C, m ). 2 2 ∗ (v) x∗ ∈ NE(C, m) ⇔ x2 ∈ GE(C, m ). 2 x (vi) C( 2 ) ≥ 1 C(x) for all x ∈ Rn . + 4 Proof. Parts (i) and (ii) are immediate. Part (iii) follows directly from part (ii) and the convexity of the function u → ucφ (u). Part (ii) implies that the concave function − φ uφ (x) cφ (uφ (x)) = −C(x) is a potential function for Γ, which with Corollary 2.1.9 yields part (iv). Parts (v) and (vi) are easily verified by direct calculation. Proof of Theorem 2.1.28: Let x∗ ∈ NE(C, m), and let x be an arbitrary feasible state for ∗ ∗ ∗ (C, m). Parts (iv) and (v) of the lemma imply that x2 ∈ NE(Γ, m ). Hence, −Γ( x2 ) ( y − x2 ) ≤ 2 0 for all y feasible in (Γ, m ) by the normal cone characterization of Nash equilibrium. 2 Therefore: C(x) = cφ (uφ (x)) uφ (x) φ∈Φ ∗ ∗ ∗ cφ (uφ ( x2 )) uφ ( x2 ) + ≥ φ∈Φ ∗ γφ (uφ ( x2 )) uφ (x) − uφ ( x2 ) by (iii) φ∈Φ = ∗ C( x2 ) = ∗ C( x2 ) p∗ Γi ( x2 ) + xi − 1 (x∗ )i 2 p p p∈P i∈Sp p + ∗ p p Γi ( x2 ) · 1 xi + 2 p∈P i∈Sp ∗ 1 2 xi − (x∗ )i p p p∈P i∈Sp p ≥ C( x2 ) + ∗ Γi ( x2 ) · ∗ p Γi ( x2 ) · 1 (x∗ )i 2 p∈P i∈Sp 55 ∗ since Γ( x2 ) ( x − 2 x∗ ) 2 ≥0 ∗ = C( x2 ) + p p Ci (x∗ ) · (x∗ )i 1 2 by (i) p∈P i∈Sp = C( x2 ) + 1 C(x∗ ) 2 ∗) 3 ≥ C(x ∗ by (vi). 4 That the highest inefficiency ratio for a given class of cost functions can be realized in a very simple network is true quite generally. Consider a two-link network with link cost functions c1 (u) = 1 and c2 (u) = uk , where k ≥ 1. With a unit mass population, the Nash equilibrium for this network is x∗ = (0, 1), and has social cost C(x∗ ) = 1; the efficient state is xmin = (1 − (k + 1)−1/k , (k + 1)−1/k ), and has social cost C(xmin ) = 1 − k(k + 1)−(k+1)/k . Remarkably, it is possible to show that the resulting inefficiency ratio of (1 − k(k + 1)−(k+1)/k )−1 is the highest possible in any network whose cost functions are polynomials of degree at most k. See the Notes for further details. 2.2 Potential Games To define full potential games, we first defined full population games by extending the domain of payoffs from the state space X to the positive orthant Rn . While this device for + introducing potential functions is simple, it is often artifical. By using ideas from affine calculus (Appendix 2.B), we can define potential functions for population games without recourse to changes in domain. 2.2.1 Motivating Examples We can motivate the developments to come not only by parsimony, but also by generality, as the following two examples show. Example 2.2.1. Random matching in symmetric normal form potential games. Recall that the symmetric normal form game C ∈ Rn×n is a common interest game if C is a symmetric matrix, so that both players always receive the same payoff. We call the symmetric normal form game A ∈ Rn×n a potential game if A = C + 1r for some common interest game C and some arbitrary vector r ∈ Rn . Thus, each player’s payoff is the sum of a common interest term and a term that only depends on his opponent’s choice of strategy. (For the latter point, note that Ai j = Ci j + r j .) Suppose a population of agents is randomly matched to play game A. Since the second payoff term has no effect on agents’ incentives, it is natural to expect our characterization of equilibrium from the previous section to carry over to the current setting. But this does 56 not follow directly from our previous definitions. Suppose we define the full population game F : Rn → Rn as in Example 2.1.2: F(x) = Ax. Then the resulting derivative matrix is + DF(x) = A = C + 1r , and so ∂F j ∂Fi (x) = Ci j + r j , but (x) = C ji + ri . ∂x j ∂xi Therefore, unless r is a constant vector (in which case A itself is symmetric), the full population game F defined above is not a full potential game. § Example 2.2.2. Two-strategy games. Recall that the population game F : X → Rn is a two-strategy game if p = 1 and n = 2. In this setting, the state space X is the simplex in R2 , which can be viewed as a relabelling of the unit interval. Because all functions defined on the unit interval are integrable, it seems natural to expect two-strategy games to admit potential functions. If we wanted to show that F defines a full potential game, we would first need to extend its domain to R2 . Once we do this, the domain is no longer + one-dimensional, so our intuition about the existence of a potential function is lost. § 2.2.2 Definition, Characterizations, and Examples Example 2.2.2 suggests that the source of our difficulties is the extension of payoffs from the original state space X to the full-dimensional set Rn . As the definition of full + potential games relied on this extension, our new notion of potential games will require some additional ideas. The key concepts are the tangent spaces and orthogonal projections introduced in Chapter 1, which we briefly review here. p p Recall that the state space for population p is given by Xp = {xp ∈ Rn : i∈Sp xi = mp }. + p The tangent space of Xp , denoted TXp , is the smallest subspace of Rn that contains all p p p directions of motion through Xp ; it is defined by TXp = Rn ≡ {zp ∈ Rn : i∈Sp zi = 0}. 0 p p p The matrix Φ ∈ Rn ×n , representing the orthogonal projection of Rn onto TXp , is defined p 1 by Φ = I − np 11 . If πp ∈ Rn is a payoff vector, then the projected payoff vector Φπp represents relative payoffs under πp : it preserves the differences between components of πp while normalizing their sum to zero. Changes in the social state x ∈ X = p∈P Xp p are represented by elements of TX = p∈P TX , the tangent space of X. The matrix Φ ∈ Rn×n , representing the orthogonal projection of Rn onto TX, is the block diagonal matrix diag(Φ, . . . , Φ). If π = (π1 , . . . , πp ) ∈ Rn is a payoff vector for the society, then Φπ = (Φπ1 , . . . , Φπp ) normalizes each of the p pieces of the vector π separately. With these preliminaries in hand, we are ready for our new definition. Definition. Let F : X → Rn be a population game. We call F a potential game if it admits a 57 potential function: a C1 function f : X → R that satisfies (2.8) f (x) = ΦF(x) for all x ∈ X. Since the potential function f has domain X, the gradient vector f (x) is by definition an element of the tangent space TX (see Appendix 2.B.3). Our definition of potential games requires that this gradient vector always equal ΦF(x), the projection of the payoff vector F(x) onto the subspace TX. At the cost of sacrificing parsimony, one can define potential games without affine calculus by using a function defined throughout Rn to play the role of the potential + function f . To do so, one simply includes the projection Φ on both sides of the analogue of equation (2.8). Observation 2.2.3. If F is a potential game with potential function f : X → R, then any C1 extension f˜ : Rn → R of f satisfies + (2.9) Φ f˜(x) = ΦF(x) for all x ∈ X. Conversely, if the population game F admits a function f˜ satisfying condition (2.9), then F is a potential game, and the restriction f = f˜ X is a potential function for F. This observation is immediate from the relevant definitions. In particular, if f˜ and f agree on X, then for all x ∈ X the gradient vectors f˜(x) and f (x) define identical linear operators on TX, implying that Φ f˜(x) = Φ f (x). But since Φ f (x) = f (x) by definition, it follows that Φ f˜(x) = f (x); this equality and definition (2.8) yield the result. Like full potential games, potential games can be characterized by a symmetry condition on the payoff derivatives DF(x). Since potential games generalize full potential games, the new symmetry condition is less restrictive than the old one. Theorem 2.2.4. Suppose the population game F : X → Rn is C1 . Then F is a potential game if and only if it satisfies externality symmetry: (2.10) DF(x) is symmetric with respect to TX × TX for all x ∈ X. Proof. Immediate from Theorem 2.B.8 in Appendix 2.B. Condition (2.10) demands that at each state x ∈ X, the derivative DF(x) define a symmetric bilinear form on TX × TX: ˆ z DF(x)ˆ = z DF(x)z for all z, z ∈ TX and x ∈ X. zˆ 58 Observation 2.2.5 offers a version of this condition that does not require affine calculus, just as Observation 2.2.3 did for definition (2.8). ˜ Observation 2.2.5. Suppose that the population game F : X → Rn is C1 , and let F : Rn → Rn be + 1 any C extension of F. Then F satisfies externality symmetry (and so is a potential game) if and only if ˜ ΦDF(x)Φ is symmetric for all x ∈ X. The next exercise characterizes externality symmetry in a more intuitive way. Exercise 2.2.6. Show that externality symmetry (2.10) holds if and only if the previous p p q q ˆ equality holds whenever z = e j − ei and z = el − ek . In other words, show that (2.10) is equivalent to p (2.11) p q q ∂(F j − Fi ) ∂(el − ek ) q (x) = q p p ∂(Fl − Fk ) ∂(e j − ei ) (x) for all i, j ∈ Sp , k, l ∈ Sq , p, q ∈ P , and x ∈ X. The left hand side of equation (2.11) captures the change in the payoff to strategy j ∈ Sp relative to strategy i ∈ Sp as agents switch from strategy k ∈ Sq to strategy l ∈ Sq . This effect must equal the change in the payoff of l relative to k as agents switch from i to j, as expressed on the right hand side of (2.11). This description is akin to that of full externality symmetry (2.2) (see the discussion after equation (2.3)), but it only refers to relative payoffs and to feasible changes in the social state. Exercise 2.2.7. Let F be a C1 single population game. Show that F is a potential game if and only if it satisfies triangular integrability: ∂F j ∂Fi ∂Fk (x) + (x) + (x) = 0 for all i, j, k ∈ S and x ∈ X. ∂(e j − ek ) ∂(ek − ei ) ∂(ei − e j ) We now return to the examples that led off the section. Example 2.2.8. Two-strategy games revisited. If F : X → R2 is a smooth two-strategy game, its state space X is the simplex in R2 , whose tangent space TX is spanned by the vector ˆ ˆˆ d = e1 − e2 . If z and z are vectors in TX, then z = kd and z = kd for some real numbers k ˆ ˆ ˆ and k; thus, however F is defined, we have that z DF(x)ˆ = kkd DF(x)d = z DF(x)z for all z x ∈ X. In other words, F is a potential game. Even if F is merely continuous, the function 59 f : X → R defined by (2.12) x1 f (x1 , 1 − x1 ) = (F1 (t, 1 − t) − F2 (t, 1 − t)) dt 0 is a potential function for F, so F is still a potential game. (If you think that a on the right hand side of equation (2.12), convince yourself that it is not.) § 1 2 is needed Exercise 2.2.9. Random matching in symmetric normal form potential games. Let A = C + 1r be a symmetric normal form potential game: C ∈ Rn×n is symmetric, and r ∈ Rn is arbitrary. Define the population game F : X → Rn by F(x) = Ax. Use one of the derivative conditions above to verify that F is a potential game, and find a potential function f : X → R for F. Exercise 2.2.10. Random matching in normal form potential games. The normal form game U = (U1 , . . . , Up ) is a potential game if there is a potential function V : S → R and auxiliary functions W p : S−p → R such that Up (s) = V (s) + W p (s−p ) for all s ∈ S and p ∈ P . In a normal form potential game, each player’s payoff is the sum of a common payoff term and a term that only depends on opponents’ behavior. It is easy to show that pure strategy profile s ∈ S is a Nash equilibrium of U if and only if it is a local maximizer of the potential function V . ˜ (i) Define the full population game F : Rn → Rn by + ˜p Fsp (x) = s−p ∈S−p V (s) + W p (s−p ) xrr = s Up (s) rp s−p ∈S−p xrr . s rp ˜ Show that F is not a full potential game. (ii) Define the population game F : X → Rn using the equation from part (i). By verifying condition (2.10), show that F is a potential game. (iii) Construct a potential function for F. 2.2.3 Potential Games and Full Potential Games What is the relationship between full potential games and potential games? In the former case, condition (2.1) requires that payoffs be completely determined by the potential function, which is defined on Rn ; in the latter, condition (2.8) asks only that relative payoffs + be determined by the potential function, now defined just on X. 60 To understand the relationship between the two definitions, take a potential game F : X → Rn with potential function f : X → R as given, and extend f to a full potential function f˜ : Rn → R. Theorem 2.2.11 shows that the link between the full potential game + ˜ F ≡ f˜ and the original game F depends on how the extension f˜ is chosen. Theorem 2.2.11. Let F : X → Rn be a potential game with potential function f : X → R. ˜ Let f˜ : Rn → R be any C1 extension of f , and define the full potential game F : Rn → Rn by + + ˜ F(x) = f˜(x). Then ˜ ˜ (i) The population games F and F X have the same relative payoffs: ΦF(x) = ΦF(x) for all x ∈ X. ˜ (ii) One can choose the extension f˜ in such a way that F and F X are identical. ˜ Part (i) of the theorem shows that the full potential game F generated from an arbitrary extension of the potential function f exhibits the same relative payoffs as F on their ˜ common domain X. It follows that F and F have the same best response correspondences and Nash equilibria, but may exhibit different average payoff levels. Part (ii) of the ˜ theorem shows that by choosing the extension f˜ appropriately, we can make F and F identical on X. To accomplish this, we construct the extension f˜ in such a way (equation (2.13) below) that its derivatives at states in X evaluated in directions orthogonal to TX encode information about average payoffs from the original game F. In conclusion, Theorem 2.2.11(ii) demonstrates that if population masses are fixed, so that the relevant set of social states is X, then definition (2.1), while more difficult to check, does not entail a loss of generality relative to definition (2.8). ˜ Proof of Theorem 2.2.11: Part (i) follows from the fact that ΦF(x) = Φ f˜(x) = f (x) = ΦF(x) for all x ∈ X; compare the discussion following Observation 2.2.3. To prove part (ii), we first extend f and F from the state space X to its affine hull ˆ aff (X). Let fˆ : aff (X) → R be a C1 extension of f : X → R, and let gp : aff (X) → R be 1 a continuous extension of population p’s average payoff function, np 1 Fp : X → R. (The existence of these extensions follows from the Whitney Extension Theorem.) Then define ˆ ˆ ˆ ˆ G : aff (X) → Rn by Gp (x) = 1 gp (x), so that F(x) = ΦF(x) + (I − Φ)F(x) = fˆ(x) + G(x) for all ˆ ˆ ˆ ˆ x ∈ X. If after this we define F : aff (X) → Rn by F(x) = fˆ(x) + G(x), then F is a continuous ˆ extension of F, and fˆ(x) = ΦF(x) for all x ∈ aff (X). With this groundwork complete, we can extend f to all of Rn via + (2.13) f˜( y) = f (ξ( y)) + ( y − ξ( y)) F(ξ( y)), where ξ( y) = Φ y + z is the closest point to y in aff (X). (Here, z⊥ is the orthogonal TX TX p translation vector that sends TX to aff (X): namely, (z⊥ )p = mp 1.) Theorem 2.B.10 shows TX n 61 that ˜ f˜ X = F X is identical to F. Theorem 2.2.11 implies that all of our results from Sections 2.1.4 and 2.1.5 on Nash equilibria of full potential games apply unchanged to potential games. On the other hand, the efficiency results from Section 2.1.6 do not. In particular, the proof of Theorem 2.1.18 depends on the game F being a full population game, as the application of Euler’s Theorem makes explicit use of the partial derivatives of F. In fact, to establish that a potential game F has efficiency properties of the sorts described in Section 2.1.6, one must show that F can be extended to a homogeneous full potential game. This should come as no surprise: since the potential function f : X → R only captures relative payoffs, it cannot be used to prove efficiency results, which depend on both relative and average payoffs. Exercise 2.2.12. Consider population games with entry and exit (Exercise 2.1.14). Which derivative condition is the right one for defining potential games in this context, (2.2) or (2.10)? Why? ˜ Exercise 2.2.13. Prove this simple “converse” to Theorem 2.2.11: Suppose F : Rn → Rn is + ˜ : Rn → R. Let F = F and f = f˜ . ˜ a full potential game with full potential function f + X X Then F is a potential game with potential function f . 2.2.4 Passive Games and Constant Games We conclude this section by introducing two simple classes of population games. Definition. The population game H : X → Rn is a passive game if for each state x ∈ X and each population p ∈ P , the payoffs to all of population p’s strategies are equal: p p Hi (x) = H j (x) for all i, j ∈ Sp , p ∈ P , and x ∈ X. Definition. The population game K : X → Rn is a constant game if all strategies’ payoffs are independent of the state: that is, if K(x) = π for all x ∈ X, or, more explicitly, if p p Ki (x) = πi for all i ∈ Sp , p ∈ P , and x ∈ X. In a passive game, an agent’s own behavior has no bearing on his payoffs; in a constant game, each agent’s behavior is the sole determinant of his payoffs. The following two propositions provide some alternate characterizations of these games. Proposition 2.2.14. The following statements are equivalent: 62 (i) (ii) (iii) (iv) (v) (vi) H is a passive game. There are functions cp : X → R such that Hp (x) = cp (x)1 for all p ∈ P and x ∈ X. H(x) ∈ (TX)⊥ for all x ∈ X. ΦH(x) = 0 for all x ∈ X. z H(x) = 0 for all z ∈ TX and x ∈ X. H is a potential game whose potential function is constant. Proposition 2.2.15. The following statements are equivalent: (i) K is a constant game. (ii) DK(x) = 0 for all x ∈ X. (iii) K is a potential game that admits a linear potential function. In particular, if K(x) = π is a constant game, then k(x) = π x is a potential function for K. One reason that passive and constant games are interesting is that adding them to a population game from a certain class (the potential games, the stable games, the supermodular games) results in a new game from the same class. For instance, suppose that F is a potential game with potential function f , let H be a passive game, and let K be a constant game with potential function k. Evidently, F + H is also a potential game with potential function f ; thus, adding H to F leaves the Nash equilibria of F unchanged. F + K is also a potential game, but its potential function is not f , but f + k; thus, NE(F) and NE(F + K) generally differ. Similar observations are true for stable games and for supermodular games: adding a passive game or a constant game to a game from either of these classes keeps us in the class, but only adding passive games leaves incentives unchanged. When payoffs are smooth, the invariances just described can be represented in terms of payoff derivatives. As an illustration, recall that the C1 population game F : X → Rn is a potential game if and only if it satisfies externality symmetry: (2.10) DF(x) is symmetric with respect to TX × TX for all x ∈ X. The first TX tells us that condition (2.10) constrains the effects of left multiplication of DF(x) by elements of TX; this restricts the purview of (2.10) to changes in relative payoffs. The second TX tells us that (2.10) constrains the effects of right multiplication of DF(x) by elements of TX; this reflects that we can only evaluate how payoffs change in response to feasible changes in the state. In summary, the action of the derivative matrices DF(x) on TX × TX captures changes in relative payoffs due to feasible changes in the state. We have seen that this action is enough to characterize potential games, and we will soon find that it is enough to characterize stable and supermodular games as well. 63 It follows from this discussion that the additions to F that do not affect the action of its derivative matrices on TX × TX are the ones that do not alter F’s class. These additions are characterized by the following proposition. Proposition 2.2.16. Let G be a C1 population game. Then DG(x) is the null bilinear form on TX × TX for all x ∈ X if and only if G = H + K, where H is a passive game and K is a constant game. Exercise 2.2.17. Prove Propositions 2.2.14, 2.2.15, and 2.2.16. (Hints: For Proposition 2.2.15, prove the equivalence of (i) and (iii) using the Fundamental Theorem of Calculus. For 2.2.16, use the previous propositions, along with the fact that DG(x) is the null bilinear form on TX × TX if and only if ΦDG(x) = 0.) Exercise 2.2.18. (i) Suppose H(x) = Ax is a single population passive game. Describe A. (ii) Suppose K(x) = Ax is a single population constant game. Describe A. 2.3 Stable Games There are a variety of well-known classes of games whose Nash equilibria lie in a single convex component: for instance, two player zero-sum games, wars of attrition, games with an interior ESS or NSS, and potential games with concave potential functions. This shared property of these seemingly disparate examples springs from a common source: all of these examples are stable games. 2.3.1 Definition The common structure in the examples above is captured by the following definition. Definition. The population game F : X → Rn is a stable game if (2.14) ( y − x) (F( y) − F(x)) ≤ 0 for all x, y ∈ X. If the inequality in condition (2.14) holds strictly whenever x y, we call F a strictly stable game, while if this inequality always binds, we call F a null stable game. For a first intuition, imagine for the moment that F ≡ f (x) is also a full potential game. In this case, condition (2.14) is simply the requirement that the potential function f 64 be concave. Our definition of stable games thus extends the defining property of concave potential games to games whose payoffs are not integrable. Stable games whose payoffs are differentiable can be characterized in terms of the action of their derivative matrices DF(x) on TX × TX. Theorem 2.3.1. Suppose the population game F is C1 . Then F is a stable game if and only if it satisfies self-defeating externalities: (2.15) DF(x) is negative semidefinite with respect to TX for all x ∈ X. Before proving Theorem 2.3.1, let us provide some intuition for condition (2.15). This condition asks that z DF(x)z ≤ 0 for all z ∈ TX and x ∈ X. This requirement is in turn equivalent to p∈P p p ∂Fi (x) zi ∂z i∈Sp ≤ 0 for all z ∈ TX and all x ∈ X. To interpret this expression, recall that the displacement vector z ∈ TX describes the aggregate effect on the population state of strategy revisions by a small group of agents. p ∂F The derivative ∂zi (x) represents the marginal effect that these revisions have on the payoffs of agents currently choosing strategy i ∈ Sp . Condition (2.15) considers a weighted sum of these effects, with weights given by the changes in the use of each strategy, and requires that this weighted sum be negative. Intuitively, a game exhibits self-defeating externalities if the improvements in the payoffs of strategies to which revising agents are switching are always exceeded by the improvements in the payoffs of strategies which revising agents are abandoning. For p p example, suppose the tangent vector z takes the form z = e j − ei , representing switches by some members of population p from strategy i to strategy j. In this case, the requirement in p ∂F j ∂F p condition (2.15) reduces to ∂z (x) ≤ ∂zi (x): that is, any performance gains that the switches create for the newly chosen strategy j are dominated by the performance gains created for the abandoned strategy i. Exercise 2.3.2. (i) Characterize the C1 two-strategy stable games using a derivative condition. (ii) Recall the Hawk-Dove game introduced in Chapter 1: 65 HD F −1 2 xH 2xD − xH . (x) = 0 1 x = x D D Verify that F is a stable game. Also, fill in the numerical details of the argument from the previous paragraph for this specific choice of payoff function. Proof of Theorem 2.3.1: To begin, suppose that F is a stable game. Fix x ∈ X and z ∈ TX; we want to show that z DF(x)z ≤ 0. Since F is C1 , it is enough to consider x in the interior of X. In this case, yε = x + εz lies in X whenever |ε| is sufficiently small, and so F( yε ) = F(x) + DF(x)( yε − x) + o( yε − x ). by the definition of DF(x). Premultiplying by yε − x and rearranging yields 2 ( yε − x) (F( yε ) − F(x)) = ( yε − x) DF(x)( yε − x) + o( yε − x ). Since the left hand side is nonpositive and since yε − x = εz, it follows that ε2 z DF(x)z + o(ε2 ) ≤ 0, and hence that z DF(x)z ≤ 0. Next, suppose that condition (2.15) holds. Then if we let α(t) = ty + (1 − t)x, the Fundamental Theorem of Calculus implies that 1 ( y − x) (F( y) − F(x)) = ( y − x) DF(α(t))( y − x) dt 0 1 = ( y − x) DF(α(t))( y − x) dt ≤ 0. 0 Exercise 2.3.3. The derivative condition that characterizes potential games, externality ˆ ˆ symmetry (2.10), requires that z DF(x)ˆ = z DF(x)z. That z and z are chosen separately z means that DF(x) is treated as a bilinear form. Exercise 2.2.6 shows that in order to check ˆ ˆ that (2.10) holds for all z and z in TX, it is enough to show that it holds for all z and z in a p p basis for TX—for example, the set of vectors of the form e j − ei . In contrast, self-defeating externalities (2.15), which requires that z DF(x)z ≤ 0, places the same vector z on both sides of DF(x), thus viewing DF(x) as a quadratic form. Explain why the conclusion of Exercise 2.2.6 does not extend to the present setting. Also, construct p p a 3 × 3 symmetric game A such that z Az ≤ 0 whenever z is of the form e j − ei but such that F(x) = Ax is not a stable game. 66 2.3.2 Examples Example 2.3.4. Random matching in symmetric normal form games with an interior evolutionarily or neutrally stable state. Let A be a symmetric normal form game. State x ∈ X is an evolutionarily stable state (or an evolutionarily stable strategy, or simply an ESS) of A if (2.16) x Ax ≥ y Ax for all y ∈ X; and (2.17) x Ax = y Ax implies that x A y > y A y. Condition (2.16) says that x is a symmetric Nash equilibrium of A. Condition (2.17) says that x performs better against any alternative best reply y than y performs against itself. (Alternatively, (2.16) says that no y ∈ X can strictly invade x, and (2.16) and (2.17) together say that if y can weakly invade x, then x can strictly invade y—see Section 2.3.3 below.) If we weaken condition (2.17) to (2.18) If x Ax = y Ax, then x A y ≥ y A y, then a state satisfying conditions (2.16) and (2.18) is called a neutrally stable state (NSS). Suppose that the ESS x lies in the interior of X. Then as x is an interior Nash equilibrium, all pure and mixed strategies are best responses to it: for all y ∈ X, we have that x Ax = y Ax, or, equivalently, that (x − y) Ax = 0. Next, we can rewrite the inequality in condition (2.17) as (x − y) A y > 0. Subtracting this last expression from the previous one yields (x − y) A(x − y) < 0. But since x is in the interior of X, all tangent vectors z ∈ TX are proportional to x − y for some choice of y ∈ X. Therefore, z DF(x)z = z Az < 0 for all z ∈ TX, and so F is a strictly stable game. Similar reasoning shows that if F admits an interior NSS, then F is a stable game. § Example 2.3.5. Random matching in Rock-Paper-Scissors. In Rock-Paper-Scissors, Paper covers Rock, Scissors cut Paper, and Rock smashes Scissors. If a win in a match is worth w > 0, a loss −l < 0, and a draw 0, we obtain the symmetric normal form game 0 −l w A = w 0 −l , where w, l > 0. −l w 0 When w = l, we refer to A as (standard) RPS; when w > l, we refer to A as good RPS, and when w < l, we refer to A as bad RPS. In all cases, the unique symmetric Nash equilibrium 1 of A is ( 3 , 1 , 1 ). 33 To determine the parameter values for which this game generates a stable population 67 game, define d = w − l. Since y A y = 1 y (A + A ) y, it is enough to see when the symmetric 2 matrix 0 d d ˆ = A + A = d 0 d A d d 0 ˆ is negative semidefinite with respect to TX. Now A has one eigenvalue of 2d corresponding to the eigenvector 1, and two eigenvalues of −d corresponding to the orthogonal ˆ eigenspace TX. Thus, z Az = −dz z for each z ∈ TX. Since z z > 0 whenever z 0, we conclude that F is stable if and only if d ≥ 0. Thus, good RPS is strictly stable, standard RPS is stable, and bad RPS is neither. § Exercise 2.3.6. Random matching in wars of attrition. A war of attrition is a two player symmetric normal form game. Strategies represent amounts of time committed to waiting for a scarce resource. If the two players choose times i and j > i, then the j player obtains the resource, worth v, while both players pay a cost of ci : once the first player leaves, the other seizes the resource immediately. If both players choose time i, the resource is split, so payoffs are v − ci each. Show that for any resource value v ∈ R and any cost vector 2 n c ∈ R satisfying c1 ≤ c2 ≤ . . . ≤ cn , random matching in a war of attrition generates a stable game. § Example 2.3.7. Random matching in symmetric zero-sum games. A symmetric two player normal form game A is symmetric zero-sum if A is skew-symmetric: that is, if A ji = −Ai j for all i, j ∈ S. This condition ensures that under single population random matching, the total utility generated in any match is zero. Since payoffs in the resulting single population game are F(x) = Ax, we find that z DF(x)z = z Az = 0 for all vectors z ∈ Rn , and so F is a null stable game. § Example 2.3.8. Random matching in standard zero-sum games. A two player normal form game U = (U1 , U2 ) is zero-sum if U2 = −U1 , so that the two players’ payoffs always add up to zero. Random matching of two populations to play U generates the population game 0 F(x1 , x2 ) = 2 (U ) U1 x1 0 = 0 x2 −(U1 ) U1 x1 . 0 x2 If z is a vector in Rn = Rn +n , then 1 z DF(x)z = (z1 ) 2 0 (z2 ) −(U1 ) U1 z1 = (z1 ) U1 z2 − (z2 ) (U1 ) z1 = 0, 0 z2 68 so F is a null stable game. § Exercise 2.3.9. Random matching in multi-zero-sum games. Let U be a p player normal form game in which each player p ∈ P chooses a single strategy from Sp to simultaneously play a distinct zero-sum contest with each of his p − 1 opponents. We call such a U a multi-zero-sum game. p q (i) When p < q, let Zpq ∈ Rn ×n denote player p’s payoff matrix for his zero-sum contest against player q. Define the normal form game U in terms of the Zpq matrices. (ii) Let F be the p population game generated by random matching in U. Show that z DF(x)z = 0 for all x ∈ X and z ∈ Rn , and hence that F is a null stable game. The previous example and exercise show that random matching across multiple populations can generate a null stable game. Proposition 2.3.10 reveals that null stable games are the only stable games that can be generated in this way. Proposition 2.3.10. Suppose F is a C1 stable game without own-population interactions: Fp (x) is independent of xp for all p ∈ P . Then F is a null stable game. Proof. By Theorem 2.3.1, F is stable if and only if for all x ∈ X, DF(x) is negative semidefinite with respect to TX. This requirement on DF(x) can be restated as (i) ΦDF(x)Φ is negative semidefinite (with respect to Rn ); or as (ii) Φ(DF(x) + DF(x) )Φ is negative semidefinite, or (since the previous matrix is symmetric) as (iii) Φ(DF(x) + DF(x) )Φ has all eigenvalues nonpositive. By similar logic, F is null stable if and only if for all x ∈ X, Φ(DF(x) + DF(x) )Φ has all eigenvalues zero (and so is the null matrix). Let Dq Fp (x) be the (p, q)th block of the derivative matrix DF(x). Since Fp is independent of xp , it follows that Dp Fp (x) = 0, and hence that Φ(Dp Fp (x) + Dp Fp (x) )Φ = 0. Since this product is the (p, p)th block of the symmetric matrix Φ(DF(x) + DF(x) )Φ, the latter has zero trace, and so its eigenvalues sum to zero. Therefore, the only way Φ(DF(x) + DF(x) )Φ can be negative semidefinite is if all of its eigenvalues are zero. In other words, if F is stable, it is null stable. Proposition 2.3.10 tells us that within-population interactions are required to obtain a strictly stable game. Thus, strictly stable games can arise when there is matching within a single population to play a symmetric normal form game, but not when there is random matching in multiple populations to play a standard normal form game. On the other hand, strictly stable games can arise in multipopulation matching settings that allow matches both across and within populations (see the Notes). Moreover, in general population games—for instance, in congestion games—within-population interactions are the norm, and strictly stable games are not uncommon. Our remaining examples illustrate this point. 69 Example 2.3.11. (Perturbed) concave potential games. We call F : X → Rn a concave potential game if it is a potential game whose potential function f : X → R is concave. Then since y − x ∈ TX, since the orthogonal projection matrix Φ is symmetric, and since f ≡ ΦF, we find that ( y − x) (F( y) − F(x)) = (Φ( y − x)) (F( y) − F(x)) = ( y − x) (ΦF( y) − ΦF(x)) = ( y − x) ( f ( y) − f (x)) ≤ 0, so F is a stable game. If the inequalities above are satisfied strictly, then they will continue to be satisfied if the payoff functions are slightly perturbed. In other words, perturbations of strictly concave potential games remain strictly stable games. § Example 2.3.12. Negative dominant diagonal games. We call the full population game F a negative dominant diagonal game if it satisfies p p ∂Fi p (x) ∂x i ≤ 0 and ∂Fi p (x) ∂x i ≥ 1 2 ( j,q) q p ∂F j ∂Fi p (x) + q (x) ∂x ∂x j i (i,p) for all i ∈ Sp , p ∈ P , and x ∈ X. The first condition says that choosing strategy i ∈ Sp imposes a negative externality on other users of this strategy. The second condition requires that this externality exceeds the average of (i) the total externalities that strategy i imposes on other strategies and (ii) the total externalities that other strategies impose on strategy i. These conditions are precisely what is required for the matrix DF(x) + DF(x) to have a negative dominant diagonal. The dominant diagonal condition implies that all of the eigenvalues of DF(x) + DF(x) are negative; since DF(x) + DF(x) is also symmetric, it is negative semidefinite. Therefore, DF(x) is negative semidefinite too, and so F is a stable game. § 2.3.3 Invasion In Section 2.3.4, we introduce new equilibrium concepts that are of basic importance for stable games: global neutral stability and global evolutionary stability. These concepts are best understood in terms of the notion of invasion to be presented now. Let F : X → Rn be a population game, and let x, y ∈ X be two social states. We say that y can weakly invade x ( y ∈ IF (x)) if ( y − x) F(x) ≥ 0. Similarly, y can strictly invade x ( y ∈ IF (x)) 70 if ( y − x) F(x) > 0. The intuition behind these definitions is simple. Consider a single population of agents who play the game F, and whose initial behavior is described by the state x ∈ X. Now imagine that a very small group of agents decide to switch strategies. After these agents select their new strategies, the distribution of choices within their group is described by some y ∈ X, but since the group is so small the impact of its behavior on the overall population state is negligible. Thus, the average payoff in the invading group is at least as high as that in the incumbent population if y F(x) ≥ x F(x), or equivalently, if y ∈ IF (x). Similarly, the average payoff in the invading group exceeds that in the incumbent population if y ∈ IF (x). The interpretation of invasion does not change much when there are multiple populations. If we write ( y − x) F(x) as p ( yp − xp ) Fp (x), we see that if y ∈ IF (x), there must be some population p for which the small group switching to yp outperforms the incumbent population playing xp at social state x. These stories suggest a link with evolutionary dynamics. If y is any state in X, then the vector y − x is a feasible displacement direction from state x. If in addition y ∈ IF (x), then the direction y − x is not only feasible, but also respects the incentives provided by the underlying game. The invasion conditions also have simple geometric interpretations. That y ∈ IF (x) means that the angle between the displacement vector y − x and the payoff vector F(x) is weakly acute; if y ∈ IF (x), this angle is strictly acute. Figure 2.3.1 sketches the set IF (x) at various states x in a two strategy game. Figure 2.3.2 does the same for a three-strategy game. To draw the latter case, we need the observation that y ∈ IF (x) ⇔ ( y − x) F(x) > 0 ⇔ (Φ( y − x)) F(x) > 0 ⇔ ( y − x) ΦF(x) > 0. In other words, y ∈ IF (x) if and only if the angle between the displacement vector y − x and the projected payoff vector ΦF(x) is strictly acute. 2.3.4 Global Neutral Stability and Global Evolutionary Stability Before introducing our new solution concepts, we first characterize Nash equilibrium in terms of invasion: a Nash equilibrium is a state that no other state can strictly invade. Proposition 2.3.13. x ∈ NE(F) if and only if IF (x) = ∅. 71 e1 I F (x) x F(x) x � F(x) � I F (x) � e2 Figure 2.3.1: Invasion in a two strategy game. e1 ΦF(x) I F (x) x ΦF(x) � x � � I F (x) e2 e3 Figure 2.3.2: Invasion in a three strategy game. 72 Proof. x ∈ NE(F) ⇔ ( y − x) F(x) ≤ 0 for all y ∈ X ⇔ IF (x) = ∅. With this background at hand, we call x ∈ X a globally neutrally stable state (GNSS) if ( y − x) F( y) ≤ 0 for all y ∈ X. Similarly, we call x a globally evolutionarily stable state (GESS) if ( y − x) F( y) < 0 for all y ∈ X − {x}. We let GNSS(F) and GESS(F) denote the sets of globally neutrally stable strategies and globally evolutionarily stable strategies, respectively. To see the reason for our nomenclature, note that the inequalities used to define GNSS and GESS are the same ones used to define NSS and ESS in symmetric normal form games (Example 2.3.4), but that they are now required to hold not just at those states y that are optimal against x, but at all y ∈ X. NSS and ESS also require a state to be a Nash equilibrium, but our new solution concepts implicitly require this as well—see Proposition 2.3.15 below. It is easy to describe both of these concepts in terms of the notion of invasion. Observation 2.3.14. (i) GNSS(F) = y∈X IF ( y), and so is convex. (ii) x ∈ GESS(F) if and only if x ∈ y∈X−{x} IF ( y). In words: a GNSS is a state that can weakly invade every state (or, equivalently, every other state), while a GESS is a state that can strictly invade every other state. Our new solution concepts can also be described in geometric terms. For example, x is a GESS if a small motion from any state y x in the direction F( y) (or ΦF( y)) moves the state closer to x (see Figure 2.3.3). If we allow not only these acute motions, but also orthogonal motions, we obtain the weaker notion of GNSS. We conclude this section by relating our new solution concepts to Nash equilibrium. Proposition 2.3.15. (i) If x ∈ GNSS(F), then x ∈ NE(F). (ii) If x ∈ GESS(F), then NE(F) = {x}. Hence, if a GESS exists, it is unique. Proof. To prove part (i), let x ∈ GNSS(F) and let y x. Define xε = ε y + (1 − ε)x. Since x is a GNSS, (x − xε ) F(xε ) ≥ 0 for all ε ∈ (0, 1]. Simplifying and dividing by ε yields (x − y) F(xε ) ≥ 0 for all ε ∈ (0, 1], so taking ε to zero yields ( y − x) F(x) ≤ 0. In other words, x ∈ NE(F). 73 y ΦF(y) x y � ΦF(y) � ~ ΦF(y) ̃ y Figure 2.3.3: The geometric definition of GESS. ΦF(xε) y ΦF(x) xε x Figure 2.3.4: Why every GNSS is a Nash equilibrium. 74 To prove part (ii), it is enough to show that if x is a GESS, then no y x ∈ GESS(F), then x ∈ IF ( y); since IF ( y) is nonempty, y NE(F). x is Nash. But if Evidently, this proposition implies that every GNSS is an NSS, and that every GESS is an ESS. The proof that every GNSS is Nash is easy to explain in pictures. In Figure 2.3.4, we draw the GNSS x and an arbitrary state y, and place the state xε on the segment between y and x. Since x is a GNSS, the angle between F(xε ) and x − xε , and hence between ΦF(xε ) and x − xε , is weakly acute. Taking ε to zero, it is apparent that the angle between ΦF(x) and y − x, and hence between y − x and ΦF(x), must be weakly obtuse. Since y was arbitrary, x is a Nash equilibrium. 2.3.5 Nash Equilibrium and Global Neutral Stability in Stable Games Proposition 2.3.15 tells us that every GNSS of an arbitrary game F is a Nash equilibrium. Theorem 2.3.16 shows that much more can be said if F is stable: in these case, the sets of globally neutrally stable states and Nash equilibria coincide. Together, this fact and Observation 2.3.14 imply that the Nash equilibria of any stable game form a convex set. In fact, if we can replace certain of the weak inequalities that define stable games with strict ones, then the Nash equilibrium is actually unique. Theorem 2.3.16. (i) If F is a stable game, then NE(F) = GNSS(F), and so is convex. (ii) If in addition F is strictly stable at some x ∈ NE(F) (that is, if ( y − x) (F( y) − F(x)) < 0 for all y x), then NE(F) = GESS(F) = {x}. Proof. Suppose that F is stable, and let x ∈ NE(F). To establish part (i), it is enough to show that x ∈ GNSS(F). So fix an arbitrary y x. Since F is stable, (2.19) ( y − x) (F( y) − F(x)) ≤ 0. And since x ∈ NE(F), ( y − x) F(x) ≤ 0. Adding these inequalities yields (2.20) ( y − x) F( y) ≤ 0, As y was arbitrary, x is a GNSS. Turning to part (ii), suppose that F is strictly stable at x. Then inequality (2.19) holds strictly, so inequality (2.20) holds strictly as well. This means that x is a GESS of F, and hence the unique Nash equilibrium of F. 75 R P S Figure 2.3.5: The GESS of good RPS. Example 2.3.17. Rock-Paper-Scissors revisited. Recall from Example 2.3.5 that good RPS is a (strictly) stable game; standard RPS is a zero-sum game, and hence a (weakly) stable 1 game. The unique Nash equilibrium of both of games is x∗ = ( 1 , 3 , 1 ). In Figure 2.3.5, for 3 3 a selection of states x, we draw the projected payoff vectors ΦF(x) generated by good RPS (with w = 3 and l = 1), as well as the vector from x to x∗ . For each x, the angle between this pair of vectors is acute, reflecting the fact that the Nash equilibrium x∗ is a GESS. In Figure 2.3.6, we perform the same exercise for standard RPS. In this case, the vectors ΦF(x) and x∗ − x always form a right angle, so x∗ is a GNSS but not a GESS. § Exercise 2.3.18. Let F be a stable game. Show that if x∗ is a Nash equilibrium of F such that DF(x∗ ) is negative definite with respect to TX × TX, then x∗ is a GESS, and hence the unique Nash equilibrium of F. Exercise 2.3.19. Pseudostable games. We call the population game F pseudostable if for all x, y ∈ X, ( y − x) F(x) ≤ 0 implies that (x − y) F( y) ≥ 0. In other words, if y cannot strictly invade x, then x can weakly invade y. (i) Show that every stable game is pseudostable. (ii) Show that if F is pseudostable, then NE(F) = GNSS(F), and so is convex. 76 R P S Figure 2.3.6: The GNSS of standard RPS. (A smooth real-valued function f is pseudoconcave if its gradient f is pseudostable. Given facts (i) and (ii) above and the discussion in Section 2.1.5, it should be no surprise that many results from concave programming (e.g., the convexity of the set of maximizers) remain true when the objective function is only pseudoconcave.) In addition to its role in establishing that the set of Nash equilibria of a stable game is convex, the concept of global neutral stability enables us to carry out an important theoretical exercise: that of devising an elementary proof of existence of Nash equilibrium in stable games—in other words, one that does not rely on an appeal to a fixed point theorem. The heart of the proof, Proposition 2.3.20, is a finite analogue of the result we seek. Proposition 2.3.20. Let F be a stable game, and let Y be a finite subset of X. Then there exists a state x∗ ∈ conv(Y) such that ( y − x∗ ) F( y) ≤ 0 for all y ∈ Y. In words: if F is a stable game, then given any finite set of states Y, we can always find a state in the convex hull of Y that can weakly invade every element of Y. The proof of this result uses the Minmax Theorem. Proof. Suppose that Y has m elements. Define a two player zero-sum game U = 77 (U1 , U2 ) = (Z, −Z) with n1 = n2 = m as follows: Zxy = (x − y) F( y). In this game, player 2 chooses a “status quo” state y ∈ Y, player 1 chooses an “invader” x ∈ Y, and the payoff Zxy is the invader’s “relative payoff ” in F. Split Z into its symmetric and skew-symmetric parts: ZS = 1 (Z + Z ) and ZSS = 1 (Z − Z ). 2 2 Since F is stable, equation (2.19) from the previous proof shows that ZS = xy (x − y) F( y) + ( y − x) F(x) = 1 (x − y) (F( y) − F(x)) ≥ 0 2 1 2 for all x, y ∈ Y. The Minmax Theorem tells us that in any zero sum game, player 1 has a strategy that guarantees him the value of the game. In the skew-symmetric game USS = (ZSS , −ZSS ) = (ZSS , (ZSS ) ), the player roles are interchangeable, so the game’s value must be zero. Since Z = ZSS + ZS and ZS ≥ 0, the value of U = (Z, −Z) must be at least zero. In other words, if λ ∈ Rm is a maxmin strategy for player 1, then λx Zxy µ y ≥ 0 x∈Y y∈Y for all mixed strategies µ of player 2. If we let x∗ = λx x ∈ conv(Y) x∈Y and fix an arbitrary pure strategy y ∈ Y for player 2, we find that λx (x − y) F( y) = (x∗ − y) F( y). λx Zxy = 0≤ x∈Y x∈Y With this result in hand, existence of Nash equilibrium in stable games follows from a simple compactness argument. Theorem 2.3.16 and Observation 2.3.14 tell us that NE(F) = GNSS(F) = {x ∈ X : ( y − x) F( y) ≤ 0}. y∈X Proposition 2.3.20 shows that if we take the intersection above over an arbitrary finite set 78 Y ⊂ X instead of over X itself, then the intersection is nonempty. Since X is compact, the finite intersection property allows us to conclude that GNSS(F) is nonempty itself. Exercise 2.3.21. In Exercise 2.1.14, we defined population games with entry and exit. If F : Rn → R is C1 and defines such a game, what condition on the derivative matrices + DF(x) is the appropriate definition of stable games for this context? Argue that all of the results in this section continue to hold when entry and exit are permitted. 2.4 Supermodular Games Of the classes of games we study in this chapter, supermodular games, a class that includes models of coordination, search, and Bertrand competition, are the most familiar to economists. By definition, supermodularity requires that higher choices by one’s opponents make one’s own higher strategies look relatively more desirable. This complementarity condition imposes a monotone structure on the agents’ best response correspondences, which in turn imposes structure on the set of Nash equilibria. 2.4.1 Definition Each strategy set Sp = {1, . . . , np } is naturally endowed with a linear order. To define supermodular games, we introduce a corresponding partial order on the set of population states Xp (and, implicitly, on the set of mixed strategies for population p). Define the matrix p p Σ ∈ R(n −1)×n by 0 1 · · · 1 . . . . .. . . Σ= . . . . . 0 · · · 0 1 Then np p (Σx )i = p xj j=i+1 equals the total mass on strategies greater than i at population state xp . If we view xp as a discrete density function on Sp with total mass mp , then Σxp defines the corresponding “decumulative distribution function” for xp . In particular, Σ yp ≥ Σxp if and only if yp stochastically dominates xp . 79 We extend this partial order to all of X using the matrix Σ ∈ R(n−p )×n , which we define as the block diagonal matrix Σ = diag(Σ, . . . , Σ). Note that Σ y ≥ Σx if and only if yp stochastically dominates xp for all p ∈ P . With these preliminaries in hand, we are ready to define our class of games. Definition. We call the population game F : X → Rn a supermodular game if it exhibits strategic complementarities: (2.21) p p p p If Σ y ≥ Σx, then Fi+1 ( y) − Fi ( y) ≥ Fi+1 (x) − Fi (x) for all i < np , p ∈ P , x ∈ X. In words: if y stochastically dominates x, then for any strategy i < np , the payoff advantage of i + 1 over i is greater at y than at x. By introducing a bit more notation, we can express condition (2.21) in a more concise p p ˜ ˜ way. Define the matrices Σ ∈ Rn ×(n −1) and Σ ∈ Rn×(n−p ) by −1 0 1 −1 ˜ Σ=0 1 . . . . .. ... 0 ... ˜ ˜ ˜ 0 and Σ = diag(Σ, . . . , Σ). ... −1 0 1 ··· ... 0 . . . Observation 2.4.1. F is a supermodular game if and only if the following condition holds: (2.22) ˜ ˜ Σ y ≥ Σx implies that Σ F( y) ≥ Σ F(x). As with potential games and stable games, we can characterize smooth supermodular games in terms of conditions on the derivatives DF(x). Theorem 2.4.2. Suppose the population game F is C1 . Then F is supermodular if and only if either of the following equivalent conditions holds. p (2.23) (2.24) p q q ∂(Fi+1 − Fi ) ∂(e j+1 − e j ) (x) ≥ 0 for all i < np , j < nq , p, q ∈ P , and x ∈ X. ˜ ˜ Σ DF(x)Σ ≥ 0 for all x ∈ X. Condition (2.23) is the most transparent of the four conditions. It requires that if some players in population q switch from strategy j to strategy j + 1, the performance of strategy i + 1 ∈ Sp improves relative to that of strategy i. On the other hand, condition 80 (2.24) provides the most concise characterization of supermodular games. Moreover, ˜ ˜ since the range of Σ is TX (i.e., since each column of Σ lies in TX), condition (2.24) is a restriction of the action of DF(x) on TX × TX—just like our earlier conditions (2.10) and (2.15) characterizing potential games and stable games. Proof. The equivalence of conditions (2.23) and (2.24) is easily verified. Given Observation 2.4.1, it is enough to show that (2.21) implies (2.23) and that (2.24) implies (2.22). So suppose condition (2.21) holds, and fix x ∈ X; since F is C1 it is enough to consider q q x in the interior of X. Let yε = x + ε(e j+1 − e j ), which lies in X whenever |ε| is sufficiently small, and which satisfies Σ yε ≥ Σx. By the definition of DF(x), we have that p p Fi+1 ( yε ) − p Fi ( yε ) = p Fi+1 (x) − p Fi (x) +ε p q q ∂(Fi+1 − Fi ) ∂(e j+1 − e j ) (x) + o yε − x . Thus, condition (2.21) implies that p ε p q q ∂(Fi+1 − Fi ) ∂(e j+1 − e j ) (x) + o(|ε|) ≥ 0, which implies (2.23). We now show that (2.24) implies (2.22). We consider only the single population case, leaving the general case as an exercise. The idea behind the proof is simple. If state y stochastically dominates state x, then we can transit from state x to state y by shifting mass from strategy 1 to strategy 2, from strategy 2 to strategy 3, ... , and finally from strategy n − 1 to strategy n. Condition (2.23) ≡ (2.24) says that each such shift improves the payoff of each strategy k + 1 relative to that of strategy k. Since transiting from x to y means executing all of the shifts, this transition too must improve the performance of k + 1 relative to k, which is exactly what condition (2.21) ≡ (2.22) requires. Our matrix notation makes it possible to formalize this argument in a streamlined way. ˜ Recall the definitions of Σ ∈ Rn×(n−1) and Σ ∈ R(n−1)×n , and define Ω ∈ Rn×n as follows: −1 0 1 −1 ˜ =0 Σ 1 . . . . .. .. . 0 0 1 ... , Σ = 0 0 0 . . . .. . −1 0 ··· 0 1 ··· ... 0 . . . Then it is easy to verify this next observation. 81 1 · · · · · · 1 0 . . 1 . , and Ω = 0 ... ... . . . . . . ··· 0 1 0 1 · · · · · · 1 0 · · · · · · 0 . .. . . 0 . . .. . . . . . . 0 ··· ··· 0 ˜ Observation 2.4.3. ΣΣ = I − Ω ∈ Rn×n . In words, Observation 2.4.3 says that the stochastic dominance operator Σ is “inverted” by ˜ the difference operator Σ, except for a remainder Ω that is a null operator on TX (i.e., that ˜ satisfies Ωz = 0 for all z ∈ TX). (For completeness, we also note that ΣΣ = I ∈ R(n−1)×(n−1) .) Now suppose that Σx ≤ Σ y, and let α(t) = ty + (1 − t)x, so that α(0) = x, α(1) = y, and α (t) = y − x ∈ TX. Then using the Fundamental Theorem of Calculus, Observation 2.4.3, condition (2.24), and the fact that Σ( y − x) ≥ 0, we find that 1 ˜ ˜ Σ (F( y) − F(x)) = Σ DF(α(t)) ( y − x) dt 0 1 = ˜ ˜ Σ DF(α(t)) (ΣΣ + Ω) ( y − x) dt 0 1 = ˜ ˜ Σ DF(α(t))Σ Σ( y − x) dt 0 ≥ 0. 2.4.2 Examples Exercise 2.4.4. Random matching in supermodular normal form games. The normal form game U = (U1 , . . . , Up ) is supermodular if the difference Up (sp + 1, sq , s−{p,q} ) − Up (sp , sq , s−{p,q} ) is nondecreasing in sq for all sp < np , s−{p,q} ∈ r {p,q} Sr and distinct p, q ∈ P . Show that random matching of p populations to play U generates a supermodular game. Exercise 2.4.5. Which symmetric normal form games generate supermodular population games? Example 2.4.6. Bertrand oligopoly with differentiated products. A population of firms produce output at zero marginal cost and compete in prices S = {1, . . . , n}. Suppose that the demand faced by a firm increases when competitors raise their prices, and that this effect does not diminish when the firm itself charges higher prices. More precisely, let qi (x), the demand faced by a firm that charges price i when the price distribution is x, satisfy ∂(qk+1 − qk ) ∂qi (x) ≥ 0 and (x) ≥ 0 for all i ≤ n and all j, k < n. ∂(e j+1 − e j ) ∂(e j+1 − e j ) The payoff to a firm that charges price i is Fi (x) = i qi (x), and so ∂qi+1 ∂qi ∂(Fi+1 − Fi ) (x) = (i + 1) (x) − i (x) ∂(e j+1 − e j ) ∂(e j+1 − e j ) ∂(e j+1 − e j ) 82 =i ∂(qi+1 − qi ) ∂qi+1 (x) + (x) ≥ 0. ∂(e j+1 − e j ) ∂(e j+1 − e j ) Therefore, F is a supermodular game. § Example 2.4.7. Search with positive externalities. A population of agents choose levels of search effort in S = {1, . . . , n}. The payoff to choosing effort i is Fi (x) = m(i) b(a(x)) − c(i), where a(x) = k≤n kxk is the aggregate search effort, b is some increasing benefit function, m is an increasing multiplier function, and c is an arbitrary cost function. Notice that the benefits from searching are increasing in both own search effort and in the aggregate search effort. Since ∂(Fi+1 − Fi ) (x) = m(i + 1) b (a(x)) ( j + 1) − j − m(i) b (a(x)) ( j + 1) − j ∂(e j+1 − e j ) = (m(i + 1) − m(i)) b (a(x)) ≥ 0, F is a supermodular game. § Example 2.4.8. Relative consumption effects/Arms races. Agents from a single population choose consumption levels (or armament levels) in S = {1, . . . , n}. Payoffs take the form Fi (x) = r(i − a(x)) + u(i) − c(i). Here, r is a concave function of the difference between the agent’s consumption level and the average consumption level in the population, while u and c are arbitrary functions of the consumption level. (One would typically assume that r is increasing, but this property is not needed for supermodularity.) Since ∂(Fi+1 − Fi ) (x) = r ((i + 1) − a(x)) −( j + 1) + j − r (i − a(x)) −( j + 1) + j ∂(e j+1 − e j ) = r (i − a(x)) − r ((i + 1) − a(x)) ≥ 0, F is a supermodular game. § Exercise 2.4.9. Characterize the C1 two-strategy supermodular games using a derivative condition. Compare them with the C1 two-strategy stable games (Exercise 2.3.2(i)). Are all C1 two-strategy games in one class or the other? 83 2.4.3 Best Response Monotonicity in Supermodular Games Recall the definition of the pure best response correspondence for population p: p bp (x) = argmax Fi (x). i∈Sp Theorem 2.4.10 establishes a fundamental property of supermodular games: their pure best response correspondences are increasing. Theorem 2.4.10. Let F be a supermodular game with pure best response correspondences bp . If Σx ≤ Σ y, then min bp (x) ≤ min bp ( y) and max bp (x) ≤ max bp ( y) for all p ∈ P . This property is intuitively obvious: when opponents choose higher strategies, an agent’s own higher strategies look relatively better, so his best strategies must be (weakly) higher as well. Proof. We consider the case in which p = 1, focusing on the first inequality; we leave the remaining cases as exercises. Let Σx ≤ Σ y and i < j. Then condition (2.21) implies that j −1 F j ( y) − Fi ( y) − F j (x) − Fi (x) = Fk+1 ( y) − Fk ( y) − Fk+1 (x) − Fk (x) ≥ 0. k =i Thus, if j = min b(x) > i, then F j ( y) − Fi ( y) ≥ F j (x) − Fi (x) > 0, so i is not a best response to y. As i < min b(x) was arbitrary, we conclude that min b(x) ≤ min b( y). To state a version of Theorem 2.4.10 for mixed best responses, we need some additional p p p notation. Let vi ∈ Rn denote the ith vertex of the simplex ∆p : that is, (vi ) j equals 1 if j = i p p p p and equals 0 otherwise. (To summarize our notation to date: xi ∈ R, vi ∈ Rn , and ei ∈ Rn . p Of course, the notation vi is unnecessary in the single population case.) We can describe population p’s mixed best response correspondence in the following equivalent ways: p Bp (x) = xp ∈ ∆p : xi > 0 ⇒ i ∈ bp (x) p = conv vi : i ∈ bp (x) , We can also define the minimal and maximal elements of Bp (x) as follows: p p Bp (x) = vmin bp (x) and Bp (x) = vmax bp (x) . 84 To extend this notation to the multipopulation environment, define B(x) = (B1 (x), . . . , Bp (x)) and B(x) = (B1 (x), . . . , Bp (x)). Then the following corollary follows immediately from Theorem 2.4.10. Corollary 2.4.11. If F is supermodular and Σx ≤ Σ y, then ΣB(x) ≤ ΣB( y) and ΣB(x) ≤ ΣB( y). 2.4.4 Nash Equilibria of Supermodular Games We now use the monotonicity of the best response correspondence to show that every supermodular game has a minimal and a maximal Nash equilibrium. The derivation of this result includes a finite iterative method for computing the minimal and maximal equilibria, and so provides a simple proof of the existence of equilibrium. We focus attention on the case where each population has mass one, so that each set of population p states Xp is just the simplex in Rn ; the extension to the general case is a simple but notationally cumbersome exercise. p p Let x and x be the minimal and maximal states in X : xp = v1 and xp = vnp for all p ∈ P . Recall that Xv denotes the set of vertices of X, and let n∗ = #Xv = p∈P np . Finally, for states y, z ∈ X, define the interval [ y, z] ⊆ X by [ y, z] = {x ∈ X : Σ y ≤ Σx ≤ Σz}. Theorem 2.4.12. Suppose F is a supermodular game. Then (i) The sequences {Bk (x)}k≥0 and {Bk (x)}k≥0 are monotone sequences in Xv , and so converge within n∗ steps to their limits, x∗ and x∗ . (ii) x∗ = B(x∗ ) and x∗ = B(x∗ ), so x∗ and x∗ are pure Nash equilibria of F. (iii) NE(F) ⊆ [x∗ , x∗ ]. Thus, if x∗ = x∗ , then this state is the Nash equilibrium of F. In short, iterating B and B from the minimal and maximal states in X yields Nash equilibria of F, and all other Nash equilibria of F lie between the two so obtained. Proof. Part (i) follows immediately from Corollary 2.4.11. To prove part (ii), note that ∗ ∗ ∗ since x∗ = Bn (x) and Bn +1 (x) = Bn (x) by part (i), it follows that ∗ ∗ ∗ B(x∗ ) = B(Bn (x)) = Bn +1 (x) = Bn (x) = x∗ . An analogous argument shows that B(x∗ ) = x∗ . We finish with the proof of part (iii). If Y ⊆ X and min Y and max Y exist, then the monotonicity of B implies that B(Y) ⊆ [B(min Y), B(max Y)]. Iteratively applying B to the ∗ ∗ ∗ set X therefore yields Bn (X) ⊆ [Bn (x), Bn (x)] = [x∗ , x∗ ]. Also, if x ∈ NE(F), then x ∈ B(x), 85 and so Bk−1 (x) ⊆ Bk−1 (B(x)) = Bk (x), implying that x ∈ Bk (x) for all k ≥ 1. We therefore ∗ ∗ conclude that x ∈ Bn (x) ⊆ Bn (X) ⊆ [x∗ , x∗ ]. Appendix 2.A Multivariate Calculus 2.A.1 Univariate Calculus Before discussing multivariate calculus we review some ideas from univariate calculus. A function f from the real line to itself is differentiable at the point x if f (x) = lim y→x f ( y) − f (x) y−x exists; this limit is called the derivative of f at x. Three useful facts about derivatives are ( f g) (x) = f (x) g (x) + g(x) f (x); ( g ◦ f ) (x) = g ( f (x)) f (x); y f ( y) − f (x) = x f (z) dz. The Product Rule: The Chain Rule: The Fundamental Theorem of Calculus: The definition of f (x) above is equivalent to the requirement that (2.25) f ( y) = f (x) + f (x)( y − x) + o( y − x), where o(z) represents a remainder function r : R → R satisfying lim z→0 r(z) = 0. z (In words: r(z) approaches zero faster than z approaches zero.) In the approximation (2.25), f (x) acts as a linear map from R to itself; it sends the displacement of the input, y − x, to the displacement of the output, f (x)( y − x). 2.A.2 The Derivative as a Linear Map Let L(Rn , Rm ) denote the space of linear maps from Rn to Rm : ˆ ˆ L(Rn , Rm ) = {λ : Rn → Rm | λ(az + bz) = aλ(z) + bλ(ˆ ) for all a, b ∈ R and z, z ∈ Rn }. z 86 Each matrix A ∈ Rm×n defines a linear map in L(Rn , Rm ) via λ(z) = Az, and such a matrix can be found for every map λ in L(Rn , Rm ) (see Appendix 2.B.1). It is common to identify a linear map with its matrix representation. But it is important to be aware of the distinction between these two objects: if we replace the domain Rn with a proper subspace of Rn , matrix representations of linear maps are no longer unique—see Appendix 2.B. Let F be a function from Rn to Rm . (Actually, we can replace the domain Rn with any open set in Rn , or even with a closed set in Rn , as discussed in Appendix 2.A.7.) We say that F is differentiable at x if there is a linear map DF(x) ∈ L(Rn , Rm ) satisfying (2.26) F( y) = F(x) + DF(x)( y − x) + o( y − x) Here, o(z) represents a remainder function r : Rn → Rm that satisfies lim z→0 r(z) = 0. |z| If the function DF : Rn → L(Rn , Rm ) is continuous, we say that F is continuously differentiable or of class C1 . When we view DF(x) as a matrix in Rm×n , we call it the Jacobian matrix or derivative matrix of F at x. To express this matrix explicitly, define the partial derivatives of F at x by Fi ( y j , x− j ) − Fi (x) ∂Fi . (x) = lim y j →x j yj − xj ∂x j Then the derivative matrix DF(x) can be expressed as ∂F1 ∂x1 (x) · · · . . . DF(x) = . . . ∂Fm (x) · · · ∂x1 ∂F1 (x) ∂xn . . . . ∂Fm (x) ∂xn If f is a function from Rn to R (i.e., if m = 1), then its derivative at x can be represented by a vector. We call this vector the gradient of f at x, and define it by ∂f ∂x1 (x) . . . f (x) = . ∂f (x) ∂xn Our notations for derivatives are related by D f (x) = 87 f (x) , where the prime represents transposition, and also by F1 (x) . . . DF(x) = . F (x) m Suppose we are interested in how quickly the value of f changes as we move from the point x ∈ Rn in the direction z ∈ Rn − {0}. This rate is described by the directional derivative of f at x in direction z, defined by (2.27) ∂f f (x + εz) − f (x) (x) = lim . ε→0 ε ∂z It is easy to verify that ∂f (x) = ∂z f (x) z. More generally, the rate of change of the vector-valued function F at x in direction z can be expressed as DF(x)z. It is worth noting that a function can admit directional derivatives at x in every direction z 0 without being differentiable at x (i.e., without satisfying definition (2.26)). Amazingly, such a function need not even be continuous at x, as the following example shows. Example 2.A.1. Define the function f : R2 → R by x1 (x2 )2 f (x1 , x2 ) = (x1 )2 + (x2 )4 0 if x1 0, if x1 = 0. Using definition (2.27), it is easy to verify that the directional derivatives of f at the origin in every direction z 0 exist: (z2 )2 ∂f (0) = z1 ∂z 0 But while f (0) = 0, f (x) = at 0. if z1 0, if z1 = 0. 1 2 at all other x that satisfy x1 = (x2 )2 , and so f is discontinuous 88 On the other hand, if all (or even all but one) of the partial derivatives f exist and are continuous in a neighborhood of x, then f is differentiable at x. 2.A.3 Differentiation as a Linear Operation We can view differentiation as an operation that takes functions as inputs and returns functions as outputs. From this point of view, differentiation is a linear operation between spaces of functions. As an example, suppose that f and g are functions from R to itself, and that a and b are real numbers. Then the scalar product a f is a function from R to itself, as is the linear combination a f + bg. (In other words, the set of functions from R to itself is a vector space.) The fact that differentiation is linear means that the derivative of the linear combination, (a f + bg) , is equal to the linear combination of the derivatives, a f + bg . We can express this idea in a multivariate setting using a simple formula. Suppose that F : Rn → Rm is a differentiable function and that A is a matrix in Rl×m . Then AF is the function from Rn to Rl defined by (AF)k (x) = m 1 Ak j F j (x) for k ∈ {1, . . . , l}. Linearity of j= differentiation says that D(AF) = A(DF), or, more explicitly, that Linearity of differentiation: D(AF)(x) = A(DF)(x) for all x ∈ Rn . Put differently, the differential operator D and the linear map A commute. 2.A.4 The Product Rule and the Chain Rule Suppose f and g are differentiable functions from R to itself. Then the product rule tells us that ( f g) (x) = f (x) g (x) + g(x) f (x). In other words, to find the effect of changing x on the value ( f g)(x) of the product function, first find the effect of changing x on g(x), and scale this effect by f (x); then, find the effect of changing x on f (x), and scale this effect by g(x); and finally, add the two terms. This same idea can be applied in multidimensional cases as well. Let F : Rn → Rm and G : Rn → Rm be differentiable vector-valued functions. Then F G : Rn → R, defined by (F G)(x) = F(x) G(x), is a scalar-valued function. The derivative D(F G)(x) ∈ R1×n of our new function is described by the following product rule: Product Rule 1: D(F G)(x) = ( (F G)(x)) = F(x) DG(x) + G(x) DF(x). (Notice that in the previous paragraph, a prime ( ) denoted the derivative of a scalarvalued function, while here it denotes matrix transposition. So long as we keep these scalar and matrix usages separate, no confusion should arise.) 89 If a : Rn → R is a differentiable scalar-valued function, then aF : Rn → Rm , defined by (aF)(x) = a(x)F(x), is a vector-valued function. Its derivative D(aF)(x) ∈ Rm×n is described our next product rule: Product Rule 2: D(aF)(x) = a(x)DF(x) + F(x) a(x) = a(x)DF(x) + F(x)Da(x). Finally, we can create a vector-valued function from F : Rn → Rm and G : Rn → Rm by introducing the componentwise product F • G : Rn → Rm . This function is defined by (F•G)i (x) = Fi (x)Gi (x), or, in matrix notation, by (F•G)(x) = diag(F(x))G(x) = diag(G(x))F(x), where diag(v) denotes the diagonal matrix whose diagonal entries are the components of the vector v. The derivative of the componentwise product, D(F • G)(x) ∈ Rm×n , is described by our last product rule: Product Rule 3: D(F • G)(x) = diag(F(x))DG(x) + diag(G(x))DF(x). One can verify each of the formulas above by expanding them and then applying the univariate product rule term by term. To remember the product rules, bear in mind that the end result must be a sum of two terms of the same dimensions, and that each of the terms must end with a derivative, so as to operate on a displacement vector z ∈ Rn to be placed on the right hand side. In the one dimensional setting, the chain rule tells us that ( g ◦ f ) (x) = g ( f (x)) f (x). In words, the formula says that we can decompose the effect of changing x on ( g ◦ f )(x) into two pieces: the effect of changing x on the value of f (x), and the effect of this change in f (x) on the value of g( f (x)). This same idea carries through to multivariate functions. Let F : Rn → Rm and G : Rm → Rl be differentiable, and let G ◦ F : Rn → Rl be their composition. The chain rule says that the derivative of this composition at x ∈ Rn , D(G ◦ F)(x) ∈ Rl×n , is obtained as the product of the derivative matrices DG(F(x)) ∈ Rl×m and DF(x) ∈ Rm×n . The Chain Rule: D(G ◦ F)(x) = DG(F(x)) DF(x). This equation can be stated more explicitly as ∂(G ◦ F)k (x) = ∂x i m j =1 ∂F j ∂Gk (F(x)) (x). ∂y j ∂x i The chain rule can be viewed as a generalization of the earlier formula on linearity of differentiation, with the linear map A replaced by the nonlinear function G. 90 2.A.5 Homogeneity and Euler’s Theorem Let f be a differentiable function from Rn to R. (We can replace the domain Rn with an open (or even a closed) convex cone: a convex set which, if it contains x ∈ Rn , also contains tx for all t > 0.) We say that f is homogeneous of degree k if (2.28) f (tx) = tk f (x) for all x ∈ Rn and t > 0. By definition, homogeneous functions are monomials along each ray from the origin. Indeed, when n = 1 the homogeneous functions are precisely the monomials: if x ∈ R, g(tx) = tk g(x), and g(1) = a, then g(x) = axk . But when n > 1 more complicated homogeneous functions can be found. Nevertheless, the basic properties of homogeneous functions are generalizations of properties of monomials. If we take the derivative of each side of equation (2.28) with respect to xi , applying the chain rule on the left hand side, we obtain f (tx) (tei ) = tk ∂f (x). ∂xi Dividing both sides of this equation by t and simplifying yields ∂f ∂f (tx) = tk−1 (x). ∂xi ∂xi In other words, the partial derivatives of a homogeneous function of degree k are themselves homogeneous of degree k − 1. If we instead take the derivative of each side of (2.28) with respect to t, again using the chain rule on the left hand side, we obtain 0 if k = 0, f (tx) x = k −1 kt f (x) otherwise. Setting t = 1 yields Euler’s Theorem: if f is homogeneous of degree k, then f (x) x = k f (x) for all x ∈ Rn . In fact, the converse of Euler’s Theorem is also true: one can show that if f satisfies the previous identity, it is homogeneous of degree k. 91 2.A.6 Higher Order Derivatives As we have seen, the derivative of a function F : Rn → Rm is a new function (2.29) DF : Rn → L(Rn , Rm ). For each x ∈ Rn , DF(x) describes how the value of F in Rm changes as we move away from x in any direction z ∈ Rn . Notice that in expression (2.29), the point x around which we evaluate the function F inhabits the first Rn , while the displacement vector z inhabits the second Rn . The second derivative of F at x, D2 F(x) = D(DF(x)), describes how the value of the first ˆ derivative DF(x) ∈ L(Rn , Rm ) changes as we move away from x in direction z ∈ Rn . Thus, D2 F(x) is an element of the set of maps L(Rn , L(Rn , Rm )), which we denote by L2 (Rn , Rm ). Elements of L2 (Rn , Rm ) are called bilinear maps from Rn × Rn to Rm : they take two vectors in Rn as inputs, are linear in each of these vectors, and return elements of Rm as outputs. If F is twice continuously differentiable (i.e., if DF and D2 F are both continuous in x), ˆ then it can be shown that D2 F(x) is symmetric, in the sense that D2 F(x)(z, z) = D2 F(x)(ˆ , z) for z n 2 2 n m ˆ all z, z ∈ R . We therefore say that D F(x) is an element of Ls (R , R ), the set of symmetric bilinear maps from Rn × Rn to Rm . More generally, the kth derivative of F is a map Dk F : Rn → Lk (Rn , Rm ). For each x ∈ Rn , s Dk F(x) is a symmetric multilinear map; it takes k displacement vectors in Rn as inputs, is linear in each, and returns an output in Rm ; this output does not depend on the order of the inputs. If F has continuous derivatives of orders zero through K, we say that it is in class CK . We can use higher order derivatives to write the Kth order version of Taylor’s Formula, which provides a polynomial approximation of a CK function F around the point x. K Taylor’s Formula: F( y) = F(x) + k =1 K 1k D F(x)( y − x, . . . , y − x) + o y − x . k! Here, Dk F(x)( y − x, . . . , y − x) ∈ Rm is the output generated when the multilinear map Dk F(x) ∈ Lk (Rn , Rm ) acts on k copies of the displacement vector ( y − x) ∈ Rn . (To see where s the factorial terms come from, try expressing the coefficients of a Kth order polynomial in terms of the polynomial’s derivatives.) The higher order derivative that occurs most frequently in applications is the second derivative of a scalar valued function f : Rn → R. This second derivative, D2 f , sends each x ∈ Rn to a symmetric bilinear map D2 f (x) ∈ L2 (Rn , R). We can represent this map s using a Hessian matrix 2 f (x) ∈ Rn×n , the elements of which are the second order partial 92 derivatives of f : 2 ∂f (x) · · · (∂x1 )2 . . 2 . . f (x) = . . 2 ∂f (x) · · · ∂xn ∂x1 ∂2 f (x) ∂x1 ∂xn . . . 2 ∂f (x) (∂xn )2 When f is C2 , the symmetry of the map D2 f (x) is reflected in the fact that the Hessian matrix is symmetric: corresponding pairs of mixed partial derivatives are equal. ˆ The value D2 f (x)(z, z) is expressed in terms of the Hessian matrix in this way: ˆ D2 f (x)(z, z) = z 2 f (x)ˆ . z Using the gradient vector and Hessian matrix, we can express the second-order Taylor approximation of a C2 scalar-valued function as follows: f ( y) = f (x) + 2.A.7 f (x) ( y − x) + 1 ( y − x) 2 2 f (x)( y − x) + o y − x 2 . The Whitney Extension Theorem While we have defined our K times continuously differentiable functions to have domain Rn , nothing we have discussed so far would change were our functions only defined on open subsets of Rn . In fact, it is also possible to define CK functions on closed sets X ⊂ Rn . To do so, one requires F : X → Rm to be CK in the original sense on int(X), and to admit “local uniform Taylor expansions” at each x on bd(X). The Whitney Extension Theorem tells us that such functions F can always be extended to CK functions defined on all of Rn . In effect, the Whitney Extension Theorem provides a definition of (K times) continuously differentiability for functions defined on closed sets. 2.A.8 Vector Integration and the Fundamental Theorem of Calculus Let α : R → Rn be a vector-valued function defined on the real line. Integrals of α are computed componentwise: in other words, b b α(t) dt = (2.30) a i αi (t) dt. a 93 It is easy to verify that integration, like differentiation, is linear: if A ∈ Rm×n then b b A α(t) dt = A a α(t) dt. a With definition (2.30) in hand, we can state a multivariate version of the Fundamental Theorem of Calculus. Suppose that F : Rn → Rm is a C1 function. Let α : [0, 1] → Rn be a C1 function satisfying α(0) = x and α(1) = y, and call its derivative α : R → Rn . Then we have 1 The Fundamental Theorem of Calculus: F( y) − F(x) = DF(α(t)) α (t) dt. 0 2.A.9 Potential Functions and Integrability When can a continuous vector field F : Rn → Rn be expressed as the gradient of some scalar valued function f ? In other words, when does F = f for some potential function f : Rn → R? One can characterize the vector fields that admit potential functions in terms of their integrals over closed curves: if F : Rn → Rn is continuous, it admits a potential function if and only if 1 F(α(t)) (2.31) 0 d α(t) dt dt = 0 for every piecewise C1 function α : [0, 1] → Rn with α(0) = α(1). If we use C to denote the closed curve through Rn traced by α, then (2.31) can be expressed more concisely as F(x) · dx = 0. C When F is not only continuous, but also C1 , the question of the integrability of F can be answered by examining cross-partial derivatives. Note first that if F admits a C2 potential function f , then the symmetry of the Hessian matrices of f implies that (2.32) ∂F j ∂2 f ∂2 f ∂Fi (x) = (x) = (x) = (x), ∂x j ∂x i ∂x j ∂x j ∂x i ∂xi and hence that the derivative matrix DF(x) is symmetric for all x ∈ Rn . The converse statement is also true, and provides the characterization of integrability we seek: if F is C1 , with DF(x) symmetric for all x ∈ Rn (i.e., whenever the integrability condition (2.32) holds), there is a function f : Rn → R such that f = F. This sufficient condition for 94 integrability remains valid whenever the domain of F is an open (or closed) convex subset of Rn . However, condition (2.32) does not ensure the existence of a potential function for vector fields defined on more general domains. 2.B Affine Calculus The simplex in Rn , which serves as our state space in single population games, is an n − 1 dimensional set. As a consequence, derivatives of functions defined on the simplex can not be computed in the manner described in Appendix 2.A, as partial derivatives of such functions do not exist. To understand differential calculus in this context, and in the more general context of multipopulation games, we must develop the tools of calculus for functions defined on affine spaces. 2.B.1 Linear Forms and the Riesz Representation Theorem Let Z be a subspace of Rn , and let L(Z, R) be the set of linear maps from Z to R. L(Z, R) is also known as the dual space of Z, and elements of L(Z, R), namely, maps λ : Z → R that ˆ satisfy λ(az + bz) = aλ(z) + bλ(ˆ ), are also known as linear forms. z Each vector y ∈ Z defines a linear form λ ∈ L(Z, R) via λ(z) = y z. In fact, the converse statement is also true: every linear form can be uniquely represented in this way. Theorem 2.B.1 (The Riesz Representation Theorem). For each linear form λ ∈ L(Z, R), there is a unique y ∈ Z, the Riesz representation of λ, such that λ(z) = y z for all z ∈ Z. Another way of describing the Riesz representation theorem is to say that Z and L(Z, R) are linearly isomorphic: the map from Z to L(Z, R) described above is linear, one-to-one, and onto. It is crucial to note that when Z is a proper subspace of Rn , the linear form λ can be represented by many vectors in Rn . What Theorem 2.B.1 tells us is that λ can be represented by a unique vector in Z itself. Example 2.B.2. Let Z = R2 = {z ∈ R2 : z1 + z2 = 0}, and define the linear form λ ∈ L(Z, R) by 0 λ(z) = z1 − z2 . Then not only 1 3 ˆ y = , but also y = −1 1 ˆ represents λ: if z ∈ Z, then y z = 3z1 + z2 = 3z1 + (−z1 ) = 2z1 = z1 − z2 = y z = λ(z). But since y is an element of Z, it is the Riesz representation of λ. § 95 ˆ In this example, the reason that both y and y can represent λ is that their difference, 2 ˆ y−y= 2 is orthogonal to Z. This suggests a simple way of recovering the Riesz representation of a linear form from an arbitrary vector representation: eliminate the portion orthogonal to Z by applying the orthogonal projection PZ . ˆ Theorem 2.B.3. Let λ ∈ L(Z, R) be a linear form. If y ∈ Rn represents λ, in the sense that ˆ ˆ λ(z) = y z for all z ∈ Z, then y = PZ y is the Riesz representation of λ. 1 Example 2.B.4. Recall that the orthogonal projection onto R2 is Φ = I − 2 11 . Thus, in the 0 ˆ previous example, we can recover y from y in the following way: ˆ y = Φ y = (I − 2.B.2 1 11 2 3 3 2 1 ) = − = .§ 1 1 2 −1 Dual Characterizations of Multiples of Linear Forms Before turning our attention to calculus, we present some results that characterize when two linear forms are scalar multiples of one another. We will use these results when studying imitative dynamics in Chapters 4 and 7; see especially Exercise 4.4.18 and Theorem 7.5.9. If the vectors v ∈ Rn and w ∈ Rn are non-zero multiples of one another, then v and w clearly are orthogonal to the same set of vectors in Rn . Conversely, if {v}⊥ = { y ∈ Rn : v y = 0} equals {w}⊥ , then v and w must be (non-zero) multiples of one another, as they are both normal vectors of the same hyperplane. When are v and w positive multiples of one another? This is the case if and only if the set H (v) = { y ∈ Rn : v y ≥ 0}, the closed half-space consisting of those vectors with which v forms an acute or right angle, is equal to the corresponding set H (w). Clearly, H (v) = H (w) implies that {v}⊥ = {w}⊥ , and so that v = cw; since v ∈ H (v) = H (w), it must be that c > 0. In summary, we have Observation 2.B.5. (i) {x ∈ Rn : v x = 0} = {x ∈ Rn : w x = 0} if and only if v = cw for some c 0. (ii) {x ∈ Rn : v x ≥ 0} = {x ∈ Rn : w x ≥ 0} if and only if v = cw for some c > 0. 96 Proposition 2.B.6 provides analogues of the characterizations above for settings in which one can only compare how v and w act on vectors in some subspace Z ⊆ Rn . Since these comparisons relate v and w as linear forms on Z, Theorem 2.B.3 suggests that the characterizations should be expressed in terms of the orthogonal projections of v and w onto Z. Proposition 2.B.6. (i) {z ∈ Z : v z = 0} = {z ∈ Z : w z = 0} if and only if PZ v = c PZ w for some c 0. (ii) {z ∈ Z : v z ≥ 0} = {z ∈ Z : w z ≥ 0} if and only if PZ v = c PZ w for some c > 0. Proof. The “if” direction of part (i) is immediate. For the “only if” direction, observe that v z = 0 for all z ∈ Z if and only if v PZ x = 0 for all x ∈ Rn . Since the matrix PZ is symmetric, we can rewrite the equality above as (PZ v) x = 0; thus, the conclusion that PZ v = c PZ w with c 0 follows from Observation 2.B.5(i). The proof of part (ii) follows similarly from Observation 2.B.5(ii). To cap this discussion, we note that both parts of Observation 2.B.5 are the simplest cases of more general duality results that link a linear map A ∈ L(Rm , Rn ) ≡ Rn×m with its transpose A ∈ L(Rn , Rm ) ≡ Rm×n . Part (i) is essentially the m = 1 case of the Fundamental Theorem of Linear Algebra: (2.33) range(A) = (nullspace(A ))⊥ . In equation (2.33), the set range(A) = {w ∈ Rn : w = Ax for some x ∈ Rm } is the span of the columns of A. The set nullspace(A ) = { y ∈ Rn : A y = 0} consists of the vectors that A maps to the origin; equivalently, it is the set of vectors that are orthogonal to every column of A. Viewed in this light, equation (2.33) says that w is a linear combination of the columns of A if and only if any y that is orthogonal to each column of A is also orthogonal to w. While (2.33) is of basic importance, it is quite easy to derive after taking orthogonal complements: (range(A))⊥ = { y ∈ Rn : y Ax = 0 for all x ∈ Rm } = { y ∈ Rn : y A = 0 } = nullspace(A ). Part (ii) of Observation 2.B.5 is essentially the m = 1 case of Farkas’s Lemma: (2.34) [w = Ax for some x ∈ Rm ] if and only if [[A y ≥ 0 ⇒ w y ≥ 0] for all y ∈ Rn ]. + In words: w is a nonnegative linear combination of the columns of A if and only if any y that forms a weakly acute angle with each column of A also forms a weakly acute angle with 97 w. Despite their analogous interpretations, statement (2.34) is considerably more difficult to prove than statement (2.33)—see the Notes. 2.B.3 Derivatives of Functions on Affine Spaces Before considering calculus on affine spaces, let us briefly review differentiation of scalar-valued functions on Rn . If f is a C1 function from Rn to R, then its derivative at x, denoted D f (x), is an element of L(Rn , R), the set of linear maps from Rn to R. For each x ∈ Rn , the map D f (x) takes vectors z ∈ Rn as inputs and returns scalars D f (x)z ∈ R as outputs. The latter expression appears in the first order Taylor expansion f (x + z) = f (x) + D f (x) z + o(z) for all z ∈ Rn . By the Riesz Representation Theorem, there is a unique vector f (x) ∈ Rn satisfying D f (x) z = f (x) z for all z ∈ Rn . We call f (x) the gradient of f at x. In the present ∂f full-dimensional case, f (x) is the vector of partial derivatives ∂xi (x) of f at x. Now, let A ⊆ Rn be an affine space with tangent space TA, and consider a function f : A → R. (As in Appendix 2.A, the ideas to follow can also be applied to functions whose domain is a set that is open (or closed) relative to A.) We say that f is differentiable at x ∈ A if there is a linear map D f (x) ∈ L(TA, R) satisfying f (x + z) = f (x) + D f (x) z + o(z) for all z ∈ TA. The gradient of f at x is the Riesz representation of D f (x). In other words, it is the unique vector f (x) ∈ TA such that D f (x) z = f (x) z for all z ∈ TA. If the function f : A → TA is continuous, then f is continuously differentiable, or of class C1 . When A = Rn , this definition of the gradient is simply the one presented earlier, and f (x) is the only vector in Rn that represents D f (x). But in lower dimensional cases, there are many vectors in Rn that can represent D f (x). The gradient vector f (x) is the only one lying in TA; all others are obtained by summing f (x) and an element of (TA)⊥ . When A = Rn , the gradient of f at x is just the vector of partial derivatives of f at x. But in other cases, the partial derivatives of f may not even exist. How does one compute f (x) then? Usually, it is easiest to extend the function f to all of Rn in some smooth way, and then to compute the gradient by way of this extension. In some cases (e.g., when f is a polynomial), obtaining the extension is just a matter of declaring that the domain is Rn . But even in this situation, there is an alternative extension that is often handy. Proposition 2.B.7. Let f : A → R be a C1 function on the affine set A, and let Z = TA. 98 (i) Let f˜ : Rn → R be any C1 extension of f . Then (ii) Define f : Rn → R by f (x) = PZ f˜(x) for all x ∈ A. f ( y) = f (PZ y + z⊥ ), A where z⊥ is the unique element of A ∩ Z⊥ . Then A f (x) = f (x) for all x ∈ A. In words, f assigns the value f (x) to each point in Rn whose orthogonal projection onto TA = Z is the same as that of x ∈ A; the gradient of f is identical to the gradient of f on the set A. Proof. Part (i) follows immediately from the relevant definitions. To prove part (ii), suppose that x ∈ A. Then by the chain and product rules, D f (x) = D( f (PZ x + z⊥ )) = D f (x)PZ . A This linear form on Rn is represented by the (column) vector f (x) = ( f (x) PZ ) ∈ Rn . But since the orthogonal projection matrix PZ is symmetric, and since f (x) ∈ Z, we conclude that f (x) = ( f (x) PZ ) = PZ f (x) = PZ f (x) = f (x). The fact that PZ is an orthogonal projection makes this proof simple: since PZ is symmetric, we are able to transfer its action from the displacement direction z ∈ Z to the vector f (x) itself. Similar considerations arise for vector-valued functions defined on affine spaces, and also for higher order derivatives. If F : A → Rm is C1 , its derivative at x ∈ A is a linear map DF(x) ∈ L(Z, Rm ), where we once again write Z for TA. While there are many matrices in Rm×n that represent this derivative, applying the logic above to each component of F shows that there is a unique such matrix, called the Jacobian matrix or derivative matrix, whose rows are elements of Z. As before, we abuse notation by denoting this matrix DF(x). But unlike before, this abuse can create some confusion: if F is “automatically” defined on all of Rn , one must be careful to distinguish between the derivative matrix of F : Rn → Rm at x and the derivative matrix of its restriction F|A : A → Rm at x; they are related by DF|A (x) = DF(x)PZ . If the function f : A → R is C2 , then its second derivative at x ∈ A is a symmetric bilinear map D2 f (x) ∈ L2 (Z, R). There are many symmetric matrices in Rn×n that represent s 2 D f (x), but there is a unique such matrix whose rows and columns are in Z. We call this matrix the Hessian of f at x, and denote it 2 f (x). If f˜ : Rn → R is any C2 extension of f , 99 then we can compute the Hessian of f as 2 f (x) = PZ 2 f˜(x)PZ ; if f ( y) = f (PZ y + z⊥ ) is the A constant orthogonal extension of f to Rn , then 2 f (x) = 2 f (x). 2.B.4 Affine Integrability A necessary and sufficient condition for a C1 vector field F : Rn → Rn to admit a potential function—that is, a scalar valued function f satisfying f (x) = F(x) for all x ∈ Rn —is that its derivative matrix DF(x) be symmetric for all x ∈ Rn . We now state a definition of potential functions for cases in which the map F is only defined on an affine space, and show that an appropriate symmetry condition on DF(x) is necessary and sufficient for a potential function to exist. We also relate these notions to their fulldimensional analogues. Let A ⊆ Rn be an affine space with tangent space Z = TA, and let z⊥ be the unique A element of A ∩ Z⊥ . Suppose that the map F : A → Rn is continuous. We call the function f : A → R a potential function for F if (2.35) f (x) = PZ F(x) for all x ∈ A. What does this definition require? Since f (x) ∈ Z, the action of f (x) on Z⊥ is null (that is, (z⊥ ) f (x) = 0 whenever z⊥ ∈ Z⊥ ). But since F(x) ∈ Rn , the action of F(x) on Z⊥ is not restricted in this way. Condition (2.35) requires that F(x) have the same action as f (x) on Z, but places no restriction on how F(x) acts on the complementary set Z⊥ . Theorem 2.B.8 characterizes the smooth maps on A that admit potential functions. The characterization is stated in terms of a symmetry condition on the derivatives DF(x). Theorem 2.B.8. The C1 map F : A → Rn admits a potential function if and only if DF(x) is ˆ symmetric with respect to Z × Z for all x ∈ A (i.e., if and only if z DF(x)ˆ = z DF(x)z for all z ˆ z, z ∈ Z and x ∈ A). Proof. To prove the “only if” direction, suppose that F admits a potential function f satisfying condition (2.35). This means that for all x ∈ A, F(x) and f (x) define identical linear forms in L(Z, R). By taking the derivative of each side of this identity, we find that DF(x) = 2 f (x) as bilinear forms in L2 (Z, R). But since 2 f (x) is a symmetric bilinear form on Z × Z (by virtue of being a second derivative), DF(x) is as well. The “if” direction is a consequence of the following proposition. Proposition 2.B.9. Define the map F : Rn → Rn by F( y) = PZ F(PZ y + z⊥ ). A 100 Then F admits a potential function f : Rn → R if and only if DF(x) is symmetric with respect to Z × Z for all x ∈ A. In this case, f = f is a potential function for F. A Proof. Define the function ξ : Rn → A by ξ( y) = PZ y + z⊥ . Then A DF( y) = D PZ F(ξ( y)) = PZ DF(ξ( y)) PZ . (2.36) Now F admits a potential function if and only if DF( y) is symmetric for all y ∈ Rn . Equation (2.36) tells us that the latter statement is true if and only if DF(x) is symmetric with respect to Z × Z for all x ∈ A, proving the first statement in the proposition. To prove the second statement, suppose that f is a potential function for F, and let f = f . Then since ξ(x) = x for all x ∈ A, we find that A f (x) = PZ f (x) = PZ F(x) = PZ PZ F(ξ(x)) = PZ F(x). This completes the proof of Theorem 2.B.8. If the C1 map F : A → Rn is integrable (i.e., if it admits a potential function f : A → R), can we extend F to all of Rn in such a way that the extension is integrable too? One natural way to proceed is to extend the potential function f to all of Rn . If one does so ˜ in an arbitrary way, then the projected maps PZ F and PZ F will agree regardless of how the extended potential function f˜ is chosen (cf Observation 2.2.3 and the subsequent ˜ discussion). But is it always possible to choose f˜ in such a way that that F and F are ˜ identical on A, so that the function F is a genuine extension of the function F? Theorem 2.B.10 shows one way that this can be done. Theorem 2.B.10. Suppose F : A → Rn is continuous with potential function f : A → R. Define f˜ : Rn → R by f˜( y) = f (ξ( y)) + ( y − ξ( y)) F(ξ( y)), where ξ( y) = PZ y + z⊥ , A ˜ ˜ ˜ and define F : Rn → Rn by F( y) = f˜( y). Then F A = F. Thus, any integrable map/potential function pair defined on A can be extended to a vector field/potential function pair defined on all of Rn . ˜ Proof. We can compute F from f˜ using the chain and product rules: ˜ F( y) = = f˜( y) f (ξ( y)) PZ + ( y − ξ( y)) DF(ξ( y))PZ + F(ξ( y)) (I − PZ ) 101 = PZ F(ξ( y)) PZ + ( y − ξ( y)) DF(ξ( y))PZ + F(ξ( y)) − F(ξ( y)) PZ = F(ξ( y)) PZ PZ + ( y − ξ( y)) DF(ξ( y))PZ + F(ξ( y)) − F(ξ( y)) PZ = F(ξ( y)) + ( y − ξ( y)) DF(ξ( y))PZ ˜ If x ∈ A, then ξ(x) = x, allowing us to conclude that F(x) = F(x). If F takes values in Z, so that F(x) = PZ F(x) for all x ∈ A, then f˜( y) is simply f (ξ( y)), ˜ and so F( y) = PZ f (ξ( y)) = PZ F(ξ( y)); in this case, the construction in Theorem 2.B.10 is identical to the one introduced in Proposition 2.B.9. The novelty in Theorem 2.B.10 is that it lets us extend the domain of F to all of Rn in an integrable fashion even when F takes values throughout Rn . 2.N Notes Section 2.1. Sections 2.1.1 through 2.1.6 follow Sandholm (2001), while Section 2.1.7 follows Roughgarden and Tardos (2002, 2004). Random matching in two player games with common interests defines a fundamental model from population genetics; the common interest assumption reflects the shared fate of two genes that inhabit the same organism. See Hofbauer and Sigmund (1988, 1998) for further discussion. Congestion games first appear in the seminal book of Beckmann et al. (1956), who define a general model of traffic flow with inelastic demand, and use a potential function argument to establish the existence and uniqueness of Nash equilibrium. The textbook of Sheffi (1985) treats congestion games from a transportation science perspective at an undergraduate level; the more recent monograph of Patriksson (1994) provides a comprehensive treatment of the topic from this point of view. Important examples of finite player potential games are introduced by Rosenthal (1973) and Slade (1994), and characterizations of this class of normal form games are provided by Monderer and Shapley (1996), Ui (2000), and Sandholm (2008b). Example 2.1.6 and Exercise 2.1.12 are due to Sandholm (2005b). Braess’s paradox (Example 2.1.10) was first reported in Braess (1968). Exercise 2.1.11 is well known in the transportation science literature; it also corrects a mistake (!) in Corollary 5.6 of Sandholm (2001). Versions of the efficiency results in Section 2.1.6 are established by Dafermos and Sparrow (1969) for a model of traffic congestion model and by Hofbauer and Sigmund (1988) for single population games. For further discussion of constraint qualification and of the interpretation of the Kuhn-Tucker first order conditions, see Avriel (1976, Section 3.1) and Harker and Pang (1990). For a complete treatment of efficiency bounds for congestion games, including more general 102 results than those described here, see Roughgarden (2005). Section 2.2. This section follows Sandholm (2008b). The general definition and basic properties of normal form potential games are established by Monderer and Shapley (1996). The triangular integrability condition from Exercise 2.2.7 is due to Hofbauer (1985). The fact that constant games are potential games in which potential equals aggregate payoffs is important in models of evolutionary implementation; see Sandholm (2002, 2005b, 2007b). Section 2.3. This section follows Hofbauer and Sandholm (2008). Evolutionarily stable strategies and neutrally stable strategies are introduced in the single population random matching context by Maynard Smith and Price (1973) and Maynard Smith (1982), respectively. The connection between interior ESS and negative definiteness of the payoff matrix was first noted by Haigh (1975). See Hines (1987) for a survey of early work on these and related concepts. A version of the GESS concept is used by Hamilton (1967) in his pioneering analysis of sex-ratio selection under the name “unbeatable strategy”; see Hamilton (1996, p. 373–374) for an intriguing discussion of the links between the notions of unbeatable strategy and ESS. Further discussion of ESS can be found in the Notes to Chapter 7. For more on Rock-Paper-Scissors, see Gaunersdorfer and Hofbauer (1995). The War of Attrition is introduced in Bishop and Cannings (1978); for economic applications, see Bulow and Klemperer (1999) and the references therein. Imhof (2005) derives a closedform expression for the Nash equilibrium of the war of attrition in terms of Chebyshev polynomials of the second kind. The dominant diagonal condition used in Example 2.3.12 is a consequence of the Gerˇ gorin Disk Theorem; see Horn and Johnson (1985). This s reference also presents the trace condition used in proving Proposition 2.3.10. In the convex analysis literature, functions that satisfy our definition of stability (though typically with the inequality reversed) are called “monotone”—see Rockafellar (1970) or Hiriart-Urruty and Lemar´ chal (2001). For more on pseudomonotonicity and e pseudoconvexity, see Avriel (1976, Chapter 6) and Crouzeix (1998). The elementary proof of existence of Nash equilibrium in stable games presented in Section 2.3.5 is a translation to the present context of work on monotone operators on vector spaces due to Minty (1967). Good references on the Minmax Theorem and its connection with the Separating Hyperplane Theorem are Kuhn (2003) and Luce and Raiffa (1957). Section 2.4. The definition of supermodular population games here comes from Hofbauer and Sandholm (2007). Finite player analogues of the results presented here are established by Topkis (1979), Vives (1990), and Milgrom and Roberts (1990). Accounts of these results can be found in Fudenberg and Tirole (1991, Sec. 12.3) and Vives (2005); 103 Topkis (1998) and Vives (2000) are book-length studies. For macroeconomic applications, see Cooper (1999). Appendix 2.A. For a textbook treatment of multivariate calculus that emphasizes the notion of the derivative as a linear map, see Lang (1997, Chapter 17). For the Whitney Extension Theorem, see Abraham and Robbin (1967) or Krantz and Parks (1999). Appendix 2.B. The version of the Riesz Representation Theorem presented here, along with further discussion of calculus on affine spaces, can be found in Akin (1990). For further discussion of the dual characterizations described at the end of Section 2.B.2, see Lax (2007, Chapter 13) or Hiriart-Urruty and Lemar´ chal (2001, Section A.4.3). e 104 Part II Deterministic Evolutionary Dynamics 105 CHAPTER THREE Revision Protocols and Evolutionary Dynamics 3.0 Introduction The theory of population games developed in the previous chapters provides a simple framework for describing strategic interactions among large numbers of agents. Having explored these games’ basic properties, we now turn to modeling the behavior of the agents who play them. Traditionally, predictions of behavior in games are based on some notion of equilibrium, typically Nash equilibrium or some refinement thereof. These notions are founded on the assumption of equilibrium knowledge, which posits that each player correctly anticipates how his opponents will act. The equilibrium knowledge assumption is difficult to justify, and in contexts with large numbers of agents it is particularly strong. As an alternative to the equilibrium approach, we introduce an explicitly dynamic model of choice, a model in which agents myopically alter their behavior in response to their current strategic environment. This dynamic model does not assume the automatic coordination of agents’ beliefs, and it can accommodate many specifications of agents’ choice procedures. These procedures are specified formally by defining a revision protocol ρ. A revision protocol takes current payoffs and aggregate behavior as inputs; its outputs are conditional p switch rates ρi j (πp , xp ), which describe how frequently agents playing strategy i ∈ Sp who are considering switching strategies switch to strategy j ∈ Sp , given that the current payoff vector and population state are πp and xp . Revision protocols are flexible enough to accommodate a wide variety of choice paradigms, including ones based on imitation, optimization, and other approaches. 107 A population game F describes a strategic environment; a revision protocol ρ describes the procedures agents follow in adapting their behavior to that environment. Together F and ρ define a stochastic evolutionary process in which all random elements are idiosyncratic across agents. Since the number of agents are large, intuition from the law of large numbers suggests that the idiosyncratic noise will average out, so that aggregate behavior evolves according to an essentially deterministic process. After formally defining revision protocols, we spend Section 3.1 deriving the differential equation that describes this deterministic process. As the differential equation captures expected motion under the original stochastic process, we call it the mean dynamic generated by F and ρ. The examples we present in Section 3.2 show how common dynamics from the evolutionary literature can be derived through this approach. In the story above, we began with a game and a revision protocol and derived a differential equation on the state space X. But if our goal is to investigate the consequences of a particular choice procedure, it is preferable to fix this revision protocol and let the game F vary. By doing so, we generate a map from population games to differential equations that we call an evolutionary dynamic. This notion of an evolutionary dynamic is developed in detail in Section 3.3. Our derivation of deterministic evolutionary dynamics in this chapter is informal, based solely on an appeal to the idea that idiosyncratic noise should be averaged away when populations are large. We will formalize this logic in Chapter 9. There we specify a Markov process to describe stochastic evolution in a large but finite population. We then prove that over finite time spans, this Markov process converges to a deterministic limit—namely, a solution trajectory of the mean dynamic—as the population size becomes arbitrarily large. Until then, we spend Chapters 3 through 8 working directly with the deterministic limit. To prepare for this, we introduce the rudiments of the theory of ordinary differential equations in Appendix 3.A and pursue this topic further in the appendices of the chapters to come. 3.1 3.1.1 Revision Protocols and Mean Dynamics Revision Protocols We now introduce a simple, general model of myopic individual choice in population games. Let F : X → Rn be a population game with pure strategy sets (S1 , . . ., Sp ) and integer- 108 valued population masses (m1 , . . ., mp ). We suppose for now that each population is large but finite: population p ∈ P has Nmp members, where N is a positive integer. The set 1 of feasible social states is therefore X N = X ∩ N Zn = {x ∈ X : Nx ∈ Zn }, a discrete grid embedded in the original state space X. We refer to the parameter N somewhat loosely as the population size. The procedures agents follow in deciding when to switch strategies and which strategies to switch to are called revision protocols. p p p p Definition. A revision protocol ρp is a map ρp : Rn × Xp → Rn ×n . The scalar ρi j (πp , xp ) is + p called the conditional switch rate from strategy i ∈ S to strategy j ∈ Sp given payoff vector πp and population state xp . We will also refer to the collection ρ = (ρ1 , . . . , ρp ) as a revision protocol when no confusion will arise. A population game F, a population size N, and a revision protocol ρ define a continuous time evolutionary process on X N . A one-size-fits-all interpretation of this process is as follows. Each agent in the society is equipped with a “stochastic alarm clock”. The times between rings of an agent’s clock are independent, each with a rate R exponential distribution. (This modeling device is often called a “Poisson alarm clock” for reasons to be made clear below.) We assume that the rate R satisfies p ρi j (πp , xp ), R ≥ max pp x ,π ,i, p j∈Sp and that the ring times of different agents’ clocks are independent of one another. The ringing of a clock signals the arrival of a revision opportunity for the clock’s owner. If an agent playing strategy i ∈ Sp receives a revision opportunity, he switches to p strategy j i with probability ρi j /R, and he continues to play strategy i with probability p 1 − j i ρi j /R; this decision is made independently of the timing of the clocks’ rings. If a switch occurs, the population state changes accordingly, from the old state x to a new state y that accounts for the agent’s choice. As the evolutionary process proceeds, the alarm clocks and the revising agents are only influenced by the prior history of the process by way of the current values of payoffs and the social state. This interpretation of the evolutionary process can be applied to any revision protocol. Still, simpler interpretations are often available for protocols with additional structure. To motivate one oft-satisfied structural condition, observe that in the interpretation p provided above, the diagonal components ρii of the revision protocol play no role what- 109 soever. But if the protocol is exact—that is, if there is a constant R > 0 such that p p ρi j (πp , xp ) = R for all πp ∈ Rn , xp ∈ Xp , i ∈ Sp , and p ∈ P , (3.1) j∈Sp p then the values of these diagonal components become meaningful: in this case, ρii /R = p 1 − j i ρi j /R is the probability that a strategy i player who receives a revision opportunity does not switch strategies. Exact protocols are particularly easy to interpret when R = 1: in this case, agents’ p clocks ring at rate 1, and for every strategy j ∈ Sp , ρi j itself represents the probability that an i player whose clock rings proceeds by playing strategy j. We will henceforth assume that protocols described as exact have clock rate R = 1 unless a different clock rate is specified explicitly. This focus on unit clock rates is not very restrictive: the only effect 1 of replacing a protocol ρ with its scalar multiple R ρ is to change the speed at which the evolutionary process runs by a constant factor. Other examples of protocols that allow alternative interpretations of the evolutionary process can be found in Section 3.2. 3.1.2 Mean Dynamics N The model above defines a stochastic process {Xt } on the state space X N . We now N derive a deterministic process that describes the expected motion of {Xt }. In Chapter 9, we will prove that this deterministic process provides a very good approximation of the N behavior of the stochastic process {Xt } so long as the time horizon of interest is finite and the population size is sufficiently large. But having noted this result, we will focus in the intervening chapters on the deterministic process itself. The times between rings of each agent’s stochastic alarm clock are independent and follow a rate R exponential distribution. How many times will this agent’s clock ring during the next t time units? A basic result from probability theory shows that the number of rings during time interval [0, t] follows a Poisson distribution with mean Rt. This fact is all we need to perform the analysis below; a detailed account of the exponential and Poisson distributions can be found in Appendix 9.A. N Let us now compute the expected motion of the stochastic process {Xt } over the next dt time units, where dt is small. To rein in our notation we focus on the single population case. Each agent in the population receives revision opportunities according to an exponential distribution with rate R, and so each expects to receive R dt opportunities during 110 the next dt time units. Thus, if the current state is x, the expected number of revision opportunities received by agents currently playing strategy i is approximately Nxi R dt. We say “approximately” because the value of xi may change during time interval [0, dt], but this change is very likely to be small if dt is small. Since an i player who receives a revision opportunity switches to strategy j with probability ρi j /R, the expected number of such switches during the next dt time units is approximately Nxi ρi j dt. It follows that the expected change in the use in strategy i during the next dt time units is approximately (3.2) N x j ρ ji − xi j∈S j∈S ρi j dt. The first term in expression (3.2) captures switches to strategy i from other strategies, while the second captures switches to other strategies from strategy i. Dividing expression (3.2) by N yields the expected change in the proportion of agents choosing strategy i: that is, in component xi of the social state. We obtain a differential equation for the social state by eliminating the time differential dt : ˙ xi = x j ρ ji − xi ρi j . j∈S j∈S This ordinary differential equation is the mean dynamic corresponding to revision protocol ρ. We now describe the mean dynamic for the general multipopulation case. Definition. Let F be a population game, and let ρ be a revision protocol. The mean dynamic corresponding to F and ρ is (M) p pp ˙ xi = p p x j ρ ji (Fp (x), xp ) − xi j∈Sp ρi j (Fp (x), xp ). j∈Sp 111 3.1.3 Target Protocols and Target Dynamics To conclude this section, we introduce a condition on revision protocols that is satisfied in many interesting examples, and that generates mean dynamics that are easy to describe in geometric terms. We say that ρ is a target protocol if conditional switch rates under ρ do not depend on p agents’ current strategies: in other words, ρi j may depend on the candidate strategy j, but not on the incumbent strategy i. We can represent target protocols using maps of the form p p p p τp : Rn × Xp → Rn , where ρi j ≡ τ j for all i ∈ Sp . This restriction yields mean dynamics of + the form (3.3) p p p p ˙ xi = mp τi (Fp (x), xp ) − xi τ j (Fp (x), xp ), j∈Sp which we call target dynamics. p What is the geometric interpretation of these dynamics? If τp (πp , xp ) ∈ Rn is not the + zero vector, we can define p p τi (πp , xp ) λ (π , x ) = p p p and p σi (πp , xp ) i∈S = τi (πp , xp ) λp (πp , xp ) . Then σp (πp , xp ) ∈ ∆p is a probability vector, and we can rewrite equation (3.3) as (3.4) p p λ (F (x), xp ) mp σp (Fp (x), xp ) − xp p ˙ x = 0 if τp (Fp (x), xp ) 0, otherwise. The first case of equation (3.4) tells us that the population state xp ∈ Xp moves in the direction of the target state mp σp ∈ Xp , the representative of the probability vector σp ∈ ∆p in the state space Xp = mp ∆p ; moreover, motion toward the target state proceeds at rate λp . Figure 3.1.1(i) illustrates this idea in the single population case; since here the population’s mass is 1, the target state is just the probability vector σp ∈ Xp = ∆p . Now suppose that protocol τ is an exact target protocol: a target protocol that is exact with clock rate R = 1 (see equation (3.1) and the subsequent discussion). In this case, we call the resulting mean dynamic an exact target dynamic. Since exactness implies that λp ≡ 1, we often denote exact target protocols by σ rather than τ, emphasizing that the p p values of σp : Rn × Xp → ∆n are probability vectors. Exact target dynamics take the 112 σ = σ (F(x),x) σ = σ (F(x),x) . x =σ– x . x = λ (σ – x) x x (ii) an exact target dynamic (λ ≡ 1) (i) a target dynamic Figure 3.1.1: Target dynamics in a single population. especially simple form (3.5) ˙ xp = mp σp (Fp (x), xp ) − xp . The vector of motion in (3.5) can be drawn with its tail at the current state xp and its head at the target state mp σp , as illustrated in Figure 3.1.1(ii) in the single population case. 3.2 Examples We now offer a number of examples of revision protocols and their mean dynamics that we will revisit throughout the remainder of the book. Recall that Fp (x) = pp 1 mp xi Fi (x) i∈Sp represents the average payoff obtained by members of population p. It is useful to define the excess payoff to strategy i ∈ Sp , p ˆp Fi (x) = Fi (x) − Fp (x), 113 as the difference between strategy i’s payoff and the average payoff in population p. The excess payoff vector for population p is written as ˆ Fp (x) = Fp (x) − 1Fp (x). To conserve on notation, the examples to follow are stated for the single population setting. When introducing revision protocols, we let π ∈ Rn denote an arbitrary payoff ˆ vector; when the population state x ∈ X is also given, we let π = π − 1x π denote the resulting excess payoff vector. Example 3.2.1. Pairwise proportional imitation. Revision protocols of the form (3.6) ρi j (π, x) = x j ri j (π) are called imitative protocols. The natural interpretation of these protocols differs somewhat from the one presented in Section 3.1.1. Here, an agent who receives a revision opportunity chooses an opponent at random and observes her strategy. If our agent is playing strategy i and the opponent strategy j, the agent switches from i to j with probability proportional to ri j . Note that the value of x j need not be observed; instead, this term in equation (3.6) reflects the agent’s observation of a randomly chosen opponent. Suppose that after selecting an opponent, the agent imitates the opponent only if the opponent’s payoff is higher than his own, doing so in with probability proportional to the payoff difference: ρi j (π, x) = x j [π j − πi ]+ . The mean dynamic generated by this revision protocol is ˙ xi = x j ρ ji (F(x), x) − xi j∈S = ρi j (F(x), x) j∈S x j xi [Fi (x) − F j (x)]+ − xi j∈S = xi x j [F j (x) − Fi (x)]+ j∈S x j (Fi (x) − F j (x)) j∈S = xi Fi (x) − F(x) . This equation, which we can rewrite as (R) ˆ ˙ xi = xi Fi (x). 114 defines the replicator dynamic, the best known dynamic in evolutionary game theory. Under ˙ this dynamic, the percentage growth rate xi /xi of each strategy currently in use is equal to that strategy’s current excess payoff; unused strategies always remain so. § Example 3.2.2. Pure imitation driven by dissatisfaction. Suppose that when a strategy i player receives a revision opportunity, he opts to switch strategies with a probability that is linearly decreasing in his current payoff. (For example, agents might revise when their payoffs do not meet a uniformly distributed random aspiration level.) In the event that the agent decides to switch, he imitates a randomly selected opponent. This leads to the revision protocol ρi j (π, x) = (K − πi )x j , where the constant K is sufficiently large that conditional switch rates are always positive. The mean dynamic generated by this revision protocol is ˙ xi = x j ρ ji (F(x), x) − xi j∈S = ρi j (F(x), x) j∈S x j (K − F j (x))xi − xi (K − Fi (x)) j∈S = xi K − j∈S x j F j (x) − K + Fi (x) ˆ = xi Fi (x). Thus, this protocol’s mean dynamic is the replicator dynamic as well. § Exercise 3.2.3. Imitation of success. Consider the revision protocol ρi j (π, x) = τ j (π, x) = x j (π j − K), where the constant K is smaller than any feasible payoff. (i) Offer an interpretation of this protocol. (ii) Show that this protocol generates the replicator dynamic as its mean dynamic. (iii) Part (ii) implies that the replicator dynamic is a target dynamic. Compute the rate λ(F(x), x) and target state σ(F(x), x) corresponding to population state x. Describe how these vary as one changes the value of K. Exercise 3.2.4. In the single population setting, we call a mean dynamic an anti-target 115 dynamic if it can be expressed as ˙˜ ˜ x = λ(F(x), x) x − σ(F(x), x) , ˜ ˜ where λ(π, x) ∈ R+ and σ(π, x) ∈ ∆. (i) Give a geometric interpretation of anti-target dynamics. (ii) Show that the replicator dynamic is an anti-target dynamic. Unlike the imitative protocols introduced above, the protocols to follow have agents directly evaluate the payoffs of candidate strategies. Example 3.2.5. Logit choice. Suppose that choices are made according to the logit choice protocol, the exact target protocol defined by exp(η−1 π j ) ρi j (π, x) = σ j (π, x) = k∈S exp(η−1 πk ) . The parameter η > 0 is called the noise level. If η is large, choice probabilities under the logit rule are nearly uniform. But if η is near zero, choices are optimal with probability close to one, at least when the difference between the best and second best payoff is not too small. By equation (3.5), the exact target dynamic generated by protocol σ is (L) ˙ xi = σi (F(x), x) − xi = exp(η−1 Fi (x)) − xi . −1 k∈S exp(η Fk (x)) This is the logit dynamic with noise level η. § Example 3.2.6. Comparison to the average payoff. Consider the target protocol ˆ ρi j (π, x) = τ j (π, x) = [π j ]+ . When an agent’s clock rings, he chooses a strategy at random. If that strategy’s payoff is above average, the agent switches to it with probability proportional to its excess payoff. By equation (3.3), the induced target dynamic is (BNN) ˙ xi = τi (F(x), x) − xi τ j (F(x), x) j∈S ˆ = [Fi (x)]+ − xi ˆ [Fk (x)]+ . k ∈S 116 This is the Brown-von Neumann-Nash (BNN) dynamic. § Example 3.2.7. Pairwise comparisons. Suppose that ρi j (π, x) = [π j − πi ]+ . When an agent’s clock rings, he selects a strategy at random. If the new strategy’s payoff is higher than his current strategy’s payoff, he switches strategies with probability proportional to the difference between the two payoffs. The resulting mean dynamic, ˙ xi = (S) x j ρ ji (F(x), x) − xi j∈S = ρi j (F(x), x) j∈S [F j (x) − Fi (x)]+ , x j [Fi (x) − F j (x)]+ − xi j∈S j∈S is called the Smith dynamic. § 3.3 Evolutionary Dynamics With this background established, we now provide a formal definition of evolutionary dynamics. Let P = {1, . . . , p } be a set of populations with masses mp and strategy sets Sp . Let X be the corresponding set of social states: X = {x ∈ Rn : x = (x1 , . . ., xp ), where + p i∈Sp xi = mp }. Define the sets F and T as follows: F = {F : X → Rn : F is Lipschitz continuous}; T = {x : [0, ∞) → X : x is continuous} F is the set of population games with Lipschitz continuous payoffs; T is the set of continuous forward-time trajectories through the state space X. Definition. An evolutionary dynamic is a set-valued map D : F ⇒ T . It assigns each population game F ∈ F a set of trajectories D(F) ⊂ T satisfying Existence and forward invariance: For each ξ ∈ X, there is a {xt }t≥0 ∈ D(F) with x0 = ξ. Thus, for each game F and each initial condition ξ ∈ X, an evolutionary dynamic must specify at least one solution trajectory that begins at ξ and then remains in X at all positive times. 117 This definition of an evolutionary dynamic is rather general, in that it does not impose a uniqueness requirement (i.e., since it allows multiple trajectories in D(F) to emanate from a single initial condition). This generality allows us to handle dynamics defined by discontinuous differential equations and by differential inclusions—see Chapter 5. But for dynamics defined by Lipschitz continuous differential equations, this level of generality is unnecessary: in this case, standard results allow us to ensure not only existence of solutions, but also: Uniqueness: For each ξ ∈ X, there is exactly one {xt }t≥0 ∈ D(F) with x0 = ξ. Lipschitz continuity: For each t, xt = xt (ξ) is a Lipschitz continuous function of ξ. The basic results on existence and uniqueness of solutions to ordinary differential equations concern equations defined on open sets. To contend with the fact that our mean dynamics are defined on the compact, convex set X, we need conditions ensuring that solution trajectories do not leave this set. The required conditions are provided by Theorem 3.3.1: if the vector field VF : X → Rn is Lipschitz continuous, and if at each state x ∈ X, the growth rate vector VF (x) is contained in the tangent cone TX(x), the set of directions of motion from x that do not point out of X, then all of our desiderata for solution trajectories are satisfied. Theorem 3.3.1. Suppose VF : X → Rn is Lipschitz continuous, and let S(VF ) ⊂ T be the set ˙ of solutions {xt }t≥0 to x = VF (x). If VF (x) ∈ TX(x) for all x ∈ X, then S(VF ) satisfies existence, forward invariance, uniqueness, and Lipschitz continuity. Theorem 3.3.1 follows directly from Theorems 3.A.2, 3.A.6, and 3.A.8 in Appendix 3.A. Its implications for evolutionary dynamics are as follows. Corollary 3.3.2. Let the map F → VF assign each population game F ∈ F a Lipschitz continuous vector field VF : X → Rn that satisfies VF (x) ∈ TX(x) for all x ∈ X. Define D : F ⇒ T by ˙ D(F) = {{xt } ∈ T : {xt } solves x = VF (x)}. Then D is an evolutionary dynamic. Indeed, for each F ∈ F , the set D(F) ⊂ T satisfies not only existence and forward invariance, but also uniqueness and Lipschitz continuity. In light of Corollary 3.3.2, we can identify an evolutionary dynamic D with a map F → VF that assigns population games to vector fields on X. We sometimes use the notation V(·) to refer to an evolutionary dynamic as a map in this sense. 118 To link these results with revision protocols and mean dynamics, we characterize the tangent cone requirement explicitly: for VF (x) to lie in TX(x), the growth rates of each population’s strategies must sum to zero, so that population masses stay constant over time, and the growth rates of unused strategies must be nonnegative. Proposition 3.3.3. V (x) ∈ TX(x) if and only if these two conditions hold: p (i) i∈Sp Vi (x) = 0 for all p ∈ P . p p (ii) For all i ∈ Sp and p ∈ P , xi = 0 implies that Vi (x) ≥ 0. Thus, if V : X → Rn is the mean dynamic generated by a game F and a revision protocol ρ, then V (x) ∈ TX(x) for all x ∈ X. Exercise 3.3.4. Verify these claims. Appendix 3.A Ordinary Differential Equations 3.A.1 Basic Definitions Every continuous vector field V : Rn → Rn defines an ordinary differential equation (ODE) on Rn , namely d x dt t = V (xt ). ˙ Often we write xt for (D) d x; dt t we also express the previous equation as ˙ x = V (x). Equation (D) describes the evolution of a state variable xt over time. When the current state is xt , the current velocity of state—in other words, the speed and direction of the ˙ change in the state—is V (xt ). The trajectory {xt }t∈I is a solution to (D) if xt = V (xt ) at all times t in the interval I, so that at each moment, the time derivative of the trajectory is described by the vector field V (see Figure 3.A.1). In many applications, one is interested in solving an initial value problem: that is, in characterizing the behavior of solution(s) to (D) that start at a given initial condition ξ ∈ Rn . Example 3.A.1. Exponential growth and decay. The simplest differential equation is the linear ˙ equation x = ax on the real line. What are the solutions to this equation starting from 119 V(x 1 ) V(x0 ) x1 x2 V(x 2 ) x0 Figure 3.A.1: A solution of an ordinary differential equation. initial condition ξ ∈ R? It is easy to verify that xt = ξ exp(at) is a solution to this equation on the full time interval (−∞, ∞), since d x dt t = d (ξ exp(at)) dt = a(ξ exp(at)) = axt , as required. This solution describes a process of exponential growth or decay according to whether a is positive or negative. ˙ In fact, xt = ξ exp(at) is the only solution to x = ax from initial condition ξ. If { yt } is a ˙ solution to x = ax from any initial condition, then d dt ˙ yt exp(−at) = yt exp(−at) − ayt exp(−at) = 0. Hence, yt exp(−at) is constant, and so yt = ψ exp(at) for some ψ ∈ R. Since y0 = ξ, it must be that ψ = ξ. § 3.A.2 Existence, Uniqueness, and Continuity of Solutions Except in cases where the state variable x is one dimensional or the vector field V is linear, explicit solutions to ODEs are usually impossible to obtain. To investigate dynamics for which explicit solutions are unavailable, one begins by verifying that a solution exists and is unique, and then uses various indirect methods to determine its properties. The main tool for ensuring existence and uniqueness of solutions to ODEs is the 120 Picard-Lindel¨ f Theorem. To state this result, fix an open set O ⊆ Rn . We call the function o f : O → Rm Lipschitz continuous if there exists a scalar K such that f (x) − f ( y) ≤ K x − y for all x, y ∈ O. More generally, we say that f is locally Lipschitz continuous if for all x ∈ O, there exists an open neighborhood Ox ⊆ O containing x such that the restriction of f to Ox is Lipschitz continuous. It is easy to verify that every C1 function is locally Lipschitz. Theorem 3.A.2 (The Picard-Lindelof Theorem). Let V : O → Rn be locally Lipschitz contin¨ uous. Then for each ξ ∈ O, there exists a scalar T > 0 and a unique trajectory x : (−T, T) → O such that {xt } is a solution to (D) with x0 = ξ. The Picard-Lindelof Theorem is proved using the method of successive approxima¨ tions. Given an approximate solution xk : (−T, T) → O with xk = ξ, one constructs a new 0 k +1 trajectory x : (−T, T) → O using the map C that is defined as follows: t xk+1 = C (xk )t ≡ ξ + t V (xk ) ds. s 0 It is easy to see the trajectories {xt } that are fixed points of C are the solutions to (D) with x0 = ξ. Thus, if C has a unique fixed point, the theorem is proved. But it is possible to show that if T is sufficiently small, then C is a contraction in the supremum norm; therefore, the desired conclusion follows from the Banach (or Contraction Mapping) Fixed Point Theorem. If V is continuous but not Lipschitz, Peano’s Theorem tells us that solutions to (D) exist, but in this case solutions need not be unique. The following example shows that when V does not satisfy a Lipschitz condition, so that small changes in x can lead to arbitrarily large changes in V (x), it is possible for solution trajectories to escape from states at which the velocity under V is zero. ˙2 Example 3.A.3. Consider the ODE x = 3 x1/3 on R. The right hand side of this equation is continuous, but it fails to be Lipschitz at x = 0. One solution to this equation from initial condition ξ = 0 is the stationary solution xt ≡ 0. Another solution is given by xt = t3/2 . In fact, for each t0 ∈ [0, ∞), the trajectory that equals 0 until time t0 and satisfies xt = (t − t0 )3/2 thereafter is also a solution; so is the trajectory that satisfies xt = −(t − t0 )3/2 after time t0 . § The Picard-Lindelof Theorem guarantees the existence of a solution to (D) over some ¨ open interval of times. This open interval need not be the full time interval (−∞, ∞), as the following example illustrates. 121 ˙ Example 3.A.4. Consider the C1 ODE x = x2 on R. The unique solution with initial 1 condition ξ = 1 is xt = 1−t . This solution exists for all negative times, but it “explodes” in forward time at t = 1. § When V is locally Lipschitz, one can always find a maximal open time interval over which the solution to (D) from initial condition ξ exists in the domain O. If V is defined throughout Rn and is bounded, then the speed of all solution trajectories is bounded as well, which implies that solutions exist for all time. Theorem 3.A.5. If V : Rn → Rn is locally Lipschitz continuous and bounded, then for each ξ ∈ Rn , then {xt }, the unique solution to (D) with x0 = ξ, exists for all t ∈ (−∞, ∞). We will often find it convenient to discuss solutions to (D) from more than one initial condition at the same time. To accomplish this most easily, we introduce the flow of differential equation (D). Suppose that V : Rn → Rn is Lipschitz continuous, and let A ⊂ Rn be an invariant set under (D): that is, solutions to (D) with initial conditions in A exist and remain in A at all times t ∈ (−∞, ∞). Then the flow φ : (−∞, ∞) × A → A generated by (D) is defined by φt (ξ) = xt , where {xt }t∈(−∞,∞) is the solution to (D) with initial condition x0 = ξ. If we fix ξ ∈ A and vary t, then {φt (ξ)}t∈(−∞,∞) is the solution orbit of (D) through initial condition ξ; note also that φ satisfies the group property φt (φs (ξ)) = φs+t (ξ). If we instead fix t and vary ξ, then {φt (ξ)}ξ∈A describes the positions at time t of solutions to (D) with initial conditions in A ⊆ A. Using this last notational device, we can describe the continuous variation of solutions to (D) in their initial conditions. Theorem 3.A.6. Suppose that V : Rn → Rn is Lipschitz continuous with Lipschitz constant K, and that A ⊂ Rn is invariant under (D). Let φ be the flow of (D), and fix t ∈ (−∞, ∞). Then φt (·) is Lipschitz continuous with Lipschitz constant eK|t| : for all ξ, χ ∈ A, we have that φt (ξ) − φt (χ) ≤ |ξ − χ| eK|t| . The assumption that A is invariant is only made for notational convenience; the theorem is valid as long as solutions to (D) from ξ and χ exist throughout the time interval from 0 to t. The proof of Theorem 3.A.6 is a direct consequence of the following inequality, which is of importance in its own right. Lemma 3.A.7 (Gronwall’s Inequality). Let zt : [0, T] → R+ be continuous. Suppose C ≥ 0 and ¨ t K ≥ 0 are such that zt ≤ C + 0 Kzs ds for all t ∈ [0, T]. Then zt ≤ CeKt for all t ∈ [0, T]. 122 If we set zt = φt (ξ) − φt (χ) , then the antecedent inequality in the lemma is satisfied when C = |ξ − χ| and K is the Lipschitz constant for V , so Theorem 3.A.6 immediately follows. Also note that setting ξ = χ establishes the uniqueness of solutions to (D) from each initial condition. 3.A.3 Ordinary Differential Equations on Compact Convex Sets The Picard-Lindelof Theorem concerns ODEs defined on open subsets of Rn . In con¨ trast, evolutionary dynamics for population games are defined on the set of population states X, which is compact and convex. Fortunately, existence and uniqueness of forward time solutions can still be established in this setting. To begin, we introduce the notion of forward invariance. The set C ⊆ Rn is forward invariant under the Lipschitz ODE (D) if every solution to (D) that starts in C at time 0 remains in C at all positive times: if {xt } is the solution to (D) from ξ ∈ C, then xt exists and lies in C at all t ∈ [0, ∞). When C is forward invariant but not necessarily invariant under (D), we can speak of the semiflow φ : [0, ∞) × A → A generated by (D). While semiflows are not defined for negative times, they resemble flows in many other respects: by definition, φt (ξ) = xt , where {xt }t≥0 is the solution to (D) with initial condition x0 = ξ; also, φ is continuous in t and ξ, and φ satisfies the group property φt (φs (ξ)) = φs+t (ξ). Now suppose that the domain of the vector field V is a compact, convex set C. Intuition suggests that as long as V never points outward from C, solutions to (D) should be well defined and remain in C for all positive times. Theorem 3.A.8 tells us that if we are given a Lipschitz continuous vector field V that is defined on a compact convex set C and that never points outward from the boundary ˙ of C, then the ODE x = V (x) leaves C forward invariant. If in addition the negation of V never points outward from the boundary of C, then C is both forward and backward invariant under the ODE. Theorem 3.A.8. Let C ⊂ Rn be a compact convex set, and let V : C → Rn be Lipschitz continuous. ˆ ˆ ˆ (i) Suppose that V (x) ∈ TC(x) for all x ∈ C. Then for each ξ ∈ C, there exists a unique x : [0, ∞) → C with x0 = ξ that solves (D). ˆ ˆ ˆ ˆ (ii) Suppose that V (x) ∈ TC(x) ∩ (−TC(x)) for all x ∈ C. Then for each ξ ∈ C, there exists a unique x : (−∞, ∞) → C with x0 = ξ that solves (D). Proof. (i) Let V : Rn → Rn be the extension of V : C → Rn defined by V ( y) = V (ΠC ( y)), where ΠC : Rn → C is the closest point projection onto C (see Section 1.B). Then V is 123 Lipschitz continuous and bounded. Thus, Theorem 3.A.5 tells us that the ODE (3.7) ˙ y = V ( y) admits unique solutions from all initial conditions in Rn , and that these solutions exist for all (forward and backward) time. Now let ξ ∈ C, let {xt }t∈(−∞,∞) be the unique solution to (3.7) with x0 ∈ C, and suppose that xt ∈ C for all positive t; then since V and V agree on C, {xt }t≥0 must be the unique forward solution to (D) with x0 = ξ. Thus, to prove our result, it is enough to show that the set C is forward invariant under the dynamic (3.7). Define the squared distance function δC : Rn → R by δC ( y) = min | y − x|2 . x∈C One can verify that δC is differentiable with gradient δC ( y) = 2( y − ΠC ( y)). Hence, if { yt } is a solution to (3.7), then the chain rule tells us that (3.8) d δ (y ) dt C t ˙ = δC ( yt ) yt = 2( yt − ΠC ( yt )) V ( yt ) = 2( yt − ΠC ( yt )) V (ΠC ( yt )). Suppose we could show that this quantity is bounded above by zero (i.e., that when yt − ΠC ( yt ) and V (ΠC ( yt )) are nonzero, the angle between them is weakly obtuse.) This would imply that the distance between yt and C is nonincreasing over time—in other words, that δC is a Lyapunov function for the set C under the dynamic (3.7)—which would in turn imply that C is forward invariant under (3.7). We divide the analysis into two cases. If yt ∈ C, then yt = ΠC ( yt ), so expression (3.8) evaluates to zero. On the other hand, if yt C, then the difference yt − ΠC ( yt ) is in the normal cone NC(ΠC ( yt )) (see Figure 3.A.2). Since V (ΠC ( yt )) ∈ TC(ΠC ( yt )), it follows that ( yt − ΠC ( yt )) V (ΠC ( yt )) ≤ 0, so the proof is complete. ˆ ˆ ˆ (ii) If V (x) ∈ TC(x) ∩ (−TC(x)), then a slight modification of the argument above shows d that dt δC ( yt ) = 2( yt − ΠC ( yt )) V (ΠC ( yt )) = 0, and so that the distance between yt and C is constant over time under the dynamic (3.7). Therefore, C is both forward and backward invariant under (3.7), and hence under (D) as well. 124 yt Π C( yt ) V(Π C( yt )) Figure 3.A.2: The proof of Theorem 3.A.8. 3.N Notes Section 3.1: Bjornerstedt and Weibull (1996) introduce a version of the revision protocol ¨ model and derive the mean dynamics associated with certain imitative decision rules; see Weibull (1995, Sections 4.4 and 5.3) for a summary. The model studied here builds on Bena¨m and Weibull (2003) and Sandholm (2003, 2006a). Versions of target dynamics are ı considered in Sandholm (2005a) and Hofbauer and Sandholm (2008). Section 3.2: The replicator dynamic was introduced by Taylor and Jonker (1978), but is closely related to a number of older models from mathematical biology—see Schuster and Sigmund (1983). The latter authors coined the term “replicator dynamic”, borrowing the term “replicator” from Dawkins (1976, 1982). Example 3.2.1, Example 3.2.2, and Exercise 3.2.3 are due to Schlag (1998), Bjornerstedt and Weibull (1996), and Hofbauer (1995a), ¨ respectively. The logit dynamic is studied by Fudenberg and Levine (1998) and Hofbauer and Sandholm (2002, 2007). The BNN dynamic was introduced in the context of symmetric zero sum games by Brown and von Neumann (1950). Nash (1951) uses a discrete version of this dynamic to devise a proof of existence of Nash equilibrium in normal form games based on Brouwer’s fixed point theorem. The Smith dynamic, also known as the pairwise difference dynamic, was introduced by Smith (1984) to study the dynamics of traffic flow. Generalizations of all of the dynamics from this section are studied in the next two chapters, where additional references can be found. 125 Appendix 3.A: Hirsch and Smale (1974) and Robinson (1995) are fine introductions to ordinary differential equations at the undergraduate and graduate levels, respectively. Theorem 3.A.8 is adapted from Smirnov (2002, Theorem 5.7). 126 CHAPTER FOUR Deterministic Dynamics: Families and Properties 4.0 Introduction In the model of evolution introduced in Chapter 3, a large society of agents recurrently play a population game F by applying a revision protocol ρ. Through an informal appeal to the law of large numbers, we argued that aggregate behavior in the society can be described by a differential equation (M) ˙ x = VF (x) on the state space X. Alternatively, by fixing the revision protocol ρ, we can define a map from population games F to differential equations (M), a map that we call an evolutionary dynamic. In this chapter and the next, we introduce families of evolutionary dynamics, where the dynamics within each family are defined by qualitatively similar revision protocols. We investigate the properties of the dynamics in each family. One of our goals in doing so is to provide an evolutionary justification of the prediction of Nash equilibrium play (see Section 3.0). The first part of this chapter sets the stage for this analysis. We begin in Section 4.1 by stating general principles for evolutionary modeling in game theory. While some of these principles are implicit in our formulation of evolutionary dynamics, others must be imposed directly on our revision protocols. We do so by introducing two desiderata for revision protocols in Section 4.2. Continuity (C) asks that revision protocols depend continuously on their inputs. Scarcity of data (SD) demands that the conditional switch rate from strategy i to strategy j only depend on the 127 payoffs of these two strategies. Protocols that respect these two properties do not make unrealistic demands on the amount of information that agents in an evolutionary model must possess. Section 4.2 also offers two conditions that relate aggregate behavior under evolutionary dynamics to incentives in the underlying games. Nash stationarity (NS) asks that the rest points of the mean dynamic be precisely the Nash equilibria of the game being played. Positive correlation (PC) requires that out of equilibrium, strategies’ growth rates be positively correlated with their payoffs. Evolutionary dynamics satisfying these two properties respect payoffs in the underlying strategic interaction, and so agree with the traditional, rationalistic approach to game theory in some primitive way. Section 4.3 previews the performance of each of our families of dynamics under the four desiderata, and uses examples to highlight the properties of each. Our study of the families themselves begins in Section 4.4, which introduces imitative dynamics. These dynamics, whose prototype is the replicator dynamic, are the most thoroughly studied in the evolutionary literature. While imitative dynamics have many appealing properties, they admit rest points that are not Nash equilibria; thus, they fail Nash stationarity (NS), and so fail to provide a full justification of the Nash prediction. We continue to work toward this justification in Section 4.5, where we introduce excess payoff dynamics. These dynamics satisfy Nash stationarity, but as they require agents to know the average payoffs obtained by members of their population, they fail scarcity of data (SD), and so do not provide the justification we seek. We come to this justification at last in Section 4.6, where we define pairwise comparison dynamics. These dynamics, whose revision protocols only require agents to compare the payoffs of the pair of strategies at issue, satisfy all four of our desiderata, and so provide our justification of the Nash prediction. This justification is developed further in Section 4.7, which shows that any dynamic that combines imitation and pairwise comparison satisfies all of our desiderata as well. Of course, a more compelling justification of the Nash prediction would not only link Nash equilibrium with stationary states of an evolutionary dynamic, but would also show that evolution leads to Nash equilibrium from disequilibrium states. This key issue will be the focus of Part III of the book. 4.1 Principles for Evolutionary Modeling We begin our discussion by proposing five principles for evolutionary modeling: (i) Large populations 128 (ii) (iii) (iv) (v) Inertia Myopia Limited information Insensitivity to modeling details The first principle, that populations are large, is not only part of the definition of a population games; it is also a key component of the deterministic approximation theorem. This principle buttresses the next two: inertia, that players only occasionally consider switching strategies; and myopia, that agents condition choices on current behavior and payoffs, and do not attempt to incorporate beliefs about the future course of play into their decisions. Both of these principles are built into the definition of a revision protocol: agents wait a random amount of time before revising, using procedures that only condition on current payoffs and the current social state. All of the first three principles are mutually reinforcing: myopic behavior is most sensible when opponents’ behavior adjusts slowly, and when populations are large enough that individual members become anonymous, inhibiting repeated game effects. The fourth principle holds that agents possess limited information about opponents’ behavior. This principle fits in easily with the previous three. When the number of agents in an interaction is large, exact information about their aggregate behavior typically is difficult to obtain. If agents make costly efforts to gather such information, it would be incongruous to then assume that they act upon it in a shortsighted fashion. The principle of limited information is expressed in our model through restrictions on allowable revision protocols, as we discuss below. The fifth principle for evolutionary modeling, insensitivity to modeling details, is of a different nature than the others. According to this principle, one should be most satisfied with properties of evolutionary dynamics that are not sensitive to the exact specification of the revision protocol. If a property holds for all revision protocols with a certain “family resemblance”, then one can argue that the property is not a consequence of particular choices of functional forms, but of more fundamental assumptions about how individuals make decisions. It is because of this principle that our analyses to come focus on families of evolutionary dynamics, and on establishing properties of dynamics that hold for all family members. The principle of insensitivity to modeling details provides a defense against a well known critique of evolutionary analysis of games: that it is inherently arbitrary. According to this critique, modelers who depart from the assumption of perfect rationality are left with an overwhelming array of alternative assumptions; since the choice among these assumptions is ultimately made in an ad hoc fashion, the predictions of boundedly 129 rational models must be viewed with suspicion. Heeding the fifth principle enables us to dispel this critique: if all qualitatively similar models generate the same predictions, then arbitrariness is no longer an issue. 4.2 Desiderata for Revision Protocols and Evolutionary Dynamics We now turn from general principles for evolutionary modeling to specific desirable properties for revision protocols and their mean dynamics. 4.2.1 Limited Information Since revision protocols can be essentially arbitrary functions of the payoff vector Fp (x) and the population state xp , they allow substantial freedom to specify how agents respond to current strategic conditions. But as we argued in the introduction, it is most in keeping with the evolutionary paradigm to specify models of choice in which agents only possess limited information about their strategic environment. Our first two desiderata capture this idea. (C) (SD) Continuity: Scarcity of data: ρ is Lipschitz continuous. p p p p For all i, j ∈ Sp and p ∈ P , ρi j only depends on πi , π j , and x j . It is contrary to the evolutionary paradigm to posit revision protocols that are extremely sensitive to the exact values of payoffs or of the population state. When the population size is large, exact information about these quantities can be difficult to obtain; myopic agents are unlikely to make the necessary efforts. These concerns are reflected in condition (C), which requires that agents’ revision protocols be Lipschitz continuous functions of payoffs and the state. Put differently, condition (C) asks that small changes in aggregate behavior not lead to large changes in players’ responses. The information that agents in an evolutionary model possess depends on the application at hand: in some settings—for instance, if information is provided by a central planner—agents might have every bit of information that could conceivably be of use. But in others agents might know very little about their strategic environment. Condition (SD), scarcity of data, imposes a stringent restriction on the nature of agents’ information, allowing agents to know only those facts that are most germane to the decision at hand. Under this condition, an agent who receives a revision opportunity chooses a candidate strategy j either by observing the strategy of a randomly chosen opponent or by selecting a strategy at random from the set Sp . Then, the agent’s decision 130 about whether to switch is based only on the payoffs of the current strategy i and the candidate strategy j. Some of the protocols we consider require agents to know the payoffs of all strategies in Sp . While such protocols fail condition (SD), one can imagine environments where this payoff information might be within the agents’ grasp. We therefore also propose this weaker scarcity of data condition: (SD∗ ) p p p p p For all i, j ∈ Sp and p ∈ P , ρi j only depends on π1 , π2 , . . ., πnp , and x j . To illustrate the use of these conditions, we recall some examples of revision protocols from Chapter 3. As all of these examples satisfy continuity (C), we focus our attention on scarcity of data. Example 4.2.1. The following three revision protocols generate the replicator dynamic as their mean dynamics: p xj p p (4.1) p ρi j (πp , xp ) (4.2) ρi j (πp , xp ) = (Kp − πi ) mjp , (4.3) ρi j (πp , xp ) = = mp [π j − πi ]+ , p px p p p xj mp p (π j − Kp ). p (In equations (4.2) and (4.3), we assume the constant Kp is chosen so that ρi j (πp , xp ) ≥ 0.) Protocol (4.1) is pairwise proportional imitation, protocol (4.2) is pure imitation driven by dissatisfaction, and protocol (4.3) is imitation of success. All three of these protocols satisfy condition (SD). § Example 4.2.2. The logit dynamic is derived from the exact target protocol p p ρi j (πp , xp ) = p σ j (πp , xp ) exp(η−1 π j ) = p k∈Sp exp(η−1 πk ) . This protocol fails condition (SD), but it satisfies condition (SD∗ ). § Example 4.2.3. The target protocol p p p ˆ ρi j (πp , xp ) = τ j (πp , xp ) = [π j ]+ induces the BNN dynamic as its mean dynamic. This protocol conditions on strategy j’s p p 1 1 ˆ excess payoff π j = π j − mp (xp ) πp , and hence on the population average payoff mp (xp ) πp . 131 Since computing this average payoff requires knowledge of the payoffs and utilization levels of all strategies, this protocol fails both condition (SD) and condition (SD∗ ). § 4.2.2 Incentives and Aggregate Behavior Our two remaining desiderata impose restrictions on mean dynamics, linking the evolution of aggregate behavior to the incentives in the underlying game. The first of the two constrains equilibrium behavior, the second disequilibrium dynamics. (NS) (PC) Nash stationarity: VF (x) = 0 if and only if x ∈ NE(F). p p Positive correlation: VF (x) 0 implies that VF (x) Fp (x) > 0. Nash stationarity (NS) requires that the Nash equilibria of the game F and the rest points of the dynamic VF coincide. It can be split into two distinct restrictions. First, (NS) asks that every Nash equilibrium of F be a rest point of VF . If state x is a Nash equilibrium, then no agent benefits from switching strategies; (NS) demands that in this situation, the state be at rest under VF . This does not mean that the agents never switch strategies at this state; instead, it requires that the expected aggregate impact of switches is nil. Second, condition (NS) asks that every rest point of VF be a Nash equilibrium of F. If the current population state is not a Nash equilibrium, then by definition there are agents who would benefit from switching strategies. Condition (NS) requires that some of these agents eventually avail themselves of this opportunity. Positive correlation (PC) is a mild payoff monotonicity condition that has force whenever a population is not at rest. To understand its name, view the strategy set Sp = {1, . . ., np } as a probability space endowed with the uniform probability measure. Then the vectors p p p VF (x) ∈ Rn and Fp (x) ∈ Rn can be interpreted as random variables on Sp , making it meaningful to ask about their covariance. To evaluate this quantity, we make a simple observation: if Y and Z are random variables and the expectation of Y is zero, then the covariance of Y and Z is just the expectation of their product: Cov(Y, Z) = E(YZ) − E(Y)E(Z) = E(YZ). p Since the dynamic VF keeps population masses constant (in other words, since VF (x) ∈ p TXp ), we know that the components of VF (x) sum to zero. Thus p 1 V (x) np k E(V p (x)) = = 0, and so k∈Sp 132 p p 1 V (x) Fk (x) np k Cov(V p (x), Fp (x)) = E(V p (x) Fp (x)) = = 1 V p (x) np Fp (x). k∈Sp p p We can therefore restate condition (PC) as follows: if VF (x) 0, then Cov(VF (x), Fp (x)) > 0. One can visualize condition (PC) through its geometric interpretation: whenever the p growth rate vector VF (x) is nonzero, it forms a strictly acute angle with the vector of payoffs Fp (x) (see Examples 4.2.5 and 4.2.7 below). In rough terms, this means that the direction of motion does not overly distort the direction of the payoff vector. In this connection, it is worth emphasizing that while the payoff vector Fp (x) can be any p p vector in Rn , forward invariance requires the growth rate vector VF (x) to be an element of the tangent cone TXp (xp ): its components must sum to zero, and it must not assign negative growth rates to unused strategies (Proposition 3.3.3). This means that in most games, evolutionary dynamics must distort payoff vectors in order to remain feasible. The dynamic that minimizes this distortion, the projection dynamic, is studied in Chapter 5. There is an important link between our two conditions: the out-of-equilibrium condition (PC) implies half of the equilibrium condition (NS). In particular, if positive correlation holds, then every Nash equilibrium of F is a rest point under VF . This is easiest to see in the single population setting. If x is a Nash equilibrium of F, then F(x) is in the normal cone of X at x. Since VF (x) is a feasible direction of motion from x, it is in the tangent cone of X at x; thus, the angle between F(x) and VF (x) cannot be acute. Positive correlation therefore implies that x is a rest point of VF . More generally, we have the following result. Proposition 4.2.4. If VF satisfies (PC), then x ∈ NE(F) implies that VF (x) = 0. Proof. Suppose that VF satisfies (PC) and that x ∈ NE(F). Recall that x ∈ NE(F) ⇔ F(x) ∈ NX(x) ⇔ [v F(x) ≤ 0 for all v ∈ TX(x)] . p Now fix p ∈ P , and define the vector v ∈ Rn by vp = VF (x) and vq = 0 for q p. Then p v ∈ TX(x) by construction, and so VF (x) Fp (x) = v F(x) ≤ 0. Condition (PC) then implies p that VF (x) = 0. Since p was arbitrary, we conclude that VF (x) = 0. Example 4.2.5. Consider the two-strategy coordination game F1 (x) 1 0 x1 x1 F(x) = F (x) = 0 2 x = 2x , 2 2 2 133 x1 x2 Figure 4.2.1: Condition (PC) in 12 Coordination. and the replicator dynamic for this game, ˆ V1 (x) x1 F1 (x) x1 x1 − (x1 )2 + 2(x2 )2 V (x) = ˆ V (x) = x F (x) = x 2x − (x )2 + 2(x )2 22 2 2 1 2 2 , both of which are graphed in Figure 4.2.1. At each state that is not a rest point, the angle between F(x) and V (x) is acute. At each Nash equilibrium, no vector that forms an acute angle with the payoff vector is a feasible direction of motion; thus, all Nash equilibria must be rest points under V . § ˆ Exercise 4.2.6. Suppose that F is a two-strategy game, and let VF and VF be Lipschitz continuous dynamics that satisfy condition (PC). Show that if neither dynamic is at rest ˆ ˆ at state x ∈ X, then VF (x) is a positive multiple of VF (x). Conclude that if VF and VF also ˆ satisfy condition (NS), then VF (x) = k(x)VF (x) for some positive function k : X → (0, ∞). In ˆ ˆ this case, the phase diagrams of VF and VF are identical, and solutions to VF and VF differ only by a change in speed (cf Exercise 4.4.10 below). § Example 4.2.7. Consider the three-strategy coordination game F1 (x) 1 0 0 x1 x1 F (x) = 0 2 0 x = 2x . 2 2 2 F(x) = F (x) 0 0 3 x 3x 2 2 2 134 Since payoffs are now vectors in R3 , they can no longer be drawn in a two-dimensional picture, so we draw the projected payoff vectors x1 − 1 (x1 + 2x2 + 3x3 ) 3 1 2x − 1 (x + 2x + 3x ) 2 31 ΦF(x) = I − 3 11 F(x) = 2 3 3x − 1 (x + 2x + 3x ) 3 2 3 31 instead. Since dynamic VF also takes values in TX, drawing the growth rate vectors VF (x) and the projected payoff vectors ΦF(x) is enough to evaluate property (PC) (cf Exercise 4.2.8). In Figure 4.2.2(i), we plot the projected payoffs ΦF and the replicator dynamic; in Figure 4.2.2(ii) we plot the projected payoffs ΦF and the BNN dynamic. In both cases, except when VF (x) = 0, the angles between VF (x) and ΦF(x) are always acute. At each Nash equilibrium x, all directions of motion from x that form an acute angle with ΦF(x) are infeasible, and so both dynamics are at rest. § Exercise 4.2.8. Let VF be an evolutionary dynamic for the single population game F. Show that sgn(VF (x) F(x)) = sgn(VF (x) ΦF(x)). Thus, to check that (PC) holds, it is enough to verify that it holds with respect to projected payoffs. 4.3 Families of Evolutionary Dynamics In the remainder of this chapter and in Chapter 5, we introduce various families and examples of evolutionary dynamics, and we evaluate them in terms of our four desiderata: continuity (C), scarcity of data (SD), Nash stationarity (NS), and positive correlation (PC). Table 4.1 summarizes the results. Let us briefly mention a few of the main ideas from the analyses to come. • Imitative dynamics, including the replicator dynamic, satisfy all of the desiderata except for Nash stationarity (NS): these dynamics admit rest points that are not Nash equilibria. • Excess payoff dynamics, including the BNN dynamic, satisfy all of our desiderata except scarcity of data (SD): the revision protocols that generate these dynamics involve comparisons between the individual strategies’ payoffs and the population’s average payoff. • By introducing revision protocols that only require pairwise comparisons of payoffs, we obtain a family of evolutionary dynamics that satisfy all four desiderata. 135 1 2 3 (i) The replicator dynamic 1 2 3 (ii) The BNN dynamic Figure 4.2.2: Condition (PC) in 123 Coordination. 136 Family Leading example (C) (SD) (NS) (PC) imitation excess payoff pairwise comparison replicator BNN Smith best response logit projection yes yes yes no yes no yes no yes yesa yesa no perturbed best response a b no yes yes yesb no yes yes yes yes yesb no yes These dynamics fail condition (SD), but satisfy the weaker requirement (SD*). The best response dynamics satisfy versions of conditions (NS) and (PC) defined for differential inclusions. Table 4.1: Families of evolutionary dynamics and their properties. • The best response dynamic satisfies versions of all of the desiderata except continuity: its revision protocol depends discontinuously on payoffs. • We can eliminate the discontinuity of the best response dynamic by introducing perturbations, but at the cost of violating the incentive conditions. In fact, choosing the level of perturbations involves a tradeoff between condition (C) and conditions (NS) and (PC): smaller perturbations reduce the degree of smoothing, while larger perturbations make the failures of the incentive conditions more severe. • The projection dynamic minimizes the discrepancy at each state between the vector of payoffs and the vector representing the directions of motion. It satisfies both of incentive conditions, but neither of the limited information conditions. There are a variety of close connections between the projection dynamic and the replicator dynamic. Figure 4.3.1 presents phase diagrams for the six basic dynamics in the standard RockPaper-Scissors game FR (x) 0 −1 1 xR xS − xP x = x − x . P R F(x) = FP (x) = 1 0 −1 S F (x) −1 1 x x − x 0 S P R S 1 The unique Nash equilibrium of RPS places equal mass on each strategy: x∗ = ( 1 , 1 , 3 ). 33 In the phase diagrams, colors represent speed of motion: within each diagram, motion is fastest in the red regions and slowest in the blue ones. In this example, the maximum 137 (i) replicator (ii) projection (iii) Brown-von Neumann-Nash (iv) Smith (v) best response (vi) logit(.08) Figure 4.3.1: Six basic dynamics in the Rock-Paper-Scissors game. 138 √ speed under the replicator dynamic is 42 ≈ .3536, while the maximum speed under the √ other five dynamics is 2 ≈ 1.4142. Some remarks on the phase diagrams: • The replicator and projection dynamics exhibit closed orbits around the Nash equilibrium. Under the other four dynamics, the Nash equilibrium is globally asymptotically stable. • The replicator dynamic has rest points at the Nash equilibrium and at each of the pure states. Under the other dynamics, the only rest point is the Nash equilibrium. • The phase diagram for the BNN dynamic can be divided into six regions. In the “odd” regions, exactly one strategy has above average payoffs, so the dynamic moves directly toward a pure state, just as under the best response dynamic. In the “even” regions, two strategies have above average payoffs; as these regions are traversed, the “target point” of the dynamic passes from one pure state to the next. • Compared to those of the BNN dynamic, solutions of the Smith dynamic approach the Nash equilibrium at closer angles and at higher speeds. • Under the best response dynamic, solution trajectories always aim directly toward the state representing the current best response. The trajectories are kinked whenever best responses change. • Unlike those of the best response dynamic, solutions trajectories of the logit dynamic are smooth. The directions of motion under the two dynamics are similar, except at states near the boundaries of the best response regions. • Under the replicator dynamic, the boundary consists of three rest points and three heteroclinic orbits that connect distinct rest points. All told, the boundary forms what is known as a heteroclinic cycle. • Under the projection dynamic, there is a unique forward solution from each initial condition, but backward solutions are not unique. For example, the outermost closed orbit (the inscribed circle) is reached in finite time by every solution trajectory that starts outside of it. In addition, there are solution trajectories that start in the interior of the state space and reach the boundary in finite time—an impossibility under any of the other dynamics We develop these and many other observations in the sections to come. 139 4.4 4.4.1 Imitative Dynamics Definition Imitative dynamics are based on revision protocols of the form p pp ˆ ρi j (πp , xp ) = x j ri j (πp , xp ), (4.4) p p ˆ where x j = x j /mp is the proportion of population p members playing strategy j ∈ Sp . We can interpret these protocols as follows: When an agent’s clock rings, he randomly chooses an opponent from his population. If the agent is playing strategy i ∈ Sp and the opponent strategy j ∈ Sp , then the agent imitates the opponent with probability proportional to the p conditional imitation rate ri j . The revision protocol (4.4) generates a mean dynamic of the form (4.5) pp p p xk ρki (Fp (x), xp ) − xi ˙ xi = p ρik (Fp (x), xp ) k∈Sp ppp p pp ˆ ˆ xk xi rki (Fp (x), xp ) − xi xk rik (Fp (x), xp ) k∈Sp k∈Sp p pp p ˆ xi xk rki (Fp (x), xp ) − rik (Fp (x), xp ) . k∈Sp k∈Sp = = If the revision protocol satisfies the requirements below, the differential equation above defines an imitative dynamic. p Definition. Suppose that the conditional imitation rates ri j are Lipschitz continuous, and that net conditional imitation rates are monotone: (4.6) p p p p p p π j ≥ πi ⇔ rk j (πp , xp ) − r jk (πp , xp ) ≥ rki (πp , xp ) − rik (πp , xp ) for all i, j, k ∈ Sp and p ∈ P . Then the map from population games F ∈ F to differential equations (4.5) is called an imitative dynamic. Condition (4.6) says that whenever strategy j ∈ Sp has a higher payoff than strategy i ∈ Sp , then the net rate of imitation from any strategy k ∈ Sp to j exceeds the net rate of imitation from k to i. We illustrate this condition in the next subsection using a variety of examples; the condition’s implications for aggregate behavior are developed thereafter. Example 4.4.1. The replicator dynamic. The fundamental example of an imitative dynamic 140 is the replicator dynamic, defined by (R) p p ˆp ˙ xi = xi Fi (x). Under the replicator dynamic, the percentage growth rate of each strategy i ∈ Sp currently p ˆp in use equals its excess payoff Fi (x) = Fi (x) − Fp (x); unused strategies remain so. We provide a variety of derivations of the replicator dynamic below. § 4.4.2 Examples The examples to follow are expressed in the setting of a single, unit mass population, ˆ so that xi = xi . They are easily recast for multipopulation settings. Example 4.4.2. Imitation via pairwise comparisons. Suppose that ρi j (π, x) = x j φ(π j − πi ), where φ : R → R+ equals 0 on (−∞, 0] and is strictly increasing on [0, ∞). In this case, an agent only imitates his randomly chosen opponent when the opponent’s payoff is higher than the agent’s own. Protocols of this form satisfy condition (4.6). If we write ψ(d) = φ(d) − φ(−d), then we can express the corresponding mean dynamic as ˙ xi = xi xk φ(Fi (x) − Fk (x)) − φ(Fk (x) − Fi (x)) k∈S = xi xk ψ(Fi (x) − Fk (x)). k∈S Setting φ(d) = [d]+ gives us the pairwise proportional imitation protocol from Example 3.2.1. In this case ψ(d) = d, and the mean dynamic is the replicator dynamic (R). § Exercise 4.4.3. Suppose we generalize the Example 4.4.2 by letting ρi j (π, x) = x j φi j (π j − πi ), where each function φi j equals 0 on (−∞, 0] and is strictly increasing on [0, ∞). Explain why the resulting mean dynamic need not satisfy condition (4.6), and so need not be an imitative dynamic. (For an interesting contrast, see Section 4.6.) Example 4.4.4. Pure imitation driven by dissatisfaction. Suppose that ρi j (π, x) = a(πi ) x j . Then when the clock of an i player rings, he abandons his current strategy with probability proportional to the abandonment rate a(πi ); in such instances, he imitates a randomly chosen opponent. In this case, condition (4.6) requires that a : R → R+ be strictly decreasing, and the mean dynamic becomes (4.7) ˙ xi = xi k∈S xk a(Fk (x)) − a(Fi (x)) = xi 141 k ∈S xk a(Fk (x)) − a(Fi (x)) . If abandonment rates take the linear form a(πi ) = K − πi (where K is large enough), then (4.7) is again the replicator dynamic (R). § Example 4.4.5. Imitation of success. Suppose ρi j (π, x) = x j c(π j ). Then when an agent’s clock rings, he picks an opponent at random; if the opponent is playing strategy j, the player imitates him with probability proportional to the copying rate c(π j ). In this case, condition (4.6) requires that c : R → R+ be strictly increasing, and the mean dynamic becomes (4.8) ˙ xi = xi k∈S xk c(Fi (x)) − c(Fk (x)) = xi c(Fi (x)) − k∈S xk c(Fk (x)) . Since ρ is a target protocol (i.e., since ρi j ≡ τ j ), the mean dynamic (4.8) is actually a target dynamic: xk c(Fk (x)) ˙ x i = k ∈ S 0 xi c(Fi (x)) − xi k∈S xk c(Fk (x)) if x j c(F j (x)) 0 for some j ∈ S, otherwise. If copying rates are of the linear form c(π j ) = π j + K (for K large enough), then (4.8) is once again the replicator dynamic (R). If in addition payoffs are nonnegative and average payoffs are positive, we can choose c(π j ) = π j , so that (4.8) becomes (4.9) ˙ xi = F(x) xi Fi (x) F(x) − xi . Here, the target state is proportional to the vector of popularity-weighted payoffs xi Fi (x), with the rate of motion toward this state governed by average payoffs F(x). § Exercise 4.4.6. Why is the restriction on payoffs needed to obtain equation (4.9)? Example 4.4.7. Imitation of success with repeated sampling. Suppose that ρi j (π, x) = x j w(π j ) , k∈S xk w(πk ) where k∈S xk w(πk ) > 0. Here, when an agent’s clock rings he chooses an opponent at random. If the opponent is playing strategy j, the agent imitates him with probability proportional to the copying weight w(π j ). If the agent does not imitate this opponent, he draws a new opponent at random and repeats the procedure. In this case, condition (4.6) requires that w : R → R+ be strictly increasing. Since ρ is an exact target protocol (i.e., 142 (i) The replicator dynamic (ii) The Maynard Smith replicator dynamic Figure 4.4.1: Two imitative dynamics in 123 Coordination. since ρi j ≡ σ j and (4.10) ˙ xi = j∈S σ j ≡ 1), it induces the exact target dynamic xi w(Fi (x)) − xi . § k∈S xk w(Fk (x)) We conclude with two important instances of repeated sampling. Example 4.4.8. The Maynard Smith replicator dynamic. If payoffs are nonnegative and average payoffs are positive, we can let copying weights equal payoffs: w(π j ) = π j . The resulting exact target dynamic, (4.11) ˙ xi = xi Fi (x) F(x) − xi , is known as the Maynard Smith replicator dynamic. Example 4.4.5 showed that under the same assumptions on payoffs, the replicator dynamic takes the form (4.9). The Maynard Smith replicator dynamic (4.11) differs from (4.9) only in that the target state is approached at a unit rate rather than at a rate determined by average payoffs; thus, motion under (4.9) is relatively fast when average payoffs are relatively high. Comparing the protocol here to the one from Example 4.4.5 reveals the source of the difference in speeds: under repeated sampling, the overall payoff level has little influence on the probability that a revising agent winds up switching strategies. In the single population setting, the phase diagrams of (4.9) and (4.11) are identical, and the dynamics only differ in terms of the speed at which solution trajectories are traversed 143 H T H T H H T T (i) The replicator dynamic (ii) The Maynard Smith replicator dynamic Figure 4.4.2: Two imitative dynamics in Matching Pennies. (cf Exercise 4.4.10). We illustrate this in Figure 4.4.1, which presents phase diagrams for the two dynamics in 123 Coordination. When there are multiple populations, the fact that average payoffs differ across populations implies that the phase diagrams of (4.9) and (4.11) no longer coincide. This is illustrated in Figure 4.4.2, which presents phase diagrams for this Matching Pennies game: H T h 2, 1 1, 2 t 1, 2 2, 1 While interior solutions of (4.9) form closed orbits around the unique Nash equilibrium 1 x∗ = (( 2 , 1 ), ( 1 , 1 )), interior solutions of (4.11) converge to x∗ . § 2 22 Example 4.4.9. The i-logit dynamic. If the copying weights w(π j ) = exp(η−1 π j ) are exponential functions of payoffs, the exact target dynamic (4.10) becomes the i-logit dynamic with noise level η > 0. ˙ xi = xi exp(η−1 Fi (x)) − xi . −1 k∈S xk exp(η Fk (x)) Here, the ith component of the target state is proportional both to the mass of agents playing strategy i and to an exponential function of strategy i’s payoff. If η is small, and x is not too close to the boundary of X or of any best response region, then the target state 144 is close to eb(x) , the vertex of X corresponding to the current best response. Therefore, in most games, the i-logit dynamic with small η approximates the best response dynamic ˙ x ∈ B(x) − x on much of int(X). We illustrate this in Figure 4.4.3, which presents four i-logit dynamics (with η = .5, .1, .05, and .01) and the best response dynamic for the anticoordination game −1 0 0 x1 −x1 0 −1 0 x = −x . § 2 2 F(x) = Ax = x −x 0 0 −1 3 3 Exercise 4.4.10. Changes of speed and reparameterizations of time. Let V : Rn → Rn be a Lipschitz continuous vector field and let k : Rn → (0, ∞) be a positive Lipschitz continuous ˙ function. Let {xt } be a solution to x = V (x) with initial condition ξ, and let { yt } be a solution t ˙ to x = k(x)V (x), also with initial condition ξ. Show that yt = xI(t) , where I(t) = 0 k( ys ) ds. 4.4.3 Biological Derivations of the Replicator Dynamic While we have derived the replicator dynamic from models of imitation, its origins lie in mathematical biology, where it arises from models of intra- and inter-species competition. The next two exercises, which are set in a single population, consider the replicator dynamic from this point of view. Exercise 4.4.11. In the basic game theoretic model of natural selection within a single animal species, each strategy i ∈ S represents a behavioral type. The value of Fi (x) represents the (reproductive) fitness of type i when the current proportions of types are described by x ∈ int(X). In particular, if we let yi ∈ (0, ∞) represent the (absolute) number animals of type i in the population, then the evolution of the population is described by (4.12) ˙ yi = yi Fi (x), where xi = yi j∈S yj . Show that under equation (4.12), the vector x describing the proportions of animals of each of each type evolves according to the replicator equation (R). Exercise 4.4.12. The Lotka-Volterra equation. The Lotka-Volterra equation is a fundamental model of biological competition among members of multiple species. When there are n − 1 species, the equation takes the form (4.13) ˙ yk = yk bk + (My)k , k ∈ {1, . . . , n − 1}, 145 (i) The i-logit(.5) dynamic (ii) The i-logit(.1) dynamic (iii) The i-logit(.05) dynamic (iv) The i-logit(.01) dynamic (v) The best response dynamic Figure 4.4.3: i-logit and best response dynamics in Anticoordination. 146 where bk is the baseline growth rate for species k, and the interaction matrix M ∈ R(n−1)×(n−1) governs cross-species effects. Show that after the change of variable xi = yi 1+ n −1 l =1 yl and xn = 1 1+ n−1 l =1 yl , the n − 1 dimensional Lotka-Volterra equation (4.13) is equivalent up to a change of speed (cf Exercise 4.4.10) to the n strategy replicator dynamic ˙ xi = xi ((Ax)i − x Ax), i ∈ {1, . . . , n}, where the payoff matrix A ∈ Rn×n is related to M ∈ R(n−1)×(n−1) and b ∈ Rn−1 by the R(n−1)×n matrix equation M b = I (−1) A. If M and b are given, this equation determines A up to an additive constant in each column. Thus, A can always be chosen so that either the elements of its last row or the elements of its diagonal are all 0. 4.4.4 Extinction and Invariance We now derive properties shared by all imitative dynamics. First of all, it follows immediately from equation (4.5) that all imitative dynamics satisfy extinction: if a strategy is unused, its growth rate is zero. (4.14) p p If xi = 0, then Vi (x) = 0. Extinction implies that the growth rate vectors V (x) are always tangent to the boundaries of X: formally, V (x) is not only in TX(x), but also in −TX(x) (cf Proposition 3.3.3). Thus, since imitative dynamics are Lipschitz continuous, it follows from Theorem 3.A.8 in Chapter 3 that solutions to imitative dynamics exist for all positive and negative times. ˙ Proposition 4.4.13 (Forward and backward invariance). Let x = VF (x) be an imitative dynamic. Then for each initial condition ξ ∈ X, this dynamic admits a unique solution trajectory in T (−∞,∞) = {x : (−∞, ∞) → X : x is continuous}. Extinction also implies a second invariance property: if {xt } is a solution trajectory of an imitative dynamic, then the support of xt is independent of t. Uniqueness of solution 147 trajectories, which is implied by the Lipschitz continuity of the dynamic, is an essential ingredient of the proof of this result. Theorem 4.4.14 (Support invariance). If {xt } is a solution trajectory of an imitative dynamic, p then the sign of component (xt )i is independent of t ∈ (−∞, ∞). ˙ Proof. Let {xt } be a solution to the imitative dynamic x = V (x), and suppose that x0 = ξ. p p Suppose that ξi = 0; we want to show that (xt )i = 0 for all t ∈ (−∞, ∞). To accomplish ˆ this, we define a new vector field V : X → Rn as follows: 0 if j = i and q = p, ˆ q (x) = Vj q V (x) otherwise. j p ˆ ˆ ˙ ˆ ˆ If {xt } ⊂ X is the unique solution to x = V (x) with x0 = ξ, then (xt )i = 0 for all t. But V p ˆ ˆ and V are identical whenever xi = 0 by extinction (4.14); therefore, {xt } is also a solution ˙ ˙ ˆ to x = V (x). Since solutions to x = V (x) are unique, it must be that {xt } = {xt }, and hence p that (xt )i = 0 for all t. p p Now suppose that ξi > 0. If xt = χ satisfied χi = 0, then the preceding analysis would ˙ imply that there are two distinct solutions to x = V (x) with xt = χ, one that is contained in the boundary of X and one that is not. As this would contradict uniqueness of solutions, p we conclude (xt )i > 0 at all times t. All of the phase diagrams presented in this section illustrate the face invariance property. The next example points out one of its more subtle consequences. Example 4.4.15. Figure 4.4.4 presents the phase diagram of the replicator dynamic for a game with a strictly dominant strategy: for all x ∈ X, F1 (x) = 1 and F2 (x) = F3 (x) = 0. There are two connected components of rest points: one consisting solely of the unique Nash equilibrium e1 , and the other containing those states at which strategy 1 is unused. Clearly, the latter component is unstable, as all nearby solution trajectories lead away from it and toward the Nash equilibrium. But as the coloring of the figure indicates, the speed of motion away from the unstable component is very slow: if a small behavior disturbance pushes the state off of the component, it may take a long time before the stable equilibrium is reached. § 148 Figure 4.4.4: The replicator dynamic in a game with a strictly dominant strategy. 4.4.5 Monotone Percentage Growth Rates and Positive Correlation We now turn to monotonicity properties of imitative dynamics. All dynamics of form (4.5) can be expressed as (4.15) p p p p p p p p ˆ xk rki (Fp (x), xp ) − rik (Fp (x), xp ) . ˙ xi = Vi (x) = xi Gi (x), where Gi (x) = k∈Sp p p p If strategy i ∈ Sp is in use, then Gi (x) = Vi (x)/xi represents the percentage growth rate of the number of agents using this strategy. Observation 4.4.16 notes that under every imitative dynamic (as defined in Section 4.4.1), strategies’ percentage growth rates are ordered by their payoffs. Observation 4.4.16. All imitative dynamics exhibit monotone percentage growth rates: (4.16) p p p p Gi (x) ≥ G j (x) if and only if Fi (x) ≥ F j (x). This observation is immediate from condition (4.6), which defines imitative dynamics. Condition (4.16) is a strong restriction on strategies’ percentage growth rates. We now show that it implies our basic payoff monotonicity condition, which imposes a weak restriction on strategies’ absolute growth rates. 149 Theorem 4.4.17. All imitative dynamics satisfy positive correlation (PC). Proof. Let x be a social state at which V p (x) To do so, we define p p 0; we need to show that V p (x) Fp (x) > 0. p p S+ (x) = {i ∈ Sp : Vi (x) > 0} and S− (x) = { j ∈ Sp : V j (x) < 0} to be the sets of population p strategies with positive and negative absolute growth rates, respectively. By extinction (4.14), these sets are contained in the support of xp . It follows that p p p S+ (x) = {i ∈ S : p p xi > 0 and Vi (x) p xi > 0} and p S− (x) = {j ∈ S : p p xj > 0 and V j (x) p xj < 0}. Since V (x) ∈ TX, we know from Proposition 3.3.3 that p p Vk (x) = − Vk (x), p p k∈S+ (x) k∈S− (x) and since V p (x) 0, these expressions are positive. Therefore, condition (4.16) enables us to conclude that p p p Vk (x) Fk (x) + V p (x) Fp (x) = p Vk (x) Fk (x) p p k∈S+ (x) k∈S− (x) p p i∈S+ (x) p j∈S− (x) p k∈S+ (x) p p j∈S− (x) Vk (x) p k∈S− (x) p = mpin Fi (x) − max F j (x) p i∈S+ (x) p Vk (x) + max F j (x) p ≥ min Fi (x) p Vk (x) > 0. p k∈S+ (x) We conclude this section by considering two other monotonicity conditions that appear in the literature. Exercise 4.4.18. In the single population setting, an imitative dynamic (4.15) has aggregate monotone percentage growth rates if (4.17) ˆ ˆ y G(x) ≥ y G(x) if and only if y F(x) ≥ y F(x) ˆ for all population states x ∈ X and mixed strategies y, y ∈ ∆. (i) Show that any imitative dynamic satisfying condition (4.17) is equivalent to the replicator dynamic up to a reparameterization of time (see Exercise 4.4.10). (Hint: 150 Use Proposition 2.B.6 to show that condition (4.17) implies that ΦG(x) = c(x) ΦF(x) for some c(x) > 0. Then use the fact that G(x) x = 0 (why?) to conclude that ˆ ˙ xi = k(x) xi Fi (x).) (ii) If a multipopulation imitative dynamic satisfies the natural analogue of condition (4.17), what can we conclude about the dynamic? Exercise 4.4.19. An dynamic of form (4.15) has sign-preserving percentage growth rates if (4.18) p ˆp sgn(Gi (x)) = sgn(Fi (x)). Show that any such dynamic satisfies positive correlation (PC). (Note that dynamics satisfying condition (4.18) need not satisfy condition (4.6), and so need not be imitative dynamics as we have defined them here. In fact, there does not appear to be an intuitive restriction on revision protocols that leads to condition (4.18).) 4.4.6 Rest Points and Restricted Equilibria Since all imitative dynamics satisfy positive correlation (PC), Proposition 4.2.4 tells us that their rest points include all Nash equilibria of the underlying game F. On the other hand, face invariance tells us that non-Nash rest points can exist—for instance, while pure states in X are not always Nash equilibria of F, they are necessarily rest points of VF . To characterize the set of rest points, we first recall the definition of Nash equilibrium: p p p NE(F) = {x ∈ X : xi > 0 ⇒ Fi (x) = max F j (x)}. p j∈S Bearing this definition in mind, we define the set of restricted equilibria of F by p p p RE(F) = {x ∈ X : xi > 0 ⇒ Fi (x) = max F j (x)}. p j∈S :x j >0 In words, x is a restricted equilibrium of F if it is a Nash equilibrium of a restricted version of F in which only strategies in the support of x can be played. Exercise 4.4.20. Alternate definitions of restricted equilibrium. (i) Show that x ∈ RE(F) if and only if within each population p, all strategies in the support of xp achieve the p p same payoff: RE(F) = {x ∈ X : xi > 0 ⇒ Fi (x) = πp }. (ii) We can also offer a geometric definition of restricted equilibrium. Let X[x] be the ˆ ˆ set of social states whose supports are contained in the support of x : X[x] = {x ∈ ˆ p p ˆ X : xi = 0 ⇒ xi = 0}. Show that x ∈ RE(F) if and only if the payoff vector F(x) is contained in the normal cone of X[x] at x : RE(F) = {x ∈ X : F(x) ∈ NX[x] (x)}. 151 Because imitative dynamics exhibit face invariance, strategies that are initially unused are never subsequently chosen. This suggests a link between rest points of imitative dynamics and the restricted equilibria of the underlying game that is established in the following theorem. ˙ Theorem 4.4.21. If x = VF (x) is an imitative dynamic, then RP(VF ) = RE(F). p Proof. x ∈ RP(V ) ⇔ Vi (x) = 0 for all i ∈ Sp , p ∈ P p ⇔ ⇔ Vi (x) p xi p Fi (x) p (by (4.14)) p (by (4.16)) = 0 whenever xi > 0, p ∈ P = πp whenever xi > 0, p ∈ P ⇔ x ∈ RE(F). While there are rest points of imitative dynamics that are not Nash equilibria, we will see that non-Nash rest points are locally unstable—see Chapter 7. On the other hand, as Example 4.4.15 illustrates, the speed of motion away from these unstable rest points is initially rather slow. Exercise 4.4.22. (i) Suppose that the payoffs of one population game are the negation of the payoffs of another. What is the relationship between the replicator dynamics of the two games? (ii) Give an example of a three-strategy game whose Nash equilibrium is unique and whose replicator dynamic admits seven rest points. 4.5 Excess Payoff Dynamics In the next two subsections we consider revision protocols that are not based on imitation of successful opponents, but rather on the direct evaluation of alternative strategies. Under such protocols, good unused strategies will be discovered and chosen, raising the possibility that the dynamics satisfy Nash stationarity (NS). 4.5.1 Definition and Interpretation In some settings, particularly those in which information about population aggregates is provided by a central planner, agents may know their population’s current average payoff. (Of course, this violates of scarcity of data (SD).) Suppose that each agent’s choices are based on comparisons between the various strategies’ current payoffs and the 152 population’s average payoff, and that these choices do not condition on the agent’s current strategy. Then the agents’ choice procedure can be described using a target protocol of the form p p ˆ ρi j (πp , xp ) = τ j (πp ), p p 1 ˆ where πi = πi (x) − mp (xp ) πp represents the excess payoff to strategy i ∈ Sp . Such a protocol generates the target dynamic (4.19) p p p p ˆ ˙ xi = mp τi (Fp (x)) − xi ˆ τ j (Fp (x)) j∈Sp = p ˆp p ˆp mp τi (F (x)) − xp if τp (Fp (x)) ˆ j∈S τ j (F (x)) p ˆp τ j (F (x)) 0, j∈S otherwise. 0 To obtain our new class of dynamics, we introduce a monotonicity condition for the ˆ protocol τ. To do so, let us first observe that the excess payoff vector Fp (x) cannot lie in the p interior of the negative orthant Rn : for this to happen, every strategy would have to earn − a below average payoff. Bearing this in mind, we can let the domain of the function τp be p p p p p p the set Rn = Rn − int(Rn ). Note that int(Rn ) = Rn − Rn is the set of excess payoff vectors − − ∗ ∗ p p under which at least one strategy earns an above average payoff, while bd(Rn ) = bd(Rn ) − ∗ is the set of excess payoff vectors under which no strategy earns an above average payoff. With this notation in hand, we can define our family of dynamics. p p Definition. Suppose the protocols τp : Rn → Rn are Lipschitz continuous and satisfy acuteness: + ∗ (4.20) p ˆ ˆˆ If πp ∈ int(Rn ), then τp (πp ) πp > 0. ∗ Then the map from population games F ∈ F to differential equations (4.19) is called an excess payoff dynamic. ˆ How should one interpret condition (4.20)? If the excess payoff vector πp has a positive component, this condition implies that ˆ σp (πp ) = i∈S 1 ˆ τp (πp ) ∈ ∆p , p ˆ τi (πp ) the probability vector that defines the target state, is well defined. Acuteness requires 153 ˆ that if we pick a component of the excess payoff vector πp at random according to this probability vector, then the expected value of this randomly chosen component is strictly positive. Put differently, acuteness asks that on average, revising agents switch to strategies with above average payoffs. Example 4.5.1. The BNN dynamic. Suppose that conditional switch rate to strategy i ∈ Sp p p ˆ ˆ is given by the positive part of strategy i’s excess payoffs: τi (πp ) = [πi ]+ . The resulting mean dynamic, (BNN) p p ˆp ˙ xi = mp [Fi (x)]+ − xi p ˆ [F j (x)]+ , j∈Sp is called the Brown-von Neumann-Nash (BNN) dynamic. § Exercise 4.5.2. k-BNN dynamics. The k-BNN dynamic is generated by the revision protocol p p ˆ ˆ+ τi (πp ) = [πi ]k , where k ≥ 1. Argue informally that if k is large, then at “typical” states, the direction of motion under the k-BNN dynamic is close to that under the best response ˙ dynamic, xp ∈ mp Bp (x) − xp (see Chapter 5), but that the speed of motion is not. 4.5.2 Incentives and Aggregate Behavior Our goal in this section is to show that every excess payoff dynamic satisfies our two incentive properties. ˙ Theorem 4.5.3. Every excess payoff dynamic x = VF (x) satisfies Nash stationarity (NS) and positive correlation (PC). We prove this result under the assumption that τp satisfies sign preservation: (4.21) p p ˆ ˆ sgn(τi (πp )) = sgn([πi ]+ ). A proof using only acuteness is outlined in Exercise 4.5.8 below. We also focus on the single population case; the proof of the multipopulation case is a simple extension of the argument below. The proof follows immediately from the following three lemmas. ˆ Lemma 4.5.4. F(x) ∈ bd(Rn ) if and only if x ∈ NE(F). ∗ ˆ Proof. F(x) ∈ bd(Rn ) ⇔ Fi (x) ≤ ∗ k∈S xk Fk (x) for all i ∈ S ⇔ there exists a c ∈ R such that Fi (x) ≤ c for all i ∈ S, 154 with F j (x) = c whenever x j > 0 ⇔ F j (x) = max Fk (x) whenever x j > 0 k∈S ⇔ x ∈ NE(F). ˆ Lemma 4.5.5. If F(x) ∈ bd(Rn ), then VF (x) = 0. ∗ Proof. Immediate from sign preservation (4.21). ˆ Lemma 4.5.6. If F(x) ∈ int(Rn ), then VF (x) F(x) > 0. ∗ ˆ ˆ ˆ Proof. Recall that F(x) = F(x) − 1F(x) and that VF (x) = τ(F(x)) − 1 τ(F(x)) x. The first ˆ definition implies that x and F(x) are always orthogonal: ˆ x F(x) = x F(x) − 1F(x) = x F(x) − F(x) = 0. ˆ Combining this with the second definition, we see that if F(x) ∈ int(Rn ), then ∗ ˆ VF (x) F(x) = VF (x) (F(x) + 1F(x)) ˆ = VF (x) F(x) since VF (x) ∈ TX ˆ ˆ ˆ = (τ(F(x)) − 1 τ(F(x))x) F(x) ˆ ˆ = τ(F(x)) F(x) ˆ since x F(x) = 0 >0 by acuteness (4.20). Exercise 4.5.7. Suppose that revision protocol τp is Lipschitz continuous, acute, and separable: p p p τi (πp ) ≡ τi (πi ). Show that τp also satisfies sign preservation (4.21). Exercise 4.5.8. This exercise shows how to establish properties (NS) and (PC) using only continuity and acuteness (4.20)—that is, without requiring sign preservation (4.21). The proofs of Lemmas 4.5.4 and 4.5.6 go through unchanged, but Lemma 4.5.5 requires additional work. Using acuteness and continuity, show that ˆ ˆ ˆ ˆ ˆ ˆ (i) If π ∈ bd(Rn ) and πi < 0, then τi (π) = 0. (Hint: Consider πε = π + εe j , where π j = 0.) ∗ n ˆ ˆ ˆ ˆ ˆ (ii) If π ∈ bd(R∗ ) and πi = π j = 0, then τ(π) = 0. (Hint: To show that τi (π) = 0, consider ε 2 ˆ ˆ π = π − εei + ε e j .) Then use these two facts to prove Lemma 4.5.5. 155 Exercise 4.5.9. This exercise demonstrates that in general, one cannot “normalize” a target dynamic in order to create an exact target dynamic. This highlights a nontrivial sense in which the former class of dynamics is more general than the latter. Recall that in the single population setting, the BNN dynamic is defined by the target ˆ ˆ protocol τi (π) = [πi ]+ . (i) It is tempting to try to define an exact target protocol by normalizing τ in an appropriate way. Explain why such a protocol would not be well-defined. (ii) To attempt to circumvent this problem, one can construct a dynamic that is derived from the normalized protocol whenever the latter is well-defined. Show that such a dynamic must be discontinuous in some games. (Hint: It is enough to consider two-strategy games.) 4.6 Pairwise Comparison Dynamics Excess payoff dynamics satisfy Nash stationarity (NS), positive correlation (PC), and continuity (C), but they fail scarcity of data (SD). The revision protocols that underlie these dynamics require agents to compare their current payoff with the average payoff obtained in their population. Without the assistance of a central planner, the latter information is unlikely to be known to the agents. A natural way to reduce these informational demands is to replace the population’s average payoff with another reference payoff, one whose value agents can directly access. We accomplish this by considering revision protocols based on pairwise payoff comparisons, which satisfy scarcity of data (SD). In the remainder of this section, we show that the resulting evolutionary dynamics can be made to satisfy our other desiderata as well. 4.6.1 Definition Suppose that the revision protocol ρp only directly conditions on payoffs, not the population state. The induced mean dynamic is then of the form (4.22) p pp ˙ xi = p p x j ρ ji (Fp (x)) − xi j∈Sp ρi j (Fp (x)), j∈Sp This equation and a mild monotonicity condition on ρ defines our next class of dynamics. Definition. Suppose that the revision protocol ρ is Lipschitz continuous and sign preserving: (4.23) p p p sgn(ρi j (πp )) = sgn([π j − πi ]+ ) for all i, j ∈ Sp and p ∈ P . 156 Then the map from population games F ∈ F to differential equations (4.22) is called a pairwise comparison dynamic. Sign preservation (4.23) is a particularly natural property: it says that the conditional switch rate from i ∈ Sp to j ∈ Sp is positive if and only if the payoff to j exceeds the payoff to i. Example 4.6.1. The Smith dynamic. The simplest sign preserving revision protocol, p p p ρi j (πp ) = [π j − πi ]+ . generates the Smith dynamic: p p ˙ xi = (S) p p p p j∈Sp p [F j (x) − Fi (x)]+ . § x j [Fi (x) − F j (x)]+ − xi j∈Sp p p p Exercise 4.6.2. The k-Smith dynamic. Consider instead the protocol ρi j (πp ) = [π j − πi ]k , + where k ≥ 1. Argue informally that in the single population case, when k is large, the direction of motion from most states x is approximately parallel to an edge of the simplex. How is this edge determined from the payoff vector F(x)? 4.6.2 Incentives and Aggregate Behavior Our main result in this section is Theorem 4.6.3. Every pairwise comparison dynamic satisfies Nash stationarity (NS) and positive correlation (PC). The proof of this theorem relies on three equivalences between properties of Nash equilibria and evolutionary dynamics on the one hand, and requirements that sums of p p p p p p terms of the form ρi j , [F j − Fi ]+ , or ρi j [F j − Fi ]+ equal zero on the other. Sign preservation ensures that sums of the three types are identical, allowing us to establish the result. ˙ In what follows, x = V (x) is the pairwise comparison dynamic generated by the population game F and revision protocol ρ. p Lemma 4.6.4. x ∈ NE(F) ⇔ For all i ∈ Sp and p ∈ P , xi = 0 or p j∈Sp Proof. Both statements say that each strategy in use at x is optimal. p Lemma 4.6.5. V p (x) = 0 ⇔ For all i ∈ Sp , xi = 0 or 157 p j∈Sp p [F j (x) − Fi (x)]+ = 0. ρi j (Fp (x)) = 0. Proof. (⇐) Immediate. (⇒) Fix a population p ∈ P , and suppose that V p (x) = 0. If j is an optimal strategy for p population p at x, then sign preservation implies that ρ jk (Fp (x)) = 0 for all k ∈ Sp , and so that there is no “outflow” from strategy j: p p ρ ji (Fp (x)) = 0. xj i∈Sp p Since V j (x) = 0, there can be no “inflow” into strategy j either: pp xi ρi j (Fp (x)) = 0. i∈Sp We can express this condition equivalently as p p For all i ∈ Sp , either xi = 0 or ρi j (Fp (x)) = 0. If all strategies in Sp earn the same payoff at state x, the proof is complete. Otherwise, p let i be a “second best” strategy—that is, a strategy whose payoff Fi (x) is second highest among the payoffs available from strategies in Sp at x. The last observation in the previous p paragraph and sign preservation tell us that there is no outflow from i. But since Vi (x) = 0, there is also no inflow into i: p p For all k ∈ Sp , either xk = 0 or ρki (Fp (x)) = 0. Iterating this argument for strategies with lower payoffs establishes the result. Lemma 4.6.6. Fix a population p ∈ P . Then (i) V p (x) Fp (x) ≥ 0. p (ii) V p (x) Fp (x) = 0 ⇔ For all i ∈ Sp , xi = 0 or p j∈Sp p p ρi j (Fp (x))[F j (x) − Fi (x)]+ = 0. Proof. We compute the inner product as follows: V p (x) Fp (x) = j∈Sp pp xi ρi j (Fp (x)) − i∈Sp i∈Sp pp = p xj p p p ρ ji (Fp (x)) F j (x) pp p xi ρi j (Fp (x))F j (x) − x j ρ ji (Fp (x))F j (x) j∈Sp i∈Sp pp = p p xi ρi j (Fp (x)) F j (x) − Fi (x) j∈Sp i∈Sp 158 = p xi p i∈S p p p ρi j (Fp (x))[F j (x) − Fi (x)]+ , p j∈S where the last equality follows from sign-preservation. Both claims directly follow. Theorem 4.6.3 follows immediately from these three lemmas and sign preservation (4.23). 4.6.3 Desiderata Revisited Pairwise comparison dynamics satisfy all four of the desiderata proposed at the beginning of the chapter: continuity (C), scarcity of data (SD), Nash stationarity (NS), and positive correlation (PC). To provide some insight into this result, we compare revision protocols that generate the three key dynamics from this chapter: p p p p replicator: ˆ ρi j (πp , xp ) = x j [π j − πi ]+ ; BNN: ρi j (πp , xp ) = [π j − πp ]+ ; Smith: ρi j (πp , xp ) = [π j − πi ]+ . p p p p p From the point of view of our desiderata, the protocol that generates the Smith dynamic combines the best features of the other two. Like the protocol for the BNN dynamic, the Smith protocol is based on direct evaluations of payoffs rather than imitation, allowing it to satisfy Nash stationarity (NS). Like the protocol for the replicator dynamic, the Smith protocol is based on comparisons of individual strategies’ payoffs rather than comparisons involving aggregate statistics, and so satisfies scarcity of data (SD). Thus, while the BNN and replicator dynamics each satisfy three of our desiderata, the Smith dynamic satisfies all four. 4.7 Multiple Revision Protocols and Combined Dynamics The results above might seem to suggest that dynamics satisfying all four desiderata are rather special, in that they must be derived from a very specific sort of revision protocol. We now argue to the contrary that these desiderata are satisfied rather broadly. To make this point, let us consider what happens if an agent uses multiple revision protocols at possibly different intensities. If an agent uses the revision protocol ρV at intensity a and the revision protocol ρW at intensity b, then his behavior is described by the new revision protocol ρC = aρV + bρW . Moreover, since mean dynamics are linear 159 in conditional switch rates, the mean dynamic for the combined protocol is a linear combination of the two original mean dynamics: CF = aVF + bWF . Theorem 4.7.1 links the properties of the original and combined dynamics. Theorem 4.7.1. Suppose that the dynamic VF satisfies (PC), that the dynamic WF satisfies (NS) and (PC), and that a, b > 0. Then the combined dynamic CF = aVF + bWF also satisfies (NS) and (PC). p p p Proof. To show that CF satisfies (PC), suppose that CF (x) 0. Then either VF (x), WF (x), p or both are not 0. Since VF and WF satisfy (PC), it follows that VF (x) Fp (x) ≥ 0, that p WF (x) Fp (x) ≥ 0, and that at least one of these inequalities is strict. Consequently, p CF (x) Fp (x) > 0, and so CF satisfies (PC). Our proof that CF satisfies (NS) is divided into three cases. First, if x is a Nash equilibrium of F, then it is a rest point of both VF and WF , and hence a rest point of CF as well. Second, if x is a non-Nash rest point of VF , then it is not a rest point of WF . Since VF (x) = 0 and WF (x) 0, it follows that CF (x) = bWF (x) 0, so x is not a rest point of CF . Finally, suppose that x is not a rest point of VF . Then by Proposition 4.2.4, x is not a Nash equilibrium, and so x is not a rest point of WF either. Since VF and WF satisfy p condition (PC), we know that VF (x) F(x) = p∈P VF (x) Fp (x) > 0 and that WF (x) F(x) > 0. Consequently, CF (x) F(x) > 0, implying that x is not a rest point of CF . Thus, CF satisfies (NS). A key implication of Theorem 4.7.1 is that imitation and Nash stationarity are not incompatible. If agents usually rely on imitative protocols but occasionally follow protocols that directly evaluate strategies’ payoffs, then the rest points of the resulting mean dynamics are precisely the Nash equilibria of the underlying game. Indeed, if we combine an imitative dynamic VF with any small amount of a pairwise comparison dynamic WF , we obtain a combined dynamic CF that satisfies all four of our desiderata. 1 9 Example 4.7.2. Figure 4.7.1 presents a phase diagram for the 10 replicator + 10 Smith dynamic in standard Rock-Paper-Scissors. Comparing this diagram to those for the replicator and Smith dynamics alone (Figure 4.3.1), we see that the diagram for the combined dynamic more closely resembles the Smith phase diagram than the replicator phase diagram, and in more than one respect: the combined dynamic has exactly one rest point, 1 the unique Nash equilibrium x∗ = ( 3 , 1 , 1 ), and all solutions to the combined dynamic 33 converge to this state. We will revisit this fragility of imitative dynamics in Chapter 8, where it will appear in a much starker form. § 160 Figure 4.7.1: The 4.N 9 10 1 replicator + 10 Smith dynamic in RPS. Notes Section 4.2: This section follows Sandholm (2006a). A wide variety of payoff monotonicity conditions have been considered in the literature; for examples, see Nachbar (1990), Friedman (1991), Samuelson and Zhang (1992), Swinkels (1993), Ritzberger and Weibull (1995), Hofbauer and Weibull (1996), and Sandholm (2001). Positive correlation is essentially the weakest condition that has been proposed. Most existing conditions are strictly stronger (see the notes to Section 4.4 below). Friedman’s (1991) weak compatibility is positive correlation plus the additional restriction that unused strategies are never subsequently chosen. Swinkels (1993) calls a dynamic a myopic adjustment dynamic if it satisfies positive correlation, but he allows Fp (x) V p (x) = 0 even when V p (x) 0. Section 4.4: The approach to imitative revision protocols and dynamics in this section builds on the work of Bjornerstedt and Weibull (1996), Weibull (1995), and Hofbauer ¨ (1995a). Taylor and Jonker (1978) introduce the replicator dynamic to provide a dynamic analogue of Maynard Smith and Price’s (1973) equilibrium (ESS) model of animal conflict. Exercise 4.4.12, which shows that the replicator dynamic is equivalent after a nonlinear (barycentric) change of variable to the Lotka-Volterra equation (Lotka (1920), Volterra 161 (1931)), is due to Hofbauer (1981). Schuster and Sigmund (1983) further observe that fundamental models of population genetics (e.g., Crow and Kimura (1970)) and of biochemical evolution (e.g., Eigen and Schuster (1979)) can be viewed as special cases of the replicator dynamic; they are also the first to refer to the dynamic by this name. For more on these biological models, see Hofbauer and Sigmund (2003). For a detailed analysis of the replicator dynamic from an economic point of view, see Weibull (1995, Chapter 3). The derivations of the replicator dynamic in Examples 4.4.2, 4.4.4, and 4.4.5 are due to Schlag (1998), Bjornerstedt and Weibull (1996), and Hofbauer (1995a), respectively. ¨ The Maynard Smith replicator dynamic can be found in Maynard Smith (1982, Appendices D and J). For a contrast between the standard and Maynard Smith replicator dynamics from a biological point of view, see Hofbauer and Sigmund (1988, Section 27.1). The i-logit dynamic is due to Bjornerstedt and Weibull (1996) and Weibull (1995). ¨ Most early work by economists on deterministic evolutionary dynamics focuses on generalizations of the replicator dynamic expressed in terms of percentage growth rates, as in equation (4.15). The condition we call monotone percentage growth rates (4.16) has appeared in many places under a variety of names: relative monotonicity (Nachbar (1990)), order compatibility of predynamics (Friedman (1991)), monotonicity (Samuelson and Zhang (1992), and payoff monotonicity (Weibull (1995)). Aggregate monotone percentage growth rates (4.17) and Exercise 4.4.18 are introduced by Samuelson and Zhang (1992). Sign-preserving percentage growth rates (4.18) is a condition due to Nachbar (1990); see also Ritzberger and Weibull (1995), who call this condition payoff positivity. For surveys of the literature referenced here, see Weibull (1995, Chapters 4 and 5) and Fudenberg and Levine (1998, Chapter 3). Sections 4.5, 4.6, and 4.7: These sections follow Sandholm (2005a, 2006a). The Brown-von Neumann-Nash dynamic was introduced in the context of symmetric zero-sum games by Brown and von Neumann (1950). Nash (1951) uses a discrete time analogue of this dynamic as the basis for his simple proof of existence of equilibrium based on Brouwer’s Theorem. More recently, the BNN dynamic was reintroduced by Skyrms (1990), Swinkels (1993), and Weibull (1996), and by Hofbauer (2000), who gave the dynamic its name. The Smith dynamic was introduced in the transportation science literature by Smith (1984). 162 CHAPTER FIVE Best Response and Projection Dynamics 5.0 Introduction This chapter continues the parade of evolutionary dynamics commenced in Chapter 4. In the first two sections, the step from payoff vector fields to evolutionary dynamics is traversed through a traditional game-theoretic approach, by employing best response correspondences and perturbed versions thereof. The third section follows a geometric approach, defining an evolutionary dynamic via closest point projections of payoff vectors. The best response dynamic embodies the assumption that revising agents always switch to their current best response. Because the best response correspondence is discontinuous and multivalued, the basic properties of solution trajectories under the best response dynamic are quite different from those of our earlier dynamics: multiple solution trajectories can sprout from a single initial condition, and solution trajectories can cycle in and out of Nash equilibria. Despite these difficulties, we will see that analogues of incentive properties (NS) and (PC) still hold true. While the discontinuity of the best response protocol stands in violation of a basic desideratum from Chapter 4, one can obtain a continuous protocol by working with perturbed payoffs. The resulting perturbed best response dynamics are continuous (and even differentiable), and so have well-behaved solution trajectories. While the payoff perturbations prevent our incentive conditions from holding exactly, we show that appropriately perturbed versions of these conditions, defined in terms of so-called “virtual payoffs”, can be proved. Our final evolutionary dynamic, the projection dynamic, is motivated by geometric considerations: we define the growth rate vector under the projection dynamic to be 163 the closest approximation of the payoff vector by a feasible vector of motion. While the resulting dynamic is discontinuous, its solutions still exist, are unique, and are continuous in their initial conditions; moreover, both of our incentive conditions are easily verified. We show that the projection dynamic can be derived from protocols that reflect “revision driven by insecurity”. These protocols also reveal surprising connections between the projection dynamic and the replicator dynamic, connections that we develop further when studying the global behavior of evolutionary dynamics in Chapter 6. The dynamics studied in this chapter require us to introduce new mathematical techniques. Determining the basic properties of the best response dynamic and the projection dynamic requires ideas from the theory of differential inclusions (i.e., of set valued differential equations), which we develop in Appendix 5.A. A key tool for analyzing perturbed best response dynamics is the Legendre transform, whose basic properties are explained in Appendix 5.B. These properties are central to our analysis of perturbed maximization, which is offered in Appendix 5.C. 5.1 5.1.1 The Best Response Dynamic Definition and Examples Traditional game theoretic analysis is based on the assumption of equilibrium play. This assumption can be split into two distinct parts: that agents have correct beliefs about their opponents’ behavior, and that they choose their strategies optimally given those beliefs. When all agents simultaneously have correct beliefs and play optimal responses, their joint behavior constitutes a Nash equilibrium. It is natural to introduce an evolutionary dynamic based on similar principles. To accomplish this, we suppose that each agent’s revision opportunities arrive at a fixed rate, and that when an agent receives such an opportunity, he chooses a best response to the current population state. Thus, we assume that each agent responds optimally to correct beliefs whenever he is revising, but not necessarily at other points in time. Before introducing the best response dynamics, let us review the notions of exact target protocols and dynamics introduced in Section 3.1.3. Under an exact target protocol, conp p ditional switch rates ρi j (πp , xp ) ≡ σ j (πp , xp ) are independent of an agent’s current strategy. p These rates also satisfy j∈S σ j (πp , xp ) ≡ 1, so that σp (πp , xp ) ∈ ∆p is a mixed strategy. Such a protocol induces the exact target dynamic (5.1) ˙ xp = mp σp (Fp (x)) − xp . 164 ˙ Under (5.1), the vector of motion xp for population p has its tail at the current state xp and its head at mp σp , the representative of the mixed strategy σp ∈ ∆p in the state space Xp = mp ∆p . The best response protocol is given by the multivalued map σp (πp , xp ) = Mp (πp ) ≡ argmax ( yp ) πp . (5.2) yp ∈∆p p Mp : Rn ⇒ ∆p is the maximizer correspondence for population p: the set Mp (πp ) consists of those mixed strategies that only place mass on pure strategies optimal under payoff vector πp . Inserting this protocol into equation (5.1) yields the best response dynamic: (BR) ˙ xp ∈ mp Mp (Fp (x)) − xp . We can also write (BR) as ˙ xp ∈ mp Bp (x) − xp . where Bp = Mp ◦ Fp is the best response correspondence for population p. Definition. The best response dynamic assigns each population game F ∈ F the set of solutions to the differential inclusion (BR). All of our dynamics from Chapter 4 are Lipschitz continuous, so the existence and uniqueness of their solutions is ensured by the Picard-Lindelof Theorem. Since the best ¨ response dynamic (BR) is a discontinuous differential inclusion, that theorem does not apply here. But while the map Mp is not a Lipschitz continuous function, it does exhibit other regularity properties: in particular, it is a convex-valued, upper hemicontinuous correspondence. These properties impose enough structure on the dynamic (BR) to establish an existence result. To state this result, we say that the Lipschitz continuous trajectory {xt }t≥0 is a Carath´odory e ˙ ˙ solution to the differential inclusion x ∈ V (x) if it satisfies xt ∈ V (xt ) at all but a measure zero set of times in [0, ∞). Theorem 5.1.1. Fix a continuous population game F. Then for each ξ ∈ X, there exists a trajectory {xt }t≥0 with x0 = ξ that is a Carath´odory solution to the differential inclusion (BR). e It is important to note that while solutions to the best response dynamic exist, they need not be unique: as the examples to follow will illustrate, multiple solution trajectories can emanate from a single initial condition. For a brief introduction to the theory of differential inclusions, see Appendix 5.A.1. 165 In Chapter 3, we justified our focus on the deterministic dynamic generated by a revision protocol through an appeal to a finite horizon approximation theorem. This result, which we present in Chapter 9, tells us that under certain regularity conditions, N the stochastic evolutionary process {Xt } generated by a game F and revision protocol ρ is well approximated by a solution to the mean dynamic (M) over any finite time horizon, so long as the population size is large enough. But because the revision protocol that generates the best response dynamic is discontinuous and multivalued, the finite horizon approximation theorem from Chapter 9 does not apply here: indeed, since σ is N multivalued, the Markov process {Xt } is not even uniquely defined! Nevertheless, we conjecture that it is possible to prove a version of the finite horizon approximation theorem that applies in the present setting (see the Notes). 5.1.2 Construction and Properties of Solution Trajectories Because solutions to the best response dynamic need not be unique, they can be distinctly more complicated than solutions to Lipschitz continuous dynamics, as we demonstrate in a series of examples below. But before doing this, we show another sense in which solutions to the best response dynamic are rather simple. Let {xt } be a solution to (BR), and suppose that at all times t ∈ [0, T], population p’s unique best response to state xt is the pure strategy i ∈ Sp . Then during this time interval, evolution in population p is described by the affine differential equation p ˙ xp = mp ei − xp . p p In other words, the population state xp moves directly towards vertex vi = mp ei of the set Xp , proceeding more slowly as time passes. It follows that throughout the interval [0, T], p the state (xt )p lies on the line segment connecting (x0 )p and vi ; indeed, we can solve the previous equation to obtain an explicit formula for (xt )p : p (xt )p = (1 − e−t ) vi + e−t (x0 )p for all t ∈ [0, T]. Matters are more complicated at states that admit multiple best responses, since at such states more than one future course of evolution is possible. Still, not every element of Bp (x) need define a feasible direction of motion for population p: if {(xt )p } is to head toward ˆ ˆ state xp during a time interval of positive length, all pure strategies in the support of xp must remain optimal throughout the interval. Example 5.1.2. Standard Rock-Paper-Scissors. Suppose a population of agents is randomly 166 Figure 5.1.1: The best response dynamic in RPS. matched to play standard Rock-Paper-Scissors: 0 −l w w 0 −l A= −l w 0 with w = l. The phase diagram for the best response dynamic in F(x) = Ax is presented in Figure 5.1.1. The upper, lower left, and lower right regions of the figure contain the states at which Paper, Scissors, and Rock are the unique best responses; in each of these regions, all solution trajectories head directly toward the appropriate vertex. When the boundary of a best response region is reached, multiple directions of motion are possible, at least in principle. But at all states other than the unique Nash equilibrium x∗ = ( 1 , 1 , 1 ), the 333 only direction of motion that can persist for a positive amount of time is the one heading toward the new best response, and starting from x∗ , the only feasible solution trajectory is the stationary one. Putting this all together, we conclude that in standard RPS, the solution to the best response dynamic from each initial condition is unique. Figure 5.1.1 appears to show that every solution trajectory converges to the unique Nash equilibrium x∗ . To verify this, we prove that along every solution trajectory {xt }, 167 whenever the best response to xt is unique, we have that (5.3) d max Fk (xt ) = − max Fk (xt ). k ∈S dt k∈S Since the best response is unique at almost all times t, integrating equation (5.3) shows that (5.4) max Fk (xt ) = e−t max Fk (x0 ). k∈S k∈S Now in standard RPS, the maximum payoff function maxk∈S Fk is nonnegative, equalling zero only at the Nash equilibrium x∗ . This fact and equation (5.4) imply that the maximal payoff falls over time, converging as t approaches infinity to its minimum value of 0; over this same time horizon, the state xt converges to the Nash equilibrium x∗ . To prove equality (5.3), fix a state xt at which there is a unique optimal strategy—say, ˙ Paper. At this state, xt = eP − xt . Since FP (x) = w(xR − xS ), we can compute that d F (x ) dt P t ˙ = FP (xt ) xt = w(eR − eS ) (eP − xt ) = −w(eR − eS ) xt = −FP (xt ). § Example 5.1.3. Two-strategy coordination. Suppose that agents are randomly matched to play the two strategy game with strategy set S = {U, D} and payoff matrix 1 0 A= 0 2 . The resulting random matching game F(x) = Ax has three Nash equilibria, the two pure equilibria eU and eD , and the mixed equilibrium (x∗ , x∗ ) = ( 2 , 1 ). UD 33 To reduce the amount of notation, we let d = xD represent the proportion of players choosing strategy D, so that the mixed Nash equilibrium becomes d∗ = 1 . The best 3 response dynamic for this game is described in terms of the state d as follows: {−d} if d < d∗ , d˙ = [− 1 , 2 ] if d = d∗ , 33 {1 − d} if d > d∗ . 168 From every initial condition other than d∗ , the dynamic admits a unique solution trajectory that converges to a pure equilibrium: (5.5) (5.6) d0 < d∗ ⇒ dt = e−t d0 , d0 > d∗ ⇒ dt = e−t d0 + (1 − e−t ) = 1 − e−t (1 − d0 ). But there are many solution trajectories starting from d∗ : one solution is stationary; another proceeds to d = 0 according to equation (5.5), a third proceeds to d = 1 according to equation (5.6), and yet others follows the trajectories in (5.5) and (5.6) after some initial delay. Notice that solutions (5.5) and (5.6) quickly leave the vicinity of d∗ . This is unlike the behavior of Lipschitz continuous dynamics, under which solutions from all initial conditions are unique, and solutions that start near a stationary point move very slowly. § Exercise 5.1.4. Two-strategy anti-coordination. Suppose players are randomly matched to play the anticoordination game −1 0 A= 0 −1 . Show that there is a unique solution to this dynamic from each initial condition d0 . Also, 1 show that each solution reaches the unique Nash equilibrium d∗ = 2 in finite time, and express this time as a function of the initial condition d0 . This is unlike the behavior of Lipschitz continuous dynamics, under which solutions can only reach rest points in the limit as the time t approaches infinity. Example 5.1.5. Three-strategy coordination. Figure 5.1.2 presents the phase diagram for the best response dynamic generated by random matching in the pure coordination game 1 0 0 0 1 0 . A= 0 0 1 1 The speed of motion is fastest near the mixed Nash equilibrium x∗ = ( 3 , 1 , 1 ). As in 33 Example 5.1.3, solution trajectories are not unique: this time, whenever the state is on the Y-shaped set of boundaries between best response regions, it can leave this set and head into any adjoining basin of attraction. § 169 Figure 5.1.2: The best response dynamic in Pure Coordination. Exercise 5.1.6. Good and bad RPS. (i) Using a similar argument to that provided in Example 5.1.2, show that in any good RPS game, the unique Nash equilibrium x∗ = ( 1 , 1 , 1 ) is globally stable, and that it is reached in finite time from every 333 initial condition. (ii) Show that in any bad RPS game, solutions starting from almost all initial conditions converge to a limit cycle in the interior of the state space. In addition, argue that there are multiple solutions starting from the Nash equilibrium x∗ : one is stationary, while others spiral outward toward the limit cycle. The latter solutions are not differentiable at t = 0. It is therefore possible for a solution to escape a Nash equilibrium without the solution beginning its motion in a well-defined direction. (Hint: Consider backward solution trajectories from initial conditions in the region bounded by the cycle.) Example 5.1.7. Zeeman’s game. Consider the population game F(x) = Ax generated by random matching in the symmetric normal form game 0 6 −4 A = −3 0 5 −1 3 0 170 Figure 5.1.3: The best response dynamic in Zeeman’s game. 1 with strategy set S = {U, M, D}. The Nash equilibria of F are eU , x∗ = ( 3 , 1 , 1 ), and 33 y∗ = ( 4 , 0, 1 ). The best response dynamic for F is presented in Figure 5.1.3. Solution 5 5 trajectories from a majority of initial conditions are unique and converge to the pure equilibrium eU . However, some initial conditions generate multiple solutions. Consider, for example, solutions starting at the interior Nash equilibrium x∗ . There is a stationary solution at x∗ , as well as solutions that head toward the vertex eU , possibly after some delay. Other solutions head toward the Nash equilibrium y∗ . Some of these converge to y∗ ; others leave segment x∗ y∗ before reaching y∗ . Of those that leave, some head to eU , while others head toward eD and then return to x∗ . If x∗ is revisited, any of the behaviors just described can occur again. Therefore, there are solutions to (BR) that arrive at and depart x∗ in perpetuity. § 5.1.3 Incentive Properties In the previous chapter, we introduced two properties, Nash stationarity (NS) and positive correlation (PC), that link growth rates under evolutionary dynamics with payoffs in the underlying games. (NS) VF (x) = 0 if and only if x ∈ NE(F). 171 (PC) p VF (x) p 0 implies that VF (x) Fp (x) > 0 for all p ∈ P . Both of these properties are designed for single-valued differential equations. We now establish that analogues of these two properties are satisfied by the differential inclusion (BR). Theorem 5.1.8. The best response dynamic satisfies (5.7) (5.8) 0 ∈ VF (x) if and only if x ∈ NE(F). p ˆp (zp ) Fp (x) = mp max F j (x) for all zp ∈ VF (x). p j∈S ˙ Condition (5.7) requires that the differential inclusion x ∈ VF (x) have a stationary solution at every Nash equilibrium, but at no other states. As we have seen, this condition does not rule out the existence of additional solution trajectories that leave Nash equilibria. p Condition (5.8) asks that the correspondence x → VF (x) Fp (x) be single valued, always equaling the product of population p’s mass and its maximal excess payoff. It follows that this map is Lipschitz continuous and nonnegative, equaling zero if and only if all agents in population p are playing a best response (see Lemma 4.5.4). Summing over populations, we see that VF (x) F(x) = {0} if and only if x is a Nash equilibrium of F. p Proof. Property (5.7) is immediate. To prove property (5.8), fix x ∈ X, and let zp ∈ VF (x). Then zp = mp yp − xp for some yp ∈ Mp (Fp (x)). Therefore, p ˆp (zp ) Fp (x) = (mp yp − xp ) Fp (x) = mp max F j (x) − mp Fp (x) = mp max F j (x). p p j∈S 5.2 j∈S Perturbed Best Response Dynamics The best response dynamic is a fundamental model of evolution in games, as it provides an idealized description of the behavior of agents whose decisions condition on exact information about the current strategic environment. Of course, the flip side of exact information is discontinuity, a violation of our desideratum (C) for revision protocols (see Section 4.2.1). We now introduce revision protocols under which agents choose best responses to payoffs that have been subjected to perturbations. While the perturbations can represent actual payoff noise, they can also represent errors in agents’ perceptions of payoffs, or in the agents’ implementations of the best response rule. Regardless of their interpretation, the perturbations lead to revision protocols that are smooth functions of payoffs, and so to dynamics that can be analyzed using standard techniques. 172 The use of perturbed best response functions is not unique to evolutionary game theory. To mention one prominent example, researchers in experimental economics employ perturbed best response functions when attempting to rationalize experimental data. Consequently, the ideas we develop in this section provide dynamic foundations for solution concepts in common use in experimental research (see the Notes). 5.2.1 Revision Protocols and Mean Dynamics Perturbed best response protocols are exact target protocols defined in terms of perp ˜ turbed maximizer functions Mp : Rn → int(∆p ): (5.9) ˜ σp (πp , xp ) = Mp (πp ). ˜ Unlike the maximizer correspondence Mp , the function Mp is single-valued, continuous, ˜ and even differentiable. The mixed strategy Mp (πp ) ∈ int(∆p ) places most of its mass on the optimal pure strategies, but places positive mass on all pure strategies. Precise definitions ˜ of Mp will be stated below. Example 5.2.1. Logit choice. When p = 1, the logit choice function with noise level η > 0 is written as ˜ Mi (π) = exp(η−1 πi ) . −1 j∈S exp(η π j ) ˜ For any value of η > 0, each strategy receives positive probability under M regardless of the payoff vector π. But if πi > π j for all j i, the probability with which strategy i is chosen approaches one as η approaches zero. Notice too that adding a constant vector to the payoff vector π has no effect on choice probabilities. When there are just two strategies, the logit choice function reduces to ˜ M1 (π) = exp(η−1 (π1 − π2 )) ˜ ˜ and M1 (π) + M2 (π) = 1. exp(η−1 (π1 − π2 )) + 1 In Figure 5.2.1, we fix π2 at 0, and graph as a function of π1 the logit(η) choice probabilities ˜ M1 (π) for η = .25, .1, and .02, as well as the optimal choice probabilities M1 (π). Evidently, ˜ M1 provides a smooth approximation of the discontinuous map M1 . While the function ˜ M1 cannot converge uniformly to the correspondence M1 as the noise level η goes to zero, ˜ one can show that the graph of M1 converges uniformly (in the Hausdorff metric—see the Notes) to the graph of M1 as η approaches zero. § 173 –1 0 1 π1 ˜ Figure 5.2.1: Logit choice probabilities M1 (π1 , 0) for noise levels η = .25 (red), η = .1 (green), and η = .02 (blue), along with optimal choice probabilities M1 (π1 , 0) (black). The protocol (5.9) induces the perturbed best response dynamic (5.10) ˜ ˙ xp = mp Mp (Fp (x)) − xp as its mean dynamic. We can also write (5.10) as ˜ ˙ xp = mp Bp (x) − xp , ˜ ˜ where the function Bp = Mp ◦ Fp , which maps social states to mixed strategies, is the perturbed best response function for population p; it is a perturbed version of the best response correspondence Bp = Mp ◦ Fp . 5.2.2 Perturbed Optimization: A Representation Theorem We now consider two methods of defining perturbed maximizer functions. To avoid superscripts, we focus here on the single population case. ˜ The traditional method of defining M, a method with a long history in the theory of discrete choice, is based on stochastic perturbations of the payoffs to each pure strategy. In this construction, an agent chooses the best response to the vector of payoffs π ∈ Rn , but only after the payoffs to his alternatives have been perturbed by some random vector ε. (5.11) ˜ Mi (π) = P i = argmax π j + ε j . j∈S We require the random vector ε to be an admissible stochastic perturbation: it must admit 174 ˜ a positive density on Rn , and this density must be smooth enough that the function M is continuously differentiable. For example, if the components εi are independent, standard ˜ results on convolutions imply that M is C1 whenever the densities of the components εi ˜ are bounded. In the discrete choice literature, the definition of M via equation (5.11) is known as the additive random utility model (ARUM). ˜ We can also define M by introducing a deterministic perturbation of the payoffs to each mixed strategy. Call the function v : int(∆) → R an admissible deterministic perturbation if it is differentiably strictly convex and steep near bd(∆). That is, v is admissible if the second derivative at y, D2 v( y) ∈ L2 (Rn , R), is positive definite for all y ∈ int(∆), and if | v( y)| s 0 approaches infinity whenever y approaches bd(∆). (Recall that Rn is an alternate notation 0 for T∆, the tangent space of the simplex.) With an admissible v in hand, we define the ˜ function M by (5.12) ˜ M(π) = argmax y π − v( y) . y∈int(∆) One interpretation of the function v is that it represents a “control cost” that becomes large whenever an agent puts too little probability on any particular pure strategy. Because the base payoffs to each strategy are bounded, the steepness of v near bd(∆) implies that it is never optimal for an agent to choose probabilities too close to zero. ˜ Note that under either definition, choice probabilities under M are unaffected by 1 constant shifts in the payoff vector π. The projection of Rn onto Rn , Φ = I − n 11 , employs 0 ˜ just such a shift, so we can express this property of M as follows: ˜ ˜ M(π) = M(Φπ) for all π ∈ Rn . ˜ With this motivation, we define M : Rn → int(∆) to be the restriction of M to the subspace 0 Rn . 0 As we noted above, the stochastic construction (5.11) is the traditional way of defining perturbed maximizer functions, and this construction is more intuitively appealing than the deterministic construction (5.12). But the latter construction is clearly more convenient for analysis: while under (5.11) choice probabilities must expressed as cumbersome multiple integrals, under (5.12) they are obtained as interior maximizers of a strictly concave function. ˜ Happily, we need not trade off intuitive appeal for convenience: every M defined via equation (5.11) can be represented in form (5.12). ˜ Theorem 5.2.2. Let M be a perturbed maximizer function defined in terms of an admissible 175 ˜ stochastic perturbation ε via equation (5.11). Then M satisfies equation (5.12) for some admissible ˜ deterministic perturbation v. In fact, M = M|Rn and v are invertible, and M = ( v)−1 . 0 Taking as given the initial statements in the theorem, it is easy to verify the last one. ˜ Indeed, suppose that M (and hence M) can be derived from the admissible deterministic perturbation v, that the gradient v : int(∆) → Rn is invertible, and that the payoff vector 0 ∗ ≡ M(π) satisfies n π is in R0 . Then y y∗ = argmax y π − v( y) . y∈int(∆) This is a strictly concave maximization problem with an interior solution. Taking the first order condition with respect to directions in Rn yields 0 Φ(π − v( y∗ )) = 0. Since π and v( y∗ ) are already in Rn , the projection Φ does nothing, so rearranging allows 0 us to conclude that M(π) = y∗ = ( v)−1 (π). In light of this argument, the main task in proving Theorem 5.2.2 is to show that a function v with the desired properties exists. Accomplishing this requires the use of the Legendre transform, a classical tool from convex analysis. We explain the basic properties of the Legendre transform in Appendix 5.B. This device is used to prove the representation theorem in Appendix 5.C , where some auxiliary results can also be found. ˜ One such result is worth mentioning now. Theorem 5.2.2 tells us that every M defined in terms of stochastic perturbations can be represented in terms of deterministic perturbations. Exercise 5.2.3 shows that the converse statement is false, and thus that the ˜ deterministic definition of M is strictly more general than the stochastic one. Exercise 5.2.3. Show that when n ≥ 4, there is no stochastic perturbation of payoffs which yields the same choice probabilities as the admissible deterministic perturbation v( y) = − j∈S log y j . (Hint: Use Theorem 5.C.6 in the Appendix.) 176 5.2.3 Logit Choice and the Logit Dynamic In Example 5.2.1, we introduced the best known example of a perturbed maximizer function: the logit choice function with noise level η > 0. (5.13) ˜ Mi (π) = exp(η−1 πi ) . −1 j∈S exp(η π j ) This function generates as its mean dynamic the logit dynamic with noise level η: p (L) p ˙ xi =m p exp(η−1 Fi (x)) −1 p j∈Sp exp(η F j (x)) p − xi . Rest points of logit dynamics are called logit equilibria. Example 5.2.4. In Figure 5.2.2, we present phase diagrams for the 123 Coordination game 1 0 0 x1 x1 F(x) = Ax = 0 2 0 x2 = 2x2 0 0 3 x 3x 3 3 under logit dynamics with a range of noise levels. As η passes from .01 to 1, the dynamics pass through four distinct regimes. At the lowest noise levels, the dynamics admit seven rest points, three stable and four unstable, corresponding to the seven Nash equilibria of F. When η reaches ≈ .22, two of the unstable rest points annihilate one another, leaving five rest points in total. At η ≈ .28, the stable rest point corresponding to Nash equilibrium e1 and an unstable rest point eliminate one another, so that three rest points remain. Finally, when η ≈ .68, the stable rest point corresponding to Nash equilibrium e2 and an unstable rest point annihilate each other, leaving just a single, stable rest point. If we continue to 1 increase η, the last rest point ultimately converges to the central state ( 1 , 3 , 1 ). 3 3 This example provides an illustration of a deep topological result called the Poincar´e Hopf Theorem. In the present two-dimensional context, this theorem ensures that generically, the number of sinks plus the number of sources equals the number of saddles plus one. § Example 5.2.5. Stochastic derivation of logit choice. We can derive the logit choice function from stochastic perturbations that are i.i.d. with the double exponential distribution: P(εi ≤ c) = exp(− exp(−η−1 c − γ)), where γ = limn→∞ ( n=1 1 − log n) ≈ 0.5772 is Euler’s constant. k k For intuition, we mention without proof that Eεi = 0 and Var(εi ) = 1.2826η. 177 η2 π2 , 6 so that SD(εi ) ≈ (i) η = .001 (ii) η = .1 (iii) η = .2 (iv) η = .22 (v) η = .27 (vi) η = .28 Figure 5.2.2: Logit dynamics in 123 Coordination. 178 (vii) η = .4 (viii) η = .6 (ix) η = .68 (x) η = .85 (xi) η = 1.2 (xii) η = 3 Figure 5.2.2: Logit dynamics in 123 Coordination. 179 To see that these perturbations generate logit choice, note that the density of εi is f (x) = η−1 exp(−η−1 x − γ) exp(− exp(−η−1 x − γ)). Using the substitutions y = exp(−η−1 x − γ) and m j = exp(η−1 π j ), we compute as follows: ∞ P i = argmax j∈S π j + ε j = f (x) 0 ji F(πi + x − π j ) dx ∞ =− η−1 y exp(− y) 0 ∞ =− exp − y 0 = mj η dy mi y mj dy mi mi j∈S = j∈S ji exp − y mj exp(η−1 πi ) .§ −1 j∈S exp(η π j ) Exercise 5.2.6. Deterministic derivation of logit choice. According to the representation theorem, it must also be possible to derive the logit choice function from an admissible deterministic perturbation. Show that this is accomplished using the (negated) entropy function v( y) = η j∈S y j log y j . The next exercise gives explicit formulas for various functions from the proof of the ˜ representation theorem in the case of logit choice. Included is the derivative matrix DM(π), a useful item in analyses of local stability (see Chapter 7.) The exercise also shows how ˜ the entropy function v can be derived from the function M. Exercise 5.2.7. Additional results on logit choice. ˜ ˜ (i) Show that µ(π) = η log( j∈S exp(η−1 π j )) is a potential function for M. (For the interpretation of this function, see Observation 5.C.3 and Theorem 5.C.4 in the Appendix.) 1 1 ˜ ˜ ¯ ˜ ¯ (ii) Let µ be the restriction of µ to Rn , so that µ(π) = ΦM(π) = M(π) − n 1 = M(π) − n 1. 0 1 ˆ For y ∈ int(∆), let y ≡ y − n 1. Show that 1 log y1 − n . . ˆ ¯ ( µ)−1 ( y) = M−1 ( y) = η . log y − 1 n n j∈S j∈S log y j . log y j ¯ ¯ (iii) Let (C∗ , µ∗ ) be the Legendre transform of (Rn , µ), and define v : int(∆) → R by 0 ∗ ( y). Show by direct computation that v( y) = η ¯ v( y) = µ ˆ j∈S y j log y j . −1 ˜ (iv) Show that v( y) = M ( y). (Hint: Let v be the natural extension of v to Rn , and use + 180 ˜ the fact v( y) = Φ v( y).) (v) Show that 2 v( y) = η Φ diag([ y−1 ]) Φ, where [ y−1 ] j = y−1 for all j ∈ S. j (vi) Show that if π ∈ Rn , then 0 ˜ DM(π) = 2 ˜ µ(π) = η−1 diag(M(π)) − M(π)M(π) = 2 ¯ µ(π) = DM(π). ¯ (vii) Show that 2 v(M(π)) = ( 2 µ(π))−1 when these matrices are viewed as linear maps from Rn to Rn . (Hint: Since both of these maps are of full rank on Rn , it is enough 0 0 0 ¯ to show that 2 µ(π) 2 v(M(π)) = Φ, the orthogonal projection onto Rn .) 0 ˜ Exercise 5.2.8. Suppose that M is a perturbed maximizer function derived from an admissible deterministic perturbation as in equation (5.12) (or from an admissible stochastic ˜ perturbation as in equation (5.11)). Show that if M can be expressed as (5.14) ˜ Mi (π) = α(πi ) j∈S α(π j ) ˜ for some increasing differentiable function α : R → (0, ∞), then M is a logit choice function with some noise level η > 0. (Hint: Combine equation (5.14) with the fact that ˜ the derivative matrix DM(π) must be symmetric (see Corollary 5.C.5 and Theorem 5.C.6 in the Appendix).) Exercise 5.2.9. The variable-rate logit dynamic. The variable-rate logit dynamic with noise level η > 0 is defined by (5.15) p p p p exp(η−1 F j (x)). ˙ xi = mp exp(η−1 Fi (x)) − xi j∈Sp The previous exercise shows that the logit dynamic is the only perturbed best response dynamic that admits a modification of this sort. (i) Describe a simple revision protocol that generates this dynamic, and provide an interpretation. (ii) Show that if p = 1, then (5.15) is equivalent to the logit dynamic (L) up to a change in the speed at which solution trajectories are traversed. Explain why this is not the case when p ≥ 2. (iii) Compare this dynamic with the excess payoff dynamics from Chapter 4. Explain why those dynamics cannot be modified so as to resemble the logit dynamic (L). 181 5.2.4 Perturbed Incentive Properties via Virtual Payoffs Because they incorporate payoff disturbances, perturbed best response dynamics cannot satisfy positive correlation (PC) or Nash stationarity (NS). We now show that these dynamics do satisfy suitably perturbed versions of the two incentive properties. In light of the representation theorem, there is no loss of generality in focusing on dynamics generated by admissible deterministic perturbations v = (v1 , . . . , vp ). We can describe the set of Nash equilibria of F in terms of the best response correspondences Bp : NE(F) = {x ∈ X : xp ∈ mp Bp (x) for all p ∈ P }. In similar fashion, we define the set of perturbed equilibria of the pair (F, v) in terms of the ˜ perturbed best response functions Bp : ˜ PE(F, v) = {x ∈ X : xp = mp Bp (x) for all p ∈ P }. By definition, the rest points of the perturbed best response dynamic (5.10) are the perturbed equilibria of (F, v). Observation 5.2.10. All perturbed best response dynamics satisfy perturbed stationarity: (5.16) V (x) = 0 if and only if x ∈ PE(F, v). We can derive an alternate characterization of perturbed equilibrium using the notion ˜ of virtual payoffs. Define the virtual payoffs F : int(X) → Rn for the pair (F, v) by 1 ˜ Fp (x) = Fp (x) − vp ( mp xp ). Thus, the virtual payoff function for population p is the difference between the population’s true payoff function and gradient of its deterministic perturbation. For intuition, let us consider the single population case. When x is far from the ˜ boundary of the simplex X, the perturbation v is relatively flat, so the virtual payoffs F(x) are close to the true payoffs F(x). But near the boundary of X, true and virtual payoffs are quite different. For example, when xi is the only component of x that is close to zero, then for each alternate strategy j i, moving “inward” in direction ei − e j sharply decreases the v value of v; thus, the directional derivative ∂(e∂−e j ) (x) is large in absolute value and negative. i ˜ ˜ It follows that the difference Fi (x) − F j (x) between these strategies’ virtual payoffs is large ˜ and positive. In other words, rare strategies are quite desirable in the “virtual game” F. 182 Individual agents do not use virtual payoffs to decide how to act: to obtain the maximized function in definition (5.12) from the virtual payoff function, we must replace 1 the normalized population state mp xp with the vector of choice probabilities yp . But at 1 perturbed equilibria, mp xp and yp agree. Therefore, perturbed equilibria of (F, v) correspond ˜ to “Nash equilibria” of the “virtual game” F. ˜ Theorem 5.2.11. Let x ∈ X be a social state. Then x ∈ PE(F, v) if and only if ΦFp (x) = 0 for all p ∈ P. ˜ ˜ The equality ΦFp (x) = 0 means that Fp (x) is a constant vector. Since uncommon strategies ˜ are quite desirable in the “virtual game” F, no state that includes an unused strategy can ˜ be a “Nash equilibrium” of F; thus, equality of all virtual payoffs in each population is the ˜ right definition of “Nash equilibrium” in F. Theorem 5.2.11 follows immediately from perturbed stationarity (5.16) and Lemma 5.2.12 below. ˜ Lemma 5.2.12. Let x ∈ X be a social state. Then V p (x) = 0 if and only if ΦFp (x) = 0. ˜ Proof. Using the facts that Mp (πp ) = Mp (Φπp ), that Mp = ( vp )−1 , and that the range of p vp is Rn (so that vp = Φ ◦ vp ), we argue as follows: 0 ˜ V p (x) = 0 ⇔ mp Mp (Fp (x)) = xp ⇔ Mp (ΦFp (x)) = 1p x mp 1 vp ( mp xp ) ⇔ ΦFp (x) = ˜ ⇔ ΦFp (x) = 0. Turning now to disequilibrium behavior, recall that positive correlation is defined in terms of inner products of growth rate vectors and payoff vectors: (PC) p VF (x) p 0 implies that VF (x) Fp (x) > 0 for all p ∈ P . In light of the discussion above, the natural analogue of property (PC) for perturbed best ˜ response dynamics replaces the true payoffs Fp (x) with virtual payoffs Fp (x). Doing so yields virtual positive correlation: (5.17) V p (x) ˜ 0 implies that V p (x) Fp (x) > 0 for all p ∈ P . To conclude this section, we verify that all perturbed best response dynamics heed this property. 183 Theorem 5.2.13. All perturbed best response dynamics satisfy virtual positive correlation (5.17). Proof. Let x ∈ X be a social state at which V p (x) (5.18) ˜ yp ≡ Mp (Fp (x)) = Mp (ΦFp (x)) 0. Then by definition, 1p x. mp Since vp = (Mp )−1 , we can rewrite the equality in expression (5.18) as vp ( yp ) = ΦFp (x). Therefore, since V p (x) ∈ TXp , we find that ˜ ˜ ˜ V p (x) Fp (x) = mp Mp (Fp (x)) − xp ΦFp (x) = mp Mp (ΦFp (x)) − xp = mp yp − 1p x mp 1 ΦFp (x) − vp ( mp xp ) 1 vp ( yp ) − vp ( mp xp ) > 0, where the final inequality follows from the fact that yp of vp . 5.3 5.3.1 1p x mp and from the strict convexity The Projection Dynamic Definition Our main payoff monotonicity condition for evolutionary dynamics is positive correlation (PC). In geometric terms, (PC) requires that at each state where population p is not at rest, the growth rate vector V p (x) must form an acute angle with the payoff vector Fp (x). Put differently, (PC) demands that growth rate vectors not distort payoff vectors to too great a degree. Is there an evolutionary dynamic that minimizes this distortion? If the vector field V is to define an evolutionary dynamic, each growth rate vector V (x) must represent a feasible direction of motion, in the sense of lying in the tangent cone TX(x). Thus, the most direct approach to our question is to always take V (x) to be the closest point in TX(x) to the payoff vector F(x). Definition. The projection dynamic associates each population game F ∈ F with a differential equation (P) ˙ x = ΠTX(x) (F(x)), where ΠTX(x) is the closest point projection of Rn onto the tangent cone TX(x). 184 It is easy to provide an explicit formula for (P) at social states in the interior of X. Since at such states TX(x) = TX, the closest point projection ΠTX(x) is simply Φ, the orthogonal projection onto the subspace TX. In fact, whenever xp ∈ int(Xp ), we have that p p ˙ xi = (ΦFp (x))i = Fi − p Fk (x). 1 n k ∈S Thus, when xp is an interior population state, the growth rate of strategy i ∈ Sp is the difference between its payoff and the unweighted average of the payoffs to population p’s strategies. When x is a boundary state, then the projection ΠTX(x) does not reduce to an orthogonal projection, so providing an explicit formula for (P) becomes more complicated. Exercise 5.3.1 describes the possibilities in a three-strategy game, while Exercise 5.3.2 provides an explicit formula for the general case. Exercise 5.3.1. Let F be a three-strategy game. Give an explicit formula for V (x) = ΠTX(x) F(x) when (i) x ∈ int(X); (ii) x1 = 0 but x2 , x3 > 0; (iii) x1 = 1. Exercise 5.3.2. Let F be an arbitrary single population game. Show that the projection ΠTX(x) (v) can be expressed as follows: v i − (ΠTX(x) (v))i = 0 1 #S (v,x) j∈S (v,x) vj if i ∈ S (v, x). otherwise; Here, the set S (v, x) ⊆ S contains all strategies in support(x), along with any subset of S − support(x) that maximizes the average #S (1v,x) j∈S (v,x) v j . 5.3.2 Solution Trajectories The dynamic (P) is clearly discontinuous at the boundary of X, so the existence and uniqueness results for Lipschitz continuous differential equations do not apply. We nevertheless have the following result, which is an immediate consequence of Theorem 5.A.4 in the Appendix. Theorem 5.3.3. Fix a Lipschitz continuous population game F. Then for each ξ ∈ X, there exists a unique Carath´odory solution {xt }t≥0 to the projection dynamic (P) with x0 = ξ. Moreover, e 185 solutions to (P) are Lipschitz continuous in their initial conditions: if {xt }t≥0 and { yt }t≥0 are solutions to (P), then | yt − xt | ≤ | y0 − x0 | eKt for all t ≥ 0, where K is the Lipschitz coefficient for F. Theorem 5.3.3 shows that the discontinuous differential equation (P) enjoys many of the properties of Lipschitz continuous differential equations. But there are important differences between the two types of dynamics. One difference is easy to spot: solutions to (P) are solutions in the Carath´ odory sense, and so can have kinks at a measure zero set e of times. Other differences are more subtle. For instance, while the theorem ensures the uniqueness of the forward solution trajectory from each state ξ ∈ X, backward solutions need not be unique. It is therefore possible for distinct solution trajectories of the projection dynamic to merge with one another. Example 5.3.4. Figure 5.3.1 presents phase diagrams for the projection dynamic in good RPS (w = 2, l = 1), standard RPS (w = l = 1), and bad RPS (w = 1, l = 2). In all three games, 1 most solutions spiral around the Nash equilibrium x∗ = ( 1 , 3 , 1 ) in a counterclockwise 3 3 direction. In good RPS (Figure 5.3.1(i)), all solutions converge to the Nash equilibrium. Solutions that begin close to a vertex hit and then travel along an edge of the simplex before heading into the interior of the simplex forever. Thus, there is a portion of each edge that is traversed by solutions starting from a positive measure set of initial conditions. In standard RPS (Figure 5.3.1(ii)), all solutions enter closed orbits at a fixed distance 1 from x∗ . Solutions starting at distance √6 or greater from x∗ (i.e., all solutions at least as 11 1 far from x∗ as the state (0, 2 , 2 )) quickly enter the closed orbit at distance √6 from x∗ ; other solutions maintain their initial distance from x∗ forever. In bad RPS (Figure 5.3.1(iii)), all solutions other than the one starting at x∗ enter the same closed orbit. This orbit alternates between segments through the interior of X and segments along the boundaries. Notice that in all three cases, solution trajectories starting in the interior of the state space can reach the boundary in finite time. This is impossible under any of our previous dynamics, including the best response dynamic. § ˙ Exercise 5.3.5. (i) Under what conditions is the dynamic (P) described by x = ΦF(x) at all states x ∈ X (i.e., not just at interior states)? (ii) Suppose that F(x) = Ax is generated by random matching in the symmetric normal form game A. What do the conditions from part (i) reduce to in this case? (Note ˙ that under these conditions, x = ΦAx is a linear differential equation; it is therefore possible to write down explicit formulas for the solution trajectories (see Chapter 7).) 186 (i) good RPS (ii) standard RPS (iii) bad RPS Figure 5.3.1: The projection dynamic in three Rock-Paper-Scissors games. 187 5.3.3 Incentive Properties That solutions to the projection dynamic exist, are unique, and are continuous in their initial conditions is not obvious. But given this fact and the manner in which the dynamic is defined, it is not surprising that the dynamic satisfies both of our incentive properties. The proofs of these properties are simple applications of the Moreau Decomposition Theorem: given any closed convex cone K ⊆ Rn and any vector π ∈ Rn , the projections ΠK (π) and ΠK◦ (π) are the unique vectors satisfying ΠK (π) ∈ K, ΠK◦ (π) ∈ K◦ , and ΠK (π) + ΠK◦ (π) = π (see Appendix 1.B). Theorem 5.3.6. The projection dynamic satisfies Nash stationarity (NS) and positive correlation (PC). Proof. Using the Moreau Decomposition Theorem and the normal cone characterization of Nash equilibrium (see Theorem 1.3.2), we find that ΠTX(x) (F(x)) = 0 ⇔ F(x) ∈ NX(x) ⇔ x ∈ NE(F), establishing (NS). To prove (PC), we again use the Moreau Decomposition Theorem: V p (x) Fp (x) = ΠTXp (xp ) (Fp (x)) ΠTXp (xp ) (Fp (x)) + ΠNXp (xp ) (Fp (x)) = |ΠTXp (xp ) (F(xp ))|2 ≥ 0. The inequality binds if and only if ΠTXp (xp ) (Fp (x)) = V p (x) = 0. 5.3.4 Revision Protocols and Connections with the Replicator Dynamic To this point, we have motivated the projection dynamic entirely through geometric considerations. Can this dynamic be derived from a model of individual choice? In this section, we describe revision protocols that generate the projection dynamic as their mean dynamics, and use these protocols to argue that the projection dynamic models “revision driven by insecurity”. Our analysis reveals close connections between the projection dynamic and the replicator dynamic, connections that we will develop further in the next chapter. In the remainder of this section, we focus on the single population setting; the extension to multipopulation settings is straightforward. 188 If we focus exclusively on interior states, the connections between the replicator and projection dynamics are especially strong. In Chapter 3, we introduced three revision protocols that generate the replicator dynamic as their mean dynamics: (5.19) ρi j (π, x) = x j [π j − πi ]+ ; (5.20) ρi j (π, x) = x j (K − πi ); (5.21) ρi j (π, x) = x j (π j + K). The x j term in each formula reflects the fact that these protocols are driven by imitation. For instance, to implement the first protocol, an agent whose clock rings picks an opponent from his population at random; he then imitates this opponent only if the opponents’ payoff is higher, doing so with probability proportional to the payoff difference. The x j term in these protocols endows their mean dynamic with a special functional form: the growth rate of each strategy is proportional to its prevalence in the population. For protocol (5.19), the derivation of the mean dynamic proceeds as follows: ˙ xi = x j ρ ji (F(x), x) − xi j∈S = ρi j (F(x), x) j∈S x j xi [Fi (x) − F j (x)]+ − xi j∈S = xi x j [F j (x) − Fi (x)]+ j∈S x j (Fi (x) − F j (x)) j∈S = xi Fi (x) − j x j F j (x) . To derive the projection dynamic on int(X), we use analogues of the revision protocols 1 above, replacing x j with nxi : (5.22) (5.23) (5.24) [π j − πi ]+ ; nxi K − πi ρi j (π, x) = ; nxi πj + K ρi j (π, x) = . nxi ρi j (π, x) = Thus, while in each of the imitative protocols, ρi j is proportional to the mass of agents playing the candidate strategy j, in the protocols just above, ρi j is inversely proportional to the mass of agents playing the current strategy i. One can therefore designate the projection 189 dynamic as capturing “revision driven by insecurity”, as it describes the behavior of agents who are especially uncomfortable choosing strategies not used by many others. It is easy to verify that protocols (5.22), (5.23), and (5.24) all induce the projection dynamic on the interior of the state space. In the case of protocol (5.22), the calculation proceeds as follows: ˙ xi = x j ρ ji (F(x), x) − xi j∈S = xj [Fi (x) − F j (x)]+ nx j j∈S = ρi j (F(x), x) j∈S 1 n [F j (x) − Fi (x)]+ − xi nxi j∈S (Fi (x) − F j (x)) j∈S = Fi (x) − F j (x). 1 n j∈S 1 Because of the xi term in the revision protocol, the mean dynamic above does not depend directly on the value of xi , allowing the disappearance rates of rare strategies to stay bounded away from zero. In other words, it is because unpopular strategies can be abandoned quite rapidly that solutions to the projection dynamic can travel from the interior to the boundary of the state space in a finite amount of time. ˙ Except in cases where the projection dynamic is defined by x = ΦF(x) at all states (cf Exercise 5.3.5), the revision protocols above do not generate the projection dynamic on the boundary of X. Exercise 5.3.7 presents a revision protocol that achieves this goal, even while maintaining connections with the replicator dynamic. Exercise 5.3.7. Consider the following two revision protocols (5.25) (5.26) ˆ x j [π j ]+ [πi ]− · ˆ if xk [πk ]+ > 0, ˆ ˆ ρi j (π, x) = k∈S k∈S xk [πk ]+ 0 otherwise. [πS ] ˜j [πS ] ˜i − + ˜k · if xi [πS ]+ > 0, S ρi j (π, x) = xi ˜ k∈S (π,x) k∈S (π,x) [πk ]+ 0 otherwise. ˜i The set S (π, x) in equation (5.26) is defined in Exercise 5.3.2, and πS = πi − #S (1 x) k∈S (π,x) πk . π, (i) Provide an interpretation of protocol (5.25), and show that it generates the replicator dynamic as its mean dynamic. (ii) Provide an interpretation of protocol (5.26), and show that it generates the projec190 tion dynamic as its mean dynamic. Appendix 5.A Differential Inclusions 5.A.1 Basic Theory A correspondence (i.e., a set valued map) V : Rn ⇒ Rn defines a differential inclusion via (DI) ˙ x ∈ V (x). We call (DI) a good upper hemicontinuous (or good UHC) differential inclusion if V is: (i) (ii) (iii) (iv) Nonempty: V (x) ∅ for all x ∈ Rn ; Convex valued: V (x) is convex for all x ∈ X; Bounded: There exists a K ∈ R such that sup{| y| : y ∈ V (x)} ≤ K for all x ∈ Rn ; Upper hemicontinuous: The graph of V , gr(V ) = {(x, y) : y ∈ V (x)}, is closed. While solutions to good UHC differential inclusions are neither as easily defined nor as well behaved as those of Lipschitz continuous differential equations, we will see that analogues of all the main properties of solutions to the latter can be established in the present setting. The set of feasible directions of motion under (DI) changes abruptly at discontinuities of the correspondence V . Our solution notion for (DI) must therefore admit trajectories with kinks: rather than requiring the relation (DI) to hold at every instant in time, it asks only that (DI) hold at almost all times. To formalize this notion, recall that the set Z ⊆ R has measure zero if for every ε > 0, there is a countable collection of open intervals of total length less than ε that covers Z. A property is said to hold for almost all t ∈ [0, T] if it holds on subset of [0, T] whose complement has measure zero. Finally, we say that a trajectory ˙ {xt }t∈[0,T] is a (Carath´odory) solution to (DI) if it is Lipschitz continuous and if xt ∈ V (xt ) at e ˙ almost all times t ∈ [0, T]. Since {xt } is Lipschitz continuous, its derivative xt exists for t ˙ almost all t ∈ [0, T], and the Fundamental Theorem of Calculus holds: xt − xs = s xu du. ˙ Observe that if {xt } is a Carath´ odory solution to a continuous ODE x = V (x), it is also e ˙ a solution to the ODE in the usual sense: xt = V (xt ) at all times t ∈ [0, T]. While our new concept does not introduce new solutions to standard differential equations, it enables us 191 to find solutions in settings where solutions of the old sort do not exist. In particular, we have the following existence result. Theorem 5.A.1. Let (DI) be a good UHC differential inclusion. Then for each ξ ∈ Rn there exists a (Carath´odory) solution {xt }t∈[0,T] to (DI) with x0 = ξ. e Our forward invariance result for ODEs extends to the current setting as follows: Theorem 5.A.2. Let C ⊆ Rn be a closed convex set, and let V : C ⇒ Rn satisfy conditions (i)-(iv) above. Suppose that V (x) ⊆ TC(x) for all x ∈ C. Extend the domain of V to all of Rn by letting V ( y) = V (ΠC ( y)) for all y ∈ Rn − C, and let this extension define the differential inclusion (DI) on Rn . Then (i) (DI) is a good UHC differential inclusion. (ii) (DI) admits a forward solution {xt }t∈[0,T] from each x0 ∈ Rn . (iii) C is forward invariant under (DI). Our examples of best response dynamics in Section 5.1 show that differential inclusions can admit multiple solution trajectories from a single initial condition, and hence that solutions need not be continuous in their initial conditions. However, the set of solutions to a differential inclusion still possesses considerable structure. To formalize this claim, let C[0,T] denote the space of continuous trajectories through Rn over the time interval [0, T], equipped with the maximum norm: C[0,T] = {x : [0, T] → Rn : x is continuous}, and ||x|| = max |xt | for x ∈ C[0,T] . t∈[0,T] Now recall two definitions from metric space topology. A set A ⊆ C[0,T] is connected if it cannot be partitioned into two nonempty sets, each of which is disjoint from the closure of the other. The set A is compact if every sequence of elements of A admits a subsequence that converges to an element of A . Now let S[0,T] (V, ξ) be the set of solutions to (DI) with initial condition ξ: S[0,T] (V, ξ) = {x ∈ C[0,T] : x is a solution to (DI) with x0 = ξ}. Theorem 5.A.3. Let (DI) be a good UHC differential inclusion. Then (i) For each ξ ∈ Rn , S[0,T] (V, ξ) is connected and compact. (ii) The correspondence S[0,T] (V, ·) : Rn → C[0,T] is upper hemicontinuous. 192 Although an initial condition ξ may be the source of many solution trajectories of (DI), part (i) of the theorem shows that the set S[0,T] (V, ξ) of such trajectories has a simple structure: it is connected and compact. Given any continuous criterion f : C[0,T] → R (where continuity is defined with respect to the maximum norm on C[0,T] ) and any initial condition ξ, connectedness implies that the set of values f (S[0,T] (V, ξ)) is an interval, while compactness implies that this set of values is compact; thus, there is a solution which is optimal according to criterion f among those that start at ξ. Part (ii) of the theorem provides an analogue of continuity in initial conditions. It tells us that if a sequence of solution trajectories {xk }∞ 1 to (DI) (with possibly differing initial conditions) converges to k= some trajectory x ∈ C[0,T] , then x is also a solution to (DI). 5.A.2 Differential Equations Defined by Projections Let X ⊆ Rn be a compact convex set, and let F : X → Rn be Lipschitz continuous. We consider the differential equation (P) ˙ x = ΠTX(x) (F(x)), where ΠTX(x) is the closest point projection onto the tangent cone TX(x). This equation ˙ provides the closest approximation to the equation x = F(x) that is consistent with the forward invariance of X. Since the right hand side of (P) changes discontinuously at the boundary of X, the Picard-Lindelof Theorem does not apply here. Indeed, solutions to (P) have different ¨ properties than solutions of standard ODEs: for instance, solution trajectories from different initial conditions can merge after a finite amount of time has passed. But like solutions to standard ODEs, forward solutions to the dynamic (P) exist, are unique, and are Lipschitz continuous in their initial conditions. Theorem 5.A.4. Let F be Lipschitz continuous. Then for each ξ ∈ X, there exists a unique (Carath´odory) solution {xt }t≥0 to (P) with x0 = ξ. Moreover, solutions are Lipschitz continuous e in their initial conditions: | yt − xt | ≤ | y0 − x0 | eKt for all t ≥ 0, where K is the Lipschitz coefficient for F. We now sketch a proof of this result. Define the multivalued map V : X ⇒ Rn by V (x) = ε>0 cl conv y∈X:| y−x|≤ε ΠTX( y) (F( y)) . 193 In words, V (x) is the closed convex hull of all values of ΠTX( y) (F( y)) that obtain at points y arbitrarily close to x. It is easy to check that V is upper hemicontinuous with closed convex values. Moreover, V (x) ∩ TX(x), the set of feasible directions of motion from x contained in V (x), is always equal to {ΠTX(x) (F(x))}, and so in particular is nonempty. Because V (x) ∩ TX(x) ∅, an extension of Theorem 5.A.2 called the Viability Theorem implies that ˙ for each ξ ∈ X, a solution {xt }t≥0 to x ∈ V (x) exists. But since V (x) ∩ TX(x) = {ΠTX(x) (F(x))}, this solution must also solve the original equation (P). This establishes the existence of solutions to (P). To prove uniqueness and continuity, let {xt } and { yt } be solutions to (P). Using the chain rule, the Moreau Decomposition Theorem, and the Lipschitz continuity of F, we see that d dt 2 yt − xt = 2( yt − xt ) (ΠTX( yt ) (F( yt )) − ΠTX(xt ) (F(xt ))) = 2( yt − xt ) (F( yt ) − F(xt )) − 2( yt − xt ) (ΠNX( yt ) (F( yt )) − ΠNX(xt ) (F(xt ))) = 2( yt − xt ) (F( yt ) − F(xt )) + 2(xt − yt ) ΠNX( yt ) (F( yt )) + 2( yt − xt ) ΠNX(xt ) (F(xt )) ≤ 2( yt − xt ) (F( yt ) − F(xt )) 2 ≤ 2K yt − xt , and hence that 2 t 2 yt − xt ≤ y0 − x0 + 2K ys − xs ds. 0 Gronwall’s inequality then implies that 2 2 yt − xt ≤ y0 − x0 e2Kt . Taking square roots yields the inequality stated in the theorem. 5.B The Legendre Transform The classical Legendre transform is the key tool for proving Theorem 5.2.2, the representation theorem for the additive random utility model. A generalization of this tool, the so-called Legendre-Fenchel transform, underlies the large deviations techniques we will introduce in Chapter 10. In this section, we introduce Legendre transforms of convex functions defined on open intervals and, more generally, on multidimensional convex domains. 194 5.B.1 Legendre Transforms of Functions on Open Intervals Let C = (a, b) ⊆ R be an open interval, and let f : C → R be a strictly convex, continuously differentiable function that becomes steep at the boundaries of C: lim f (x) = −∞ if a > −∞ , and lim f (x) = ∞ if b < ∞. x↓a x↑b The Legendre transform associates with the strictly convex function f a new strictly convex function f ∗ . Because f : C → R is strictly convex, its derivative f : C → R is strictly increasing, and thus invertible. We denote its inverse by ( f )−1 : C∗ → R, where the open interval C∗ is the range of f . Since ( f )−1 is itself strictly increasing, its integral, which we denote f ∗ : C∗ → R, is strictly convex. With the right choice of the constant of integration K, the pair (C∗ , f ∗ ) is the Legendre transform of the pair (C, f ). In summary: f ∗ ≡ ( f )−1 + K is strictly convex f : C → C∗ is strictly increasing − − → ( f )−1 : C∗ → C is strictly increasing −− f : C → R is strictly convex The cornerstone of the construction above is this observation: the derivative of f ∗ is the inverse of the derivative of f . That is, (5.27) ( f ∗ ) = ( f )−1 . Or, in other words, (5.28) f ∗ has slope x at y ⇔ f has slope y at x. Surprisingly enough, we can specify the function f ∗ described above in a simple, direct way. We define the Legendre transform (C∗ , f ∗ ) of the pair (C, f ) by C∗ = range( f ) and f ∗ ( y) = max xy − f (x). x∈C The first order condition of the program at right is y = f (x∗ ( y)), or, equivalently, ( f )−1 ( y) = x∗ ( y). On the other hand, if we differentiate f ∗ with respect to y, the envelope theorem yields ( f ∗ ) ( y) = x∗ ( y). Putting these equations together, we see that ( f ∗ ) ( y) = ( f )−1 ( y), which is property (5.27). Suppose that f exists and is positive. Then by differentiating both sides of the identity ( f ∗ ) ( y) = ( f )−1 ( y), we find this simple relationship between the second derivatives of f 195 and f ∗ : ( f ∗ ) ( y) = (( f )−1 ) ( y) = 1 , where x = ( f )−1 ( y) = x∗ ( y). f (x) In words: to find ( f ∗ ) ( y), evaluate f at the point x ∈ C corresponding to y ∈ C∗ , and then take the reciprocal. Our initial discussion of the Legendre transform suggests that it is a duality relation: in other words, that one can generate (C, f ) from (C∗ , f ∗ ) using the same procedure through which (C∗ , f ∗ ) is generated from (C, f ). To prove this, we begin with the simple observations that C∗ is itself an open interval, and that f ∗ is itself strictly convex and continuously differentiable. It is also easy to check that |( f ∗ ) ( y)| diverges whenever y approaches bd(C∗ ); in fact, this is just the contrapositive of the corresponding statement about f . It is easy to verify that (C∗ )∗ = C: (C∗ )∗ = range(( f ∗ ) ) = range(( f )−1 ) = domain( f ) = C. To show that ( f ∗ )∗ = f , we begin with the definition of ( f ∗ )∗ : ( f ∗ )∗ (x) = max xy − f ∗ ( y) y∈C∗ Taking the first order condition yields x = ( f ∗ ) ( y∗ (x)), and hence y∗ (x) = (( f ∗ ) )−1 (x) = f (x). Since ( f )−1 ( y) = x∗ ( y), y∗ and x∗ are inverse functions. We therefore conclude that ( f ∗ )∗ (x) = xy∗ (x) − f ∗ ( y∗ (x)) = xy∗ (x) − x∗ ( y∗ (x)) y∗ (x) − f (x∗ ( y∗ (x))) = f (x). Putting this all together, we obtain our third characterization of the Legendre transform and of the implied bijection between C and C∗ : (5.29) x maximizes xy − f (x) ⇔ y maximizes xy − f ∗ ( y). Example 5.B.1. If C = R and f (x) = ex , then the Legendre transform of (C, f ) is (C∗ , f ∗ ), where C∗ = (0, ∞) and f ∗ ( y) = y log y − y. § Example 5.B.2. Suppose that c : R → R is a strictly convex cost function. (For convenience, we allow negative levels of output; the next example shows that this is without loss of generality if c (0) = 0.) If output can be sold at price p ∈ C∗ = range(c ), then maximized 196 C* g–1= ( f *)’ g=f’ y f *(y) f(x) x C Figure 5.B.1: A Legendre transform. profit equals π(p) = max xp − c(x). x∈R Thus, by definition, (C∗ , π) is the Legendre transform of (R, c). The duality relation tells us that if we started instead with the maximized profit function π : C∗ → R, we could recover the cost function c via the dual program c(x) = max xp − π(p). § p∈C∗ Example 5.B.3. To obtain the class of examples that are easiest to visualize, let the function g : R → R be continuous, strictly increasing, and satisfy lim g(x) = −∞, lim g(x) = ∞, and g(0) = 0. x↓−∞ x↑∞ x If we define f (x) = 0 g(s) ds on domain R, then the Legendre transform of (R, f ) is (R, f ∗ ), y where f ∗ ( y) = 0 g−1 (t) dt. Evidently, ( f ∗ ) = g−1 = ( f )−1 . Indeed, Figure 5.B.1 illustrates that x maximizes xy − f (x) if and only if y maximizes xy − f ∗ ( y), and that f ∗ has slope x at y if and only if f has slope y at x. § 197 5.B.2 Legendre Transforms of Functions on Multidimensional Domains Analogues of all of the previous results can be established in settings with multidimensional domains. Let Z be a linear subspace of Rn . We call (C, f ) a Legendre pair if C ⊆ Z is (relatively) open and convex, and f is C1 , strictly convex, and steep near bd(C), where f is steep near bd(C) if | f (x)| → ∞ whenever x → bd(C). Our goal is to define a pair (C∗ , f ∗ ) that satisfies properties (5.30), (5.31), and (5.32): (5.31) f ∗ = ( f )−1 . f ∗ has slope x at y ⇔ f has slope y at x. (5.32) x maximizes x y − f (x) ⇔ y maximizes x y − f ∗ ( y). (5.30) As before, we can imagine obtaining f ∗ from f by differentiating, inverting, and then integrating, as we illustrate in the diagram below: f : C → R is strictly convex f : C → C∗ is invertible f ∗ ≡ ( f )−1 + K is strictly convex −−→ −− ( f )−1 : C∗ → C is invertible Since the domain of f is C ⊆ Z, the derivative of f , D f , is a map from C into L(Z, R), the set of linear forms on Z. The gradient of f at x is the unique vector f (x) ∈ Z that represents D f (x); thus, f is a map from C into Z. We define the Legendre transform (C∗ , f ∗ ) of the pair (C, f ) by C∗ = range( f ) and f ∗ ( y) = max x y − f (x). x∈C Theorem 5.B.4 summarizes the Legendre transform’s basic properties. Theorem 5.B.4. Suppose that (C, f ) is a Legendre pair. Then: (i) (C∗ , f ∗ ) is a Legendre pair. (ii) f : C → C∗ is bijective, and ( f )−1 = f ∗ . (iii) f (x) = max y∈C∗ x y − f ∗ ( y). (iv) The maximizers x∗ and y∗ satisfy x∗ ( y) = f ∗ ( y) = ( f )−1 ( y) and y∗ (x) = ( f ∗ )−1 (x). f (x) = As in the one dimensional case, we can relate the second derivatives of f ∗ to the second derivatives of f . The second derivative D2 f is a map from C to L2 (Z, R), the set s 2 n ×n of symmetric bilinear forms on Z × Z. The Hessian of f at x, f (x) ∈ R , is the unique 198 representation of D2 f (x) by a symmetric matrix whose rows and columns are in Z. In fact, since the map z → 2 f (x) z has range Z, we can view the matrix 2 f (x) as a linear map from Z to Z. We rely on this observation in the following result. Corollary 5.B.5. If D2 f (x) exists and is positive definite for all x ∈ C, then D2 f ∗ ( y) exists and is positive definite for all y ∈ C∗ . In fact, 2 f ∗ ( y) = ( 2 f (x))−1 as linear maps from Z to Z, where x = ( f )−1 ( y). In the one dimensional setting, the derivative f is invertible because it is strictly increasing. Both of these properties also follow from the stronger assumption that f (x) > 0 for all x ∈ C. In the multidimensional setting, it makes no sense to ask whether f is strictly increasing. But there is an analogue of the second derivative condition: namely, that the Hessian 2 f (x) is positive definite on Z × Z for all x ∈ C. According to the Global Inverse Function Theorem, any function on a convex domain that is proper (i.e., preimages of compact sets are compact) and whose Jacobian determinant is everywhere nonvanishing is invertible; thus, the fact that 2 f (x) is always positive definite implies that ( f )−1 exists. However, this deep result is not needed to prove Theorem 5.B.4 or Corollary 5.B.5. 5.C Perturbed Optimization 5.C.1 Proof of the Representation Theorem We now use the results on Legendre transforms from Appendix 5.B to prove Theorem ˜ 5.2.2. We defined the perturbed maximizer function M using stochastic perturbations via (5.11) ˜ Mi (π) = P i = argmax π j + ε j . j∈S Here, the random vector ε is an admissible stochastic perturbation if it has a positive density ˜ ˜ on Rn , and if this density is sufficiently smooth that M is C1 . We defined M using deterministic perturbations via (5.12) ˜ M(π) = argmax ( y π − v( y)). y∈int(∆) Here, the function v : int(∆) → R is an admissible deterministic perturbation if the Hessian matrix 2 v( y) is positive definite on Rn × Rn for all y ∈ int(∆), and if | v( y)| approaches 0 0 infinity whenever y approaches bd(∆). 199 ˜ Theorem 5.2.2. Let M be a perturbed maximizer function defined in terms of an admissible ˜ stochastic perturbation ε via equation (5.11). Then M satisfies equation (5.12) for some admissible ˜ deterministic perturbation v. In fact, M = M|Rn and v are invertible, and M = ( v)−1 . 0 Proof. The probability that alternative i is chosen when the payoff vector is π is ˜ Mi (π) = P(πi + εi ≥ π j + ε j for all j ∈ S) = P(ε j ≤ πi + εi − π j for all j) πi +xi −π1 ∞ = πi +xi −πi−1 πi +xi −πi+1 −∞ f (x) dxn . . . dxi+1 dxi−1 . . . dx1 dxi , ··· ··· −∞ πi +xi −πn −∞ −∞ −∞ where f is the joint density function of the random perturbations ε. The following lemma ˜ lists some properties of the derivative of M. Lemma 5.C.1. For all π ∈ Rn we have ˜ (i) DM(π) 1 = 0. ˜ (ii) DM(π) is symmetric. ˜ (iii) DM(π) has strictly negative off-diagonal elements. ˜ (iv) DM(π) is positive definite with respect to Rn × Rn . 0 0 ˜ ˜ Proof. Part (i) follows from differentiating the identity M(π) = M(Φπ). To establish parts (ii) and (iii), let i and j > i be two distinct strategies. Then using the change of ˆ variable x j = πi + xi − π j , we find that ˜ ∂Mi (π) = − ∂π j πi +xi −π1 ∞ πi +xi −πi−1 πi +xi −πi+1 ··· −∞ −∞ πi +xi −π j−1 πi +xi −π j+1 ··· −∞ −∞ πi +xi −πn f (x1 , . . . , x j−1 , ··· −∞ −∞ −∞ πi + xi − π j , x j+1 , . . . , xn ) dxn . . . dx j+1 dx j−1 . . . dxi+1 dxi−1 . . . dx1 dxi ˆ π j +x j −π 1 ∞ =− ˆ π j +x j −πi−1 ˆ π j +x j −πi+1 ··· −∞ −∞ ˆ π j +x j −π j−1 ˆ π j +x j −π j+1 ··· −∞ −∞ ˆ π j +x j −πn f (x1 , . . . , xi−1 , ··· −∞ −∞ −∞ ˆ ˆ π j + x j − πi , xi+1 , . . . , xn ) dxn . . . dx j+1 dx j−1 . . . dxi+1 dxi−1 . . . dx1 dx j = ˜ ∂M j (π), ∂πi which implies claims (ii) and (iii). To establish claim (iv), let z ∈ Rn . Then using claims (i), 0 (ii), and (iii) in succession yields ˜ z DM(π)z = i∈S j∈S ˜ ∂Mi (π) zi z j = ∂π j i∈S 200 ji ˜ ∂Mi (π) zi z j + ∂π j i∈S − ji ˜i 2 ∂M (π) zi ∂π j = i ji ˜ ∂Mi (π) zi z j − z2 = i ∂π j =− i j <i ˜ ∂M i (π) zi − z j ∂π j 2 i j<i ˜ ∂Mi (π) 2zi z j − z2 − z2 i j ∂π j > 0. ˜ ˜ Since the derivative matrix DM(π) is symmetric, the vector field M admits a potential ˜ ˜ ˜ function µ : Rn → R (that is, a function satisfying µ(π) = M(π) for all π ∈ Rn ). Let ¯ ¯ ˜0 ˜ µ = µ|Rn be the restriction of µ to Rn . Then for all π ∈ Rn , µ(π) ∈ Rn is given by 0 0 0 1 1 ˜ ˜ ¯ ˜ µ(π) = Φ µ(π) = ΦM(π) = M(π) − n 1 = M(π) − n 1 , ˜ where the third equality uses the fact that M(π) ∈ ∆. ¯ ¯ Since 2 µ(π) = DM(π) is positive definite with respect to Rn × Rn , µ is strictly convex; 0 0 n n ∗ , µ∗ ) be the Legendre ¯ thus, since bd(R0 ) is empty, (R0 , µ) is a Legendre pair. Let the pair (C ¯ 1 1 n ¯ ¯ transform of (R0 , µ), and define the function v : (C∗ + n 1) → R by v( y) = µ∗ ( y − n 1). Theorem 5.2.2 then follows immediately from Lemma 5.C.2. 1 Lemma 5.C.2. (i) C∗ + n 1 = int(∆). (ii) v : int(∆) → Rn is the inverse of M : Rn → int(∆). 0 0 (iii) v is an admissible deterministic perturbation. ˜ (iv) M(π) = argmax y∈int(∆) y π − v( y) for all π ∈ Rn . 1 ¯ Proof. (i) The set C∗ = range( µ) = range(M) − n 1 is convex by Theorem 5.B.4(i). Moreover, if the components π j , j ∈ J ⊂ S stay bounded while the remaining components ˜ ˜ approach infinity, then M j (π) → 0 for all j ∈ J: that is, M(π) converges to a subface of the ˜ simplex ∆. Thus, range(M) = range(M) ⊆ int(∆) contains points arbitrarily close to each corner of the simplex. Since range(M) is convex, it must equal int(∆). (ii) Let y ∈ int(∆). Using Theorem 5.B.4(ii), we find that 1 1 ¯ ¯ v( y) = µ∗ ( y − n 1) = ( µ)−1 ( y − n 1) = M−1 ( y). 1 ¯ (iii) (C∗ , µ∗ ) is a Legendre pair by Theorem 5.B.4(i); thus, if y → bd(∆) = bd(C∗ ) + n 1, 1 ¯ ¯ then | v( y)| = µ∗ ( y − n 1) diverges. In addition, since 2 µ(π) = DM(π) is positive definite with respect to Rn × Rn for all π ∈ Rn , Corollary 5.B.5 implies that 2 v( y) = 0 0 0 1 2∗ ¯ µ ( y − n 1) is positive definite with respect to Rn × Rn for all y ∈ int(∆). 0 0 201 (iv) ˜ ˜ Since M(·) = M(Φ(·)), it is enough to consider π ∈ Rn . For such π, 0 ˆ ¯ ˆ 1 argmax y π − v( y) = argmax y π − µ∗ ( y) + n 1 1 y∈int(∆) ˆ y∈int(∆)− n 1 1 ¯ = µ(π) + n 1 ˜ = M(π), where the second equality follows from Theorem 5.B.4(iv). This completes the proof of Theorem 5.2.2. 5.C.2 Additional Results We conclude this section by stating without proof a few additional results on perturbed ˜ optimization. The first two of these concern the construction of the potential function µ ˜ of the perturbed maximizer function M. In fact, two constructions are available, one for each sort of perturbation. ˜ If we define M in terms of an admissible deterministic perturbation v, then one can verify (using the Envelope Theorem or a direct calculation) that the perturbed maximum ˜ function associated with v is a potential function for M. ˜ Observation 5.C.3. The function µ : Rn → R defined by ˜ µ(π) = max y π − v( y) y∈int(∆) ˜ is a potential function for M as defined in (5.12). ˜ Alternatively, suppose we define M in terms of an admissible stochastic perturbation ε. In this case, the expectation of the maximal perturbed payoff is a potential function for ˜ M. ˜ Theorem 5.C.4. The function µ : Rn → R defined by ˜ µ(π) = E max(π j + ε j ) j∈S ˜ is a potential function for M as defined in (5.11). The intuition behind this result is simple. If we marginally increase the value of πi , the value of the maximum function max j π j + ε j goes up at a unit rate at those values of ε 202 where strategy i is optimal. The set of ε at which strategy i is optimal also changes, but the contribution of these points to the value of the maximum function is negligible. Building on these observations, one can show that ˜ ∂µ ˜ (π) = E1{i=argmax j π j +ε j } = P i = argmax j π j + ε j = Mi (π). ∂πi Which functions are perturbed maximizer functions? The following characterization of the perturbed maximizer functions that can be derived from admissible deterministic perturbations follows easily from the proof of Theorem 5.2.2. ˜ Corollary 5.C.5. A bijective function M : Rn → int(∆) can be derived from an admissible ˜ deterministic perturbation if and only if DM(π) is symmetric, positive definite on Rn , and satisfies 0 ˜ (π)1 = 0. DM The counterpart of this result for stochastic perturbations is known as the Williams-DalyZachary Theorem. ˜ Theorem 5.C.6. A bijective function M : Rn → int(∆) can be derived from an admissible ˜ stochastic perturbation if and only if DM(π) is symmetric, positive definite on Rn , and satisfies 0 ˜ ˜ DM(π)1 = 0, as well as the additional requirement that the partial derivatives of M satisfy ˜ ∂k M i 0 >0 (−1) ∂πi1 · · · ∂πik k for each k = 1, . . . , n − 1 and each set of k + 1 distinct indices {i0 , i1 , . . . , ik } ⊆ S. To establish the necessity of the kth order derivative conditions, one repeatedly differenti˜ ates the definition of M. The first order derivative condition is derived in this way in the proof of Theorem 5.2.2. These two results show that deterministic perturbations generate a strictly larger class of perturbed maximizer functions than stochastic perturbations; see Exercise 5.2.3 for an explicit example. 5.N Notes Section 5.1: The best response dynamic was introduced by Gilboa and Matsui (1991) and further studied by Matsui (1992), Hofbauer (1995b), and Gaunersdorfer and Hofbauer (1995). Hofbauer (1995b) introduced the interpretation of the best response dynamic as a differential inclusion. 203 Example 5.1.7 is introduced by Zeeman (1980), who shows that the interior Nash equilibrium of this game is not an ESS but is nevertheless asymptotically stable under the replicator dynamic. The properties of the best response dynamic described in the example are pointed out by Hofbauer (1995b). A complete analysis of best response dynamics in Rock-Paper-Scissors games can be found in Gaunersdorfer and Hofbauer (1995). An approximation theorem for collections of Markov processes whose mean dynamics are differential inclusions is proved by Bena¨m et al. (2005). They work in a setting in which ı the step size of the increments of the Markov processes shrinks over time; we conjecture that their result is also true in the present constant step size setting. Such a result would provide a foundation not just for the best response dynamic, but for the projection dynamic as well. Section 5.2: This section is based on Hofbauer and Sandholm (2002, 2007). The perturbed best response dynamic first appears in the work of Fudenberg and Kreps (1993) on stochastic fictitious play, while the logit dynamic first appears in Fudenberg and Levine (1998). For the Hausdorff metric mentioned in Example 5.2.1, see Ok (2007). For further references on logit models in game theory, see the Notes to Chapter 10. In the experimental economics literature, perturbed equilibrium goes by the name of quantal response equilibrium, a term introduced by McKelvey and Palfrey (1995). Some authors use this term more narrowly to refer to logit equilibrium. For more on the use of these concepts in the experimental literature, see Camerer (2003) and the references therein. ˜ The properties of the derivative matrix DM(π) have long been known in the discrete choice literature—see McFadden (1981) or Anderson et al. (1992). The control cost interpretation of deterministic perturbations is suggested by van Damme (1991, Chapter ˜ 4). That independent εi with bounded densities generate a continuously differentiable M follows from standard results on convolutions; see Hewitt and Stromberg (1965, Theorem 21.33). An intuitive discussion of the Poincar´ -Hopf Theorem can be found in Hofbauer and e Sigmund (1988, Section 19); see Milnor (1965) for a formal treatment. See Ritzberger (1994), Demichelis and Germano (2000, 2002), and Demichelis and Ritzberger (2003) for intriguing uses of topological ideas to study the global properties of evolutionary game dynamics. Section 5.3: Nagurney and Zhang (1996, 1997), building on work of Dupuis and Nagurney (1993), introduce the projection dynamic in the context of congestion games. Earlier, Friedman (1991) introduced an evolutionary dynamic that is equivalent to the projection dynamic on int(X), but that is different at states in bd(X). The presentation in this section 204 follows Lahkar and Sandholm (2008) and Sandholm et al. (2008). Appendix 5.A: Smirnov (2002) provides a readable introduction to the theory of differential inclusions. A more comprehensive but less readable reference is Aubin and Cellina (1984). The existence of solutions to differential inclusions defined by projections of multivalued maps was proved by Henry (1973); the approach described here follows Aubin and Cellina (1984, Section 5.6). Restricting attention to differential equations defined by projections of Lipschitz continuous functions allows one to establish uniqueness and continuity results, a point noted, e.g., by Dupuis and Nagurney (1993). Appendix 5.B: Formal treatments of the Legendre transform can be found in Rockafellar (1970) and Hiriart-Urruty and Lemar´ chal (2001). Example 5.B.3 is borrowed from Roberts e and Varberg (1973, Section 15). For the Global Inverse Function Theorem, see Gordon (1972). Appendix 5.C : Theorem 5.2.2 is due to Hofbauer and Sandholm (2002). For proofs of Theorem 5.C.4 and 5.C.6, see McFadden (1981) or Anderson et al. (1992). The latter source is a good general reference on discrete choice theory. 205 206 Part III Convergence and Nonconvergence 207 CHAPTER SIX Global Convergence of Evolutionary Dynamics 6.0 Introduction In the preceding chapters, we introduced a variety of classes of evolutionary dynamics and exhibited their basic properties. Most conspicuously, we established links between the rest points of each dynamic and the Nash equilibria of the underlying game, links that are valid regardless of the nature of the game at hand. This connection is expressed in its strongest form by dynamics satisfying Nash stationarity (NS), under which rest points and Nash equilibria coincide. Still, once one specifies an explicitly dynamic model of behavior, the most natural approach to prediction is not to focus immediately on equilibrium points, but to determine where the dynamic leads when set in motion from various initial conditions. If equilibrium occurs as the limiting state of this adjustment process, we can feel some confidence in predicting equilibrium play. If instead our dynamics lead to limit cycles or other more complicated limit sets, then these sets rather than the unstable rest points provide superior predictions of behavior. In this chapter, we seek conditions on games and dynamics under which behavior converges to equilibrium from all or nearly all initial population states. We therefore reconsider the three classes of population games introduced in Chapter 2—potential games, stable games, and supermodular games—and derive conditions on evolutionary dynamics that ensure convergence in each class of games. We also establish convergence results for dominance solvable games, but we shall see in Chapter 8 that these results are not robust to small changes in the dynamics for which they hold. The most common method for proving global convergence in a dynamical system is by 209 constructing a strict Lyapunov function: a scalar-valued function that the dynamic ascends whenever it is not at rest. When the underlying game is a potential game, the game’s potential function provides a natural candidate Lyapunov function for evolutionary dynamics. We verify in Section 6.1 that a potential functions serve as Lyapunov functions under any evolutionary dynamic that satisfies our basic monotonicity condition, positive correlation (PC). We then use this fact to prove global convergence in potential games under all of the evolutionary dynamics studied in Chapters 4 and 5. Unlike potential games, stable games do not come equipped with a scalar-valued function that is an obvious candidate Lyapunov function for evolutionary dynamics. But the structure of payoffs in these games—already reflected in the fact that their sets of Nash equilibria are convex—makes it natural to expect convergence results to hold. We develop this intuition in Section 6.2, where we develop approaches to constructing Lyapunov functions for stable games. We find that distance-like functions serve as Lyapunov functions for the replicator and projection dynamics, allowing us to establish global convergence results for these dynamics in strictly stable games. For target dynamics, including excess payoff, best response, and perturbed best response dynamics, we find that integrability of the revision protocol is the key to establishing convergence results. We argue in Section 6.2.2 that in the presence of payoff monotonicity, integrability of the protocol ensures that on average, the vector of motion deviates from the vector of payoffs in the direction of the equilibrium; given the geometry of equilibrium in stable games, this is enough to ensure convergence to equilibrium. All told, we prove global convergence results for all six of our fundamental dynamics. In Section 6.3, we turn our attention to supermodular games. As these game’s essential property is the monotonicity of their best response correspondences, it is not surprising that our convergence results address dynamics that respect this monotone structure. We begin by considering the best response dynamic, using elementary methods to prove a convergence result for supermodular games generated by two-player normal form games that satisfy a “diminishing returns” condition. To obtain convergence results that demand less structure of the game, we appeal to methods from the theory of cooperative differential equations: these are smooth differential equations under which increasing the value of one component of the state variable increases the growth rates of all other components. The smoothness requirement precludes applying these methods to the best response dynamic, but we are able to use them to study perturbed best response dynamics. We prove that after a natural change of coordinates, perturbed best response functions generated by stochastic perturbations of payoffs are monotone. Ultimately, this allows us to show that the corresponding perturbed best response dynamics converge to perturbed equilibrium 210 from almost all initial conditions. In Section 6.4, we study evolution in games with strictly dominated strategies. We find that under the best response dynamic and under imitative dynamics, strictly dominated strategies are eliminated; so are strategies ruled out by iterative removal of strictly dominated strategies. It follows that in games that are dominance solvable—that is, in games where this iterative procedure leaves only one strategy for each population—the best response dynamic and all imitative dynamics converge to the dominance solution. We should emphasize, however, that these elimination results are not robust: we will see in Chapter 8 that under many small modifications of the dynamics covered by our elimination results, strictly dominated strategies can survive. The definitions and tools from dynamical systems theory needed for our analyses are treated in the Appendix. Appendix 6.A introduces notions of stability, limit behavior, and recurrence for deterministic dynamics. Appendix 6.B presents stability and convergence results for dynamics that admit Lyapunov functions. Finally, Appendix 6.C introduces the theory of cooperative differential equations and monotone dynamical systems. 6.1 6.1.1 Potential Games Potential Functions as Lyapunov Functions In a potential game F : X → Rn , all information about incentives is captured by the potential function f : X → R, in that (6.1) f (x) = ΦF(x) for all x ∈ X. In Chapter 2, we characterized Nash equilibria of F as those states that satisfy the KuhnTucker first order conditions for maximizing f on X. We now take a further step, using the potential function to describe disequilibrium adjustment. In Lemma 6.1.1, we show that any evolutionary dynamic that satisfies positive correlation, (PC) p VF (x) p 0 implies that VF (x) Fp (x) > 0, must ascend the potential function f . To state this result, we introduce the notion of a Lyapunov function. The C1 function ˙ L : X → R is an (increasing) strict Lyapunov function for the differential equation x = VF (x) ˙ if L(x) ≡ L(x) VF (x) ≥ 0 for all x ∈ X, with equality only at rest points of VF . 211 Lemma 6.1.1. Let F be a potential game with potential function f . Suppose the evolutionary ˙ dynamic x = VF (x) satisfies positive correlation (PC). Then f is a strict Lyapunov function for VF . Proof. Follows immediately from condition (PC) and the fact that f˙(x) = p ˙ f (x) x = (ΦF(x)) VF (x) = Fp (x) VF (x). p∈P The initial equality in the expression above follows from an application of the chain rule (Section 2.A.4) to the composition t → xt → f (xt ). Versions of this argument will be used often in the proofs of the results to come. If a dynamic admits a strict Lyapunov function, all solution trajectories of the dynamic converge to equilibrium. Combining this fact with Lemma 6.1.1 allows us to prove a global convergence result for potential games. To state this result, we briefly present some definitions concerning limit behavior of deterministic trajectories; for more on these notions and on Lyapunov functions, see Appendices 6.A and 6.B. The ω-limit of trajectory {xt }t≥0 is the set of all points that the trajectory approaches arbitrarily closely infinitely often: ω({xt }) = y ∈ X : there exists {tk }∞ 1 with lim tk = ∞ such that lim xtk = y . k= k→∞ k→∞ ˙ For dynamics x = VF (x) that admit a unique solution trajectory from each initial condition, we write ω(ξ) for the ω-limit set of the trajectory starting from state ξ, and we let Ω(VF ) = ω(ξ) ξ∈X denote the set of all ω-limit points of all solution trajectories. The set Ω(VF ) (or its closure, when Ω(VF ) is not closed) provides a basic notion of recurrence for deterministic dynamics. ˙ Theorem 6.1.2. Let F be a potential game, and let x = VF (x) be an evolutionary dynamic for F that admits a unique solution from each initial condition and that satisfies positive correlation (PC). Then Ω(VF ) = RP(VF ). In particular, (i) If VF is an imitative dynamic, then Ω(VF ) = RE(F), the set of restricted equilibria of F. (ii) If VF is an excess payoff dynamic, a pairwise comparison dynamic, or the projection dynamic, then Ω(VF ) = NE(F). Proof. Immediate from Lemma 6.1.1, Theorem 6.B.4, and the characterizations of rest points from Chapters 4 and 5. 212 Example 6.1.3. 123 Coordination. Figure 6.1.1 presents phase diagrams for the six fundamental dynamics in 123 Coordination: (6.2) 1 0 0 x1 x1 0 2 0 x = 2x . 2 2 F(x) = Ax = 0 0 3 x 3x 3 3 In the first five cases, the phase diagram is plotted atop the potential function (6.3) f (x) = 1 2 (x1 )2 + 2(x2 )2 + 3(x3 )2 . Of these, the first four cases (replicator, projection, BNN, Smith) are covered by Theorem 6.1.2; evidently, every solution trajectory in diagrams (i)–(iv) ascends the potential function, ultimately converging to one of the seven Nash equilibria of F. It is worth noting that these equilibria are not all locally stable. The interior equilibrium is a source, with all nearby solution trajectories moving away from the equilibrium. The three equilibria with two-strategy supports are saddles: for each of these, there is one solution trajectory that converges to the equilibrium, while all other nearby trajectories eventually move away from the equilibrium. Only the three remaining equilibria—the three pure equilibria—are locally stable. We defer further discussion of local stability to Chapter 7, which is devoted to this topic. § Our convergence results for best response and perturbed best response dynamics require additional work. In the case of the best response dynamic (BR) p ˙ xp ∈ mp Mp (Fp (x)) − xp , where Mp (πp ) = argmax πi , i∈Sp we must account for the fact that the dynamic is multivalued. ˙ Theorem 6.1.4. Let F be a potential game with potential function f , and let x ∈ VF (x) be the best response dynamic for F. Then ∂f (x) = ∂z ˆp mp max F j (x) for all z ∈ VF (x) and x ∈ X. p p∈P j∈S Therefore, every solution trajectory {xt } of VF satisfies ω({xt }) ⊆ NE(F). Proof. Recall from Theorem 5.1.8 that the best response dynamic satisfies the following 213 1 2 1 32 3 (i) replicator (ii) projection 1 1 2 32 3 (iii) BNN (iv) Smith 1 1 2 32 (v) best response 3 (vi) logit(.5) Figure 6.1.1: Six basic dynamics in 123 Coordination. The contour plots are the potential function in (i)-(v), and the logit potential function in (vi). 214 refinement of condition (PC): p p ˆ (zp ) Fp (x) = mp max F j (x) for all zp ∈ VF (x). p j∈S This condition immediately implies that ∂f (x) ≡ ∂z f (x) z = (ΦF(x)) z = p Fp (x) zp = p∈P ˆ mp max F j (x). p p∈P j∈S ∂f Thus, ∂z (x) ≥ 0 for all x ∈ X and z ∈ VF (x), and Lemma 4.5.4 implies that equality holds if and only if x ∈ NE(F). The convergence result now follows from Theorem 6.B.5. Example 6.1.5. 123 Coordination revisited. Figure 6.1.1(v) presents the phase diagram of the best response dynamic in 123 Coordination (6.2), again atop the potential function (6.3). As in Example 5.1.5, there are multiple solutions starting from each initial condition on the Y-shaped set of boundaries between the best response regions. It is not hard to verify that each of these solutions converges to a Nash equilibrium. § Finally, we turn to perturbed best response dynamics, considering the (more general) definition of these dynamics via admissible deterministic perturbations vp : int(∆p ) → R. ˜ ˜ ˙ xp = mp Mp (F(x)) − xp , where Mp (πp ) = argmax ( yp ) πp − vp ( yp ), yp ∈int(∆p ) While these dynamics do not satisfy positive correlation (PC), Theorem 5.2.13 showed that these dynamics do satisfy a perturbed analogue called virtual positive correlation: V p (x) ˜ 0 implies that V p (x) Fp (x) > 0 for all p ∈ P , ˜ where the virtual payoffs F : int(X) → Rn for the pair (F, v) are defined by 1 ˜ Fp (x) = Fp (x) − vp ( mp xp ). Accordingly, the Lyapunov function for a perturbed best response dynamic is not the potential function f , but a perturbed version thereof. ˙ Theorem 6.1.6. Let F be a potential game with potential function f , and let x = VF,v (x) be the perturbed best response dynamic for F generated by the admissible deterministic perturbations 215 v = (v1 , . . . , vp ). Define the perturbed potential function f˜ : int(X) → R by (6.4) f˜(x) = f (x) − 1 mp vp ( mp xp ). p∈P Then f˜ is a strict Lyapunov function for VF,v , and so Ω(VF,v ) = PE(F, v). Proof. That f˜ is a strict Lyapunov function for VF,v follows immediately from virtual positive correlation and the fact that ˙ f˜(x) ≡ p ˙ f˜(x) x = p ˜ Fp (x) VF,v (x). 1 Fp (x) − vp ( mp xp ) VF,v (x) = p∈P p∈P Since PE(F, v) ≡ RP(VF,v ), that Ω(VF,v ) = PE(F, v) follows from Theorem 6.B.4. Example 6.1.7. 123 Coordination rerevisited. Figure 6.1.1(vi) presents the phase diagram for the logit(.5) dynamic in 123 Coordination (6.2). Here the contour plot is the logit potential function 3 f˜(x) = 1 2 (x1 ) + 2(x2 ) + (3(x3 ) − .5 2 2 xi log xi . 2 i =1 Because the noise level is rather high, this phase diagram looks very different than the others—in particular, it includes only three rest points (two stable and one unstable) rather than seven. Nevertheless, every solution trajectory ascends the relevant Lyapunov function f˜, ultimately converging to a perturbed equilibrium. § 6.1.2 Gradient Systems for Potential Games Lemma 6.1.1 tells us that in potential games, any dynamic that satisfies condition (PC) must ascend the potential function f . We now turn to a more refined question: is there an evolutionary dynamic that ascends f in the fastest possible way? A first answer to this question is suggested by Figure 6.1.1(ii): in 123 Coordination, solution trajectories of the projection dynamic, (P) ˙ x = ΠTX(x) (F(x)), cross the level sets of the potential function orthogonally. In fact, we have 216 Observation 6.1.8. Let F : X → Rn be a potential game with potential function f : X → R. On int(X), the projection dynamic (P) is the gradient system for f : (6.5) ˙ x= f (x) on int(X). Surprisingly, there is an alternative answer to our question: it turns out that the replicator dynamic, (R) p p ˆp ˙ xi = xi Fi (x), also defines a gradient system for the potential function f ; however, this is only true after we apply a clever change of variable. In addition to its inherent interest, this fact demonstrates a close connection between the replicator and projection dynamics; another such connection will be made in Section 6.2.1 below. We restrict our analysis to the single population case. Define the set X = {x ∈ Rn : + 2 i∈S x i = 4} to be the portion of the raidus 2 sphere lying in the positive orthant. Our change of variable is given by the Akin transformation H : int(Rn ) → int(Rn ), where + + √ Hi (x) = 2 xi . Evidently, H is a diffeomorphism that maps the simplex X onto the set X . The transformation makes changes in component xi look large when xi itself is small. Theorem 6.1.9 tells us that the replicator dynamic is a gradient dynamic on int(X) after a change of variable that makes changes in the use of rare strategies look important relative to changes in the use of common ones. Intuitively, this reweighting accounts for the fact that under imitative dynamics, changes in the use of rare strategies are necessarily slow. Theorem 6.1.9. Let F : X → Rn be a potential game with potential function f : X → R. Suppose we transport the replicator dynamic for F from int(X) to int(X ) using the Akin transformation H. Then the resulting dynamic is the gradient dynamic for the transported potential function φ = f ◦ H −1 . Proof. We prove Theorem 6.1.9 in two steps: first, we derive the transported version of the replicator dynamic; then we derive the gradient system for the transported version of the potential function, and show that it is the same dynamic on X . The following notation will simplify our calculations: when y ∈ Rn and a ∈ R, we let [ ya ] ∈ Rn be the vector whose + ith component is ( yi )a . We can express the replicator dynamic on X as ˙ x = R(x) = diag(x) (F(x) − 1x F(x)) = diag (x) − xx F(x). 217 The transported version of this dynamic can be computed as ˙ x = R (x ) = DH(H−1 (x ))R(H−1 (x )). In words: given a state x ∈ X , we first find the corresponding state x = H−1 (x ) ∈ X and direction of motion R(x). Since R(x) represents a displacement from state x, we transport it to X by premultiplying it by DH(x), the derivative of H evaluated at x. Since x = H(x) = 2 [x1/2 ], the derivative of H at x is given by DH(x) = diag([x−1/2 ]) Using this fact, we derive a primitive expression for R (x ) in terms of x = H−1 (x ) = 1 [x 2 ]: 4 (6.6) ˙ x = R (x ) = DH(x)R(x) = diag([x−1/2 ])(diag(x) − xx )F(x) = diag([x1/2 ]) − [x1/2 ]x F(x). Now, we derive the gradient system on X generated by φ = f ◦ H−1 . To compute φ(x ), we need to define an extension of φ to all of Rn , compute its gradient, and then project the + result onto the tangent space of X at x . The easiest way to proceed is to let f˜ : int(Rn ) → R + 1 ˜ : int(Rn ) → R by φ = f˜◦ H−1 . ˜ be an arbitrary C extension of f , and to define the extension φ + Since X is a portion of a sphere centered at the origin, the tangent space of X at x is the subspace TX (x ) = {z ∈ Rn : x z = 0}. The orthogonal projection onto this set is represented by the n × n matrix PTX (x ) = I − 1 xx 1 4 xx = I − xx = I − [x1/2 ][x1/2 ] . Also, since Φ f˜(x) = f (x) = ΦF(x) by construction, it follows that some scalar-valued function c : X → R. Using the chain rule (Section 2.A.4), we compute that f˜(x) = F(x) + c(x)1 for ˜ φ(x ) = D( f˜ ◦ H−1 )(x ) = (D f (H−1 (x )) DH−1 (x )) = DH−1 (x ) f˜(x), while applying the chain rule to the identity H−1 (H(x)) ≡ x and then rearranging yields DH−1 (x ) = DH(x)−1 . 218 Marshaling these observations, we find that the gradient system on X generated by φ is ˙ x = φ(x ) ˜ = PTX (x ) φ(x ) = PTX (x ) DH−1 (x ) f˜(x) = PTX (x ) (DH(x)−1 ) (F(x) + c(x)1) = I − [x1/2 ][x1/2 ] diag([x1/2 ]) (F(x) + c(x)1) = diag([x1/2 ]) − [x1/2 ]x (F(x) + c(x)1) = diag([x1/2 ]) − [x1/2 ]x F(x). This agrees with equation (6.6), completing the proof of the theorem. Example 6.1.10. 123 Coordination one last time. Figure 6.1.2 illustrates Theorem 6.1.9 by ˙ presenting phase diagrams of the transported replicator dynamic x = R (x ) for 123 Coordination (cf Example 6.1.3). These phase diagrams on X are drawn atop contour plots 1 of the transported potential function φ(x ) = ( f ◦ H−1 )(x ) = 32 ((x1 )4 + 2(x2 )4 + 3(x3 )4 ). According to Theorem 6.1.9, the solution trajectories of R should cross the level sets of φ orthogonally. Looking at Figure 6.1.2, we find that the crossings look orthogonal at the center of the figure, but not by the boundaries. This is an artifact of our drawing a portion of the sphere in R3 by projecting it orthogonally onto a sheet of paper. (For exactly the same reason, latitude and longitude lines in an orthographic projection of the Earth only appear to cross at right angles in the center of the projection, not on the left and right sides.) To check whether the crossings near a given state x ∈ X are truly orthogonal, we must minimize the distortion of angles near x by making x the origin of the projection—that is, the point where the sphere touches the sheet of paper. In the phase diagrams in Figure 6.1.2, we mark the projection origins with pink dots; evidently, the crossings are orthogonal near these points. § 6.2 Stable Games Recall that the population game F is stable if it satisfies (6.7) ( y − x) (F( y) − F(x)) ≤ 0 for all x, y ∈ X. 219 1 2 3 (i) origin = H( 1 , 1 , 1 ) 333 1 2 3 1 (ii) origin = H( 7 , 1 , 5 ) 77 ˙ Figure 6.1.2: The phase diagram of the transported replicator dynamic x = R (x ) for a coordination game. The pink dots represent the positions of the projection origins. 220 When F is C1 , this condition is equivalent to self-defeating externalities: (6.8) z DF(x) z ≤ 0 for all z ∈ TX and x ∈ X. The set of Nash equilibria of a stable game is convex, and most often a singleton. In general, uniqueness of equilibrium is not enough to ensure convergence of evolutionary dynamics. As we shall see in Chapter 8, there are many simple examples of games with a unique Nash equilibrium in which dynamics fail to converge. Nevertheless, we show in this section that under many evolutionary dynamics, the structure provided by self-defeating externalities is enough to ensure convergence. While fewer dynamics converge here than in potential games, convergence does obtain under all six fundamental dynamics. Our convergence proofs for stable games again rely on the construction of Lyapunov functions, but here we will need to construct a distinct Lyapunov function for each dynamic we consider. It will be natural to write these Lyapunov functions so that their values fall over time: thus, we say that a C1 function L is a (decreasing) strict Lyapunov ˙ ˙ function for the dynamic x = VF (x) if L(x) ≤ 0 for all x ∈ X, with equality only at rest points of VF . Apart from those for perturbed best response dynamics, the Lyapunov functions introduced below are also gap functions: they are continuous and nonnegative, with zeros precisely at the Nash equilibria of the underlying game F. 6.2.1 The Projection and Replicator Dynamics in Strictly Stable Games To obtain convergence results for the projection and replicator dynamics, we must restrict attention to strictly stable games: that is, games in which condition (6.7) holds strictly for all x, y ∈ X. The Lyapunov functions for these dynamics are based on explicit notions of “distance” from the the game’s unique Nash equilibrium x∗ . Theorem 6.2.1 shows that under the projection dynamic, x∗ is globally asymptotically stable: all solution trajectories converge to x∗ , and solutions that start near x∗ never move too far away from x∗ (see Appendix 6.A.2). ˙ Theorem 6.2.1. Let F be a strictly stable game with unique Nash equilibrium x∗ , and let x = VF (x) be the projection dynamic for F. Let the function Ex∗ : X → R+ , defined by 2 Ex∗ (x) = x − x∗ , represent squared Euclidean distance from x∗ . Then Ex∗ is a strict Lyapunov function for VF , and so x∗ is globally asymptotically stable under VF . 221 Proof. Since F is a strictly stable game, its unique Nash equilibrium x∗ is also its unique GESS: (x − x∗ ) F(x) < 0 for all x ∈ X − {x∗ }. This fact and the Moreau Decomposition Theorem imply that ˙ ˙ Ex∗ (x) = Ex∗ (x) x ∗) Π = 2(x − x TX(x) (F(x)) = 2(x − x∗ ) F(x) + 2(x∗ − x) ΠNX(x) (F(x)) ≤ 2(x∗ − x) Π (F(x)) NX(x) ≤ 0, where the penultimate inequality is strict whenever x NE(F) then follows from Corollary 6.B.7. x∗ . Global asymptotic stability of Exercise 6.2.2. Let F be a stable game, and let x∗ be a Nash equilibrium of F. (i) Show that x∗ is Lyapunov stable under (P). (ii) Suppose that F is a null stable game (i.e., that ( y − x) (F( y) − F(x)) = 0 for all x, y ∈ X). Show that if x∗ ∈ int(X), then Ex∗ defines a constant of motion for (P) on int(X): the value of Ex∗ is constant along interior portions of solution trajectories of (P). Exercise 6.2.3. Show that if F is a C1 stable game, then the squared speed of motion L(x) = |ΦF(x)|2 is a Lyapunov function for (P) on int(X). Show that if F is null stable, then L defines a constant of motion for (P) on int(X). (Notice that unlike that of Ex∗ , the definition of L does not directly incorporate the Nash equilibrium x∗ .) Under the replicator dynamic (R), as under any imitative dynamic, strategies that are initially unused remain unused for all time. Therefore, if state x places no mass on a strategy in the support of the Nash equilibrium x∗ , the solution to (R) starting from x cannot converge to x∗ . Thus, in stating our convergence result for the replicator dynamic, we need to be careful to specify the set of states from which convergence to equilibrium occurs. p With this motivation, let Sp (xp ) = {i ∈ Sp : xi > 0} denote the support of xp . Then p X yp = xp ∈ Xp : Sp ( yp ) ⊆ Sp (xp ) is the set of states in Xp whose supports contain the support p of yp , and X y = p∈P X yp is the set of states in X whose supports contain the support of y. 222 p p To construct our Lyapunov function, we introduce the function h yp : X yp → R, defined by p p h yp (xp ) p yi = log i∈Sp ( yp ) yi p. xi p If population p is of unit mass, so that yp and xp are probability distributions, h yp (xp ) is known as the relative entropy of yp given xp . ˙ Theorem 6.2.4. Let F be a strictly stable game with unique Nash equilibrium x∗ , and let x = VF (x) be the replicator dynamic for F. Define the function Hx∗ : Xx∗ → R+ by p Hx∗ (x) = h(x∗ )p (xp ). p∈P − Then Hx∗1 (0) = {x∗ }, and Hx∗ (x) approaches infinity whenever x approaches X − Xx∗ Moreover, ˙ Hx∗ (x) ≤ 0, with equality only when x = x∗ . Therefore, x∗ is globally asymptotically stable with respect to Xx∗ . Proof. (p = 1) To see that Hx∗ is a gap function, observe that by Jensen’s inequality, xi xi x∗ · ∗ = log xi ≤ log 1 = 0, −Hx∗ (x) = x∗ log ∗ ≤ log i i ∗ ∗ xi xi ∗) i∈S(x ) i∈S(x ) i∈S(x with equality if and only if x = x∗ . The second claim is immediate. For the third claim, note that since F is strictly stable, x∗ is a GESS, so for all x ∈ Xx∗ we have that ˙ ˙ Hx∗ (x) = Hx∗ (x) x x∗ i ˆ =− · xi Fi (x) xi ∗ i∈S(x ) ˆ x∗ Fi (x) i =− i∈S = −(x∗ ) (F(x) − 1 x F(x)) = −(x∗ − x) F(x) ≤ 0, where the inequality binds precisely when x = x∗ . The conclusions about stability then follow from Theorems 6.B.2 and 6.B.4. 223 Exercise 6.2.5. Let F be a stable game, and let x∗ be a Nash equilibrium of F. (i) Show that x∗ is Lyapunov stable under (R). (ii) Show that if F is a null stable game and x∗ ∈ int(X), then Hx∗ defines a constant of motion for (R) on int(X). 6.2.2 Integrable Target Dynamics Of our six fundamental dynamics, three of them—the BNN, best response, and logit dynamics, can be expressed as target dynamics of the form ˆ τp (πp , xp ) = τp (πp ), under which conditional switch rates only depend on on the vector of excess payoffs 1 ˆ πp = πp − mp 1(xp ) πp . This is obviously true of the BNN dynamic. For the other two cases, note that shifting all components of the payoff vector by the same constant has no effect on either exact or perturbed best responses: in particular, the definitions (5.2) and (5.12) p of the maximizer correspondence Mp : Rn ⇒ ∆p and the perturbed maximizer function p ˜ ˜ˆ ˜ ˆ Mp : Rn → ∆p satisfy Mp (πp ) = Mp (πp ) and Mp (πp ) = Mp (πp ). In this section, we show that these three dynamics converge to equilibrium in all stable games, as do all close enough relatives of these dynamics. Unlike in the context of potential games, monotonicity properties alone are not enough to ensure that a dynamic converges: in addition, integrability of the revision protocol plays a key role in establishing convergence results. To begin, we provide an example to illustrate that monotonicity properties alone do not ensure convergence of target dynamics in stable games. Example 6.2.6. Cycling in good RPS. Fix ε > 0, and let gε : R → R be a continuous decreasing function that equals 1 on (–∞, 0], equals ε2 on [ε, ∞), and is linear on [0, ε]. Then define the revision protocol τ for Rock-Paper-Scissors games by (6.9) τR (π) [πR ]+ gε (πS ) ˆ ˆ ˆ τ (π) = [π ] gε (π ) . P ˆ ˆP + ˆR τ (π) [π ] gε (π ) ˆS + ˆP Sˆ Under this protocol, the weight placed on a strategy is proportional to positive part of the strategy’s excess payoff, as in the protocol for the BNN dynamic; however, this weight is only of order ε2 if the strategy it beats in RPS has an excess payoff greater than ε. 224 R P S Figure 6.2.1: An excess payoff dynamic in good RPS (w = 3, l = 2). It is easy to verify that protocol (6.9) satisfies acuteness (4.20): ˆˆ ˆ+ ˆ ˆ+ ˆ ˆ+ ˆ τ(π) π = [πR ]2 gε (πS ) + [πP ]2 gε (πR ) + [πS ]2 gε (πP ), ˆ which is positive when π ∈ int(Rn ). Thus, the target dynamic induced by τ is an excess ∗ payoff dynamic. In Figure 6.2.1 we presents a phase diagram for this dynamic in the good RPS game 0 −2 3 xR 3 F(x) = Ax = 0 −2 xP . −2 3 0 xS Evidently, solutions from many initial conditions lead to a limit cycle. § To explain why cycling occurs in the example above, we review some ideas about the geometry of stable games and target dynamics. By Theorem 2.3.16, every Nash equilibrium x∗ of a stable game is a GNSS. Geometrically, this means that at every nonequilibrium state x, the projected payoff vector ΦF(x) forms an acute or right angle with the line segment from x back to x∗ (Figures 2.3.3, 2.3.5, and 2.3.6). Meanwhile, our monotonicity condition for dynamics, positive correlation (PC), requires that away from equilibrium, each vector of motion VF (x) forms an acute angle with the projected payoff vector ΦF(x) (Figures 4.2.1 and 4.2.2). Combining these observations, we conclude that if the law of 225 ˙ motion x = VF (x) tends to deviate from the projected payoffs ΦF in “outward” directions— that is, in directions heading away from equilibrium—then cycling will occur (compare Figure 2.3.6 with Figure 6.2.1). On the other hand, if the deviations of VF from ΦF tend to be “inward”, then solutions should converge to equilibrium. By this logic, we should be able to guarantee convergence of target dynamics in stable games by ensuring that the deviations of VF from ΦF are toward the equilibrium, at least in some average sense. To accomplish this, we introduce an additional condition for revision protocols: integrability. (6.10) p There exists a C1 function γp : Rn → R such that τp ≡ γp . We call the functions γp introduced in this condition revision potentials. To give this condition a behavioral interpretation, it is useful to compare it to separability: (6.11) p p ˆ ˆ τi (πp ) is independent of π−i . The latter condition is stronger than the former: if τp satisfies (6.11), then it satisfies (6.10) with p (6.12) ˆ πi ˆ γ (π ) = p p i∈Sp 0 p τi (s) ds. In Example 6.2.6, the protocol (6.9) that generated cycling has the following noteworthy feature: the weights agents place on each strategy depend systematically on the payoffs of the next strategy in the best response cycle. Building on this motivation, one can obtain a game-theoretic interpretation of integrability. Roughly speaking, integrability (6.10) is equivalent to a requirement that in expectation, learning the weight placed on strategy j does not convey information about other strategies’ excess payoffs. It thus generalizes separability (6.11), which requires that learning the weight placed on strategy j conveys no information at all about other strategies’ excess payoffs (see the Notes). Before turning to our convergence theorems, we address a missing step in the motivating argument above: how does integrability ensure that the law of motion VF tends to deviate from the projected payoffs ΦF in the direction of equilibrium? To make this link, let us recall a characterization of integrablility from Section 2.A.9: the map τ : Rn → Rn is integrable if and only if its line integral over any piecewise smooth closed curve C ⊂ Rn 226 evaluates to zero: ˆ ˆ τ(π) · dπ = 0. (6.13) C Example 6.2.7. Let the population game F be generated by random matching in standard RPS: 0 −1 1 xR 1 F(x) = Ax = 0 −1 xP . −1 1 0 xS 1 The unique Nash equilibrium of F is the GNSS x∗ = ( 1 , 3 , 1 ). Game F has the convenient 3 3 property that at each state x ∈ X, the payoff vector F(x), the projected payoff vector ΦF(x), ˆ and the excess payoff vector F(x) are all the same, a fact that will simplify the notation in the argument to follow. Since F is null stable, we know that at each state x x∗ , the payoff vector F(x) is orthogonal to the vector x∗ − x. In Figure 2.3.6, these payoff vectors point counterclockwise relative to x∗ . Since positive correlation (PC) requires that the direction of motion VF (x) form an acute angle with F(x), dynamics satisfying (PC) also travel counterclockwise around the equilibrium. To address whether the deviations of VF from F tend to be inward or outward, let C ⊂ X 1 be a circle of radius c ∈ (0, √6 ] centered at the equilibrium x∗ . This circle is parameterized by the function ξ : [0, 2π] → X, where (6.14) √ −2 sin α c 3 cos α + sin α + x∗ . ξα = √ √ 6 − 3 cos α + sin α Here α is the counterclockwise angle between the vector ξα − x∗ and a rightward horizontal vector (see Figure 6.2.2). Since state ξα lies on the circle C, the vector x∗ − ξα can be drawn as a radius of C; thus, the payoff vector πα ≡ F(ξα ), which is orthogonal to x∗ − ξα , must be tangent to C at ξα , as shown in Figure 6.2.2. This observation is easy to verify analytically: (6.15) √ −2 3 cos α √ c −3 sin α + √3 cos α = 3 d ξ . πα = F(ξα ) = √ dα α √ 6 3 sin α + 3 cos α 227 R L⊥ (ξα ) C +α F(ξα ) = √3 dd ξα α σ (F(ξα )) σ – x* x* V (ξα ) F ξα– x* L(ξα ) ξα P S Figure 6.2.2: Integrability and inward motion of target dynamics in standard RPS. If we differentiate both sides of identity (6.15) with respect to the angle α, and note that d2 ξ = −(ξα − x∗ ), we can link the rate of change of the payoff vector πα = F(ξα ) to the (dα)2 α displacement of state ξα from x∗ : (6.16) d π dα α = √ √ d2 3 (dα)2 ξα = − 3(ξα − x∗ ). Now introduce an acute, integrable revision protocol τ. By combining integrability condition (6.13) with equation (6.16), we obtain 2π (6.17) 0= τ(π) · dπ ≡ C τ(πα ) 0 d π dα α 2π √ dα = − 3 τ(πα ) ξα − x∗ dα. 0 τ (π) Let us write λ(π) = i∈S τi (π) and σi (π) = λi(π) to express the dynamic in target form. Then because ξα − x∗ ∈ TX is orthogonal to x∗ = 1 1, we can conclude from equation (6.17) that 3 2π (6.18) λ(F(ξα )) σ(F(ξα )) − x∗ ξα − x∗ dα = 0. 0 Equation (6.18) is a form of the requirement described at the start of this section: it 228 asks that at states on the circle C, the vector of motion under the target dynamic (6.19) ˙ x = VF (x) = λ(F(x)) σ(F(x)) − x . typically deviates from the payoff vector F(x) in an inward direction—that is, in the direction of the equilibrium x∗ . To reach this interpretation of equation (6.18), note first that if the target state σ(F(ξα )) lies on or even near line L⊥ (ξα ), then motion from ξα toward σ(F(ξα )) is initially inward, as shown in Figure 6.2.2. (Of course, target state σ(F(ξα )) lies above L(ξα ) by virtue of positive correlation (PC).) Now, the integrand in (6.18) contains the inner product of the vectors σ(F(ξα )) − x∗ and ξα − x∗ . This inner product is zero precisely when then the two vectors are orthogonal, or, equivalently, when target state σ(F(ξα )) lies on L⊥ (ξα ). While equation (6.18) does not require the two vectors to be orthogonal, it asks that this be true on average, where the average is taken over states ξα ∈ C, and weighted by the rates λ(F(ξα )) at which ξα approaches σ(F(ξα )). Thus, in the presence of acuteness, integrability implies that on average, the dynamic (6.19) tends to point inward, toward the equilibrium x∗ . § The foregoing arguments suggest that together, monotonicity and integrability are enough to ensure global convergence of target dynamics in stable games. We now develop this intuition into formal results by constructing suitable Lyapunov functions. As a point of comparison, recall from Section 6.1.1 that in the case of dynamics for potential games, monotonicity conditions alone are sufficient to prove global convergence results: as the game’s potential function serves as a Lyapunov function for any dynamic satisfying positive correlation (PC). Unlike potential games, stable games do not come equipped with candidate Lyapunov functions. But if the revision protocol agents follow is integrable, then the revision potential of this protocol provides a building block for constructing a suitable Lyapunov function. Evidently, this Lyapunov function will vary with the dynamic under study, even when the game under consideration is fixed. Our first result concerns integrable excess payoff dynamics: that is, target dynamics whose protocols τp are Lipschitz continuous, acute (4.20), and integrable (6.10). The prototype p p ˆ ˆ for this class is the BNN dynamic: its protocol τi (πp ) = [πi ]+ is not only acute and p ˆ+ ˆ integrable, but also separable (6.11), and so admits potential function γp (πp ) = 1 i∈Sp [πi ]2 2 (cf equation (6.12)). ˙ Theorem 6.2.8. Let F be a C1 stable game, and let x = VF (x) be the integrable excess payoff p dynamic for F based on revision protocols τ with revision potentials γp . Define the C1 function 229 Γ : X → R by Γ(x) = ˆ mp γp (Fp (x)). p∈P Then Γ is a strict Lyapunov function for VF , and NE(F) is globally attracting. In addition, if F admits a unique Nash equilibrium, or if the protocols τp also satisfy separability (6.11), then we can choose Γ to be nonnegative with Γ−1 (0) = NE(F), and so NE(F) is globally asymptotically stable. For future reference, observe that the value of the Lyapunov function Γ at state x is the (mp -weighted) sum of the values of the revision potentials γp evaluated at the excess payoff ˆ vectors Fp (x). The conditions introduced in the last sentence of the theorem are needed to ensure that the Lyapunov function Γ is constant on the set NE(F) of Nash equilibria. Were this not the case, the set NE(F) could be globally attracting without being Lyapunov stable—see Example 6.B.3. The proof of Theorem 6.2.8 and those to come make heavy use of multivariate product and chain rules, which we review in Section 2.A.4. ˆ Proof of Theorem 6.2.8. (p = 1) Recall that the excess payoff vector F(x) is equal to ¯ ¯ F(x) − 1F(x), where F(x) = x F(x) is the population’s average payoff. By the product rule, ¯ the derivative of F is ¯ DF(x) = x DF(x) + F(x) . ˆ ¯ Therefore, the derivative matrix for the excess payoff function F(x) = F(x) − 1F(x) is ˆ ¯ DF(x) = D(F(x) − 1 F(x)) ¯ = DF(x) − 1 DF(x) (6.20) = DF(x) − 1(x DF(x) + F(x) ). Using (6.20) and the chain rule, we can compute the time derivative of Γ: ˙ ˙ Γ(x) = Γ(x) x ˆ ˆ˙ = γ(F(x)) DF(x)x ˆ ˙ = τ(F(x)) DF(x) − 1 (x DF(x) + F(x) ) x ˆ ˆ ˆ ˙ ˙ = τ(F(x)) − τ(F(x)) 1x DF(x)x − τ(F(x)) 1 F(x) x ˆ ˙ ˙ ˙ = x DF(x)x − (τ(F(x)) 1)(F(x) x) 230 ≤ 0, where the inequality follows from the facts that F is stable and VF satisfies positive correlation (PC). We now show that this inequality binds precisely on the set NE(F). To begin, note that if ˆ ˙ ˙ x ∈ RP(VF ) (i.e., if x = 0), then Γ(x) = 0. On the other hand, if x RP(VF ), then τ(F(x)) 1 > 0 ˙ ˙ and F(x) x > 0 (by condition (PC)), implying that Γ(x) < 0. Since NE(F) = RP(VF ), the claim is proved. That NE(F) is globally attracting then follows from Theorem 6.B.4. If F admits a unique Nash equilibrium x∗ , then the foregoing argument implies that x∗ is the unique minimizer of Γ: since the value of Γ is nonincreasing over time, a solution starting from a state x with Γ(x) < Γ(x∗ ) could not converge to x∗ , contradicting that x∗ is globally attracting. Thus, after normalizing by an additive constant, we find that Γ is nonnegative with Γ−1 (0) = {x∗ }, so the global asymptotic stability of x∗ follows from Corollary 6.B.7. If instead τ satisfies separability (6.11), we can define the revision potential γ as in equation (6.12). It then follows from Exercise 4.5.7 that Γ is nonnegative, with Γ(x) = 0 if ˆ and only if F(x) ∈ bd(Rn ). Thus, Lemma 4.5.4 implies that Γ(x) = 0 if and only if x ∈ NE(F), ∗ and so the global asymptotic stability of NE(F) again follows from Corollary 6.B.7. Next we consider the best response dynamic, which we here express by applying the maximizer correspondence ˆ ˆ Mp (πp ) = argmax ( yp ) πp yp ∈∆p to the vector of excess payoffs, yielding the exact target dynamic (BR) ˆ ˙ xp ∈ mp Mp (Fp (x)) − xp . Following the previous logic, we can assess the possibilities for convergence in stable games by checking monotonicity and integrability. Monotonicity was established in Theorem 5.1.8, which showed that (BR) satisfies an analogue of positive correlation (PC) appropriate for differential inclusions. For integrability, one can argue that the protocol Mp , despite being multivalued, is integrable in a suitably defined sense, with its “potential function” being given by the maximum function p µp (πp ) = max( yp ) πp = max πi . p p p y ∈∆ i∈S ˆ Note that if the payoff vector πp , and hence the excess payoff vector πp , have a unique 231 p ˆ maximizing component i ∈ Sp , then the gradient of µp at πp is the standard basis vector ei . ˆ But this vector corresponds to the unique mixed best response to πp , and so p ˆ ˆ µp (πp ) = ei = Mp (πp ). One can account for multiple optimal components using a broader notion of differentiaˆ ˆ ˆ tion: for all πp ∈ Rn , Mp (πp ) is the subdifferential of the convex function µp at πp (see the Notes). Having verified monotonicity and integrability, we again construct our candidate Lyapunov function by plugging the excess payoff vectors into the revision potentials µp . The resulting function G is very simple: it measures the difference between the payoffs agents could obtain by choosing optimal strategies and their actual aggregate payoffs. ˙ Theorem 6.2.9. Let F be a C1 stable game, and let x ∈ VF (x) be the best response dynamic for F. Define the Lipschitz continuous function G : X → R+ by ˆ G(x) = max ( y − x) F(x) = max Fi (x). y∈X i∈S Then G−1 (0) = NE(F). Moreover, if {xt }t≥0 is a solution to VF , then for almost all t ≥ 0 we have ˙ that G(xt ) ≤ −G(xt ), and so NE(F) is globally asymptotically stable under VF . Proof. (p = 1) That G−1 (0) = NE(F) follows from Lemma 4.5.4. To prove the second claim, let {xt }t≥0 be a solution to VF , and let S∗ (t) ⊆ S be the set of pure best responses to state ˆ xt . Since {xt }t≥0 is Lipschitz continuous, the map t → Fi (xt ) is also Lipschitz continuous ˆ for each strategy i ∈ S. Thus, since G(x) = max y∈X ( y − x) F(x) = maxi∈S Fi (x), it follows from Danskin’s Envelope Theorem (see the Notes) that the map t → G(xt ) is Lipschitz continuous, and that at almost all t ∈ [0, ∞), (6.21) ˙ G(xt ) ≡ d dt ˆ max Fi (xt ) = i∈S for all i∗ ∈ S∗ (t). dˆ F ∗ (xt ) dt i ˙ Applying equation (6.20), we find that for t satisfying equation (6.21) and at which xt exists, we have that ˙ G(xt ) = dˆ F ∗ (xt ) dt i ˙ = ei∗ DF(xt ) − xt DF(xt ) − F(xt ) xt ˙ ˙ = ( y∗ − x ) DF(x )x − F(x ) x t t t t t ˙ ˙ ˙ = xt DF(xt )xt − F(xt ) xt ˙ ≤ −F(xt ) xt 232 for all i∗ ∈ S∗ (t) for all i∗ ∈ S∗ (t) for all y∗ ∈ argmax y∈∆ ˆ y F(xt ) = − max F(xt ) ( y − xt ) y∈X = −G(xt ), where the inequality follows from the fact that F is a stable game. (Note that the the equality of the third to last and last expressions is also implied by Theorem 5.1.8.) The global asymptotic stability of NE(F) then follows from Theorems 6.B.2 and 6.B.6. Finally, we consider convergence under perturbed best response dynamics. These are exact target dynamics of the form ˜ˆ ˙ xp = mp Mp (Fp (x)) − xp ; here, the target protocol is the perturbed maximizer function ˜ˆ ˆ Mp (πp ) = argmax ( yp ) πp − vp ( yp ), yp ∈int(∆p ) where vp : int(∆p ) → R is an admissible deterministic perturbation (see Section 5.2.2). Once again, we verify the two conditions that underlie convergence. Theorem 5.2.13 showed that all perturbed best response dynamics satisfy virtual positive correlation (5.17), establishing the required monotonicity. As for integrability, Observation 5.C.3 showed ˜ that the protocol Mp is integrable; its revision potential, (6.22) ˜ µp (πp ) = pmax p ( yp ) πp − vp ( yp ), y ∈int(∆ ) is the perturbed maximum function induced by vp . Now, mimicking Theorem 6.2.8, we ˜ construct our Lyapunov function by composing the revision potentials µp with the excess p ˆ payoff functions F . ˙ Theorem 6.2.10. Let F be a C1 stable game, and let x = VF,v (x) be the perturbed best response dynamic for F generated by the admissible deterministic perturbations v. Define the function ˜ G : int(X) → R+ by ˜ G(x) = 1 ˜ˆ mp µp (Fp (x)) + vp ( mp xp ) , p∈P ˜ Then G−1 (0) = PE(F, v), and this set is a singleton. Moreover, G is a strict Lyapunov function for VF,v , and so PE(F, v) is globally asymptotically stable under VF,v . ˜ Proof. (p = 1) As in Section 5.2, let F(x) = F(x) − v(x) be the virtual payoff function 233 generated by (F, v). Then ˜ ˜ x ∈ PE(F, v) ⇔ ΦF(x) = 0 ⇔ x = argmax y F(x) − v( y) ⇔ G(x) = 0. y∈int(∆) ˜ To prove that G is a strict Lyapunov function, recall from Observation 5.C.3 that the ˜ perturbed maximum function µ defined in equation (6.22) is a potential function for the ˜ ˜ ˜ perturbed maximizer function M: that is, µ ≡ M. Therefore, since F is stable, virtual positive correlation (5.17) implies that ˙ ˜ G(x) = d dt ˜ˆ µ(F(x)) + v(x) = d dt ˜ µ(F(x)) − (x F(x) − v(x)) ˜ ˙ ˙˙ ˙ = M(F(x)) DF(x) x − (x DF(x) x + x F(x) − x v(x)) ˜ ˙˙ = (M(F(x)) − x) DF(x) x − x (F(x) − v(x)) ˙ ˙ ˙˜ = x DF(x) x − x F(x) ≤ 0, with equality if and only if x is a rest point. But RP(VF,v ) = PE(F, v) by definition, so Corollary 6.B.7 implies that PE(F, v) is globally asymptotically stable. Finally, we prove that PE(F, v) is a singleton. Let ˜ φx,h (t) = h F(x + t h) for all x ∈ X, h ∈ TX − {0}, and t ∈ R such that x + th ∈ int(X). Since F is stable and D2 v(x + th) is positive definite with respect to TX × TX, we have that (6.23) ˜ φx,h (t) = h DF(x + t h) h = h DF(x + t h) h − h D2 v((x + t h)) h < 0, and so φx,h (t) is decreasing in t. Moreover, (6.24) ˜ x ∈ PE(F, v) ⇔ F(x) is a constant vector ⇔ φx,h (0) = 0 for all h ∈ TX − {0}. Now let x ∈ PE(F, v) and y ∈ X − {x}. Then y = x + t y h y for some t y > 0 and h y ∈ TX − {0}. Statements (6.23) and (6.24) imply that ˜ ˜ φ y,hy (0) = h y F( y) = h y F(x + t y h y ) = φx,hy (t y ) < φx,hy (0) = 0. Therefore, statement (6.24) implies that y PE(F, v), and hence that PE(F, v) = {x}. 234 6.2.3 Impartial Pairwise Comparison Dynamics In Section 4.6, we defined pairwise comparison dynamics by considering Lipschitz continuous revision protocols ρp that only condition on payoffs and that are sign preserving: p p p sgn(ρi j (πp )) = sgn([π j − πi ]+ ) for all i, j ∈ Sp and p ∈ P . To obtain a general convergence result for stable games, we require an additional condition called impartiality: (6.25) p p p p p ρi j (πp ) = φ j (π j − πi ) for some functions φ j : R → R+ . Combining this restriction with mean dynamic equation (M), we see that impartial pairwise comparison dynamics take the form p p ˙ xi = p p p p p x j φi (Fi (x) − F j (x)) − xi j∈Sp p p φ j (F j (x) − Fi (x)). j∈Sp p p Under impartiality (6.25), the function of the payoff difference π j − πi that describes the conditional switch rate from i to j does not depend on an agent’s current strategy i. This restriction introduces at least a superficial connection with the target dynamics studied in Section 6.2.2, as both restrict the dependence of agents’ decisions on their current choices of strategy. Theorem 6.2.11 shows that together, sign preservation and impartiality ensure global convergence to Nash equilibrium in stable games. ˙ Theorem 6.2.11. Let F be a C1 stable game, and let x = VF (x) be an impartial pairwise comparison dynamic for F. Define the Lipschitz continuous function Ψ : X → R+ by p Ψ(x) = p p p p∈P i∈Sp j∈Sp d p xi ψ j (F j (x) − Fi (x)), where ψk (d) = 0 p φk (s) ds p ˙ is the definite integral of φk . Then Ψ−1 (0) = NE(F). Moreover, Ψ(x) ≤ 0 for all x ∈ X, with equality if and only if x ∈ NE(F), and so NE(F) is globally asymptotically stable. To understand the role played by impartiality (6.25), recall the general formula for the mean dynamic: (M) pp p ˙ xi = p p ρi j (Fp (x), xp ). x j ρ ji (Fp (x), xp ) − xi j∈Sp j∈Sp 235 According to the second term of this expression, the rate of outflow from strategy i is p p p xi k∈Sp ρik ; thus, the percentage rate of outflow from i, k∈Sp ρik , varies with i. It follows that strategies with high payoffs can nevertheless have high percentage outflow rates: even if p p p p πi > π j , one can still have ρik > ρ jk for k i, j. Having good strategies lose players more quickly than bad strategies is an obvious impediment to convergence to Nash equilibrium. Impartiality (6.25) places controls on these percentage outflow rates. If the conditional p switch rates φ j are monotone in payoffs, then condition (6.25) ensures that better strategies have lower percentage outflow rates. If the conditional switch rates are not monotone, but merely sign-preserving, condition (6.25) still implies that the integrated conditional switch p rates ψk are ordered by payoffs. According to the analysis below, this control is enough to ensure convergence of pairwise comparison dynamics to Nash equilibrium in stable games. Proof. (p = 1) The first claim is proved as follows: Ψ(x) = 0 ⇔ [xi = 0 or ψ j (F j (x) − Fi (x)) = 0] for all i, j ∈ S ⇔ [xi = 0 or Fi (x) ≥ F j (x)] for all i, j ∈ S ⇔ [xi = 0 or Fi (x) ≥ max j∈S F j (x)] for all i, j ∈ S ⇔ x ∈ NE(F). To begin the proof of the second claim, we compute the partial derivatives of Ψ: ∂Ψ (x) = ∂x l xi ρi j i∈S j∈S ∂F j ∂Fi (x) − (x) + ∂xl ∂xl xi ρi j − x j ρ ji = i∈S j∈S = ˙ xj j∈S ∂F j (x) + ∂x l ∂F j (x) + ∂xl ψk (Fk (x) − Fl (x)) k∈S ψk (Fk (x) − Fl (x)) k ∈S ψk (Fk (x) − Fl (x)). k∈S Using this expression, we find the rate of change of Ψ over time along solutions to (M): ˙ ˙ Ψ(x) = Ψ(x) x ˙ ˙ = x DF(x)x + ψk (Fk − Fi ) ˙ xi i∈S k∈S ˙ ˙ = x DF(x)x + x j ρ ji − xi ρi j i∈S j∈S ψk (Fk − Fi ) k∈S 236 ˙ ˙ = x DF(x)x + i∈S j∈S x ρ j ji ψk (Fk − Fi ) − ψk Fk − F j k∈S . To evaluate the summation, first observe that if Fi (x) > F j (x), then ρ ji (F(x)) ≡ φi (Fi (x) − F j (x)) > 0 and Fk (x) − Fi (x) < Fk (x) − F j (x); since each ψk is nondecreasing, it follows that ψk (Fk − Fi ) − ψk (Fk − F j ) ≤ 0. In fact, when k = i, the comparison between payoff differences becomes 0 < Fi (x) − F j (x); since each ψi is increasing on [0, ∞), it follows that ψi (0) − ψi (Fi − F j ) < 0. We therefore conclude that if Fi (x) > F j (x), then ρ ji (F(x)) > 0 and k∈S ψk (Fk − Fi ) − ψk Fk − F j < 0. On the other hand, if F j (x) ≥ Fi (x), we have ˙ ˙ immediately that ρ ji (F(x)) = 0. And of course, x DF(x)x ≤ 0 since F is stable. ˙ Marshaling these facts, we find that Ψ(x) ≤ 0, and that (6.26) ˙ Ψ(x) = 0 if and only if x j ρ ji (F(x)) = 0 for all i, j ∈ S. Lemma 4.6.5 shows that the second condition in (6.26) defines the set RP(VF ), which is equal to NE(F) by Theorem 4.6.3; this proves the second claim. Finally, the global asymptotic stability of NE(F) follows from Corollary 6.B.7. Exercise 6.2.12. Construct a pairwise comparison dynamic that generates cycling in the good RPS game from Example 6.2.6. 6.2.4 Summary In Table 6.1, we summarize the results in this section by presenting the Lyapunov functions for single-population stable games for the six fundamental evolutionary dynamics. The Lyapunov functions divide into three classes: those based on an explicit notion of “distance” from equilibrium, those based on revision potentials for target protocols, and the Lyapunov function for the Smith dynamic, which stands alone. Example 6.2.13. Matching Pennies. In Figure 6.2.3, we present phase diagrams of the six fundamental dynamics in two-population Matching Pennies: 1 FH (x) 0 0 1 −1 x1 x2 − x2 H h t 1 F (x) 0 T T t 0 −1 1 x1 x2 − x2 h = = . 2 2 1 F (x) −1 1 0 0 xh xT − x1 h H 2 2 1 Ft (x) 1 −1 0 0 xt xH − x1 T Each phase diagram is drawn atop a contour plot of the relevant Lyapunov function. Since Matching Pennies is a zero-sum game, F is null stable; thus, the Lyapunov functions 237 H h t H h T T (i) replicator H h (ii) projection t H T h t T (iii) Brown-von Neumann-Nash H t h (iv) Smith t H T h t T (v) best response (vi) logit(.2) Figure 6.2.3: Six basic dynamics in Matching Pennies. The contour plots are the corresponding Lyapunov functions. 238 Dynamic Formula Lyapunov function projection ˙ x = ΠTX(x) (F(x)) Ex∗ (x) = |x − x∗ |2 replicator ˆ ˙ xi = xi Fi (x) best response ˆ ˙ x ∈ M(F(x)) − x ˆ G(x) = µ(F(x)) logit ˜ˆ ˙ x = M(F(x)) − x ˜ ˜ˆ G(x) = µ(F(x)) + v(x) ˆ ˙ xi = [Fi (x)]+ − xi BNN Smith ˙ xi = j∈S Hx∗ (x) = ˆ j∈S [F j (x)]+ x j [Fi (x) − F j (x)]+ − xi [F j (x) − Fi (x)]+ j∈S Γ(x) = Ψ(x) = 1 2 i∈S(x∗ ) 1 2 i∈S j∈S x∗ log i x∗ i xi 2 ˆ i∈S [Fi (x)]+ xi [F j (x) − Fi (x)]2 + Table 6.1: Lyapunov functions for the six fundamental dynamics in stable games. for the replicator and projection dynamics define constants of motion for these dynamics, with solution trajectories cycling along level curves. In the remaining cases, all solutions 11 1 converge to the unique Nash equilibrium, x∗ = (( 2 , 2 ), ( 1 , 2 )). § 2 6.3 Supermodular Games In a supermodular game, higher choices by one’s opponents make one’s own higher strategies look relatively more desirable. In Section 2.4, we used this property to show that the best response correspondences of supermodular games are monotone in the stochastic dominance order; this implies in turn that these games admit minimal and maximal Nash equilibria. Given this monotone structure on best response correspondences, it is natural to look for convergence results for supermodular games under the best response dynamic (BR). In Section 6.3.1, we use elementary methods to establish a global convergence result for (BR) under some strong additional assumptions on the underlying game: in particular, it must be derived from a two-player normal form game that satisfies both supermodularity and “diminishing returns” conditions. To prove more general convergence results, we appeal to the theory of cooperative differential equations. These are smooth differential equations under which increasing the value any component of the state variable increases the growth rates of all other components. Under some mild regularity conditions, almost all solutions of these equations converge to rest points. 239 Because of the smoothness requirement, these techniques cannot be applied to the best response dynamic itself. Happily, the needed monotonicity carries over from exact best responses to perturbed best responses, although only those that can be generated from stochastic perturbations of payoffs. In Section 6.3.2, we use this idea to prove almost global convergence of stochastically perturbed best response dynamics in supermodular games. 6.3.1 The Best Response Dynamic in Two-Player Normal Form Games Let U = (U1 , U2 ) be a two-player normal form game, and let F be the population game obtained when members of two populations are randomly matched to play U (cf Example 1.2.2). Then the best response dynamic (BR) for F takes the form (BR) ˙ x1 ∈ B1 (x) − x1 = M1 (F1 (x)) − x1 = M1 (U1 x2 ) − x1 , ˙ x2 ∈ B2 (x) − x2 = M2 (F2 (x)) − x2 = M2 ((U2 ) x1 ) − x2 . Our convergence result for supermodular games concerns simple solutions of this dynamic. A solution {xt }t≥0 of (BR) is simple if the set of times at which it is not differentiable has no accumulation point, and if at other times, the target states Bp (xt ) are pure (i.e., vertices of Xp ). Exercise 6.3.1. (i) Given an example of a Nash equilibrium x∗ of a 2 × 2 game such that no solution to (BR) starting from x∗ is simple. (ii) Show that there exists a simple solution to (BR) from every initial condition in game ˆ ˆ U = (U1 , U2 ) if for all nonempty sets S1 ⊆ S1 and S2 ⊆ S2 , the game in which players ˆ ˆ are restricted to strategies in S1 and S2 admits a pure Nash equilibrium. (Theorem 2.4.12 implies that U has this property if it is supermodular.) If {xt }t≥0 is a simple solution trajectory of (BR), we can list the sequence of times {tk } at which the solution is not differentiable (i.e., at which the target state for at least one population changes). During each open interval of times Ik = (tk−1 , tk ), the pure strategies ik ∈ S1 and jk ∈ S2 selected by revising agents are fixed. We call ik and jk the interval k selections for populations 1 and 2. The following lemma links shows that ik+1 must perform at least as well as ik against both jk and jk+1 , and that the analogous comparisons between the payoffs of jk and jk+1 also hold. Lemma 6.3.2. Suppose that revising agents select strategies i = ik and j = jk during interval Ik , and strategies i = ik+1 and j = jk+1 during interval Ik+1 . Then (i) Ui1 j ≥ Ui1j and Ui2j ≥ Ui2j , and 240 (ii) Ui1 j ≥ Ui1j and Ui2 j ≥ Ui2 j . Exercise 6.3.3. Prove Lemma 6.3.2. (Hint: Start by verifying that x2k is a convex combination t 2 2 2 of xtk−1 and the vertex v j , and that xtk+1 is a convex combination of x2k and v2 .) t j Now, recall from Exercise 2.4.4 that U = (U1 , U2 ) is supermodular if (6.27) Ui1+1, j+1 − Ui1, j+1 ≥ Ui1+1, j − Ui1, j and Ui2+1, j+1 − Ui2+1, j ≥ Ui2, j+1 − Ui2, j for all i < n1 , j < n2 . (When (6.27) holds, the population game F induced by U is supermodular as well.) If the inequalities in (6.27) always hold strictly, we say that U is strictly supermodular. Our convergence result requires two additional conditions on U. We say that U exhibits strictly diminishing returns if for each fixed strategy of the opponent, the benefit a player obtains by increasing his strategy is decreasing—in other words, if payoffs are “concave in own strategy”: Ui1+2, j − Ui1+1, j < Ui1+1, j − Ui1, j for all i ≤ n1 − 2 and j ∈ S2 , and Ui2, j+2 − Ui2, j+1 < Ui2, j+1 − Ui2, j for all i ∈ S1 and j ≤ n2 − 2. Finally, we say that U is nondegenerate if for each fixed pure strategy of the opponent, a player is not indifferent among any of his pure strategies. Theorem 6.3.4. Suppose that F is generated by random matching in a two-player normal form game U that is strictly supermodular, exhibits strictly diminishing returns, and is nondegenerate. Then every simple solution trajectory of the best response dynamic (BR) converges to a pure Nash equilibrium. Proof. To begin, suppose that the sequence of times {tk } is finite, with final element tK . Let i∗ and j∗ be the selections made by revising agents after time tK . Then the pure state x∗ = (v1∗ , v2∗ ) is in B(xt ) for all t ≥ tK , and {xt } converges to x∗ . Since payoffs are continuous, i j it follows that x∗ ∈ B(x∗ ), and so that x∗ is a Nash equilibrium. To complete the proof of the theorem, we establish by contradiction that the sequence of times {tk } cannot be infinite. To begin, note that at time tk , agents in each population p are indifferent between their interval k and interval k + 1 selections. Moreover, since U exhibits strictly decreasing returns, it is easy to verify that whenever such an indifference occurs, it must be between two consecutive strategies in Sp . Putting these observations together, we find that each transition in the sequence {(ik , jk )} is of length 1, in the sense that |ik+1 − ik | ≤ 1 and jk+1 − jk ≤ 1 for all k. 241 1 ~ j ĵ n2 1 ĩ n1 Figure 6.3.1: The proof of Theorem 6.3.4. Next, we say that there is an improvement step from (i, j) ∈ S to (i , j ) ∈ S, denoted (i, j) (i , j ), if either (i) Ui1 j > Ui1j and j = j, or (ii) i = i and Ui2j > Ui2j . Lemma 6.3.2(i) and the fact that U is nondegenerate imply that (ik , jk ) (ik+1 , jk+1 ) if either ik = ik+1 or jk = jk+1 . Moreover, applying both parts of the lemma, we find that if ik ik+1 and jk jk+1 , we have that (ik , jk ) (ik+1 , jk ) (ik+1 , jk+1 ), and also that (ik , jk ) (ik , jk+1 ) (ik+1 , jk+1 ). Now suppose that the sequence {tk } is infinite. Then since S is finite, there must be a strategy profile that is the interval k selection for more than one k. In this case, the arguments in the previous two paragraphs imply that there is a length 1 improvement cycle: that is, a sequence of length 1 improvement steps beginning and ending with the same strategy profile. Evidently, this cycle must contain an improvement step of the form (˜, ˜) ı (˜, ˜ + 1) for ı some (˜, ˜) ∈ S. Strict supermodularity of U then implies that ı (6.28) (i, ˜) (i, ˜ + 1) for all i ≥ ˜. ı It follows that for the sequence of length 1 improvement steps to return to (˜, ˜), there must ı be an improvement step of the form (˜, ˆ) ı (˜ − 1, ˆ) for some ˆ > ˜ (see Figure 6.3.1). This ı time, strict supermodularity of U implies that (6.29) (˜, j) ı (˜ − 1, j) for all j ≤ ˆ. ı From (6.28) and (6.29), it follows that no cycle of length 1 improvement steps containing (˜, ˜) ı (˜, ˜ + 1) can reach any strategy profile (i, j) with i ≥ ˜ and j ≤ ˜. In particular, ı ı the cycle cannot return to (˜, ˜), which is a contradiction. This completes the proof of the ı theorem. 242 6.3.2 Stochastically Perturbed Best Response Dynamics While Theorem 6.3.4 was proved using elementary techniques, it was not as general as one might hope: it restricted attention to two-player normal form games, and required not only the assumption of supermodularity, but also that of decreasing returns. In order to obtain a more general convergence result, we turn from exact best response dynamics to perturbed best response dynamics. Doing so allows us to avail ourselves of a powerful set of techniques for smooth dynamics with a monotone structure: the theory of cooperative differential equations. To begin, let us recall the transformations used to discuss the stochastic dominance p p p p ˜ order. In Section 2.4, we defined the matrices Σ ∈ R(n −1)×n , Σ ∈ Rn ×(n −1) , and Ω ∈ Rn×n by 0 1 · · · 1 . . . . .. . . . , Σ= . . . 0 · · · 0 1 −1 0 1 −1 ˜ =0 Σ 1 . . . . .. .. . 0 ··· 0 1 . .. . 0 . . .. , and Ω = 0 . 0 . . .. . . −1 0 0 1 1 · · · · · · 1 0 · · · · · · 0 . .. . . . 0 . . .. . . . . . . 0 ··· ··· 0 We saw that yp ∈ Xp stochastically dominates xp ∈ Xp if and only if Σ yp ≥ Σxp . We also verified that (6.30) ˜ ΣΣ = I − Ω. Since Ω is the null operator on TXp , equation (6.30) describes a sense in which the stochastic ˜ dominance operator Σ is “inverted” by the difference operator Σ. Applying the change of coordinates Σ to the set Xp yields the set of transformed population states p X p ≡ ΣXp = x p ∈ Rn −1 : mp ≥ x 1 ≥ · · · ≥ x p p −1 ≥ 0 . n p By postmultiplying both sides of (6.30) by xp and letting xp = (mp , 0, . . . , 0) denote the minimal state in Xp , we find that the inverse of the map Σ : Xp → X p is described as follows: (6.31) ˜ x p = Σxp ⇔ xp = Σx p + xp . To work with full social states x ∈ X, we introduce the block diagonal matrices Σ = ˜ ˜ ˜ diag(Σ, . . . , Σ) and Σ = diag(Σ, . . . , Σ), and let X ≡ ΣX = p∈P X p . If we let x = (x1 , . . . , xp ) 243 be the minimal state in X, then the inverse of the map Σ : X → X is described by (6.32) ˜ x = Σx ⇔ x = Σx + x. To simplify the discussion to follow, let us assume for convenience that each population is of mass 1. Then our stochastically perturbed best response dynamics take the form (6.33) ˜ ˙ xp = Mp (Fp (x)) − xp , where p p ˜p Mi (πp ) = P i = argmax j∈Sp π j + ε j for some admissible stochastic perturbations ε = (ε1 , . . . , εp ). Rather than study this dynamic directly, we apply the change of variable (6.32) to obtain a new dynamic on the set X : (6.34) ˜ ˜ ˙ x p = ΣMp (Fp (Σx + x)) − x p . ˜ Given the current state x ∈ X , we use the inverse transformation x → x ≡ Σx + x to obtain the input for the payoff function Fp , and we use the original transformation ˜ ˜ Mp (Fp (x)) → ΣMp (Fp (x)) to convert the perturbed best response into an element of X p . The next observation verifies the relationship between solutions to the transformed dynamic (6.34) and solutions to the original dynamic (6.33). ˜ Observation 6.3.5. (6.33) and (6.34) are affinely conjugate: {xt } = {Σx t + x} solves (6.33) if and only if {x t } = {Σxt } solves (6.34). Our next task is to show that if F is a supermodular game, then (6.34) is a cooperative ˙ differential equation: writing this dynamic as x = V (x ), we want to show that ∂Vi p q ∂x j (x ) ≥ 0 for all x ∈ X whenever (i, p) ( j, q). If this inequality is always satisfied strictly, we say that (6.34) is strongly cooperative. As we explain in Section 6.C, strongly cooperative differential equations converge to rest points from almost all initial conditions. Thus, if we can prove that equation (6.34) is strongly cooperative, we can conclude that almost all solutions of our original dynamic (6.33) converge to perturbed equilibria. 244 To prove that (6.34) is strongly cooperative, we marshal our facts about supermodular games and stochastically perturbed best responses. Recall from Chapter 2 that if the population game F is C1 , then F is a supermodular if and only if ˜ ˜ Σ DF(x)Σ ≥ 0 for all x ∈ X. Our result requires an additional nondegeneracy condition: we say that F is irreducible if ˜ ˜ each column of Σ DF(x)Σ contains a strictly positive element. ˜ Next, we recall from Lemma 5.C.1 the basic properties of DM(π), the derivative matrix ˜ of the stochastically perturbed best response function M. ˜ Lemma 6.3.6. Fix π ∈ Rn , and suppose that the perturbed best response function M is derived ˜ from admissible stochastic payoff perturbations. Then the derivative matrix DM(π) is symmetric, ˜ has negative off-diagonal elements, and satisfies DM(π)1 = 0. Combining these facts yields the desired result: Theorem 6.3.7. Let F be a C1 irreducible supermodular game, and let (6.33) be a stochastically perturbed best response dynamic for F. Then the transformed dynamic (6.34) is strongly cooperative. ˙ Proof. (p = 1) Write the dynamic (6.34) as x = V (x ). Then (6.35) ˜ ˜ DV (x ) = D(ΣM(F(Σx + x))) − I. Since all off-diagonal elements of I equal zero, it is enough to show that the first term on the right hand side of (6.35) has all positive components. ˜ ˜ ˜ Let x = Σx + x and π = F(x). Using the facts that ΣΣ = I − Ω and DM(π)1 = 0, we express the first term of the right hand side of (6.35) as follows: ˜ ˜ ˜ ˜ D(ΣM(F(Σx + x))) = Σ DM(π) DF(x) Σ ˜ ˜ ˜ = Σ DM(π) (Σ Σ + Ω ) DF(x) Σ ˜ ˜ ˜ = (ΣDM(π)Σ )(Σ DF(x)Σ). Lemma 6.3.6 and the fact that ˜ (ΣDM(π)Σ )i j = ˜ DM(π)kl k>i l> j ˜ imply that every component of ΣDM(π)Σ is positive (see Exercise 6.3.8). Since F is ˜ ˜ supermodular and irreducible, Σ DF(x)Σ is nonnegative, with each column containing a 245 positive element. Thus, the product of these two matrices has all positive elements. This completes the proof of the theorem. ˜ Exercise 6.3.8. (i) Prove that every component of ΣDM(π)Σ is positive. ˜ (ii) Explain why Theorem 6.3.7 need not hold when M is generated by deterministic perturbations. Observation 6.3.5, Theorem 6.3.7, and Theorems 6.C.1, 6.C.2, and 6.C.3 immediately imply the following “almost global” convergence result. In part (i) of the theorem, x = ¯ (x1 , . . . , xp ) is the minimal state in X introduced above; similarly, xp = (0, . . . , mp ) is the p 1 p ) is the maximal state in X. ¯ ¯ ¯ maximal state in X , and x = (x , . . . , x ˙ Theorem 6.3.9. Let F be a C1 irreducible supermodular game, and let x = VF,ε (x) be a stochastically perturbed best response dynamic for F. Then ¯ ¯ (i) States x∗ ≡ ω(x) and x∗ ≡ ω(x) exist and are the minimal and maximal elements of PE(F, ε). ∗ , x∗ ] contains all ω-limit points of V and is globally asymptotically stable. Moreover, [x ¯ F,ε ˙ (ii) Solutions to x = VF,ε (x) from an open, dense, full measure set of initial conditions in X converge to states in PE(F, ε). Our final example shows that the conclusion of Theorem 6.3.9 cannot be extended from convergence from almost all initial conditions to convergence from all initial conditions. Example 6.3.10. Let U be a normal form game with p ≥ 5 players and two strategies per player. Each player p in U obtains a payoff of 1 if he chooses the same strategy as player p + 1 (with the convention that p + 1 = 1) and obtains a payoff of 0 otherwise. U has three Nash equilibria: two strict equilibria in which all players coordinate on the same strategy, 1 and the mixed equilibrium x∗ = (( 1 , 1 ), . . . , ( 1 , 2 )). If F is the p population game generated 22 2 by random matching in U, it can be shown that F is supermodular and irreducible (see Exercise 6.3.11(i)). pp We now introduce random perturbations εp = (ε1 , ε2 ) to each player’s payoffs. These p p perturbations are such that the differences ε2 − ε1 admit a common density g that is symmetric about 0, is decreasing on R+ , and satisfies g(0) > 1 . It can be shown that the 2 resulting perturbed best response dynamic (6.33) possesses exactly three rest points: the mixed equilibrium x∗ , and two stable symmetric rest points that approximate the two pure Nash equilibria (see Exercise 6.3.11(ii)). One can show that the rest point x∗ is unstable under (6.33). It then follows from Theorem 6.3.9 that the two stable rest points of (6.33) attract almost all initial conditions in X, and that the basins of attraction for these rest points are separated by a p − 1 dimensional 246 invariant manifold M that contains x∗ . Furthermore, one can show that when p ≥ 5, the rest point x∗ is unstable with respect to the manifold M . Thus, solutions from all states in M − {x∗ } fail to converge to a rest point. § The details of these last arguments require techniques for determining the local stability of rest points. This is the topic of the next chapter. Exercise 6.3.11. (i) Prove that the game F introduced in Example 6.3.10 is supermodular and irreducible. (ii) Prove that under the assumption on payoff perturbations stated in the example, there are exactly three perturbed equilibria, all of which are symmetric. 6.4 Dominance Solvable Games The elimination of strictly dominated strategies is the mildest requirement employed in standard game-theoretic analyses, and so it seems natural to expect evolutionary dynamics obey this dictum. In this section, we provide some positive results on the elimination of dominated strategies: under the best response dynamic, any strictly dominated strategy must vanish in the limit; the same is true under any imitative dynamic so long as we focus on interior initial conditions. Arguing inductively, we show next that any strategy that does not survive iterated elimination of strictly dominated strategies vanishes as well. In particular, if a game is dominance solvable—that is, if removing iteratively dominated strategies leaves only one strategy for each population, then best response and imitative dynamics select this strategy. These results may seem unsurprising. However, we argue in Chapter 8 that they are actually borderline cases: under “typical” evolutionary dynamics, strictly dominated strategies can survive in perpetuity. 6.4.1 Dominated and Iteratively Dominated Strategies Let F be a population game. We say that strategy i ∈ Sp is strictly dominated if there exists a strategy j ∈ Sp such that F j (x) > Fi (x) for all x ∈ X: that is, if there is a strategy j that ˆ outperforms strategy i regardless of the population state. Similarly, if Sp is a nonempty p p p ˆ ˆ ˆ subset of S and S = p∈P S , we say that i ∈ S is strictly dominated relative to S, denoted ˆ ˆ i ∈ D p (S), if there exists a strategy j ∈ Sp such that F j (x) > Fi (x) for all x ∈ X that satisfy ˆ support(xp ) ⊆ Sp for all p ∈ P . We can use these definitions to introduce the notion of iterative dominance. Set S0 = S. Then D p (S0 ) is the set of strictly dominated strategies for population p, and 247 p p so S1 = S0 − D p (S0 ) is the set of strategies that are not strictly dominated. Proceeding inductively, we define D p (Sk ) to be the set of strategies that are eliminated during the p p (k + 1)st round of removal of iteratively dominated strategies, and we let Sk+1 = Sk − D p (Sk ) be the set of strategies that survive k + 1 rounds of removal of such strategies. Since the number of strategies is finite, this iterative procedure must converge, leaving p us with nonempty sets S1 , . . . , S∗ . Strategies in these sets are said to survive iterative removal ∗ of strictly dominated strategies. If each of these sets is a singleton, then the game F is said to be dominance solvable. In this case, the pure social state at which each agent plays his population’s sole surviving strategy is the game’s unique Nash equilibrium; we call this state the dominance solution of F. 6.4.2 The Best Response Dynamic Under the best response dynamic, revising agents always switch to optimal strategies. Since strictly dominated strategies are never optimal, such strategies cannot persist: Observation 6.4.1. Let {xt } be a solution trajectory of (BR) for population game F, in which p strategy i ∈ Sp is strictly dominated. Then limt→∞ (xt )i = 0. p p ˙ Indeed, since i is never a best response, we have that (xt )i ≡ −(xt )i , and hence that p p −t (xt )i = (x0 )i e : the mass playing the dominated strategy converges to zero exponentially quickly. An inductive argument takes us from the observation above to the following result. Theorem 6.4.2. Let {xt } be a solution trajectory of (BR) for population game F, in which strategy p i ∈ Sp does not survive iterative elimination of strictly dominated strategies. Then limt→∞ (xt )i = 0. In particular, if F is dominance solvable, then all solutions of (BR) converge to the dominance solution. p p Proof. Observation 6.4.1 provides the basis for this induction: if i S1 , then limt→∞ (xt )i = p 0. As the inductive hypothesis, suppose that this same equality holds for all i Sk . Now p p p p p let j ∈ Sk − Sk+1 . Then by definition, there exists a j ∈ Sk+1 such that F j (x) > F j (x) whenever p p x ∈ Xk , where Xk = {x ∈ X : xi > 0 ⇒ i ∈ Sk } is the set of social states in which all agents p in each population p choose a strategies in Sk . Since Xk is compact and F is continuous, it p p follows that for some c > 0, we have that F j (x) > F j (x) + c whenever x ∈ Xk , and so that p p p p for some ε > 0, we have that F j (x) > F j (x) whenever x ∈ Xk,ε = {x ∈ X : xi > ε ⇒ i ∈ Sk }. By the inductive hypothesis, there exists a T > 0 such that xt ∈ Xk,ε for all t ≥ T. Thus, for p p ˙ such t, j is not a best response to xt . This implies that (xt ) j = −(xt ) j for t ≥ T, and hence p p that (xt ) j = (xT ) j eT−t , which converges to 0 as t approaches infinity. 248 Exercise 6.4.3. Show that under (BR), the time until convergence to the set X∗,ε = {x ∈ X : p p xi > ε ⇒ i ∈ S∗ } is uniform over initial conditions in X. 6.4.3 Imitative Dynamics We now establish analogous results for imitative dynamics. Since these dynamics leave the boundary of the state space invariant, the elimination results can only hold for solutions starting from interior initial conditions. Theorem 6.4.4. Let {xt } be an interior solution trajectory of an imitative dynamic for population p game F, in which strategy i ∈ Sp is strictly dominated. Then limt→∞ (xt )i = 0. ˙ Proof. (p = 1) Observation 4.4.16 tells us that all imitative dynamics x = VF (x) exhibit monotone percentage growth rates (4.16): we can write the dynamic as (6.36) ˙ xi = xi Gi (x), where the continuous function G : X → Rn satisfies (6.37) Gk (x) ≤ Gl (x) if and only if Fk (x) ≤ Fl (x) for all x ∈ int(X). Now suppose strategy i is strictly dominated by strategy j ∈ S. Since X is compact and F is continuous, we can find a c > 0 such that F j (x) − Fi (x) > c for all x ∈ X. Since G is continuous as well, equation (6.37) implies that for some C > 0, we have that G j (x) − Gi (x) > C for all x ∈ X. Now write r = xi /x j . Equation (6.36) and the quotient rule imply that (6.38) ˙ ˙ d xi xi x j − x j xi xi Gi (x)x j − x j G j (x)xi d r= = = = r Gi (x) − G j (x) . dt dt x j (x j )2 (x j )2 ˙ Thus, along every interior solution trajectory {xt } of x = VF (x) we have that t rt = r0 + t rs ds. rs Gi (x) − G j (x) ds ≤ r0 − C 0 0 Gronwall’s Inequality (Lemma 3.A.7) then tells us that rt ≤ r0 exp(−Ct), and hence that rt ¨ vanishes as t approaches infinity. Since (xt ) j is bounded above by 1, (xt )i must approach 0 as t approaches infinity. An argument similar to the one used to prove Theorem 6.4.2 can be used to prove that iteratively dominated strategies are eliminated by imitative dynamics. 249 Theorem 6.4.5. Let {xt } be an interior solution trajectory of an imitative dynamic for population game F, in which strategy i ∈ Sp does not survive iterative elimination of strictly dominated p strategies. Then limt→∞ (xt )i = 0. In particular, if F is dominance solvable, then all interior solutions of any imitative dynamic converge to the dominance solution. Exercise 6.4.6. (i) Prove Theorem 6.4.5. p p (ii) Is the time until convergence to X∗,ε = {x ∈ X : xi > ε ⇒ i ∈ S∗ } uniform over initial conditions in int(X)? Explain. Appendix 6.A Limit and Stability Notions for Deterministic Dynamics We consider differential equations and differential inclusions that are forward invariant on the compact set X ⊂ Rn . (D) ˙ x = V (x), a unique forward solution exists from each ξ ∈ X. (DI) ˙ x ∈ V (x), V is nonempty, convex-valued, bounded, and upper hemicontinuous. When V is discontinuous, we allow solutions to be of the Carath´ odory type—that is, to e ˙ ˙ satisfy xt = V (xt ) (or xt ∈ V (xt )) at almost all t ∈ [0, ∞). 6.A.1 ω-Limits and Notions of Recurrence Let {xt } = {xt }t≥0 be a solution trajectory to (D) or (DI). The ω-limit of {xt } is the set of all points that the trajectory approaches arbitrarily closely infinitely often: ω({xt }) = y ∈ X : there exists {tk }∞ 1 with lim tk = ∞ such that lim xtk = y . k= k→∞ The following proposition lists some basic properties of ω-limit sets. Proposition 6.A.1. Let {xt } be a solution to (D) (or (DI)). Then (i) ω({xt }) is non-empty and connected. (ii) ω({xt }) is closed. In fact, ω({xt }) = t≥0 cl({xs : s ≥ t}). (iii) ω({xt }) is invariant under (D) (or (DI)). 250 k→∞ If {xt } is the unique solution to (D) with initial condition x0 = ξ, we write ω(ξ) in place of ω({xt }). In this case, the set Ω= ω(ξ), ξ∈X contains all points that are approached arbitrarily closely infinitely often by some solution of (D). Among other things, Ω contains all rest points, periodic orbits, and chaotic attractors ¯ of (D). Since Ω need not be closed, its closure Ω = cl(Ω) is used to define a standard notion of recurrence for differential equations. Example 6.A.2. To see that Ω need not be closed, consider the replicator dynamic in 1 1 standard Rock-Paper-Scissors (Figure 4.3.1(i)). The unique Nash equilibrium x∗ = ( 3 , 1 , 3 ) 3 is a rest point, and solution trajectories from all other interior initial conditions form closed orbits around x∗ . The vertices eR , eP , and eS are also rest points, and each trajectory starting from a boundary point that is not a vertex converges to a vertex. Thus, Ω = ¯ int(X) ∪ {eR , eP , eS }, but Ω = X. § While we will not make much use of them here, many other notions of recurrence ¯ besides Ω are available. To obtain a more demanding notion of recurrence for (D), call the state ξ recurrent if the solution from (D) returns arbitrarily close to ξ infinitely often—in other words, if ξ ∈ ω(ξ). The Birkhoff center of (D) is the closure of the set of all recurrent points of (D). More inclusive notions of recurrence—most importantly, the notion of chain recurrence—can be obtained by allowing occasional short jumps between nearby solution trajectories. In addition to its uses in the theory of learning in games, chain recurrence is the key idea needed to state the Fundamental Theorem of Dynamical Systems: the domain of any smooth flow can be decomposed into two sets: a set on which the flow admits a Lyapunov function, and the set of chain recurrent points. 6.A.2 Stability of Sets of States Let A ⊆ X be a closed set, and call O ⊆ X a neighborhood of A if it is open relative to X and contains A. We say that A is Lyapunov stable under (D) (or (DI)) if for every neighborhood O of A there exists a neighborhood O of A such that every solution {xt } that starts in O is contained in O: that is, x0 ∈ O implies that xt ∈ O for all t ≥ 0. A is attracting if there is a neighborhood Y of A such that every solution that starts in Y converges to A: that is, x0 ∈ Y implies that ω({xt }) ⊆ A. A is globally attracting if it is attracting with Y = X. 251 Finally, the set A is asymptotically stable if it is Lyapunov stable and attracting, and it is globally asymptotically stable if it is Lyapunov stable and globally attracting. Example 6.A.3. Attracting sets need not be asymptotically stable. A counterexample is provided by a flow on the unit circle that moves clockwise except at a single point. The fact that the domain is the unit circle is unimportant, since one can embed this flow as a limit cycle in a flow on the plane. § Example 6.A.4. Invariance is not included in the definition of asymptotic stability. Thus, ˙ under the dynamic x = −x on R, any closed interval containing the origin is asymptotically stable. § 6.B Stability Analysis via Lyapunov Functions Let Y ⊆ X. The function L : Y → R is a Lyapunov function for (D) or (DI) if its value changes monotonically along every solution trajectory. We state the results to follow for the case in which the value of L decreases along solution trajectories; of course, the obvious analogues of these results hold for the opposite case. The following lemma will prove useful in a number of the analyses to come. Lemma 6.B.1. Suppose that the function L : Y → R and the trajectory {xt }t≥0 are Lipschitz continuous. ˙ (i) If L(xt ) ≤ 0 for almost all t ≥ 0, then the map t → L(xt ) is nonincreasing. ˙ (ii) If in addition L(xs ) < 0, then L(xt ) < L(xs ) for all t > s. Proof. The composition t → L(xt ) is Lipschitz continuous. Thus, the Fundamental Theorem of Calculus tells us that when t > s, we have that t L(xt ) − L(xs ) = ˙ L(xu ) du ≤ 0, s ˙ where the inequality is strict if L(xs ) < 0. 6.B.1 Lyapunov Stable Sets The basic theorem on Lyapunov stability applies both to differential equations (D) and differential inclusions (DI). 252 Theorem 6.B.2 (Lyapunov stability). Let A ⊆ X be closed, and let Y ⊆ X be a neighborhood of A. Let L : Y → R+ be Lipschitz continuous with L−1 (0) = A. If each solution {xt } of (D) (or (DI)) ˙ satisfies L(xt ) ≤ 0 for almost all t ≥ 0, then A is Lyapunov stable under (D) (or (DI)). Proof. Let O be a neighborhood of A such that cl(O) ⊂ Y. Let c = minx∈bd(O) L(x), so that c > 0. Finally, let O = {x ∈ O : L(x) < c}. Lemma 6.B.1 implies that solution trajectories that start in O do not leave O, and hence that A is Lyapunov stable. Example 6.B.3. The requirement that the function L be constant on A cannot be dispensed with. Consider a flow on the unit circle C = {x ∈ R2 : (x1 )2 + (x2 )2 = 1} that moves clockwise at states x with x1 > 0 and is at rest at states in the semicircle A = {x ∈ C : x1 ≤ 0}. If we let ˙ L(x) = x2 , then L(x) ≤ 0 for all x ∈ C, and A is attracting (see Theorem 6.B.4 below), but A is not Lyapunov stable. We can extend this example so that the flow is defined on the unit disk D = {x ∈ R2 : (x1 )2 + (x2 )2 ≤ 1}. Suppose that when x1 > 0, the flow travels clockwise along the circles centered at the origin, and that the half disk A = {x ∈ D : x1 ≤ 0} consists entirely of rest ˙ points. Then L(x) = x2 satisfies L(x) ≤ 0 for all x ∈ D, and A is attracting, but A is not Lyapunov stable. § 6.B.2 ω-Limits and Attracting Sets We now provide some results that use Lyapunov functions to characterize ω-limits of solution trajectories that begin in the Lyapunov function’s domain. These results immediately yield sufficient conditions for a set to be attracting. To state our results, we call the (relatively) open set Y ⊂ X inescapable if for each solution trajectory {xt }t≥0 with x0 ∈ Y, we have that cl ({xt }) ∩ bd(Y) = ∅. Our first result focuses on the differential equation (D). Theorem 6.B.4. Let Y ⊂ X be relatively open and inescapable under (D). Let L : Y → R be C1 , ˙ ˙ and suppose that L(x) ≡ L(x) V (x) ≤ 0 for all x ∈ Y. Then ω(x0 ) ⊆ {x ∈ Y : L(x) = 0} for all ˙ x0 ∈ Y. Thus, if L(x) = 0 implies that V (x) = 0, then ω(x0 ) ⊆ RP(V ) ∩ Y. Proof. Let {xt } be the solution to (D) with initial condition x0 = ξ ∈ Y, let χ ∈ ω(ξ), and let { yt } be the solution to (D) with y0 = χ. Since Y is inescapable, the closures of trajectories {xt } and { yt } are contained in Y. ˙ Suppose by way of contradiction that L(χ) 0. Since χ ∈ ω(ξ), we can find a divergent sequence of times {tk }∞ 1 such that limk→∞ xtk = χ = y0 . Since solutions to (D) are unique, k= 253 and hence continuous in their initial conditions, we have that (6.39) lim xtk +1 = y1 , and hence that lim L(xtk +1 ) = L( y1 ). k→∞ k→∞ ˙ But since y0 = χ ∈ ω(ξ) and L(χ) 0, applying Lemma 6.B.1 to both {xt } and { yt } yields L(xt ) ≥ L(χ) > L( y1 ) for all t ≥ 0, contradicting the second limit in (6.39). This proves the first claim of the theorem, and the second claim follows immediately from the first. Theorem 6.B.5 is an analogue of Theorem 6.B.4 for upper hemicontinous differential inclusions. Where the proof of Theorem 6.B.4 relied on the continuity of solutions to (D) in their initial conditions, the proof of Theorem 6.B.5 takes advantage of the upper hemicontinuity of the map from initial conditions ξ to solutions of (DI) starting from ξ. Theorem 6.B.5. Let Y ⊂ X be relatively open and inescapable under (DI). Let L : Y → R be C1 and satisfy (i) ∂L (x) ≡ L(x) v ≤ 0 for all v ∈ V (x) and x ∈ Y, and (ii) [0 V (x) implies that ∂v ∂L (x) < 0] for all v ∈ V (x) and x ∈ Y. Then for all solutions {xt } of (DI) with x0 ∈ Y, we have that ∂v ω({xt }) ⊆ {x ∈ Y : 0 ∈ V (x)}. Proof. Suppose that χ ∈ ω({xt }), but that 0 V (χ). Then ∂L (χ) < 0 for all v ∈ V (χ). ∂v Thus, since V (χ) is compact by assumption, there exists a b > 0 such that ∂L (χ) < −b for all ∂v b ˆ v ∈ V (χ). Because V is upper hemicontinuous and L is C1 , it follows that ∂L (χ) < − 2 for all ˆ ∂v ˆ ˆ ˆ v ∈ V (χ) and all χ sufficiently close to χ. So since V is bounded, there is a time u ∈ (0, 1] such that all solutions { yt } of (DI) with y0 = χ satisfy (6.40) L( yt ) ≤ L( ys ) ≤ L(χ) − bs 2 for all s ∈ [0, u] and t > s. Now let {tk }∞ 1 be a divergent sequence of times such that limk→∞ xtk = χ, and for each k= k, define the trajectory {xk }t≥0 by xk = xt+tk . Since the set of continuous trajectories C[0,T] (X) t t is compact in the sup norm topology, the sequence of trajectories {{xk }}∞ 1 has a convergent t k= subsequence, which we take without loss of generality to be {{xk }}∞ 1 itself. We call the t k= ˆ ˆ limit of this subsequence { yt }. Evidently, y0 = χ. Given our conditions on the correspondence V , the set-valued map ˆ ˆ χ → {{xt } : {xt } is a solution to (DI) with x0 = χ} is upper hemicontinuous with respect to the sup norm topology on C[0,T] (X) (see Appendix 254 ˆ 5.A). It follows that { yt } is a solution to (DI). Consequently, (6.41) ˆ ˆ lim xtk +1 = y1 , and so lim L(xtk +1 ) = L( y1 ). k→∞ k→∞ But Lemma 6.B.1 and inequality (6.40) imply that ˆ L(xt ) ≥ L(χ) > L( y1 ) for all t ≥ 0, contradicting the second limit in (6.41). Theorem 6.B.6 is a simple convergence result for differential inclusions. Here the Lyapunov function need only be Lipschitz continuous (rather than C1 ), but the condition on the rate of decrease of this function is stronger than in the previous results. Theorem 6.B.6. Let Y ⊂ X be relatively open and inescapable under (DI), and let L : Y → R+ be Lipschitz continuous. Suppose that along each solution {xt } of (DI) with x0 ∈ Y, we have that ˙ L(xt ) ≤ −L(xt ) for almost all t ≥ 0. Then ω({xt }) ⊂ {x ∈ Y : L(x) = 0}. Proof. Observe that t L(xt ) = L(x0 ) + t ˙ L(xu ) du ≤ L(x0 ) + 0 −L(xu ) du = L(x0 ) e−t , 0 t where the final equality follows from the fact that α0 + 0 −αu du is the value at time t of the ˙ solution to the linear ODE αt = −αt with initial condition α0 ∈ R. It follows immediately that limt→∞ L(xt ) = 0. 6.B.3 Asymptotically Stable and Globally Asymptotically Stable Sets Combining Theorem 6.B.2 with Theorem 6.B.4, 6.B.5, or 6.B.6 yields asymptotic stability and global asymptotic stability results for deterministic dynamics. Corollary 6.B.7 offers such a result for the differential equation (D). Corollary 6.B.7. Let A ⊆ X be closed, and let Y ⊆ X be a neighborhood of A. Let L : Y → R+ be ˙ C1 with L−1 (0) = A. If L(x) ≡ L(x) V (x) < 0 for all x ∈ Y − A, then A is asymptotically stable under (D). If in addition Y = X, then A is globally asymptotically stable under (D). 255 6.C Cooperative Differential Equations Cooperative differential equations are defined by the property that increases in the value of one component of the state variable increase the growth rates of all other components. Their solutions have appealing monotonicity and convergence properties. Let ≤ denote the standard partial order on Rn : that is, x ≤ y if and only if xi ≤ yi for all i ∈ {1, . . . , n}. We write x < y when x ≤ y and x y, so that x j < y j for some j. Finally, we write x y when xi < yi for all i ∈ {1, . . . , n}. We call a vector or a matrix strongly positive if all of its components are positive; thus, x ∈ Rn is strongly positive if x 0. n Let X ⊂ R be a compact convex set that possesses a minimal and a maximal element with respect to the partial order ≤. Let V : X → Rn be a C1 vector field with V (x) ∈ TX(x) for all x ∈ X, so that the differential equation (6.42) ˙ x = V (x) is forward invariant on X. We call the differential equation (6.42) cooperative if (6.43) ∂Vi (x) ≥ 0 for all i ∂x j j and x ∈ X. Equation (6.42) is irreducible if for every x ∈ X and every nonempty proper subset I of the index set {1, . . . n}, there exist indices i ∈ I and j ∈ {1, . . . , n} − I such that ∂Vji (x) 0. ∂x An obvious sufficient condition for (6.42) to be irreducible is that it be strongly cooperative, meaning that the inequality in condition (6.43) is strict for all i j and x ∈ X. In Appendix 3.A.3, we saw how to represent all solutions to the dynamic (6.42) simultaneously via the semiflow φ : R+ × X → X, defined by φt (ξ) = xt , where {xt }t≥0 is the solution to (6.42) with initial condition x0 = ξ. We say that the semiflow φ is monotone if x ≤ y implies that φt (x) ≤ φt ( y) for all t ≥ 0: that is, weakly ordered initial conditions induce weakly ordered solution trajectories. If in addition x < y implies that φt (x) φt ( y) for all t > 0, we say that φ is strongly monotone. Theorem 6.C.1 tells us that cooperative irreducible differential equations generate strongly monotone semiflows. ˙ Theorem 6.C.1. Suppose that x = V (x) is cooperative and irreducible. Then (i) For all t > 0, the derivative matrix of its semiflow φ is strongly positive: Dφt (x) (ii) The semiflow φ is strongly monotone. 0. For the intuition behind Theorem 6.C.1, let {xt } and { yt } be solutions to (6.42) with x0 < y0 . Suppose that at some time t > 0, we have that xt ≤ yt and (xt )i = ( yt )i . If we could 256 show that Vi (xt ) ≤ Vi ( yt ), then it seems reasonable to expect that (xt+ε )i will not be able to surpass ( yt+ε )i . But since xt and yt only differ in components other than i, the vector z = yt − xt ≥ 0 has zi = 0, and so 1 Vi ( yt ) − Vi (xt ) = 1 Vi (xt + αz) z dα = 0 0 ji ∂V i (xt + αz) z j dα. ∂x j The final expression is nonnegative as long as ∂Vji ≥ 0 whenever j i. ∂x The next theorem sets out the basic properties of strongly monotone semiflows on X. To state this result, we let C(φ) = {x ∈ X : ω(x) = {x∗ } for some x∗ ∈ RP(φ)} denote the set of initial conditions from which the semiflow φ converges to a rest point. Also, let Ω(φ) = x∈X ω(x) be the set of ω-limit points under φ. Theorem 6.C.2. Suppose that the semiflow φ on X is strongly monotone. Then (i) (Convergence criteria) If φT (x) ≥ x for some T > 0, then ω(x) is periodic with period T. If φt (x) ≥ x over some nonempty open interval of times, then x ∈ C(φ). (ii) (Unordered ω-limit sets) If x, y ∈ ω(z), then x y and y x. ¯ (iii) (Minimal and maximal rest points) Let x = min X and x = max X. Then x∗ = min RP(φ) ∗ = max RP(φ) exist; in fact, ω(x) = x∗ and ω(x) = x∗ . Moreover, [x∗ , x∗ ] contains ¯ ¯ ¯ ¯ and x Ω(φ) and is globally asymptotically stable. Proof. (i) If φT (x) ≥ x, then φ(n+1)T (x) ≥ φnT (x) for all positive integers n, so monotonicity and the compactness of X imply that limn→∞ φnT (x) = y for some y ∈ X. By the continuity and group properties of the flow, φt+T ( y) = φt+T lim φnT (x) = lim φt+(n+1)T (x) = lim φt (φ(n+1)T (x)) = φt ( y), n→∞ n→∞ n→∞ so the flow from y is T-periodic. A continuity argument shows that the orbit from y is none other than ω(x). The proof of the second claim is omitted. (ii) Suppose that x, y ∈ ω(z) and that x < y. Since φ is strongly monotone, and by the continuity of φt (ξ) in ξ, there are neighborhoods Nx , N y ⊂ X of x and y and a time T > 0 such that φT (Nx ) φT (N y ). Choose τ y > τx > 0 such that φτx (z) ∈ Nx and φτy (z) ∈ N y . Then for all t close enough to τ y , φτx +T (z) φt+T (z) = φt−τx (φτx +T (z)). Therefore, part (i) implies that ω(z) is a singleton. ¯ (iii) Since x and x are the minimal and maximal points in X, part (i) implies that ω(x) = ∗ and ω(x) = x∗ for some x∗ , x∗ ∈ RP(φ). Hence, if x ∈ X ⊆ [x, x], then φ (x) ≤ φ (x) ≤ φ (x) ¯ ¯ ¯ ¯ x t t t¯ 257 ¯ ¯ for all t ≥ 0, so taking limits yields x∗ ≤ ω(x) ≤ x∗ ; thus, Ω(φ) ⊆ [x∗ , x∗ ]. Finally, if ¯ ¯ [x∗ , x∗ ] ⊆ [ y, z] ⊆ X, then x ∈ [ y, z] implies that φt (x) ∈ [φt ( y), φt (z)] ⊆ [ y, z], so [x∗ , x∗ ] is Lyapunov stable, and hence globally asymptotically stable by the previous argument. If the derivative matrices of the semiflow are strongly positive, one can obtain even stronger results, including the convergence of solution trajectories from generic initial conditions to rest points. Theorem 6.C.3. Suppose that the semiflow φ on X is strongly monotone, and that its derivative matrices Dφt (x) are strongly positive for all t > 0. Then (i) (Limit set dichotomy) If x < y, then either ω(x) < ω( y), or ω(x) = ω( y) = {x∗ } for some x∗ ∈ RP(φ). (ii) (Generic convergence to equilibrium) C(φ) is an open, dense, full measure subset of X. 6.N Notes Section 6.1. The results in Section 6.1.1 are proved for symmetric random matching games in Hofbauer (2000), the seminal reference on Lyapunov functions for evolutionary dynamics. Global convergence in all potential games of dynamics satisfying positive correlation is proved in Sandholm (2001), building on earlier work of Hofbauer and Sigmund (1988) and Monderer and Shapley (1996). Convergence of perturbed best response dynamics in potential games is proved by Hofbauer and Sandholm (2007). Shahshahani (1979), building on the early work of Kimura (1958), showed that the replicator dynamic for a potential game is a gradient dynamic after a “change in geometry”— that is, after the introduction of an appropriate Riemannian metric on int(X). Subsequently, Akin (1979, 1990) proved that Shahshahani’s (1979) result can also be represented using the change of variable presented in Theorem 6.1.9. The direct proof offered in the text is from Sandholm et al. (2008). Section 6.2. Theorem 6.2.1 is due to Nagurney and Zhang (1997); the proof in the text is from Sandholm et al. (2008). Theorem 6.2.4 was first proved for normal form games with an interior ESS by Hofbauer et al. (1979) and Zeeman (1980). Akin (1990, Theorem 6.4) and Aubin (1991, Section 1.4) extend this result to nonlinear single population games, while Cressman et al. (2001) extend it to linear multipopulation games. Section 6.2.2 follows Hofbauer and Sandholm (2007, 2008). These papers take inspiration from Hart and Mas-Colell (2001), which points out the role of integrability in models of regret-based learning in repeated normal form games. Hofbauer (2000) proves the convergence of the BNN, best response, and perturbed best response dynamics in normal 258 form games with an interior ESS. A proof of the existence of a cycle in Example 6.2.6 can be found in Hofbauer and Sandholm (2008); this reference also contains a statement and proof of the version of Danskin’s Envelope Theorem cited in the text. The probabilistic characterization of integrability alluded to the text is presented in Sandholm (2006b). For subdifferentials of convex functions, see Hiriart-Urruty and Lemar´ chal (2001); their e Example D.3.4 is especially relevant to our discussion in the text. Smith (1984) proves Theorem 6.2.11 for his dynamic; the general result presented here is due to Hofbauer and Sandholm (2008). Kojima and Takahashi (2007) consider a class of single population random matching games called anti-coordination games, in which at each state x, the worst response to x is always in the support of x. They prove (see also Hofbauer (1995b)) that such games must have a unique equilibrium, that this equilibrium is interior, and that it is globally asymptotically stable under the best response dynamic. However, also they present an example (due to Hofbauer) showing that neither the replicator dynamic nor the logit dynamic need converge in these games, the latter even at arbitrarily low noise levels. Section 6.3. Section 6.3.1 follows Berger (2007). Exercise 6.3.1(ii) is due to Hofbauer (1995b), and Lemma 6.3.2(i) is due to Monderer and Sela (1997). It is worth noting that Theorem 6.3.4 extends immediately to ordinal supermodular games (also known as quasi-supermodular games; see Milgrom and Shannon (1994)). Moreover, since ordinal potential games (Monderer and Shapley (1996)) are defined by the absence of cycles of improvement steps, a portion of the proof of Theorem 6.3.4 establishes the convergence of simple solutions of (BR) in nondegenerate ordinal potential games. Section 6.3.2 follows Hofbauer and Sandholm (2002, 2007). Section 6.4. Akin (1980) shows that starting from any interior population state, the replicator dynamic eliminates strategies that are strictly dominated by a pure strategy. Versions of Theorems 6.4.4 and 6.4.5 can be found in Nachbar (1990) and Samuelson and Zhang (1992); see also Hofbauer and Weibull (1996). Section 6.A. For properties of ω-limit sets of differential equations, see Robinson (1995); for ω-limit sets of differential inclusions, see Bena¨m et al. (2005). For applications of chain ı recurrence in the theory of learning in games, see Bena¨m and Hirsch (1999), Hofbauer ı and Sandholm (2002), and Bena¨m et al. (2005, 2006b). The Fundamental Theorem of ı Dynamical Systems is due to Conley (1978); see Robinson (1995) for a textbook treatment. Other good general references on notions of recurrence for differential equations include Nemytskii and Stepanov (1960), Akin (1993), and Bena¨m (1998, 1999). ı Section 6.B. The standard reference on Lyapunov functions for flows is Bhatia and Szeg˝ (1970). o 259 Section 6.C. The standard reference on cooperative differential equations and monotone dynamical systems is Smith (1995). Theorems 6.C.1, 6.C.2(i), 6.C.2(ii), and 6.C.3(i) in the text are Smith’s (1995) Theorems 4.1.1, 1.2.1, 1.2.3, and 2.4.5, respectively. Theorem 6.C.3(ii) combines Theorem 2.4.7 of Smith (1995) with Theorem 1.1 of Hirsch (1988), the latter after a reversal of time. 260 CHAPTER SEVEN Local Stability under Evolutionary Dynamics 7.0 Introduction In Chapter 6, we analyzed classes of games in which many evolutionary dynamics converge to equilibrium from all or most initial conditions. While we argued in Chapter 2 that games from many applications lie in these classes, it is certain that at least as many interesting games do not. In cases where global convergence results are not available, one can turn instead to analyses of local stability. If a society somehow finds itself playing a particular equilibrium, how can we tell whether this equilibrium will persist in the face of occasional, small disturbances in behavior? This chapter introduces a refinement of Nash equilibrium— that of an evolutionarily stable state (or ESS)—and establishes that an ESS is locally stable under many evolutionary dynamics. Our definition of ESS is one of many related definitions considered in the literature. We present our definition, its alternatives, and the connections among them in Section 7.3. We will see that games with an ESS share some structural properties with stable games, at least in the neighborhood of the ESS. Taking advantage of this connection, we show in Section 7.4 how to establish local stability of ESS under some dynamics through the use of local Lyapunov functions. Our results here build on our analyses in Section 6.2, where we constructed local Lyapunov functions for many dynamics for use in stable games. The other leading approach to local stability analysis is linearization. Given a rest point of a nonlinear (but smooth) dynamic, one can approximate the behavior of the dynamic in a neighborhood of the rest point by studying an appropriate linear dynamic: namely, the one defined by the derivative matrix of the nonlinear dynamic, evaluated at the rest 261 point in question. In Sections 7.5 and 7.6, we use linearization to study the two families of smooth dynamics introduced in Chapters 4 and 5: the imitative dynamics, and the perturbed best response dynamics. Surprisingly, this analysis will lead us to a deep and powerful connection between the replicator and logit dynamics, one that seems difficult to reach by other means. It is worth noting now that linearization is also very useful for establishing instability results. For this reason, the techniques we develop in this chapter will be a very important ingredient of our analyses in Chapter 8, where we study nonconvergent dynamics. The first two sections of the chapter formally establish some results that were hinted at in earlier chapters. In Section 7.1, we indicate two senses in which a non-Nash rest point of an imitative dynamic cannot be stable. In Section 7.2, we show that under most dynamics, a Nash equilibrium of a potential game is locally stable if and only if it is a local maximizer of potential. The linearization techniques used in Sections 7.5 and 7.6 and in Chapter 8 require a working knowledge of matrix analysis and linear differential equations; we present these topics in detail in Appendices 7.A and 7.B. The main theorems of linearization theory are themselves presented in Appendix 7.C. 7.1 Non-Nash Rest Points of Imitative Dynamics We saw in Chapters 4 and 5 that under five of our six classes of evolutionary dynamics, rest points are identical to Nash equilibria (or to perturbed versions thereof). The lone exception is the imitative dynamics: Theorem 4.4.21 shows that the rest points of these dynamics are the restricted equilibria, a set that includes not only the Nash equilibria, but also any state that would be Nash equilibria were the strategies unused at that state removed from the game. Theorem 4.7.1 established one sense in which these extra rest points are fragile: by combining a small amount of a “better behaved” dynamic with an imitative dynamic, one obtains a new dynamic that satisfies Nash stationarity. But we mentioned in Section 4.4.6 that this fragility can be expressed more directly: there we claimed that non-Nash rest points of imitative dynamics cannot be locally stable, and so are not plausible predictions of play. We are now in a position to formally establish this last claim. Recall from Observation 4.4.16 that imitative dynamics exhibit monotone percentage growth rates: they can be expressed in the form (7.1) p p p ˙ xi = xi Gi (x), 262 p p with the percentage growth rates Gi (x) ordered by payoffs Fi (x) as in equation (4.16). This fact drives our instability result. ˆ Theorem 7.1.1. Let VF be an imitative dynamic for population game F, and let x be a non-Nash ˆ rest point of VF . Then x is not Lyapunov stable under VF , and no interior solution trajectory of VF ˆ converges to x. ˆ Proof. (p = 1) Since x is a restricted equilibrium that is not a Nash equilibrium, each ˆ ˆ ˆ ˆ strategy j in the support of x satisfies F j (x) = F(x), and any best response i to x is an unused ˆ ˆ ˆ strategy that satisfies Fi (x) > F(x). Also, since x is a rest point of VF , equation (7.1) implies ˆ ˆ that each j in the support of x has G j (x) = 0. Thus, monotonicity of percentage growth ˆ ˆ rates implies that Gi (x) > G j (x) = 0, and so the continuity of Gi implies Gi (x) ≥ k > 0 on ˆ some small neighborhood O of x. Now let {xt } be an interior solution trajectory of VF (see Theorem 4.4.14). Then if xs ∈ O for all s ∈ (t, u), it follows that u u log((xu )i )−log((xt )i ) = t d ds log((xs )i ) ds = t ˙ (xs )i ds = (xs )i u Gi (xs ) ds ≥ k(u−t). t Rearranging and exponentiating yields (xu )i ≥ (xt )i exp(k(u − t)). Thus, during intervals that xs is in O, (xs )i is strictly increasing. This immediately implies ˆ that there is no neighborhood O of x such that solutions starting in O stay in O, and so ˆ x is not Lyapunov stable. Also, since (xt ) j cannot decrease inside O ∩ int(X), no interior ˆ solution trajectory can converge to x. 7.2 Local Stability in Potential Games We saw in Section 6.1 that in potential games, the potential function serves as a strict Lyapunov function for any evolutionary dynamic satisfying positive correlation (PC); solution trajectories of such dynamics ascend the potential function and converge to connected sets of rest points. For dynamics that also satisfy Nash stationarity (NS), these sets consist entirely of Nash equilibria. That the potential function is a strict Lyapunov function has important implications for local stability of sets of rest points. Call A ⊆ X a local maximizer set of the function f : X → R if it is connected, if f is constant on A, and if there exists a neighborhood O of 263 A such that f (x) > f ( y) for all x ∈ A and all y ∈ O − A. Theorem 2.1.7 implies that such a set consists entirely of Nash equilibria. We call the set A ⊆ NE(F) isolated if there is a neighborhood of A that does not contain any Nash equilibria other than those in A. If the value of f is nondecreasing along solutions of a dynamic, then they cannot escape a neighborhood of a local maximizer set. If the value of f is actually increasing in this neighborhood, then solutions in the neighborhood should converge to the set. This is the content of the following theorem. Theorem 7.2.1. Let F be a potential game with potential function f , let VF be an evolutionary dynamic for F, and suppose that A ⊆ NE(F) is a local maximizer set of f . (i) If VF satisfies positive correlation (PC), then A is Lyapunov stable under VF . (ii) If in addition VF satisfies Nash stationarity (NS) and A is isolated, then A is an asymptotically stable set under VF . Proof. Part (i) of the theorem follows immediately from Lemma 6.1.1 and Theorem 6.B.2. To prove part (ii), note that (NS), (PC), and the fact that A is isolated imply that there is a neighborhood O of A such that f˙(x) = f (x) VF (x) > 0 for all x ∈ O − A. Corollary 6.B.7 then implies that A is asymptotically stable. For dynamics satisfying (PC) and (NS), being an isolated local maximizer set is not only a sufficient condition for being asymptotically stable; it is also necessary. Theorem 7.2.2. Let F be a potential game with potential function f , let VF be an evolutionary dynamic for F that satisfies (PC) and (NS). Suppose that A ⊆ NE(F) is a smoothly connected asymptotically stable set under VF . Then A is an isolated local maximizer set of f . Proof. Since A is a smoothly connected set of Nash equilibria, Exercise 2.1.15 implies that f takes some fixed value c throughout A. Now let ξ be an initial condition in O − A, where O is the basin of attraction of A. Then ω(ξ) ⊆ A. But since f is a strict Lyapunov function for VF , it follows that f (ξ) < c. Since ξ ∈ O − A was arbitrary, we conclude that that A is an isolated local maximizer set. Theorems 7.2.1 and 7.2.2 allow us to characterize locally stable rest points for dynamics satisfying positive correlation (PC). Since the best response and perturbed best response dynamics do not satisfy this condition, the former because of lack of smoothness and the latter because of the perturbations, Theorems 7.2.1 and 7.2.2 do not apply. In the case of the best response dynamic, Theorem 5.1.8 establishes analogues of (NS) and (PC), which in turn imply that solution trajectories ascend the potential function and converge to Nash equilibrium (Theorem 6.1.4). By using these results along with the arguments above, we obtain 264 Theorem 7.2.3. Let F be a potential game with potential function f , let VF be the best response dynamic for F, and let A ⊆ NE(F) be smoothly connected. Then A is an isolated local maximizer set of f if and only if A is asymptotically stable under VF . In the case of perturbed best response dynamics, the roles of conditions (PC) and (NS) are played by virtual positive correlation and perturbed stationarity (Theorem 5.2.13 and Observation 5.2.10). These in turn ensure that the dynamics ascend the perturbed potential function f˜(x) = f (x) − 1 mp vp ( mp xp ) p∈P (Theorem 6.1.6). Substituting these results into the arguments above yields Theorem 7.2.4. Let F be a potential game with potential function f , let VF,v be the perturbed best response dynamic for F generated by the admissible deterministic perturbations v = (v1 , . . . , vp ), and let A ⊆ PE(F, v) be smoothly connected. Then A is an isolated local maximizer set of f˜ if and only if A is asymptotically stable under VF,v . 7.3 Evolutionarily Stable States We now turn to the main ideas of this chapter by introducing a refinement of Nash equilibrium that is of basic importance in evolutionary modeling. 7.3.1 Definition Let F be a game played by p ≥ 1 populations. We call x ∈ X an evolutionarily stable state (ESS) of F if there is a neighborhood O of x such that (7.2) ( y − x) F( y) < 0 for all y ∈ O − {x}. The notion of ESS was first introduced in the context of single population random matching. The following exercise shows that under multipopulation random matching, the ESS condition is very restrictive: Exercise 7.3.1. Suppose that F has no own-population interactions: Fp (x) is independent p of xp for all p ∈ P . Show that if x∗ is an ESS of F, then it is a pure social state: (x∗ )p = mp ei for some i ∈ Sp and p ∈ P . (Hint: If xp is not pure, consider an invasion by y = ( yp , x−p ), where yp is an alternate best response to x.) 265 The conclusion of this exercise is similar in spirit and in content to that of Proposition 2.3.10, which showed that a stable multipopulation game without own-population interactions must be null stable. But just as strictly stable multipopulation games are quite common if own-population interactions are allowed, so too are interior ESSs. 7.3.2 Variations Our definition (7.2) of ESS is one of many that can be found in the literature. Some alternatives are equivalent to definition (7.2) in a single population setting, but that differ substantially in multipopulation settings. Exercise 7.3.2. Consider the following condition on a social state x ∈ X: (7.3) For all y ∈ O − {x}, there is a p ∈ P such that ( yp − xp ) Fp ( y) < 0. (i) Show that every ESS satisfies condition (7.3), but that the converse statement is false if p ≥ 2. (ii) Show that if F has no own-population interactions, then any state satisfying condition (7.3) is pure social state (cf Exercise 7.3.1). While definition (7.2) is the most useful one for studying the evolutionary dynamics considered in this book, condition (7.3) is more in the spirit to the original motivation for ESS. See the Notes for an extended discussion of this point. There are also alternative requirements that are equivalent to definition (7.2) under single population random matching, but that are distinct from condition (7.2) in nonlinear games. The following exercises explore some of these alternatives; see the Notes for additional discussion. Exercise 7.3.3. Consider the following requirement on social state x ∈ X: (7.4) ¯ For each y ∈ O − {x}, there exists an ε > 0 such that ¯ ( y − x) F(ε y + (1 − ε)x) < 0 for all ε ∈ (0, ε). (i) Explain requirement (7.4) in words. (ii) Show that any ESS as defined in (7.2) satisfies condition (7.4). (iii) Show that if F(x) = Ax is a single population random matching game, then conditions (7.2) and (7.4) are equivalent. (iv) Construct a three-strategy game with a state x∗ that satisfies condition (7.4) but that 1 is not an ESS. (Hint: Let x∗ = (0, 1 , 2 ), and let D1 and D2 be closed disks in X ⊂ R3 2 266 that are tangent to bd(X) at x∗ and whose radii are r1 and r2 > r1 . Introduce a payoff function of the form F(x) = −c(x)(x − x∗ ), where c(x) is positive on int(D1 ) ∪ (X − D2 ) and negative on int(D2 ) − D1 . Then apply Exercise 7.3.5 below.) Exercise 7.3.4. Consider this pair of conditions on social state x ∈ X: (7.5) x is a Nash equilibrium: ( y − x) F(x) ≤ 0 for all y ∈ X. (7.6) For all y ∈ O − {x}, ( y − x) F(x) = 0 implies that ( y − x) F( y) < 0. Show that this pair of conditions is equivalent to condition (7.4), and so are satisfied if x is an ESS. Exercise 7.3.5. The previous exercises imply that every ESS is a Nash equilibrium. Show further that every ESS is isolated in the set of Nash equilibria. Exercise 7.3.6. Consider this pair of conditions on social state x ∈ X: (7.5) x is a Nash equilibrium. (7.7) For all y ∈ O − {x}, ( y − x) F(x) = 0 implies that ( y − x) DF(x)( y − x) < 0. (i) Show that conditions (7.5) and (7.7) imply that x is an ESS. (ii) Show that if F(x) = Ax is a single population random matching game, then conditions (7.5) and (7.7) hold if and only if x is an ESS. (iii) Give an example of a two-strategy game with an ESS x∗ that fails condition (7.7). 7.3.3 Regular ESS To prove some of our local stability results, we need a version of the ESS concept that is slightly stronger than any of those proposed above. To introduce this condition, we first recall that a strict equilibrium is a pure Nash equilibrium at which in each population, the strategy in use earns a strictly higher payoff than all strategies not in use. In a similar vein, we call state x a quasistrict equilibrium if within each population, all strategies in use earn the same payoff, which is higher than the payoff of each unused strategy. Put differently, x is a quasistrict equilibrium if p p p p ¯ Fi (x) = Fp (x) > F j (x) whenever xi > 0 and x j = 0. With this definition in hand, we can introduce our refinement of ESS: we call state x a regular ESS if it is a quasistrict equilibrium that satisfies condition (7.7). It is clear from Exercise 7.3.6 that every regular ESS is in fact an ESS. 267 Let us point out an alternate characterization of regular ESS that will be useful later on. p For any set of strategies I ⊂ p∈P Sp , let Rn = { y ∈ Rn : y j = 0 whenever j I} denote the I set of vectors in Rn whose components corresponding to strategies outside I equal zero. Also, let S(x) ⊆ p∈P Sp denote the support of state x. Observation 7.3.7. State x is a regular ESS if and only if (7.8) x is a quasistrict equilibrium; (7.9) z DF(x)z < 0 for all nonzero z ∈ TX ∩ Rn(x) . S Condition (7.9) resembles the derivative condition we associate with strictly stable games. However, the condition need only hold at the equilibrium, and negative definiteness is only required to hold in directions that move along the face of X on which the equilibrium lies. For instance, if x is pure, condition (7.9) is vacuous, so the definition of regular ESS reduces to that of strict equilibrium. 7.4 Local Stability via Lyapunov Functions In this remainder of this chapter, we show that any (regular) ESS x∗ is locally stable under many evolutionary dynamics. In this section, our approach is to construct a strict local Lyapunov function for each dynamic in question: that is, a nonnegative function defined in a neighborhood of x∗ that vanishes precisely at x∗ and whose value decreases along solution of the dynamic other than the stationary one at x∗ . The results presented in Appendix 6.B show that the existence of such a function ensures the asymptotically stability of x∗ . The similarity between the definitions of ESS and of stable games—in particular, the negative semidefiniteness conditions that play a central role in both contexts—suggests the Lyapunov functions for stable games from Section 6.2 as the natural starting points for our stability analyses of ESSs. In some cases—under the projection and replicator dynamics, and whenever the ESS is interior—we will be able to use the Lyapunov functions from Section 6.2 without amendment. But more generally, these functions will require modifications to become local Lyapunovs function for ESSs. 7.4.1 The Replicator and Projection Dynamics The analysis is simplest in the cases of the replicator and projection dynamics. In Section 6.2, we proved global convergence of these dynamics in every strictly stable game 268 by showing that measures of “distance” from the game’s unique Nash equilibrium served as global Lyapunov functions. The proofs of these convergence results relied on nothing about the payoff structure of the game apart from the fact that the game’s unique Nash equilibrium is also a GESS. This observation suggests that if state x∗ is an ESS of an arbitrary population game, the same “distance” functions will serve as local Lyapunov functions for x∗ under the two dynamics. We confirm this logic in the following theorem. Theorem 7.4.1. Let x∗ be an ESS of F. Then x∗ is asymptotically stable under (i) the replicator dynamic for F; (ii) the projection dynamic for F. Exercise 7.4.2. Prove Theorem 7.4.1 by showing that the functions Hx∗ and Ex∗ from Theorems 6.2.4 and 6.2.1 define strict local Lyapunov functions for the two dynamics. 7.4.2 Target and Pairwise Comparison Dynamics: Interior ESS In proving convergence results for other classes of dynamics in Section 6.2, we relied directly on the negative semidefiniteness condition (2.15) that characterizes stable games. If a game admits an interior ESS that satisfies the strict inequalities in (7.9), then condition (2.15) holds in a neighborhood of the ESS. This allows us again to use the Lyapunov functions from Section 6.2 without amendment to prove local stability results. Theorem 7.4.3. Let x∗ be a regular interior ESS of F. Then x∗ is asymptotically stable under (i) any separable excess payoff dynamic for F; (ii) the best response dynamic for F; (iii) any impartial pairwise comparison dynamic for F. Exercise 7.4.4. Prove Theorem 7.4.3 by showing that the functions Γ, G, and Ψ from Theorems 6.2.8, 6.2.9, and 6.2.11 define a strict local Lyapunov functions for an ESS x∗ under the three dynamics. Rest points of perturbed best response dynamics generally do not coincide with Nash equilibria, and hence with ESSs. Nevertheless, the next exercise indicates that an appropriate negative definiteness condition is still enough to ensure local stability. ˜ Exercise 7.4.5. Let x be a perturbed equilibrium of (F, v) for some admissible deterministic ˜ perturbations v = (v1 , . . . vp ), and suppose that z DF(x)z < 0 for all nonzero z ∈ TX. Show ˜ ˜ that x is isolated in the set of perturbed equilibria, and that the function G from Theorem ˜ ˜ 6.2.10 defines a strict local Lyapunov function for x. (Hint: To show that x is isolated, use the argument at the end of the proof of Theorem 6.2.10.) 269 For consistency with our previous results, it is natural to try prove stability results for games with an interior ESS. To do so, we need to assume that the size of the perturbations is “small”, in the hopes that there will be a perturbed equilibrium that is “close” to the ESS. Since the logit dynamic is parameterized by a noise level η, it provides a natural setting for the result we seek. Theorem 7.4.6. Let x∗ be a regular interior ESS of F. Then for some neighborhood O of x∗ and each ˜ ˆ η > 0 less than some η > 0, there is a unique logit(η) equilibrium xη in O, and this equilibrium η ˜ is asymptotically stable under the logit(η) dynamic. Finally, x varies continuously in η, and ˜ limη→0 xη = x∗ . Proof. (p = 1) Theorem 6.2.10 and Exericises 5.2.6 and 5.2.7 show that for η > 0, the function −1 ˆ ˜ η (x) = η log G exp(η F j (x)) + η x j log x j . j∈S j∈S (with 0 log 0 ≡ 0) is a Lyapunov function for the logit(η) dynamic when F is a stable game. If we define ˜ ˆ G0 (x) ≡ G(x) = max F j (x). j∈S ˜ to be the Lyapunov function for the best response dynamic in stable games, then Gη (x) is continuous in (x, η) on X × [0, ∞), By Exercise 7.4.3, G defines a strict local Lyapunov function for the best response dynamic at the interior ESS x∗ . In particular, x∗ is local minimizer of G: there is an open, convex neighborhood O ⊂ X of x∗ such that G(x) > G(x∗ ) for all x ∈ O − {x∗ }. Moreover, since F is C1 and satisfies z DF(x∗ )z < 0 for all nonzero z ∈ TX, we can choose O in such a way that z DF(x)z < 0 for all nonzero z ∈ TX and x ∈ O. ˜ Because Gη (x) is continuous in (x, η), the Theorem of the Maximum (see the Notes) implies that the map ˜ ˜ η → β(η) ≡ argmin Gη (x). x∈cl(O) ˜ is upper hemicontinuous on [0, ∞). Thus, since β(0) = {x∗ } ⊂ O (in particular, since ˜ ˆ ˆ x∗ bd(O)), there is an η > 0 such that β(η) ⊂ O for all η < η. This implies that each η η ˜ ˜ ˜ x ∈ β(η) is a local minimum of G not only with respect to cl(O), but also with respect to the full state space X. 270 ˜ Exercise 7.4.5 implies that the value of Gη is decreasing along solutions to the logit(η) ˜ dynamic in the set O, implying that each local minimizer xη is a rest point of this dynamic— η ˜ indeed, x must be an asymptotically stable rest point. Finally, since O is convex, the last ˆ˜ paragraph of the proof of Theorem 6.2.10 shows that when η < η, β(η) ⊂ O is a singleton. This completes the proof of the theorem. 7.4.3 Target and Pairwise Comparison Dynamics: Boundary ESS It remains for us to prove local stability results for boundary ESSs for the the dynamics considered in Theorem 7.4.3. Theorem 7.4.7. Let x∗ be a regular ESS of F. Then x∗ is asymptotically stable under (i) any separable excess payoff dynamic for F; (ii) the best response dynamic for F; (iii) any impartial pairwise comparison dynamic for F. To prove Theorem 7.4.7, we show that suitably modified versions of the Lyapunov functions for stable games serve as local Lyapunov functions here. Letting Sp (x∗ ) = support((x∗ )p ) and C > 0, we augment the functions Γ, G, and Ψ from Section 6.2 by the function (7.10) p Υx∗ (x) = C p∈P j Sp (x∗ ) xj , which is proportional to the number of agents using strategies outside the support of x∗ . We provide a detailed proof of the theorem for the case of impartial pairwise comparison dynamics, and leave the proofs of the other two cases as exercises. ˙ Proof of Theorem 7.4.7(iii). (p = 1) Let x = VF (x) be an impartial pairwise comparison 1 dynamic for F. Define the C function Ψx∗ : X → R by Ψx∗ (x) = Ψ(x) + Υx∗ (x) = Ψ(x) + C x j. j S(x∗ ) Here Ψ is the Lyapunov function defined in Theorem 6.2.11, and Υx∗ is as defined in equation (7.10); the constant C > 0 will be determined later. Since VF is an impartial pairwise comparison dynamic, Theorem 6.2.11 shows that the function Ψ is nonnegative, with Ψ(x) = 0 if and only if x ∈ NE(F). It follows that Ψx∗ too is nonnegative, with Ψx∗ (x) = 0 if and only if x is a Nash equilibrium of F with 271 support(x) ⊆ support(x∗ ). Thus, since x∗ is a regular ESS, it is isolated in the set of Nash equilibria (see Exercise 7.3.5), so there is a neighborhood O of x∗ on which x∗ is the unique ˙ zero of Ψx∗ . If we can show that there is also a neighborhood O of x∗ such that Ψx∗ (x) < 0 ∗ }, then Ψ ∗ is a strict local Lyapunov function for x∗ , so the conclusion for all x ∈ O − {x x of the theorem will follow from Corollary 6.B.7. To reduce the amount of notation in the analysis to come, let 10 ∈ Rn be the vector whose jth component equals 0 if j ∈ support(x∗ ) and equals 1 otherwise, so that (10 ) x is the mass of agents who use strategies outside the support of x∗ at state x. Then we can write Ψx∗ (x) = Ψ(x) + C (10 ) x, and so can express the time derivative of Ψx∗ as ˙ ˙ ˙ Ψx∗ (x) = Ψ(x) + C (10 ) x. Now the proof of Theorem 6.2.11 shows that the time derivative of Ψ satisfies ˙ ˙ ˙ Ψ(x) ≤ x DF(x)x, with equality holding precisely at the Nash equilibria of VF . Thus, to finish the proof, it is enough to show that ˙ ˙ ˙ x DF(x)x + C (10 ) x ≤ 0 for all x ∈ O − {x∗ }. This follows directly from the following lemma, choosing C ≥ M/N. ˙ Lemma 7.4.8. Let x = VF (x) be a pairwise comparison dynamic for F, and let x∗ be a regular ESS of F. Then there is a neighborhood O of x∗ and constants M, N > 0 such that for all x ∈ O , ˙ ˙ (i) x DF(x)x ≤ M (10 ) x; ˙ (ii) (10 ) x ≤ −N (10 ) x. Proof. Suppose without loss of generality that S(x∗ ) = support(x∗ ) is given by {1, . . . , n∗ }. Then to complement 10 ∈ Rn , let 1∗ ∈ Rn be the vector whose first n∗ components equal 1 and whose remaining components equal 0, so that 1 = 1∗ + 10 . Next, decompose the identity matrix I as I∗ + I0 , where I∗ = diag(1∗ ) and I0 = diag(10 ), and finally, decompose I∗ 1 as Φ∗ +Ξ∗ , where Ξ∗ = n∗ 1∗ (1∗ ) and Φ∗ = I∗ −Ξ∗ . Notice that Φ∗ is the orthogonal projection p of Rn onto Rn ∩ Rn(x∗ ) = {z ∈ Rn : z j = 0 whenever j S(x∗ )}, and that I = Φ∗ + Ξ∗ + I0 . 0 0 S Using this decomposition of the identity matrix, we can write (7.11) ˙ ˙ ˙ ˙ x DF(x)x = ((Φ∗ + Ξ∗ + I0 ) x) DF(x)((Φ∗ + Ξ∗ + I0 ) x) ∗ x) DF(x)(Φ∗ x) + ((Ξ∗ + I0 ) x) DF(x) x + (Φ∗ x) DF(x)((Ξ∗ + I0 ) x). ˙ ˙ ˙ ˙ ˙ = (Φ ˙ 272 Since x∗ is a regular ESS, we know that z DF(x∗ )z < 0 for all nonzero z ∈ TX ∪ Rn(x∗ ) . Thus, S since DF(x) is continuous in x, there is a neighborhood O of x∗ on which the first term of (7.11) is nonpositive. ˙ Turning to the second term, note that since 1 x = 0 and (10 ) = 1 I0 , we have that 1 1 ˙ ˙ ˙ (Ξ∗ + I0 )x = ( n∗ 1∗ (1∗ ) + I0 )x = (− n∗ 1∗ (10 ) + I0 )x = ((I − 1∗ 11 n∗ ˙ )I0 )x. Let A denote the spectral norm of the matrix A (see Appendix 7.A.6). Then applying spectral norm inequalities and the Cauchy-Schwarz inequality, we find that (7.12) ˙ ˙ ((Ξ∗ + I0 )x) DF(x)x = ((I − 1∗ 11 n∗ ˙ ˙ ˙ )I0 x) DF(x)x ≤ I0 x I− 1 1(1∗ ) n∗ DF(x) ˙ x. Since DF(x), VF (x), and ρi j (F(x)) are continuous in x on the compact set X, we can find constants K and R such that I− 1 1(1∗ ) n∗ DF(x) ˙ x ≤ K and max ρi j (F(x), x) ≤ R for all x ∈ X. i, j∈S From the bound on ρi j , it follows that ˙ I0 x = j>n∗ ˙ xj 2 ˙ xj ≤ j>n∗ = xk ρk j (F(x), x) − x j j>n∗ k∈S ≤ j>n∗ ρ jk (F(x), x) k∈S xk ρk j (F(x), x) + x j k∈S ≤ 2Rn k∈S ρ jk (F(x), x) xj j>n∗ = 2Rn (10 ) x. We therefore conclude that at all x ∈ O , ˙ ˙ ((Ξ∗ + I0 )x) DF(x)x ≤ 2KRn (10 ) x. Essentially the same argument provides a similar bound on the third term of (7.11), 273 completing the proof of part (i) the lemma. We proceed with the proof of part (ii). Since x∗ is quasistrict, we have that Fi (x∗ ) = ¯ F(x∗ ) > F j (x∗ ) for all i ∈ support(x∗ ) = {1, . . . , n∗ } and all j support(x∗ ) = {n∗ + 1, . . . , n}. Therefore, since the pairwise comparison dynamic satisfies sign preservation (4.23), we have for such i and j that ρ ji (F(x∗ )) > 0 and ρi j (F(x∗ )) = 0. So, since F and ρ are continuous, sign preservation implies that there is a neighborhood O of x∗ and an r > 0 such that ρ ji (F(x)) > r and ρi j (F(x)) = 0 for all i ≤ n∗ , j > n∗ , and x ∈ O . Applying this observation and then canceling like terms, we find that for all x ∈ O , ˙ (10 ) x = ˙ xj j>n∗ xk ρk j (F(x)) − x j = j>n∗ k∈S = xk ρk j (F(x)) − x j j>n∗ k>n∗ =− k∈S ρ ji (F(x)) xj j>n∗ k∈S ρ jk (F(x)) ρ jk (F(x)) i≤n∗ ≤ −r n∗ (10 ) x. This completes the proof of the lemma, and thus the proof of Theorem 7.4.7. Exercise 7.4.9. Prove Theorem 7.4.7(ii) (for p = 1) by showing that under the best response dynamic, the function Gx∗ (x) = G(x) + Υx∗ (x) = max ( y − x) F(x) + C y∈X xj j S(x∗ ) is a strict local Lyapunov function for any regular ESS x∗ . (Hint: The proof is nearly the same as the one above, but building on the proof of Theorem 6.2.9 instead of the proof of Theorem 6.2.11, and using Theorems 6.B.2 and 6.B.6 in place of Corollary 6.B.7.) Exercise 7.4.10. Prove Theorem 7.4.7(i) (for p = 1) by showing that under the separable excess payoff dynamic with revision protocol τ, the function ˆ Fi (x) Γx∗ (x) = Γ(x) + Υx∗ (x) = τi (s) ds + C i∈S 0 xj j S(x∗ ) is a strict local Lyapunov function for any regular ESS x∗ . (Hint: Establish this variant of 274 Lemma 7.4.8: under the excess payoff dynamic generated by τ, there is a neighborhood O of x∗ such that ˙ ˙ (i) x DF(x)x ≤ K T(x) (10 ) x and ˙ (ii) (10 ) x = −T(x) (10 ) x, for all x ∈ O , where T(x) = 7.5 i∈S ˆ τi (Fi (x)).) Linearization of Imitative Dynamics In this section and the next, we study the stability of rest points of evolutionary dynamics using linearization. This technique requires the dynamic in question to be smooth, at least near the rest point in question, and it can be inconclusive in borderline cases. But, more optimistically, it does not require the guesswork needed to find Lyapunov functions. Furthermore, instead of establishing just asymptotic stability, a rest point found stable via linearization (that is, one that is linearly stable) must attract solutions from all nearby initial conditions at an exponential rate. Linearization is also very useful for proving that a rest point is unstable, a fact we will avail ourselves of repeatedly when studying nonconvergence in Chapter 8. Finally, linearization techniques allow us to prove local stability results for imitative dynamics other than the replicator dynamic, for which no Lyapunov functions have been proposed. The appendix to this chapter explains the techniques from matrix analysis (Appendix 7.A), linear differential equation theory (Appendix 7.B), and linearization theory (Appendix 7.C) used in this chapter and the next. We assume in the remainder of this chapter and in the next chapter that payoffs are defined on the positive orthant (see Appendix 2.A.7), as doing so will allow us to avoid the using affine calculus. Reviewing multivariate product and chain rules from Appendix 2.A.4 may be helpful for following the arguments to come. We begin the analysis with some general background on linearization of evolutionary dynamics. Recall that a single population dynamic (D) ˙ x = V (x) describes the evolution of the population state through the simplex X. In evaluating the stability of the rest point x∗ using linearization, we are relying on the fact that near x∗ , the dynamic (D) can typically be well approximated by the linear dynamic (L) ˙ y = DV (x∗ ) y. 275 Because we are only interested in how (D) behaves on the simplex, we only care about how (L) behaves on the tangent space TX. Indeed, it is only because (D) defines a dynamic on X that it makes sense to think of (L) as a dynamic on TX. At each state x ∈ X, V (x) ∈ TX describes the current direction of motion through the simplex. It follows that the derivative DV (x) must map any tangent vector z into TX, as one can verify by writing V (x + z) = V (x) + DV (x)z + o(|z|) ˙ and noting that V (x) and V (x + z) are both in TX. Thus, in (L), y lies in TX whenever y lies in TX, implying that TX is invariant under (L). Keeping this argument in mind is important when using linearization to study stability under the dynamic (D): rather than looking at all the eigenvalues of DV (x∗ ), we should only consider those associated with the restricted linear map DV (x∗ ) : TX → TX, which sends each tangent vector z ∈ TX to a new tangent vector DV (x∗ )z ∈ TX. The scalar λ = a + ib is an eigenvalue of this restricted map if DV (x∗ )z = λz for some vector z whose real and imaginary parts are both TX. If all eigenvalues of this restricted map have negative real part, then the rest point x∗ is linearly stable under (D) (cf Corollary 7.C.2). Hines’s Lemma, stated next and proved in Appendix 7.A.7, is often the key to making these determinations. In stating this result, we let Rn = {z ∈ Rn : z 1 = 0} denote the 0 tangent space of the simplex. In the single population case, TX and Rn are the same, but it 0 p is useful to separate these two notations in multipopulation cases, where TX = p∈P Rn 0 Lemma 7.5.1. Suppose that Q ∈ Rn×n is symmetric, satisfies Q1 = 0, and is positive definite with respect to Rn , and that A ∈ Rn×n is negative definite with respect to Rn . Then each eigenvalue of 0 0 the linear map QA : Rn → Rn has negative real part. 0 0 7.5.1 The Replicator Dynamic In this section, we show that any regular ESS x∗ is linearly stable under the replicator dynamic. To begin, we focus on the case in which x∗ is interior. Theorem 7.5.2. Let x∗ be an regular interior ESS of F. Then x∗ is linearly stable under the replicator dynamic. Proof. (p = 1) The single population replicator dynamic is given by (R) ˆ ˙ xi = Vi (x) = xi Fi (x). To compute DV (x), recall from equation (6.20) that the derivative of the excess payoff 276 ˆ ¯ function F(x) = F(x) − 1F(x) is given by ˆ DF(x) = DF(x) − 1(x DF(x) + F(x) ) = (I − 1x )DF(x) − 1F(x) . Then applying the product rule for componentwise products (see Appendix 2.A.4), we find that (7.13) ˆ DV (x) = D(diag(x)F(x)) ˆ ˆ = diag(x)DF(x) + diag(F(x)) ˆ = diag(x)((I − 1x )DF(x) − 1F(x) ) + diag(F(x)) ˆ = Q(x)DF(x) − x F(x) + diag(F(x)), where we write Q(x) = diag(x) − xx . Since x∗ is an interior Nash equilibrium, F(x∗ ) is a constant vector, implying that ˆ F(x∗ ) Φ = 0 and that F(x∗ ) = 0. Thus, equation (7.13) becomes (7.14) DV (x∗ )Φ = Q(x∗ )DF(x∗ )Φ. Since the matrices Q(x∗ ) and DF(x∗ )Φ satisfy the conditions of Hines’s Lemma, the eigenvalues of DV (x∗ )Φ (and hence of DV (x∗ )) corresponding to directions in Rn have negative 0 real part. This completes the proof of the theorem. Exercise 7.5.3. Let x∗ be an interior Nash equilibrium of F that satisfies z DF(x∗ )z > 0 for all nonzero z ∈ TX. Show that x∗ is a source under the replicator dynamic: all relevant eigenvalues of DV (x∗ ) have positive real part, implying that all solutions of the replicator dynamic that start near x∗ are repelled. (Hint: See the discussion in Appendix 7.A.7.) Also, construct a game with an equilibrium that satisfies the conditions of this result. Exercise 7.5.4. Show that any regular interior ESS is linearly stable under the projection dynamic. The next example highlights the fact that being a regular ESS is only a sufficient condition for an interior equilibrium to be locally stable under the replicator dynamic, not a necessary condition. Example 7.5.5. Zeeman’s game revisited. In Example 5.1.7, we introduced the single popu- 277 lation game F(x) = Ax generated by random matching in 0 6 −4 A = −3 0 5 . −1 3 0 1 4 This game admits Nash equilibria at states x∗ = ( 1 , 3 , 1 ), ( 5 , 0, 1 ) and e1 ; the replicator 3 3 5 dynamic has rest points at these states, as well as at the restricted equilibria (0, 5 , 3 ), e2 , 88 and e3 . Examining the phase diagram in Figure 7.5.1, we see that the behavior of the dynamic near the non-Nash rest points is consistent with Theorem 7.1.1. Since F is not a stable game (why not?), Theorem 7.5.2 does not tell us whether x∗ is stable. But we can check this directly: following the proof of Theorem 7.5.2, we compute 4 9 −13 1 DV (x∗ )Φ = Q(x∗ )DF(x∗ )Φ = Q(x∗ )AΦ = −5 −9 14 . 9 1 0 −1 In addition to the irrelevant eigenvalue of 0 corresponding to eigenvector 1, this matrix √ √ has pair of complex eigenvalues, − 1 ± i 32 , corresponding to eigenvectors (−2 ± i(3 2), 1 3 √ i(3 2), 1) whose real and complex parts lie in Rn . Since the real parts of the relevant 0 eigenvalues are both − 1 , the Nash equilibrium x∗ is linearly stable under the replicator 3 dynamic. § We now establish the stability of all regular ESSs. Theorem 7.5.6. Let x∗ be a regular ESS of F. Then x∗ is linearly stable under the replicator dynamic. Proof. (p = 1) Suppose without loss of generality that the support of x∗ is {1, . . . , n∗ }, so that the number of unused strategies at x∗ is n0 = n − n∗ . For any matrix M ∈ Rn×n , ∗∗ we let M++ ∈ Rn ×n denote the upper left n∗ × n∗ block of M, and we define the blocks ∗0 0 ∗ 0 0 M+0 ∈ Rn ×n , M0+ ∈ Rn ×n , and M00 ∈ Rn ×n similarly. Also, for each vector v ∈ Rn , we let ∗ 0 v+ ∈ Rn and v0 ∈ Rn denote the upper and lower “blocks” of v. Recall our expression (7.13) for the derivative matrix of the replicator dynamic: ˆ DV (x) = Q(x)DF(x) − x F(x) + diag(F(x)), ˆ where Q(x) = diag(x) − xx . Now observe that x∗ = 0 for all j > n∗ , that Fi (x∗ ) = 0 for all j ˆ i ≤ n∗ , and, since x∗ is quasistrict, that F j (x∗ ) < 0 for all j > n∗ (see the proof of Lemma 278 1 2 3 Figure 7.5.1: The replicator dynamic in Zeeman’s game. ˆ ˆ 4.5.4). Therefore, by writing Q = Q(x∗ ), D = DF(x∗ ), π = F(x∗ ), and π = F(x∗ ), we can express DV (x∗ ) in the block diagonal form (7.15) Q++ D++ − (x∗ )+ (π+ ) DV (x∗ ) = 0 Q++ D+0 − x∗ (π0 ) . 0 ˆ diag(π ) To complete the proof of the theorem, we need to show that if v + iw with v, w ∈ Rn is an 0 ∗ ) with eigenvalue a + ib, then a < 0. eigenvector of DV (x We split the analysis into two cases. Suppose first that (v + iw)0 = 0 (i.e., that v j = w j = 0 whenever j > n∗ ). Then it is easy to see that (v + iw)+ must be an eigenvector of DV (x∗ )++ = Q++ D++ − (x∗ )+ (π+ ) . Now because x∗ is a Nash equilibrium with support {1, . . . , n∗ }, π+ is a constant vector, and since v, w ∈ Rn and (v + iw)0 = 0, the components of (v + iw)+ sum to 0 ∗∗ zero. Together, these observations imply that (x∗ )+ (π+ ) (v + iw)+ = 0. Finally, Q++ ∈ Rn ×n ∗∗ and D++ ∈ Rn ×n satisfy the conditions of Hines’s Lemma, the latter by requirement (7.9) for regular ESSs, and so this lemma enables us to conclude that a < 0. Now suppose that (v + iw)0 0, so that v j + iw j 0 for some j > n∗ . Then since ˆ the lower right block of DV (x∗ ) is the diagonal matrix diag(π0 ), the jth component of the ∗ ) is π (v + iw ) = (a + ib)(v + iw ), implying that a = π (and ˆj j ˆj eigenvector equation for DV (x j j j 279 also that b = w j = 0). But as we noted above, the fact that x∗ is a quasistrict equilibrium ˆ implies that π j < 0, and so that a < 0. This completes the proof of the theorem. Exercise 7.5.7. Suppose that x∗ = ei is a strict equilibrium of F. Show that for each j the vector e j − ei is an eigenvector of DV (x∗ ) with eigenvalue F j (x∗ ) − Fi (x∗ ). i, Exercise 7.5.8. Suppose that x∗ is a quasistrict Nash equilibrium of F. We saw in Theorem ˆ 7.5.6 that for each unused strategy j, the excess payoff F j (x∗ ) is an eigenvalue of DV (x∗ ) ˆ corresponding to an eigenvector in TX. Assume that F j (x∗ ) is not an eigenvalue of DV (x∗ ) corresponding to an eigenvector in TX ∩ Rn(x∗ ) . Show that S 1 ζ + n∗ 1 −ι ∈ TX j ˆ is an eigenvector of DV (x∗ ) corresponding to eigenvalue F j (x∗ ), where ι j is the appropriate 0 ∗ standard basis vector in Rn , and where ζ is the unique vector in Rn satisfying 1 ζ = 0 and ˆ ˆ1 (Q++ D++ − π j I) ζ = π j ( n∗ 1 − (x∗ )+ ) + Q++ (D+0 ι j − 1 D++ 1). n∗ Why is there exactly one vector that satisfies these conditions? What goes wrong if the ˆ restriction on F j (x∗ ) does not hold? 7.5.2 General Imitative Dynamics Theorem 7.5.6 established the local stability of all regular ESSs under the replicator dynamic. Theorem 7.5.9 parlays the previous analysis into a local stability for all imitative dynamics. Theorem 7.5.9. Assume that x∗ is a hyperbolic rest point of both the replicator dynamic (R) and a given imitative dynamic (4.5). Then x∗ is linearly stable under (R) if and only if it is linearly stable under (4.5). Thus, if x∗ is a regular ESS that satisfies the hyperbolicity assumptions, it is linearly stable under (4.5). Proof. (p = 1) We only consider the case in which x∗ is interior; for boundary cases, see Exercise 7.5.12. Recall from Observation 4.4.16 that any imitative dynamic (4.5) has monotone percentage growth rates: we can express the dynamic as (7.16) ˙ xi = xi Gi (x), where 280 (7.17) Gi (x) ≥ G j (x) if and only if Fi (x) ≥ F j (x). Lemma 7.5.10 shows that property (7.17) imposes a remarkable amount of structure on the derivative matrix of the percentage growth rate function G at the equilibrium x∗ . Lemma 7.5.10. Let x∗ be an interior Nash equilibrium, and suppose that ΦDF(x∗ ) and ΦDG(x∗ ) define invertible maps from TX to itself. Then ΦDG(x∗ )Φ = c ΦDF(x∗ )Φ for some c > 0. Proof. Since x∗ is a Nash equilibrium, and hence a rest point of (7.16), we have that ΦF(x∗ ) = ΦG(x∗ ) = 0. It follows that (7.18) ΦF(x∗ + εz) = εΦDF(x∗ )z + o(ε) and ΦG(x∗ + εz) = εΦDG(x∗ )z + o(ε). for all z ∈ TX. Since we can rewrite condition (7.17) as (ei − e j ) G(x) ≥ 0 if and only if (ei − e j ) F(x) ≥ 0, and since ei − e j ∈ TX, equation (7.18) implies that for all i, j ∈ S and z ∈ TX, (7.19) (ei − e j ) ΦDG(x∗ )z ≥ 0 if and only if (ei − e j ) ΦDF(x∗ )z ≥ 0. (This observation is trivial when z = 0, and when z 0 it follows from the fact that the linear terms dominate in (7.18) when ε is small.) By Proposition 2.B.6, condition (7.19) is equivalent to the requirement that for all i, j ∈ S, there is a ci j > 0 such that (7.20) (ei − e j ) ΦDG(x∗ )Φ = ci j (ei − e j ) ΦDF(x∗ )Φ. Now write gi j = (ei − e j ) ΦDG(x∗ )Φ and fi j = (ei − e j ) ΦDF(x∗ )Φ. Since by assumption ΦDF(x∗ )Φ is an invertible map from TX to itself, so is its transpose (see Exercise 7.5.11 below). Therefore, when i, j, and k are distinct, the unique decomposition of fik as a linear combination of fi j and f jk is as fi j + f jk . But equation (7.20) reveals that ci j fi j + c jk f jk = gi j + g jk = gik = cik fik , and so ci j = c jk = cik . This and the fact that ci j = c ji imply that ci j is independent of i and j. So, since vectors of the form ei − e j span TX, we conclude from equation (7.20) that ΦDG(x∗ )Φ = c ΦDF(x∗ )Φ, where c is the common value of the constants ci j . This completes the proof of the lemma. ˆ We proceed with the proof of Theorem 7.5.9. Let V (x) = diag(x)F(x) and W (x) = 281 diag(x)G(x) denote the replicator dynamic (R) and the dynamic (7.16), respectively. Since ˆ W (x) ∈ TX, we have that 1 W (x) = x G(x) = 0, and hence that G(x) ≡ G(x) − 1x G(x) = G(x). ˆ Thus (7.16) can be rewritten as W (x) = diag(x)G(x). Now, repeating calculation (7.13) reveals that ˆ DW (x) = Q(x)DG(x) − 1G(x) + diag(G(x)). Since x∗ is an interior rest point of W , G(x∗ ) is a constant vector, and so DW (x∗ )Φ = Q(x∗ )DG(x∗ )Φ = Q(x∗ )ΦDG(x∗ )Φ, where the second equality follows from the fact that Q(x∗ )1 = 0. Similar reasoning for the replicator dynamic V shows that DV (x∗ )Φ = Q(x∗ )ΦDF(x∗ )Φ Lemma 7.5.10 tells us that ΦDG(x∗ )Φ = cΦDF(x∗ )Φ for some c > 0. We therefore conclude from the previous two equations that if x∗ is a hyperbolic rest point under V and W , its stability properties under the two dynamics are the same. Exercise 7.5.11. Suppose that A ∈ Rn×n defines an invertible map from Rn to itself and 0 maps the vector 1 to the origin. Show that A must also have these properties. (Hint: Use the Fundamental Theorem of Linear Algebra (7.27).) Exercise 7.5.12. Extend the proof of Theorem 7.5.9 above to the case of boundary equilibria. (Hint: Combine Lemma 7.5.10 with the proof of Theorem 7.5.6.) 7.6 Linearization of Perturbed Best Response Dynamics Linearization is also a useful tool for studying perturbed best response dynamics, our other main class of differentiable evolutionary dynamics. 7.6.1 Deterministically Perturbed Best Response Dynamics In Chapter 5, we saw that perturbed best response dynamics can be defined in terms of either stochastic or deterministic payoff perturbations. But Theorem 5.2.2 showed that there is no loss of generality in focusing on the later case, and so we will do so here. 282 Our first result shows that a negative definiteness condition on the payoff derivative is a sufficient condition for stability. The conclusion here is similar to that from Exercise 7.4.5, but the analysis is much simpler, and establishes not only asymptotic stability, but also linear stability. ˜ Theorem 7.6.1. Consider the perturbed best response dynamic for the pair (F, v), and let x be ˜ ˜ a perturbed equilibrium of this pair. If DF(x) is negative definite with respect to TX, then x is linearly stable. Proof. (p = 1) In the single population case, the stochastically perturbed best response dynamic takes the form (7.21) ˜ ˙ x = M(F(x)) − x, ˜ where the perturbed maximizer function M is defined in equation (5.12). By the chain rule, the derivative of law of motion (7.21) is (7.22) ˜ DV (x) = DM(F(x))DF(x) − I. ˜ To determine the eigenvalues of the product DM(F(x))DF(x), let us recall the properties ˜ of the derivative matrix DM(π) from Corollary 5.C.5: it is symmetric, positive definite on ˜ ˜ Rn , and satisfies DM(π)1 = 0. Since we have assumed that DF(x) is negative definite with 0 ˜ ˜ ˜ respect to Rn , Hines’s Lemma implies that the eigenvalues of DM(F(x))DF(x) (as a map 0 n from R0 to itself) have negative real part. Subtracting the identity matrix I from the matrix product reduces each of these eigenvalues by 1, so the theorem is proved. Exercise 7.6.2. Show that the conclusion of the theorem continues to hold if DF(x) is only negative semidefinite with respect to TX. (Hint: See the discussion in Appendix 7.A.7.) ¯ ˜ Exercise 7.6.3. Let x be a perturbed equilibrium for (F, v). Let λ be the largest eigenvalue ˜ ˜ ¯ ˜ of DM(F(x)), and let s be the largest singular value of ΦDF(x)Φ (see Section 7.A.6). Show ¯¯ ˜ ˜ that if λ s < 1, then x is linearly stable: that is, x is stable whenever choice probabilities are not too sensitive to changes in payoffs, or payoffs are not too sensitive to changes in the state. 7.6.2 The Logit Dynamic Imposing the additional structure provided by logit choice allows us to carry our local stability analysis further. First, building on Theorem 7.4.6, we argue that any regular 283 interior ESS must have a linearly stable logit(η) equilibrium nearby whenever the noise level η is sufficiently small. Corollary 7.6.4. Let x∗ be a regular interior ESS of F. Then for some neighborhood O of x∗ and ˆ all η > 0 less than some η > 0, there is a unique and linearly stable logit(η) equilibrium xη in O. Proof. (p = 1) Theorem 7.4.6 tells us that for η small enough, the equilibrium xη exists ˜ and is unique, and that limη→0 xη = x∗ . Since x∗ is a regular interior ESS, DF(x∗ ) is negative ˜ definite with respect to TX, so by continuity, DF(xη ) is negative definite with respect to TX for all η close enough to 0. The result therefore follows from Theorem 7.6.1. The derivative matrix for the logit dynamic takes an especially appealing form. Recall from Exercise 5.2.7 that the derivative matrix of the logit(η) choice function is (7.23) ˜ ˜ ˜ ˜ ˜ DMη (π) = η−1 diag(Mη (π)) − Mη (π)Mη (π) = η−1 Q(Mη (π)). ˜ ˜ ˜ ˜ Now by definition, the logit equilibrium xη satisfies Mη (F(xη )) = xη . Substituting this fact into equations (7.22) and (7.23) yields (7.24) ˜ ˜ ˜ DV η (xη ) = η−1 Q(xη )DF(xη ) − I. To see the importance of this equation, recall from equation (7.14) that at interior rest points, the derivative matrix for the replicator dynamic satisfies (7.25) DV (x∗ )Φ = Q(x∗ )DF(x∗ )Φ. Together, equations (7.24) and (7.25) show that when evaluated at their respective rest points and in the relevant tangent directions, the linearizations of the replicator and logit dynamics at their interior rest points differ only by a positive affine transformation! Example 7.6.5. To obtain the cleanest connections between the two dynamics, consider a 1 game that admits a Nash equilibrium x∗ = n 1 at the barycenter of the simplex. Then by symmetry, xη = x∗ is also a logit(η) equilibrium for every η > 0. By the logic above, λ is a relevant eigenvalue of (7.25) if and only if η−1 λ − 1 is a relevant eigenvalue of (7.24). It follows that if x∗ is linearly stable under the replicator dynamic, then it is also linearly stable under the logit(η) dynamic for any η > 0. § The foregoing discussion shows how analyses of local stability under the replicator and logit dynamics can be closely linked. Pushing these arguments further, one can use equations (7.24) and (7.25) to connect the long run behaviors of the replicator and best 284 response dynamics starting from arbitrary initial conditions—see the Notes for further discussion. Appendix 7.A Matrix Analysis In this section we review some basic ideas from matrix analysis. In doing so, we lay the groundwork for our introduction to linear differential equations in Appendix 7.B; this in turn underlies our introduction to local linearization of nonlinear differential equations in Appendix 7.C. The techniques presented here are also used to perform the explicit calculations that arise when using linearization to analyze evolutionary dynamics. 7.A.1 Rank and Invertibility While in most of this section we focus on square matrices, we start by considering matrices A ∈ Rm×n of arbitrary dimensions. The rank of A is the number of linearly independent columns of A, or, equivalently, the dimension of its range. The nullspace (or kernel) of A is the set of vectors that the matrix maps to the origin, and the dimension of this set is called the nullity of A. The rank and nullity of a matrix must sum to its number of columns: dim(nullspace(A)) + dim(range(A)) = n; (7.26) dim(nullspace(A )) + dim(range(A )) = m. In Appendix 2.B.2, we introduced the Fundamental Theorem of Linear Algebra: (7.27) range(A) = (nullspace(A ))⊥ . To derive a key implication of (7.27) for the ranks of matrices, first recall that any subspace V ⊆ Rm satisfies dim(V ) + dim(V ⊥ ) = m. Letting V = nullspace(A ) and then combining the result with equation (7.26), we obtain dim(range(A )) = dim((nullspace(A )⊥ ). Therefore, (7.27) yields dim(range(A )) = dim(range(A)). 285 In words: every matrix has the same rank as its transpose. From this point forward, we suppose that A ∈ Rn×n is a square matrix. We say that A is invertible if it admits an inverse matrix A−1 : that is, a matrix satisfying A−1 A = I. Such a matrix also satisfies AA−1 = I, and when an inverse matrix exists, it is unique. Invertible matrices can be characterized in a variety of ways: for instance, a matrix is invertible if and only if it has full rank (i.e., if A ∈ Rn×n has rank n); alternatively, a matrix is invertible if and only if its determinant is nonzero. 7.A.2 Eigenvectors and Eigenvalues Let A ∈ Rn×n , and suppose that (7.28) Ax = λx for some complex scalar λ ∈ C and some nonzero complex vector x ∈ Cn . Then we call λ an eigenvalue of A, and x an eigenvector of A associated with λ; sometimes, the pair (λ, x) is referred to as an eigenpair. The eigenvector equation (7.28) can be rewritten as (λI − A)x = 0. This equation can only be satisfied by a nonzero vector if (λI − A) is not invertible, or, equivalently, if det(λI − A) = 0. It follows that λ is an eigenvalue of A if and only if λ is a root of the characteristic polynomial det(tI − A). Since det(tI − A) is a polynomial of degree n in t, the Fundamental Theorem of Algebra ensures that it has n complex roots: (7.29) det(tI − A) = (t − λ1 ) (t − λ2 ) . . . (t − λn ). To be sure to obtain n roots, we must “count multiplicities”: if the values of λi in the above expression are not all distinct, the repeated values must be tallied each time they appear. Evidently, each λi in (7.29) is an eigenvalue of A; if the value λ is repeated k times in (7.29), we say that λ is an eigenvalue of A of (algebraic) multiplicity k. We note in passing that that the sum and the product of the eigenvalues of A can be described very simply: n n λi = tr(A); i =1 λi = det(A). i =1 (Here, the trace tr(A) of the matrix A is the sum of its diagonal elements.) To remember these formulas, notice that they are trivially true if A is a diagonal matrix, since in this 286 case the eigenvalues of A are its diagonal entries. Each eigenvalue of A corresponds to at least one eigenvector of A, and if an eigenvalue λ is of algebraic multiplicity k, then there can be as many as k linearly independent eigenvectors of A corresponding to this eigenvalue. This number of linearly independent eigenvectors is called the geometric multiplicity of λ. The collection of all eigenvectors corresponding to λ, the eigenspace of λ, is a subspace of Cn of dimension equal to the geometric multiplicity of λ. Example 7.A.1. Let a, b ∈ R be nonzero, and consider these three 2 × 2 matrices: a b a b a 0 A= 0 a ; B = 0 a ; C = −b a . The matrix A has just one eigenvalue, a, which therefore has algebraic multiplicity 2. It also has geometric multiplicity 2, as its eigenspace is all of C2 = span({e1 , e2 }). (This description of C2 relies on our allowing complex scalars when taking linear combinations of e1 and e2 .) The matrix B also has a lone eigenvalue of a of algebraic multiplicity 2. But here the geometric multiplicity of a is just 1, since its eigenspace is span({e1 }). The matrix C has no real eigenvalues or eigenvectors; however, it has complex eigenvalues a ± i b ∈ C corresponding to the complex eigenvectors e1 ± i e2 ∈ C2 . Let us explain for future reference the geometry of the linear map x → Cx. By writing √ r = a2 + b2 and θ = cos−1 ( a ), we can express the matrix C as r cos(θ) sin(θ) C=r − sin(θ) cos(θ) . Computing Cx for various values of x (try x = e1 and x = e2 ), reveals that the map x → Cx first rotates the vector x around the origin clockwise by an angle of θ, and then rescales the result by a factor of r. § 7.A.3 Similarity, (Block) Diagonalization, and the Spectral Theorem The matrix A ∈ Rn×n is similar to matrix B ∈ Rn×n if there exists an invertible matrix S ∈ Cn×n , called a similarity matrix, such that B = S−1 AS. 287 When A is similar to B, the linear transformations x → Ax and y → By are equivalent up to a linear change of variable. Similarity defines an equivalence relation on the set of n × n matrices, and matrices that are similar have the same characteristic polynomial and the same eigenvalues, counting either algebraic or geometric multiplicities. If A is similar to a diagonal matrix D—that is, if A is diagonalizable—then the eigenvalues of A are simply the diagonal elements of D. In this definition the similarity matrix is allowed to be complex; if the similarity can be achieved via a real similarity matrix S ∈ Rn×n , then the diagonal matrix D is also real, and we call A real diagonalizable. It follows easily from our definitions that a matrix A is diagonalizable if and only if the sum of the geometric multiplicities of the eigenvalues of A is n. Equivalently, A is diagaonalizable if and only if each of its eigenvalues has equal algebraic and geometric multiplicities. It is simple to verify that in this case, a similarity matrix S can be constructed by choosing n linearly independent eigenvectors of A to be its columns. It is especially convenient when similarity can be achieved using similarity matrix that is itself of a simple form. The most important instance occurs when this matrix is an orthogonal matrix, meaning that its columns form an orthonormal basis for Rn : each column is of length 1, and distinct columns are orthogonal. (It would make more sense to call such a matrix an “orthonormal matrix”, but the term “orthogonal matrix” is traditional.) Orthogonal matrices can be characterized in a variety of ways: Theorem 7.A.2. The following are equivalent: (i) R is an orthogonal matrix. (ii) RR = I. (iii) R = R−1 . (iv) The map x → Rx preserves lengths: |Rx| = |x| for all x ∈ Rn . (v) The map x → Rx preserves inner products: (Rx) (Ry) = x y for all x, y ∈ Rn ;. (vi) The map x → Rx is a composition of rotations and reflections. The last three items are summarized by saying that the linear transformation x → Rx defined by an orthogonal matrix R is a Euclidean isometry. While showing that a matrix is similar to a diagonal matrix is quite useful, showing similarity to a block diagonal matrix often serves just as well. We focus on block diagonal matrices with diagonal blocks of these two types: a b J1 = (λ); J2 = −b a . For reasons that will become clear in Section 7.A.5, we call block diagonal matrices of this 288 form simple Jordan matrices. Calculations with simple Jordan matrices are often little more difficult than those with diagonal matrices: for instance, multiplying such a matrix by itself retains its block diagonal structure. To muster these ideas, let us call the matrix A ∈ Rn×n normal if it commutes with itself: that is, if A A = AA . Theorem 7.A.3 (The Spectral Theorem for Real Normal Matrices). The matrix A ∈ Rn×n is normal if and only if it is similar via an orthogonal matrix R to a simple Jordan matrix B = R−1 AR. The matrix B is unique up to the ordering of the diagonal blocks. The spectral decomposition of A provides a full account of the eigenvalues and eigenvectors of A. Each J1 block ( λ ) contains a real eigenvalue of A, and the pair of complex numbers a ± i b derived from each J2 block are complex eigenvalues of A. Moreover, columns of the orthogonal similarity matrix R either are real eigenvectors of A, or are real and imaginary parts of complex eigenvectors of A. The spectral theorem tells us that if A is normal, the behavior of the linear map x → Ax = RBR−1 x can be decomposed into three simple steps. First, one applies the orthogonal transformation R−1 = R to x, obtaining y = R x. Second, one applies the block diagonal matrix B to y: each J1 block rescales a component of y, while each J2 block rotates and rescales a pair of components of y (cf Example 7.A.1). Third, one applies R to BR x to undo the initial orthogonal transformation. Additional restrictions on the J1 and J2 blocks yield characterizations of important subclasses of the normal matrices. Corollary 7.A.4. (i) The matrix A ∈ Rn×n is symmetric (A = A) if and only if it is similar via an orthogonal matrix R to a simple Jordan matrix containing only J1 blocks. Thus, the symmetric matrices are the normal matrices with real eigenvalues. (ii) The matrix A ∈ Rn×n is skew-symmetric (A = −A) if and only if it is similar via an orthogonal matrix R to a simple Jordan matrix whose J1 blocks all have λ = 0 and whose J2 blocks all have a = 0. Thus, the skew-symmetric matrices are the normal matrices with purely imaginary eigenvalues. (iii) The matrix A ∈ Rn×n is orthogonal (A = A−1 ) if and only if if it is similar via an orthogonal matrix R to a simple Jordan matrix whose J1 blocks all have λ2 = 1 and whose J2 blocks all have a2 + b2 = 1. Thus, the orthogonal matrices are the normal matrices whose eigenvalues have modulus 1. 289 7.A.4 Symmetric Matrices Which matrices are real diagonalizable by an orthogonal matrix? The spectral theorem for symmetric matrices tells us that A is real diagonalizable by an orthogonal matrix if and only if it is symmetric. (This is just a restatement of Corollary 7.A.4.) Among other things, the spectral theorem implies that the eigenvalues of a symmetric matrix are real. While we often associate a matrix A with the linear transformation x → Ax, a symmetric matrix is naturally associated with a quadratic form, x → x Ax. In fact, the eigenvalues of a symmetric matrix can be characterized in terms of its quadratic form. The RayleighRitz Theorem provides simple descriptions of the λ and λ, the maximal and minimal eigenvalues of A: λ = max x Ax; λ = min x Ax. n n x∈R : |x|=1 x∈R : |x|=1 The Courant-Fischer Theorem shows how the remaining eigenvalues of A can be expressed by in terms of a related sequence of minmax problems. We say that the matrices A, B ∈ Rn×n are congruent if there is an invertible matrix Q ∈ Rn×n such that B = QAQ . Congruence plays the same role for quadratic forms as similarity does for linear transformations: if two symmetric matrices are congruent, they define the same quadratic form up to a linear change of variable. Like similarity, congruence defines an equivalence relation on the set of n × n matrices. Lastly, note that two symmetric matrices that are similar by an orthogonal matrix Q are also congruent, since in this case Q = Q−1 . The eigenvalues of congruent symmetric matrices are closely linked. Define the inertia of a symmetric matrix to be the ordered triple consisting of the numbers of positive, negative, and zero eigenvalues of the matrix. Sylvester’s Law of Inertia tells us that congruent symmetric matrices have the same inertia. Ostrowski’s Theorem provides a quantitative extension of this result: if we list the eigenvalues of A and the eigenvalues of B in increasing order, then the ratios between pairs of corresponding eigenvalues are bounded by the minimal and maximal eigenvalues of Q Q. 7.A.5 The Real Jordan Canonical Form How can we tell if two matrices are similar? If the matrices are diagonalizable, then one can check for similarity by diagonalizing the two matrices and seeing whether the same 290 diagonal matrix is obtained in each case. To apply this logic beyond the diagonalizable case, we would need to find a simple class of matrices with the property that every matrix is similar to a unique representative from this class. Such a class of matrices would also provide a powerful computational aid, since calculations involving arbitrary matrices could be reduced by similarity to calculations with these simple matrices. With this motivation, we define a real Jordan matrix to be a block diagonal matrix whose diagonal blocks, known as Jordan blocks, are of these four types: λ 1 0 0 0 λ 1 0 a b . ; J3 = . . . . . . . . . . J1 = (λ); J2 = . −b a 0 0 0 λ 0000 0 0 0 0 J2 I 0 J2 I 0 0 0 . . . . .. .. . ; J4 = . .. . . . . . . . 0 0 0 J I 1 2 0 0 0 0 J2 λ Theorem 7.A.5. Every matrix A ∈ Rn×n is similar via a real similarity matrix S to a real Jordan matrix J = S−1 AS. The latter matrix is unique up to the ordering of the Jordan blocks. The real Jordan matrix in the statement of the theorem is called the real Jordan canonical form of A. The blocks in the real Jordan form of A provide detailed information about the eigenvalues of A: each J1 block corresponds to a real eigenvalue λ; each J2 block corresponds to a pair of complex eigenvalues a ± i b; each J3 block corresponds to a real eigenvalue with less than full geometric multiplicity; and each J4 block corresponds to a pair of complex eigenvalues with less than full geometric multiplicities. (We can say more if each Jordan block represents a distinct eigenvalue: then each eigenvalue has geometric multiplicity 1; the J1 and J2 blocks correspond to eigenvalues whose algebraic multiplicities are also 1; and the J3 and J4 blocks correspond to eigenvalues with higher algebraic multiplicities, with these multiplicities being given by the number of appearances of λ (in a J3 block) or of J2 blocks (in a J4 block).) Example 7.A.6. Suppose that A ∈ R2×2 has complex eigenvalues a ± i b with complex eigenvectors v ± i w. Then A(v + i w) = (a + i b)(v + i w). Equating the real and imaginary parts of this equation yields a b Av w=v w −b a . Premultiplying by ( v w )−1 reveals that the real Jordan form of A is a single J2 block. § 291 Example 7.A.7. Suppose that A ∈ R2×2 has a lone eigenvalue, λ ∈ R, which is of algebraic multiplicity 2 but geometric multiplicity 1. Let x ∈ R2 be an eigenvector of A, so that (A − λI)x = 0. It can be shown that there exists a vector y that is linearly independent of x and that satisfies (A − λI) y = x. (Such a vector (and, more generally, vectors that satisfy higher iterates of this equation) is called a generalized eigenvector of A.) Rewriting the two equations above, we obtain λ 1 A x y = λx x + λ y = x y 0 λ . Premultiplying the first and last expressions by ( x y )−1 shows that A has a real Jordan form consisting of a single J3 block. § 7.A.6 The Spectral Norm and Singular Values It is often useful to be able to place bounds on the amount of “expansion” generated by a linear map x → Ax, or by a composite linear map x → Bx → ABx. One can obtain such bounds by introducing the spectral norm of a matrix A ∈ Rn×n , defined by A = max |Ax| . x: |x|=1 (As always in this book, |x| denotes the Euclidean norm of the vector x.) It is not difficult to check that the spectral norm is submultiplicative, in the following two senses: |Ax| ≤ A |x| ; and AB ≤ A B. These inequalities often work hand in hand with the Cauchy-Schwarz inequality, which expresses the submultiplicativity of inner products of vectors: x y ≤ |x| y . To compute the spectral norm of a matrix, it is best to describe it in a different way. The product A A generated by any matrix A is symmetric. It therefore has n real eigenvalues (see Section 7.A.4), and it can be shown that these eigenvalues are nonnegative. The square roots of the eigenvalues of A A are called the singular values of A. 292 One can show that the spectral norm of A equals the largest singular value of A: A = max √ λ : λ is an eigenvalue of A A . It makes no difference here if we replace A A with AA , since for any A, B ∈ Rn×n , AB and BA have the same eigenvalues. The notion of a singular value also underpins the singular value decomposition Theorem 7.A.8. Every matrix A ∈ Rn×n can be expressed as A = V ΣW , where V and W are orthogonal matrices, and where Σ is a diagonal matrix whose diagonal entries are the singular values of A. In this decomposition, the columns of V are eigenvectors of AA , and the columns of W are eigenvectors of A A. 7.A.7 Hines’s Lemma In Section 7.5, we introduced Hines’s Lemma: Lemma 7.5.1. Suppose that Q ∈ Rn×n is symmetric, satisfies Q1 = 0, and is positive definite with respect to Rn , and that A ∈ Rn×n is negative definite with respect to Rn . Then each eigenvalue 0 0 of the linear map QA : Rn → Rn has negative real part. 0 0 If we ignored the complications caused by the fact that our dynamics are restricted to the simplex, Lemma 7.5.1 would reduce to Lemma 7.A.9. If Q is symmetric positive definite and A is negative definite, then the eigenvalues of QA have negative real parts. The proof of Lemma 7.A.9 is a simpler version of the proof below. The argument below can also be used when other definiteness conditions are imposed on A. In particular, if A is only negative semidefinite with respect to Rn , then the relevant 0 eigenvalues of QA have nonpositive real parts, and if A is positive definite with respect to Rn , the relevant eigenvalues of QA have positive real part. 0 Proof of Lemma 7.5.1. Since Q is positive definite with respect to Rn , since Q1 = 0, and 0 n n since R = R0 ⊕ span({1}), we have that nullspace(Q) = span({1}). Thus, because Q is symmetric, the Fundamental Theorem of Linear Algebra (7.27) tells us that range(Q) = (nullspace(Q ))⊥ = (nullspace(Q))⊥ = (span({1}))⊥ = Rn . 0 293 In other words, Q maps Rn onto itself, and so is invertible on this space. 0 Now suppose that (7.30) QA(v + iw) = (a + ib)(v + iw) for some v, w ∈ Rn with v + iw 0 and some a, b in R. Since Q is invertible on Rn , there 0 0 exist y, z ∈ Rn , at least one of which is not 0, such that Qy = v and Qw = z. We can thus 0 rewrite equation (7.30) as QA(v + iw) = (a + ib)Q( y + iz). Since Q is invertible on Rn , this implies that 0 A(v + iw) = (a + ib)( y + iz). Premultiplying by (v − iw) = (Q( y − iz)) yields (v − iw)A(v + iw) = (a + ib)( y − iz) Q( y + iz). Equating the real parts of each side yields v Av + w Aw = a( y Q y + z Qz). Since Q is positive definite with respect to Rn and A is negative definite with respect to 0 Rn , we conclude that a < 0. 0 7.B Linear Differential Equations The simplest ordinary differential equations on Rn are linear differential equations: (L) ˙ x = Ax, where A ∈ Rn×n . Although our main interest in this book is in nonlinear differential equations, linear differential equations are still very important to us: as we explain in Section 7.C, the behavior of a nonlinear equation in the neighborhood of a rest point is often well appproximated by the behavior of linear equation in a neighborhood of the origin. 294 7.B.1 Examples Example 7.B.1. Linear dynamics on the line. In the one-dimensional case, equation (L) ˙ becomes x = ax. We described the solution to this equation from initial condition x0 = ξ in Example 3.A.1: they are of the form xt = ξ exp(at). Thus, if a 0, the equation has its unique rest point at the origin. If a > 0, all solutions other than the stationary one move away from the origin, while if a < 0, all solutions converge to the origin. § One can always apply a linear change of variable to (L) to reduce it to a simpler form. ˙ ˙ In particular, if B = SAS−1 is similar to A, let y = Sx; then since y = Sx, we can rewrite (L) ˙ ˙ as S−1 y = AS−1 y, and hence as y = By. It follows from this observation and from Theorem 7.A.5 that to understand linear differential equations, it is enough to understand linear differential equations defined by real Jordan matrices. Example 7.B.2. Linear dynamics on the plane. There are three generic types of 2 × 2 matrices: diagonalizable matrices with two real eigenvalues, diagonalizable matrices with two complex eigenvalues, and nondiagonlizable matrices with a single real eigenvalue. The corresponding real Jordan forms are a diagonal matrix (which contains two J1 blocks), a J2 matrix, and a J3 matrix, respectively. We therefore consider linear differential equations based on these three types of real Jordan matrices. When A is diagonal, the linear equation (L) and its solution from initial condition x0 = ξ are of the following form: λ 0 x1 ˙ x = Ax = 0 µ x ; 2 ξ1 eλt xt = µt . ξ e 2 The phase diagrams in Figure 7.B.1 show that the behavior of this dynamic depends on the values of the eigenvalues λ and µ: if both are negative, the origin is a stable node, if their signs differ, the origin is a saddle, and if both are positive, the origin is an unstable node. Now suppose that A is the real Jordan form of a matrix with complex eigenvalues a ± i b. Then we have a b x1 ξ1 eat cos bt + ξ2 eat sin bt ˙ x = Ax = −b a x ; xt = −ξ eat sin bt + ξ eat cos bt . 2 1 2 Phase diagrams for this equation are presented in Figure 7.B.2. Evidently, the stability of the origin is determined by the real part of the eigenvalues: if a < 0, the origin is a stable spiral, while if a > 0, the origin is an unstable spiral. In the nongeneric case where a = 0, 295 (i) stable node (µ < λ < 0) (ii) saddle (µ > 0 > λ) (iii) unstable node ( > λ > 0) Figure 7.B.1: Linear dynamics on the plane: two real eigenvalues λ, µ. (i) stable spiral (a < 0) (ii) center (a = 0) (iii) unstable spiral (a > 0) Figure 7.B.2: Linear dynamics on the plane: complex eigenvalues a ± i b, b < 0. the origin is a center, with each solution following a closed orbit around the origin. The value of b determines the orientation of the cycles. The diagrams in Figure 7.B.2 use b < 0, which causes solutions to cycle counterclockwise; had we chosen b > 0, these orientations would have been reversed. Finally, suppose that A is the real Jordan form of a nondiagonalizable matrix with lone eigenvalue λ. Then we obtain λ 1 x1 ˙ x = Ax = 0 λ x ; 2 ξ1 eλt + ξ2 teλt . xt = λt ξe 1 The phase diagrams in Figure 7.B.3 reveal the origin to be an improper (or degenerate) node. It is stable if the eigenvalue λ is negative and unstable if λ is positive. § 296 (i) stable improper node (λ < 0) (ii) unstable improper node (λ > 0) Figure 7.B.3: Linear dynamics on the plane: A not diagonalizable, one real eigenvalue λ. 7.B.2 Solutions The Picard-Lindelof Theorem (Theorem 3.A.2) implies that for any matrix A ∈ Rn×n ¨ there is a unique solution to the linear equation (L) starting from each initial condition ξ ∈ Rn . While solutions of nonlinear differential equations generally cannot be expressed in closed form, the solutions to linear equations can always be described explicitly. In the planar case, Example 7.B.2 provided explicit formulas when A is a Jordan matrix, and the solutions for other matrices can be obtained through a change of variable. Similar logic can be employed in the general case, yielding the following result: Theorem 7.B.3. Let {xt }t∈(−∞,∞) be the solution to (L) from initial condition x0 . Then each coordinate of xt is a linear combination of terms of the form tk eat cos(bt) and tk eat sin(bt), where a + i b ∈ C is an eigenvalue of A and k ∈ Z+ is less than the algebraic multiplicity of this eigenvalue. For analytic purposes, it is often convenient to express solutions of the linear equation (L) in terms of matrix exponentials. Given a matrix A ∈ Rn×n , we define eA ∈ Rn×n by applying the series definition of the exponential function to the matrix A: that is, ∞ e= A k =0 Ak , k! where Ak denotes the kth power of A and A0 ≡ I is the identity matrix. Recall that the flow φ : (−∞, ∞) × Rn → Rn generated by (L) is defined by φt (ξ) = xt , where {xt }t∈(−∞,∞) is the solution to (L) with initial condition x0 = ξ. Theorem 7.B.4 provides a concise expression for solutions to (L) in terms of matrix exponentials. Theorem 7.B.4. The flow of (L) is φt (ξ) = eAt ξ. 297 A benefit of representing solutions to (L) in this way is that properties established for matrix exponentials can be given immediate interpretations in terms of solutions to (L). For examples, consider these properties: Proposition 7.B.5. (i) If A and B commute, then eA+B = eB eA . (ii) If B = S−1 AS, then eB = S−1 eA S. (iii) e(A ) = (eA ) . Applying part (i) of the proposition to matrices As and At yields the group property of the flow of (L): φs+t (ξ) = φt (φs (ξ)). Part (ii) shows that linear flows generated by similar matrices are linearly conjugate (i.e., that they are equivalent up to a linear change of variables), as we discussed before Example 7.B.2. Applying parts (iii) and (i) to At when A is skew-symmetric shows that in this case, eAt is an orthogonal matrix: thus, for each fixed time t, the map ξ → φt (ξ) is a Euclidean isometry (cf Figure 7.B.2(ii)). 7.B.3 Stability and Hyperbolicity Theorem 7.B.3 shows in generic cases, the stability of the origin under the linear equation (L) is determined by the eigenvalues {a1 + i b1 , . . . , an + i bn } of A: more precisely, by the real parts ai of these eigenvalues. If each ai is negative, then all solutions to (L) converge to the origin; in this case, the origin is called a sink, and the flow φt (x) = eAt x is called a contraction. If instead each ai is positive, then all solutions besides the stationary solution at the origin move away from the origin; in this case, the origin is called a source, and the flow of (L) is called an expansion. When the origin is a sink, solutions to (L) converge to the origin at an exponential rate. Define a norm on Rn by x = |S−1 x|, where S is the similarity matrix from the Jordan decomposition J = S−1 AS of A. Then for any a > 0 satisfying a < |ai | for all i ∈ {1, . . . n}, the flow φ of (L) satisfies 0 is a sink ⇔ φt (ξ) ≤ e−at ξ for all t ≥ 0 and all ξ ∈ Rn . A similar statement in terms of the Euclidean norm holds if one introduces an appropriate multiplicative constant C = C(a) ≥ 1: (7.31) 0 is a sink ⇔ φt (ξ) ≤ Ce−at |ξ| for all t ≥ 0 and all ξ ∈ Rn . 298 If the origin is the source, analogous statements hold if time is run backward: for instance, (7.32) 0 is a source ⇔ φt (ξ) ≤ Ce−a|t| |ξ| for all t ≤ 0 and all ξ ∈ Rn . More generally, the flow of (L) may be contracting in some directions and expanding in others. In the generic case in which each real part ai of an eigenvalue of A is nonzero, ˙ the differential equation x = Ax, its rest point at the origin, and its flow φt (x) = eAt x are all said to be hyperbolic. Hyperbolic linear flows come in three varieties: contractions (if all ai are negative), expansions (if all ai are positive), and saddles (if there is at least one ai of each sign). If a flow is hyperbolic, then the origin is globally asymptotically stable if it is a sink, and it is unstable otherwise. If (L) is hyperbolic, then A has k eigenvalues with negative real part (counting algebraic multiplicities) and n − k eigenvalues with positive real part. In this case, we can view Rn = Es ⊕ Eu as the direct sum of subspaces of dimensions dim(Es ) = k and dim(Eu ) = n − k, where the stable subspace Es contains all solutions of (L) that converge to the origin at an exponential rate (as in (7.31)), while the unstable subspace Eu contains all solutions of (L) that converge to the origin at an exponential rate if time is run backward (as in (7.32)). If A is real diagonalizable, then it follows easily from Theorem 7.B.3 that Es and Eu are the spans of the eigenvectors of A corresponding to the negative and positive eigenvalues of A, respectively. More generally, Es and Eu can be computed by way of the real Jordan form J = S−1 AS of A. Arrange S and J so that the Jordan blocks of J corresponding to eigenvalues of A with negative real parts appear in the first k rows and columns, while the blocks corresponding to eigenvalues with positive real parts appear in the remaining n − k rows and columns. Then Es is the span of the first k columns of the similarity matrix S, and Eu is the span of the remaining n − k columns of S. (The columns of S are the real and imaginary parts of the so-called generalized eigenvectors of A—see Example 7.A.7.) 7.C Linearization of Nonlinear Differential Equations Virtually all of the differential equations we study in this book are nonlinear. Nevertheless, when studying the behavior of nonlinear equations in the neighborhood of a rest point, the theory of linear equations takes on a central role. Consider the C1 differential equation (D) ˙ x = V (x) with rest point x∗ . By the definition of the derivative, we can approximate the value of V 299 in the neighborhood of x∗ via V ( y) = 0 + DV (x∗ )( y − x∗ ) + o( y − x∗ ). This suggests that the behavior of the dynamic (D) near x∗ can be approximated by the behavior near the origin of the linear equation (L) ˙ y = DV (x∗ ) y. To make this idea precise, we must introduce the notion of topological conjugacy of flows. To begin, let X and Y be subsets of Rn . Then the function h : X → Y is homeomorphism if it is bijective (i.e., one-to-one and onto) and continuous with a continuous inverse. Now let I be an interval containing 0, and let φ : I × X → X and ψ : I × Y → Y be two flows. We say that φ and ψ are topologically conjugate on X and Y if there is a homeomorphism h : X → Y such that φt (x0 ) = h−1 ◦ ψt ◦ h (x0 ) for all times t ∈ I. In other words, φ and ψ are topologically conjugate if there is a continuous map with continuous inverse that sends trajectories of φ to trajectories of ψ (and vice versa), preserving the rate of passage of time. Therefore, to find φt (x0 ), the position at time t under flow φ when the initial state is x0 ∈ X, one can apply h : X → Y to x0 to obtain the transformed initial condition y0 = h(x0 ) ∈ Y, then run the flow ψ from y0 for t time units, and finally apply h−1 to the result. We summarize this construction in the diagram below: x0 φt h −−→ −− h(x0 ) ψt φt (x0 ) ← − − ψt (h(x0 )) −− h−1 The use of linearization to study the behavior of nonlinear differential equations around fixed points is justified by the Hartman-Grobman Theorem. Theorem 7.C.1 (The Hartman-Grobman Theorem). Let φ and ψ be the flows of the C1 equation (D) and the linear equation (L), where x∗ is a hyperbolic rest point of (D). Then there exist neighborhoods Ox∗ of x∗ and O0 of the origin 0 on which φ and ψ are topologically conjugate. Combining the Hartman-Grobman Theorem with our analysis in Section 7.B.3 provides a simple characterization of the stability of hyperbolic rest points of (D). Corollary 7.C.2. Let x∗ be a hyperbolic rest point of (D). Then x∗ is asymptotically stable if all eigenvalues of DV (x∗ ) have strictly negative real parts, and x∗ is unstable otherwise. 300 By virtue of these results, we say that x∗ is linearly stable if the eigenvalues of DV (x∗ ) all have negative real part. While the Hartman-Grobman Theorem implies that a linearly stable rest point is asymptotically stable, it can be shown further that solutions starting near a linearly stable rest point converge to it at an exponential rate, as in equation (7.31). We say that x∗ is linearly unstable if DV (x∗ ) has at least one eigenvalue with positive real part. (We do not require x∗ to be hyperbolic.) It can be shown that as long as one eigenvalue of DV (x∗ ) has positive real part, most solutions of (D) will move away from x∗ at an exponential rate. While the topological conjugacy established in Theorem 7.C.1 is sufficient for local stability analysis, one should understand that topological conjugacy need not preserve the geometry of a flow. The following result for linear equations makes this point clear. ˙ ˙ Theorem 7.C.3. Let x = Ax and y = By be hyperbolic linear differential equations on Rn with flows φ and ψ. If A and B have the same numbers of eigenvalues with negative real part (counting algebraic multiplicities), then φ and ψ are topologically conjugate throughout Rn . Looking back at Example 7.B.2, we see that the phase diagrams of stable nodes (Figure 7.B.1(i)), stable spirals (Figure 7.B.2(i)), and stable improper nodes (Figure 7.B.3(i)) have very different appearances. Nevertheless, Theorem 7.C.3 reveals that the flows described in these figures are topologically conjugate—that is, they can be continuously transformed into one another! To ensure that the geometry of phase diagrams is preserved, one needs not only topological conjugacy, but rather differentiable conjugacy: that is, conjugacy under a diffeomorphism (a differentiable transformation with differentiable inverse). As it turns out, it is possible to establish a local differentiable conjugacy between (D) near x∗ and (L) near 0 if V is sufficiently smooth, and if the eigenvalues of DV (x∗ ) are distinct and satisfy a mild nonresonance condition (see the Notes). Much additional information about the flow of (D) can be surmised from the derivative matrix DV (x∗ ) at a hyperbolic rest point x∗ . Suppose that DV (x∗ ) has k eigenvalues with negative real part and n − k eigenvalues with positive real part, counting algebraic multiplicities. The Stable Manifold Theorem tells us that within some neighborhood of x∗ , there is k dimensional local stable manifold Ms on which solutions converge to x∗ at an loc exponential rate (as in (7.31)), and an n − k dimensional local unstable manifold Mu on which loc solutions converge to x∗ at an exponential rate if time is run backward (as in (7.32)). Moreover, both of these manifolds can be extended globally: the k dimensional (global) stable manifold Ms includes all solutions of (D) that converge to x∗ , while the n − k dimensional (global) unstable manifold Mu includes all solutions that converge to x∗ as time runs backward. Among other implications of the existence of these manifolds, it follows that if 301 x∗ is hyperbolic and unstable, then the set Ms of states from which solutions converge to x∗ is of measure zero, while the complement of this set is open, dense, and of full measure. 7.N Notes Section 7.1: Theorem 7.1.1 is established by Bomze (1986) for the replicator dynamic and by Nachbar (1990) for general imitative dynamics; see also Weibull (1995). Section 7.2: This section follows Sandholm (2001). Bomze (2002) provides an exhaustive treatment of local stability under the replicator dynamic for single-population linear potential games (which are generated by random matching in common interest games), and the connections between this stability analysis and quadratic programming. Section 7.3: The notion of an evolutionarily stable strategy was first defined in a single population random matching setting by Maynard Smith and Price (1973) via conditions (7.5) and (7.6). The equivalent definition (7.4) is due to Taylor and Jonker (1978). Taylor and Jonker (1978) also introduce the notion of a regular ESS for single-population nonlinear games; in this paper, they also introduce the replicator dynamic, and prove that a regular ESS is asympotically stable under this dynamic. Basic references on ESS theory include the survey of Hines (1987) and the monographs of Bomze and Potscher (1989) and Cressman ¨ (1992). Our definition (7.2) of ESS generalizes the definition of two-population ESS introduced by Taylor (1979) (also see Schuster et al. (1981a)) and the definition of “local ESS” for single population nonlinear games of Pohley and Thomas (1983). Exercise 7.3.1 is essentially due to Selten (1980); see also van Damme (1991) and Swinkels (1992). Exercise 7.3.3(iii) is Example 18 of Bomze and Potscher (1989). ¨ When Maynard Smith and Price (1973) introduced the notion of ESS, the situation they aimed to capture was rather different from the one studied in this book. They envisioned a population of animals, each member of which plays the same mixed strategy x ∈ X as they are randomly matched to play a symmetric normal form game A. Occasionally, this population is invaded by a small group of mutants, each member of which plays the same mixed strategy y x. Maynard Smith and Price (1973) call x an evolutionarily stable strategy if regardless of the strategy y x played by the mutants, the payoff of the incumbents is exceeds that of the mutants in the post-entry population. They captured this notion using conditions (7.5) and (7.6), which as we have noted are equivalent to condition (7.4). To preserve the sense of this definition in multipopulation settings, Cressman (1992) (see also Cressman (1995, 1996) and Cressman et al. (2001)) calls a strategy profile x = 302 (x1 , . . . , xp ) satisfying condition (7.3) a monomorphic ESS. (The later papers call such a profile a p -species ESS.) To justify this definition, Cressman (1992) introduces a collection of p dimensional replicator systems, one for each alternative strategy profile y = ( y1 , . . . , yp ). The pth component of the state variable of this system describes the fraction of the pth species using mixed strategy yp ; the remainder of the species uses the incumbent mixed strategy xp . In games with linear payoffs, asymptotic stability of the origin (i.e., of the state at which all members each species p choose mixed strategy xp ) in this system is equivalent to condition (7.3). Notice that condition (7.3) is less restrictive than our definition (7.2): (7.3) requires just one of the incumbent species to outperform its mutant counterpart, whereas (7.2) requires that the incumbent species do at least as well as the mutant species on average. To understand why condition (7.3) ensures stability under Cressman’s (1992) dynamics, note first that a single successful incumbent species p will drive its mutant ˆ counterpart to extinction. This effectively changes the invading strategy profile to y = ˆ ˆ ( y1 , . . . , xp , . . . , yp ), which in turn must have a species that is outperformed by its incumbent counterpart. Iteration of this logic shows that the origin is locally stable. Interestingly, Cressman (1992) shows that in two-population linear games, a monomorphic ESS is asymptotically stable under the replicator dynamic, but that once there are three populations, this implication no longer need hold. Given that Maynard Smith and Price’s (1973) motivation for single population ESS (and Cressman’s (1992) extension to multiple populations) have little to do with the usual pure strategy replicator dynamic, the fact that monomorphic ESS implies stability under this dynamic in any cases at all is “only good fortune” (Cressman et al. (2001, p. 10)). Other interesting extensions of the ESS concept to multipopulation random matching settings allow set-valued solutions, which are particularly useful in the context of random matching in extensive form games; see Thomas (1985), Swinkels (1992), Balkenborg and Schlag (2001), and Cressman (2003). The alternatives definitions of ESS for games with nonlinear payoffs that we describe in Section 7.3.2 are studied by Vickers and Cannings (1988), Bomze and Potscher (1989), ¨ and Bomze (1990, 1991). One alternative we did not mention is that of an uninvadable strategy, which is based on the requirement of a uniform invasion barrier: namely, that the ¯ threshold ε > 0 in definition (7.4) be independent of the mutant y. It can be shown that an uninvadable strategy must satisfy our definition (7.2) of ESS, and that uninvadability is strictly weaker than the pair of conditions (7.5) and (7.7): see Theorem 35 and Corollaries 39 and 43 of Bomze and Potscher (1989). As we noted in the text, our definition (7.2) ¨ of ESS, the alternative definitions of ESS presented in equation (7.4), equations (7.5) and (7.6), and equations (7.5) and (7.7), and therefore the notion of uninvadability as well, are 303 equivalent in single-population linear games. See Bomze and Potscher (1989) and Weibull ¨ (1995) for further discussion. To sum up, the motivations for the alternative definitions of ESS for multipopulation settings and for nonlinear games come from the monomorphic population, mixedstrategist environment studied by Maynard Smith and Price (1973). Since our focus in this book is on behavior of agents who choose among different pure strategies, our aim is to employ simple definitions that support general asymptotic stability results in this context. This goal motivates the definitions of ESS and regular ESS put forward here. Section 7.4: Theorem 7.4.1(i) on the local stability of ESS under the replicator dynamic is one of the earliest results on evolutionary game dynamics; see Taylor and Jonker (1978), Taylor (1979), Hofbauer et al. (1979), Zeeman (1980), and Schuster et al. (1981a). Theorem 7.4.1(ii) follows easily from results of Nagurney and Zhang (1997); see also Sandholm et al. (2008). The results in Section 7.4.2 are extensions of ones from Hofbauer and Sandholm (2008). For the Theorem of the Maximum, see Ok (2007). Theorem 7.4.7 is due to Sandholm (2008a). Hofbauer (1995b) establishes the asymptotic stability of ESS under the best response dynamic in a single population random matching using a different construction than the one presented here. Section 7.5: Lemma 7.5.1 is due to Hines (1980); see also Hofbauer and Sigmund (1988), Hopkins (1999), and Sandholm (2007a). Versions of Theorems 7.5.2 and 7.5.6 can be found in Taylor and Jonker (1978), Taylor (1979), Hines (1980), and Cressman (1992, 1997). Example 7.5.5 is taken from Zeeman (1980). Theorem 7.5.9 is due to Cressman (1997). Section 7.6: Linearization of perturbed best response dynamics is studied by Hopkins (1999, 2002), Hofbauer (2000), Hofbauer and Sandholm (2002, 2007), Hofbauer and Hopkins (2005), and Sandholm (2007a). Exercise 7.6.3 is used in Sandholm (2007a) to show that Nash equilibria of normal form games can always be purified (in the sense of Harsanyi (1973)) in an evolutionarily stable fashion through an appropriate choice of payoff noise. See Ellison and Fudenberg (2000) and Ely and Sandholm (2005) for related results. Example 7.6.5 is due to Hopkins (1999). Hopkins (2002) uses this result to show that the replicator dynamic closely approximates the evolution of choice probabilities under stochastic fictitious play. Hofbauer et al. (2007) use similar ideas to establish an exact relationship between the long run time averaged behavior of the replicator dynamic and the long run behavior of the best response dynamic. Appendix 7.A: Horn and Johnson (1985) is an outstanding general reference on matrix analysis. Many of the results we described are also presented in Hirsch and Smale (1974). Appendix 7.B: Both Hirsch and Smale (1974) and Robinson (1995) provide thorough treatments of linear differential equations at the undergraduate and graduate levels, re- 304 spectively. Appendix 7.C: Robinson (1995) is an excellent reference on dynamical systems in general and on linearization in particular. For more on differentiable conjugacy around rest points, see Hartman (1964). 305 306 CHAPTER EIGHT Nonconvergence of Evolutionary Dynamics 8.0 Introduction We began our study of the global behavior of evolutionary dynamics in Chapter 6, focusing on combinations of games and dynamics generating global or almost global convergence to equilibrium. The analysis there demonstrated that global payoff structure—in particular, the structure captured in the definitions of potential, stable, and supermodular games—makes compelling evolutionary justifications of the Nash prediction possible. On the other hand, once we move beyond these classes of well-behaved games, it is not clear how often convergence will occur. The present chapter counterbalances Chapter 6 by investigating nonconvergence of evolutionary dynamics for games, describing a variety of environments in which cycling or chaos offer the best predictions of long run behavior. Section 8.1 leads with a study of conservative properties of evolutionary dynamics, focusing on the existence of constants of motion and on the preservation of volume under the replicator and projection dynamics. Section 8.2 continues with a panoply of examples of nonconvergence. Among other things, this section offers games in which no reasonable evolutionary dynamic converges to equilibrium, demonstrating that no evolutionary dynamic can provide a blanket justification for the prediction of Nash equilibrium play. Section 8.3 proceeds by offering examples of chaotic evolutionary dynamics—that is, dynamics exhibiting complicated attracting sets and sensitive dependence on initial conditions. The possibility of nonconvergence has surprising implications for evolutionary support of traditional solution concepts. Under dynamics that satisfy Nash stationarity (NS), solution trajectories that converge necessarily converge to Nash equilibria. But since no 307 reasonable evolutionary dynamic converges in all games, general support for standard solution concepts is not assured. Since the Nash prediction is not always supported by an evolutionary analysis, it is natural to turn to a less demanding notion—namely, the elimination of strategies that are strictly dominated by a pure strategy. As this requirement is the mildest employed in standard game theoretic analyses, it is natural to expect to find support for this requirement via an evolutionary approach. In Section 8.4, we present the striking finding that evolutionary dynamics satisfying four mild conditions—continuity, Nash stationarity, positive correlation, and innovation— do not eliminate strictly dominated strategies in all games. Moreover, while we saw in Chapter 6 that imitative dynamics and the best response dynamic eliminate strictly dominated strategies, we show here that small perturbations of these dynamics do not. This analysis demonstrates that evolutionary dynamics provide surprisingly little support for a basic rationality criterion. As always, the appendices provide the mathematical background necessary for our analysis. Appendix 8.A describes some classical theorems on nonconvergence used throughout the chapter. Appendix 8.B introduces the notion of an attractor of a dynamic, and establishes the continuity properties of attractors that underlie our analysis of dominated strategies. 8.1 Conservative Properties of Evolutionary Dynamics It is often impossible to provide precise descriptions of long run behavior under nonconvergent dynamics. An important exception occurs in cases where the dynamics lead certain quantities to be preserved. We explore this idea in the current section, where we argue that in certain strategic environments, the replicator and projection dynamics exhibit noteworthy conservative properties. 8.1.1 Constants of Motion in Null Stable Games In Section 6.2.1, we introduced null stable population games. These games are defined by the requirement that ( y − x) (F( y) − F(x)) = 0 for all x, y ∈ X, and include zero-sum games (Example 2.3.7) and multi-zero-sum games (Exercise 2.3.9) as special cases. 308 In Exercise 6.2.2, we saw if x∗ is an interior Nash equilibrium of a null stable game F : X → Rn , then the value of the function 2 Ex∗ (x) = x − x∗ is preserved along interior segments of solution trajectories of the projection dynamic: thus, as these segments are traversed, Euclidean distance from the equilibrium x∗ is fixed. Similar conclusions hold for interior solutions of the replicator dynamic: Exercise 6.2.5 shows that such solutions preserve the value of the function p Hx∗ (x) = p p h(x∗ )p (xp ), where h yp (xp ) = i∈Sp ( yp ) p∈P p y i yi log xp i is a relative entropy function. When x∗ is interior, the level sets of Ex∗ and Hx∗ foliate from x∗ like the layers of an onion. Each solution trajectory is limited to one of these layers, a manifold whose dimension is one less than that of X. Example 8.1.1. In Figure 4.3.1, we presented phase diagrams of the six basic evolutionary dynamics for standard Rock-Paper-Scissors, FR (x) 0 −1 1 xR xS − xP F (x) = 1 x = x − x , P P R F(x) = 0 −1 S F (x) −1 1 x x − x 0 S S P R 1 a zero-sum game with unique Nash equilibrium x∗ = ( 1 , 1 , 3 ). Figures 4.3.1(i) and 4.3.1(ii) 33 show that interior solutions of the replicator and projection dynamics form closed orbits around x∗ . These orbits describe the level sets of the functions Ex∗ and Hx∗ . Note that an affine transformation of Hx∗ yields a simpler constant of motion for the replicator dynamic, H (x) = − i∈S log xi . § When dim(X) > 2, the level sets of Ex∗ and Hx∗ need not pin down the locations of interior solutions of (P) and (R). But if the null stable game F has multiple Nash equilibria, then there are multiple collections of level sets, and intersections of these sets do determine the positions of interior solutions. Example 8.1.2. Consider the population game F generated by random matching in the 309 1 3 4 Figure 8.1.1: Solutions of the projection dynamic on level set Ex∗ (x) = √ 3 12 , 1 1 x∗ = ( 4 , 1 , 1 , 4 ). 44 symmetric zero-sum game A: (8.1) 0 −1 0 1 x1 x4 − x2 1 0 −1 0 x2 x1 − x3 = F(x) = Ax = x x − x . 3 2 0 1 0 −1 4 x3 − x1 −1 0 1 0 x4 The Nash equilibria of F are the points on line segment NE connecting states ( 1 , 0, 1 , 0) 2 2 and (0, 1 , 0, 1 ). 2 2 The arguments above show that interior solutions to the projection dynamic maintain a constant distance from every Nash equilibrium of F. This is illustrated in Figure 8.1.1, which presents solutions on the sphere inscribed in the pyramid X; this is the level set on √ 11 which Ex∗ takes the value 123 , where x∗ = ( 1 , 4 , 4 , 1 ). Each solution drawn in the figure is a 4 4 circular closed orbit orthogonal to line segment NE. Figure 8.1.2 presents solution trajectories of the replicator dynamic for game F. Dia1 grams (i) and (ii) show solutions on level sets of Hx∗ where x∗ = ( 1 , 4 , 1 , 1 ); the first (smaller) 4 44 level set is nearly spherical, while the second approximates the shape of the pyramid X. 3 Diagrams (iii) and (iv) present solutions on level sets of Hx∗ with x∗ = ( 3 , 1 , 8 , 1 ) and 88 8 1 3 x∗ = ( 8 , 3 , 1 , 8 ) . By our previous discussion, the intersection of the two level sets is a 88 closed curve describing a single orbit of the dynamic. § Example 8.3.2 will show that even in zero-sum games, very complicated dynamics can arise within the level sets of Hx∗ . Exercise 8.1.3. (i) Suppose that A ∈ Rn×n is skew-symmetric. Show that the eigenvalues of A all have zero real part, and so that the number of nonzero eigenvalues is even. 310 1 1 2 3 3 4 4 1 (i) x∗ = ( 4 , 1 , 1 , 1 ), Hx∗ (x) = .02 444 1 (ii) x∗ = ( 1 , 1 , 1 , 4 ), Hx∗ (x) = .58 444 1 1 3 3 4 4 (iii) x∗ = ( 3 , 1 , 3 , 1 ), Hx∗ (x) = .35 8888 3 (iv) x∗ = ( 1 , 8 , 1 , 3 ), Hx∗ (x) = .35 8 88 Figure 8.1.2: Solutions of the replicator dynamic on level sets of Hx∗ . 311 (ii) Suppose that A ∈ Rn×n is a symmetric zero-sum game that admits an interior Nash equilibrium x∗ . Show that if n is even, then x∗ is contained in a line segment consisting entirely of Nash equilibria. (Hint: Consider the matrix ΦAΦ.) The previous analysis shows that in zero-sum games, typical solutions of the replicator dynamic do not converge. The next exercise shows that the time averages of these solutions do converge, and that the limits of the time averages are Nash equilibria. Exercise 8.1.4. Convergence of time averages under the replicator dynamic. Let F(x) = Ax be the population game generated by the symmetric normal form game A ∈ Rn×n , and let ˙ x = VF (x) be the replicator dynamic for this game. Suppose that {xt }t≥0 is a solution to VF that is bounded away from bd(X) (i.e., that there is an ε > 0 such that (xt )i ≥ ε for all t ≥ 0 and i ∈ S). Let 1 ¯ xt = t t xs ds 0 be the average value of the state over the time interval [0, t]. Following the steps below, ¯ prove that {xt }t≥0 converges to the set of (interior) Nash equilibria of F as t approaches infinity: (8.2) ¯ lim min xt − x∗ = 0. t→∞ x∗ ∈NE(F) ¯ In particular, if F has a unique interior Nash equilibrium x∗ , then {xt } converges to x∗ . d (i) Define yt ∈ Rn by ( yt )i = log (xt )i . Compute dt yt . (ii) Show that 1 1 yt − y0 = t t t Axs − 1xs Axs ds. 0 ¯ ¯ ¯ (iii) Let x∗ be an ω-limit point of the trajectory {xt }. Show that Ax∗ is a constant vector, ¯ and hence that x∗ is a Nash equilibrium. (Hint: Use the fact that the trajectory { yt } is constrained to a compact set.) ¯ (iv) Conclude that (8.2) holds. (Hint: Use the fact that the trajectory {xt } is constrained to a compact set.) Exercise 8.1.5. Prove that the conclusion of Exercise 8.1.4 continues to hold in a twopopulation random matching setting. Exercise 8.1.6. Explain why the argument in Exercise 8.1.4 does not allow its conclusion to be extended to random matching in p ≥ 3 populations. 312 8.1.2 Preservation of Volume ˙ Let x = V (x) be differential equation on X with flow φ : R × X → X, and let µ denote Lebesgue measure on X. The differential equation is said to volume preserving (or incompressible) on Y ⊆ X if for any measurable set A ⊆ Y, we have µ(φt (A)) = µ(A) for all t ∈ R. Preservation of volume has strong implications for local stability of rest points: since an asymptotically stable rest point must draw in all nearby initial condition, no such rest points can exist in regions where volume is preserved (see Theorem 8.A.4). We now show that in single population zero-sum games, the replicator dynamic is volume preserving after a well-chosen change in speed. Compared to the standard replicator dynamic, the speed-adjusted replicator dynamic on int(X), (8.3) p pp ˆ ˙ xi = q(x) xi Fi (x), where q(x) = r∈P j∈S 1 , xrj moves relatively faster at states closer to the boundary of the simplex, with speeds approaching infinity as the boundary is approached. The solution trajectories of (8.3) have the same locations as those of the standard replicator dynamic (see Exercise 4.4.10), so the implications of volume preservation for stability of rest points extend immediately to the latter dynamic. Theorem 8.1.7. Let F(x) = Ax be generated by random matching in the symmetric zero-sum game A = −A ∈ Rn×n . Then the dynamic (8.3) for F is volume preserving on int(X). Therefore, no interior Nash equilibrium of F is asymptotically stable under the replicator dynamic. The proof of Theorem 8.1.7 is based on Liouville’s Theorem, which tells us that the rate ˙ at which the dynamic x = V (x) expands or contracts volume near state x is given by the divergence divV (x) ≡ tr(DV (x)). More precisely, Liouville’s Theorem tells us that d µ(φt (A)) dt = φt (A) divV (x) dµ(x). for each Lebesgue measurable set A. Thus, if divV ≡ 0, so that V is divergence free, then the flow φ is volume preserving. See Section 8.A.1 for a proof and further discussion of this result. Proof. The replicator dynamic is described by the vector field R : X → TX, where R(x) = diag(x)(F(x) − 1x F(x)). 313 Since F(x) = Ax, and since x Ax ≡ 0 (because A is symmetric zero-sum), we can simplify the previous expression to (8.4) R(x) = diag(x)Ax. The dynamic (8.3) can be written as V (x) = q(x)R(x), where q is the function from int(X) → R+ defined in equation (8.3). If we can show that V is divergence free on int(X), then our result will follow from Liouville’s Theorem. ˆ ˆ To compute DV (x), let q : int(Rn ) → R+ and R : Rn → Rn be the natural extensions of q + ˆ ˆ and R, so that q(x) = Φ q(x) and DR(x) = DR(x)Φ. Then the chain rule implies that (8.5) ˆ ˆ DV (x) = q(x)DR(x) + R(x) q(x) = q(x)DR(x) + R(x) q(x) Φ. To evaluate this expression, write [x−1 ] = ( x11 , . . . , x1n ) , and compute from equations (8.3) and (8.4) that ˆ ˆ ˆ q(x) = −q(x)[x−1 ] and DR(x) = diag(x)A + diag(Ax). Substituting into equation (8.5) yields DV (x) = q(x) (diag(x)A + diag(Ax) − diag(x)Ax[x−1 ] Φ = q(x) diag(x)A + diag(Ax) − diag(x)Ax[x−1 ] − 1 n diag(x)A + diag(Ax) − diag(x)Ax[x−1 ] 11 . Therefore, divV (x) = q(x) xi Aii + i∈S − (Ax)i − i∈S 1 n xi i∈S xi (Ax)i 1 xi i∈S Ai j − Ai j x j + 1 n j∈S i∈S j∈S 1 n xi Ai j x j i∈S j∈S k∈S 1 . xk The first term in the brackets equals 0 since Aii = 0; the second and third terms cancel; the fourth and fifth terms cancel since Ai j = −A ji ; and the sixth term is 0 since x Ax = 0. We therefore conclude that divV (x) = 0 on int(X), and hence that the flow of (8.3) is volume 314 preserving. The conclusion about asymptotic stability follows from Theorem 8.A.4. Under single population random matching, volume preservation under the replicator dynamic is only assured in zero-sum games. Remarkably, moving to multipopulation random matching ensures volume preservation regardless of the payoffs in the underlying normal form game. Suppose the population game F is generated by random matching of members of p ≥ 2 populations to play a p player normal form game. Since each agent’s opponents in a match will be members of the other populations, the agent’s payoffs do not depend on his own population’s state: Fp (x) ≡ Fp (x−p ). Theorem 8.1.8 shows that this last condition is sufficient to prove that the flow of the replicator dynamic for F is volume preserving. Theorem 8.1.8. Let F be a game played by p ≥ 2 populations that satisfies Fp (x) ≡ Fp (x−p ). Then the dynamic (8.3) for F is volume preserving on int(X). Therefore, no interior Nash equilibrium of F is asymptotically stable under the replicator dynamic. Exercise 8.1.9. Prove Theorem 8.1.8. To simplify the notation, assume that each population is of unit mass. (Hint: To prove that the vector field V from equation (8.3) is divergence free, start by showing that the derivative matrix of V p at x with respect to directions in TXp is the np × np matrix ¯ ¯ DTXp V p (x) = q(x) diag(πp ) − πp I − xp (πp ) − diag(xp ) πp [(xp )−1 ] + πp xp [(xp )−1 ] Φ, ¯ ¯ where πp = Fp (x−p ) and πp = Fp (x−p ) = (xp ) πp .) Analogues of Theorems 8.1.7 and 8.1.8 can be established for the projection dynamics via much simpler calculations, and without introducing a change in speed. Exercise 8.1.10. Let F(x) = Ax be generated by random matching in the symmetric zero sum game A = −A ∈ Rn×n . Show that the projection dynamic for F is volume preserving on int(X). Exercise 8.1.11. Let F be a game played by p ≥ 2 unit-mass populations that satisfies Fp (x) ≡ Fp (x−p ). Show that the projection dynamic for F is volume preserving on int(X). 8.2 Games with Nonconvergent Evolutionary Dynamics In this section, we introduce examples of games for which many evolutionary dynamics fail to converge to equilibrium. 315 8.2.1 Circulant Games The matrix A ∈ Rn×n is called a circulant matrix if it is of the form a0 a1 · · · an−2 an−1 an−1 a0 a1 · · · an−2 . . . ... ... ... ... . A= a 2 · · · an−1 a0 a1 a1 a2 · · · an−1 a0 When we view A as the payoff matrix for a symmetric normal form game, we refer to A 1 as a circulant game. Such games always include the central state x∗ = n 1 among their Nash equilibria. Note that Rock-Paper-Scissors games are circulant games with n = 3, a0 = 0, a1 = −l, and a2 = w. Most of the specific games considered below will also have diagonal payoffs equal to 0. Their symmetric structure make circulant games simple to analyze. In doing so, we will find it convenient to refer to strategies modulo n. Exercise 8.2.1. Verify that the eigenvalue/eigenvector pairs of the circulant matrix A are (8.6) n−1 jk (λk , vk ) = a j ιn , (1, ιk , . . . , ι(n−1)k ) , k = 0, . . . , n − 1, n n j =0 π π π where ιn = exp( 2n i ) = cos( 2n ) + i sin( 2n ) is the nth root of unity. Exercise 8.2.2. Let F(x) = Ax be generated by random matching in the circulant game A, ˙ and let x = R(x) = diag(x)(Ax − 1x Ax) be the replicator dynamic for F. Show that the 1 derivative matrix of R at the Nash equilibrium x∗ = n 1 is the circulant matrix 1 ¯ DR(x∗ ) = n (A − 2 11 a), ¯1 where a = n 1 a is the average of the components of the vector a = (a0 , a1 , . . . , an−1 ) . It then follows from the previous exercise that the eigenvalue/eigenvector pairs (λk , vk ) of DR(x∗ ) are given by (8.7) 1 (λk , vk ) = n n −1 j=0 jk ( ¯ (a j − 2a)ιn , (1, ιk , . . . , ιnn−1)k ) , k = 0, . . . , n − 1. n Example 8.2.3. The hypercycle system. Suppose that a0 = . . . = an−2 = 0 and that an−1 = 1, so that each strategy yields a positive payoff only against the strategy that precedes it 316 λ2 λ3 1 3 λ4 1 4 λ2 λ3 1 5 λ2 λ1 λ1 λ1 (i) n = 3 (ii) n = 4 (iii) n = 5 Figure 8.2.1: Eigenvalues of the hypercycle system. 1 (modulo n). In this case, x∗ = n 1 is the unique Nash equilibrium of F, and the replicator dynamic for A is known as the hypercycle system. We determine the local stability of the rest point x∗ by considering the eigenvalues of DR(x∗ ). Substituting into equations (8.6) and (8.7) shows that the eigenvector/eigenvalue pairs are of the form 1 ( 2 (λk , vk ) = ιnn−1)k − 2 n n n−1 j =0 jk ιn , (1, ιk , . . . , ι(n−1)k ) , k = 0, . . . , n − 1. n n 1 2 1 Eigenvalue λ0 = n − n = − n corresponds to eigenvector v0 = 1 and so has no bearing on the stability analysis. For k ≥ 1, the sum in the formula for λk vanishes (why?), leaving 1 1 us with λk = n ι(n−1)k = n ι−k . The stability of x∗ therefore depends on whether any λk with n n k > 0 has positive real part. As Figure 8.2.1 illustrates, this largest real part is negative when n ≤ 3, zero when n = 4, and positive when n ≥ 5. It follows that x∗ is asymptotically stable when n ≤ 3, but unstable when n ≥ 5. Exercise 8.2.4 shows that the local stability results can be extended to global stability results, and that global stability can also be proved when n = 4. When n ≥ 5, it is possible to show that the boundary of X is repelling, as it is in the lower dimensional cases, and that the dynamic admits a stable periodic orbit (see the Notes). § Exercise 8.2.4. Consider the function H : int(X) → R defined by H (x) = − i∈S log xi (cf Example 8.1.1.) (i) Show that under the hypercycle equation with n = 2 or 3, H is a strict Lyapunov function on int(X), and hence that x∗ is globally asymptotically stable with respect to int(X). 317 ˙ (ii) Show that under the hypercycle equation with n = 4 we have H (x) ≤ 0 on int(X), with equality if and only if x lies in Y = { y ∈ int(X) : y1 + y3 = y2 + y4 }. Show that the sole invariant subset of Y is {x∗ }. Then use Theorems 6.B.2 and 6.B.4 and Proposition 6.A.1(iii) to conclude that x∗ is globally asymptotically stable with respect to int(X). Example 8.2.5. Monocyclic games. A circulant game A is monocyclic if a0 = 0, a1 , . . . , an−2 ≤ 0, 1 ¯ ¯1 and an−1 > 0. Let a = n i ai . If we assume that a < 0, then the Nash equilibrium x∗ = n 1, ¯ which yields a payoff of a for each strategy, is the unique interior Nash equilibrium of F(x) = Ax. More importantly, there is an open, dense, full measure set of initial conditions from which the best response dynamic for F(x) = Ax converges to a limit cycle; this limit cycle is contained in the set where M(x) = maxi∈S Fi (x) equals 0. Here is a sketch of the proof. Consider a solution trajectory {xt } of the best response dynamic that lies in set B1 = {x ∈ X : argmaxi∈S Fi (x) = {1}} during time interval [0, T). For any t ∈ [0, T), we have that xt = e−t x0 + (1 − e−t ) e1 . Since the diagonal elements of A all equal zero, it follows that (8.8) For j (8.9) M(xt ) = F1 (xt ) = e−t F1 (x0 ) = e−t M(x0 ). {1, 2} we have that F j (xt ) = e−t F j (x0 ) + (1 − e−t )A j1 ≤ e−t F j (x0 ) < e−t F1 (x0 ) = F1 (xt ). Equations (8.8) and (8.9) and the fact that F1 (e1 ) = 0 < an−1 = F2 (e1 ) imply that a solution starting in region B1 must hit the set B12 = {x ∈ X : argmaxi∈S Fi (x) = {1, 2}}, and then immediately enter region B2 = {x ∈ X : argmaxi∈S Fi (x) = {2}}. Repeating the foregoing argument shows that the trajectory next enters best response regions B3 , B4 , . . . , B0 in succession before returning to region B1 . Therefore, if we denote by B the set of states at which there are at most two best responses, then B is forward invariant under the best response dynamic. Moreover, equation (8.8) implies that the maximal payoff M(xt ) approaches 0 along all solution trajectories in B. In light of this discussion, we can define the return map r : B12 → B12 , where r(x) is the position at which a solution starting at x ∈ B12 first returns to B12 . All fixed points of r lie in 318 M−1 (0). In fact, it can be shown that r is a contraction on M−1 (0) for an appropriate choice of metric, and so that r has a unique fixed point (see the Notes). We therefore conclude that any solution trajectory starting in the open, dense, full measure set B converges to the closed orbit that passes through the unique fixed point of the return map r. § 8.2.2 Continuation of Attractors for Parameterized Games The games we construct in the examples to come will generate nonconvergent behavior for large classes of evolutionary dynamics. Recall our general formulation of evolutionary dynamics from Chapter 3: each revision protocol ρ defined a map from population games ˙ F to differential equations x = VF (x) via (8.10) p p pp ˙ xi = (VF )i (x) = p p x j ρ ji (Fp (x), xp ) − xi j∈Sp ρi j (Fp (x), xp ). j∈Sp In Chapter 4, we introduced the following three desiderata for ρ and V . (C) (NS) (PC) Continuity: ρp is Lipschitz continuous. Nash stationarity: VF (x) = 0 if and only if x ∈ NE(F). p p Positive correlation: VF (x) 0 implies that VF (x) Fp (x) > 0. We have seen that under continuity condition (C), any Lipschitz continuous population game F will generate a Lipschitz continuous differential equation (8.10), an equation that admits unique solutions from every initial condition in X. But a distinct consequence of condition (C)—one involving comparisons of dynamics across games—is equally important for the analyses to come. Suppose we have a collection of population games {Fε }ε∈(−ε,ε) that have identical strategy ¯¯ sets and whose payoffs vary continuously in ε. Then under condition (C), the law of ˙ motion x = VFε (x) varies continuously in ε. Moreover, if we let φε : R+ × X → X denote the semiflow under VFε , then the map (ε, t, x) → φε (x) is continuous as well. This fact is t important for understanding how evolution under V(·) changes as we vary the underlying game. To capture the effects on long run behavior under V(·) , we must introduce the notion of an attractor. We keep the introduction here brief; additional details can be found in Appendix 8.B. A set A ⊆ X is an attractor of the flow φ if it is nonempty, compact, and invariant under φ, and if there is a neighborhood U of A such that (8.11) lim sup dist(φt (x), A ) = 0. t→∞ x∈U 319 The set B(A ) = {x ∈ X : ω(x) ⊆ A } is called the basin of A . Put differently, attractors are asymptotically stable sets that are also invariant under the flow. A key property of attractors for the current context is known as continuation. Fix an attractor A = A 0 of the flow φ0 . Then as ε varies continuously from 0, there exist attractors A ε of the flows φε that vary upper hemicontinuously from A ; their basins B(A ε ) vary lower hemicontinuously from B(A ). Thus, if we slightly change the parameter ε, the attractors that exist under φ0 continue to exist, and they do not explode. Exercise 8.2.6. In defining an attractor via equation (8.11), we require that it attract solutions from all nearby states uniformly in time. To understand the role of uniformity in this definition, let φ be a flow on the unit circle that moves clockwise except at the topmost point x∗ (cf Example 6.A.3). Explain why {x∗ } is not an attractor under this flow. As a first application of these ideas, consider the 4 × 4 circulant game (8.12) 0 0 −1 ε x1 ε 0 0 −1 x2 ε ε . F (x) = A x = −1 ε 0 0 x3 0 −1 ε 0 x4 When ε = 0, the payoff matrix Aε = A0 is symmetric, so F0 is a potential game with 1 potential function f (x) = 1 x A0 x = −x1 x3 − x2 x4 . The function f attains its minimum of − 4 2 1 1 at states v = ( 1 , 0, 2 , 0) and w = (0, 1 , 0, 2 ), has a saddle point with value − 1 at the Nash 2 2 8 1 1 equilibrium x∗ = ( 4 , 1 , 4 , 1 ), and attains its maximum of 0 along the closed path of Nash 4 4 equilibria γ consisting of edges e1 e2 , e2 e3 , e3 e4 , and e4 e1 . It follows from results in Section ˙ 6.1 that if x = VF0 (x) satisfies (NS) and (PC), then all solutions whose initial conditions ξ satisfy f (ξ) > − 1 converge to γ. (In fact, if x∗ is a hyperbolic rest point of VFε , then the 8 Stable Manifold Theorem (see Appendix 7.C) tells us that the set of initial conditions from which solutions converge to x∗ is a manifold of dimension no greater than 2, and hence has measure zero.) The phase diagram for the Smith dynamic game F0 is presented in Figure 8.2.2(i). Now suppose that ε > 0. If our revision protocol satisfies continuity (C), then the attractor γ of VF0 continues to an attractor γε of VFε ; γε is contained in a neighborhood of γ, and its basin approximates that of γ (see Figure 8.2.2(ii)). At the same time, the unique Nash equilibrium of Fε is the central state x∗ . We have therefore proved Proposition 8.2.7. Let V(·) be an evolutionary dynamic that satisfies (C), (PC), and (NS), let Fε ˙ be given by (8.12), and let δ > 0. Then for ε > 0 suffficiently small, solutions to x = VFε (x) from 320 1 3 4 (i) ε = 0 1 3 4 (ii) ε = 1 10 Figure 8.2.2: The Smith dynamic in game Fε . 321 T H H T H T Figure 8.2.3: The replicator dynamic in Mismatching Pennies. all initial conditions x with f (x) > − 1 + δ converge to an attractor γε on which f exceeds −δ; in 8 ε particular, γ contains neither Nash equilibria nor rest points. 8.2.3 Mismatching Pennies Mismatching Pennies is a three-player normal form game in which each player has two strategies, Heads and Tails. Player p receives a payoff of 1 for choosing a different strategy than player p + 1 and a payoff of 0 otherwise, where players are indexed modulo 3. If we let F be the population game generated by random matching in Mismatching Pennies, then for each population p ∈ P = {1, 2, 3} we have that p p+1 F (x) x Fp (x) = H = T+1 . p p F (x) x T H 1 1 The unique Nash equilibrium of F is the central state x∗ = (( 2 , 1 ), ( 1 , 1 ), ( 1 , 2 )). Since 2 22 2 p there are two strategies per player, it will simplify our analysis to let yp = xH be the proportion of population p players choosing Heads, and to focus on the new state variable y = ( y1 , y2 , y3 ) ∈ Y = [0, 1]3 (see Exercise 8.2.12 for details). Example 8.2.8. The replicator dynamic for Mismatching Pennies. After our change of variable, 322 ˆ ˙ the replicator dynamic y = VF ( y) for Mismatching Pennies takes the form y1 y1 (1 − y1 )(1 − 2 y2 ) ˙ ˙ ˙ y = y2 = y2 (1 − y2 )(1 − 2 y3 ) . y3 y3 (1 − y3 )(1 − 2 y1 ) ˙ 11 The derivative matrix for an arbitrary state y and the equilibrium state y∗ = ( 2 , 2 , 1 ) are 2 ˆ DV ( y) = (1−2 y1 )(1−2 y2 ) −2 y1 (1− y1 ) 0 0 (1−2 y2 )(1−2 y3 ) −2 y2 (1− y2 ) −2 y3 (1− y3 ) 0 (1−2 y3 )(1−2 y1 ) ˆ and DV ( y∗ ) = 1 0 −2 0 1 0 0 −2 1 −2 0 0 . ˆ DV ( y∗ ) is a circulant matrix with an eigenvalue of − 1 corresponding to eigenvector 1, 2 √ √√ 3 1 and eigenvalues of 4 ± 4 i corresponding to eigenvectors (−1, −1, 2) ± (− 3, 3, 0) ; note √√ that 1, (−1, −1, 2) , and (− 3, 3, 0) are mutually orthogonal. The phase diagram for the replicator dynamic is a spiral saddle: interior solutions on the diagonal where y1 = y2 = y3 head directly toward y∗ , while all other orbits are attracted to a two-dimensional manifold containing an unstable spiral. This is depicted in Figure 8.2.3, where behavior in populations 1, 2, and 3 is measured on the left-right, front-back, and top-bottom axes, respectively. Solutions on the manifold containing the unstable spiral converge to a sixsegment heteroclinic cycle; this cycle agrees with the best response cycle of the underlying normal form game. § Example 8.2.9. The best response dynamic in Mismatching Pennies. The analysis of the best response dynamic in Mismatching Pennies is very similar to the corresponding analysis in monocyclic games (Example 8.2.5). Divide the state space Y = [0, 1]3 into eight octants in the natural way. Then the two octants corresponding to vertices HHH and TTT are backward invariant, while solutions starting in any of the remaining six octants proceed through those octants according to the best response cycle of the underlying game (see Exercise 8.2.10). As Figure 8.2.4 illustrates, almost all solutions to the best response dynamic converge to a six-sided closed orbit in the interior of Y. § Exercise 8.2.10. (i) Give an explicit formula for the best response dynamic for Mismatching Pennies in terms of the state variable y ∈ Y = [0, 1]3 . (ii) Prove that octants HHH and TTT described in the previous example are backward invariant. (iii) Prove that solutions starting in any of the remaining octants proceed through those octants according to the best response cycle of the underlying game. The following proposition shows that the previous two examples are not exceptional. 323 T H H T H T T H H T T H Figure 8.2.4: The best response dynamic in Mismatching Pennies (two viewpoints). 324 Proposition 8.2.11. Let V(·) be an evolutionary dynamic that is generated by a C1 revision protocol ρ and that satisfies Nash stationarity (NS). Let F be Mismatching Pennies, and suppose that the ˙ unique Nash equilibrium x∗ of F is a hyperbolic rest point of x = VF (x). Then x∗ is unstable under VF , and there is an open, dense, full measure set of initial conditions from which solutions to VF do not converge. Proposition 8.2.11 is remarkable in that it does not require the dynamic to satisfy a payoff monotonicity condition. Instead, it takes advantage of the fact that by definition, the revision protocol for population p does not condition on the payoffs of other populations. In fact, the specific payoffs of Mismatching Pennies are not important to obtain the instability result; any three-player game whose unique Nash equilibrium is interior works equally well. The proof of the theorem makes these points clear. Proof. For ε close to 0, let Fε be generated by a perturbed version of Mismatching 1+2ε Pennies in which player 3’s payoff for playing H when player 1 plays T is not 1, but 1−2ε . Then like Mismatching Pennies itself, Fε has a unique Nash equilibrium, here given by 1 (( 1 + ε, 1 − ε), ( 1 , 1 ), ( 2 , 1 )). 2 2 22 2 For convenience, let us argue in terms of the state variable y = (x1 , x2 , x3 ) ∈ Y = [0, 1]3 HHH ˆ ˙ ˙ (see Exercise 8.2.12). If y = VFε ( y) is the dynamic x = VFε (x) expressed in terms of y, then Nash stationarity (NS) tells us that (8.13) ˆ1 VFε ( 2 + ε, 1 , 1 ) = 0 22 whenever |ε| is small. Now by definition, the law of motion for population 1 does not depend directly on payoffs in the other populations, regardless of the game at hand (cf equation (8.10)). Therefore, since changing the game from Fε to F0 does not alter population 1’s payoff function, equation (8.13) implies that ˆ1 VF0 ( 1 + ε, 1 , 1 ) = 0 2 22 whenever |ε| is small. This observation and the fact that the dynamic is differentiable at 1 y∗ = ( 2 , 1 , 1 ) imply that 22 ˆ1 ∂VF0 ∂ y1 ( y∗ ) = 0. ˆ Repeating this argument for the other populations shows that the trace of DVF0 ( y∗ ), ∗ ), is 0. Since y∗ is a hyperbolic rest point ˆ and hence the sum of the eigenvalues of DVF0 ( y ˆ ˆ of VF0 , it follows that some eigenvalue of DVF0 ( y∗ ) has positive real part, and thus that y∗ 325 ˆ is unstable under VF0 . Thus, the Stable Manifold Theorem (see Appendix 7.C) tells us that the set of initial conditions from which solutions converge to y∗ is of dimension at most 2, and that its complement is open, dense, and of full measure in Y. Exercise 8.2.12. Let X be the state space for a p population game with two strategies per population, and let Y = [0, 1]p , so that TY = Rp . (i) Show that the change of variable h : X → Y has inverse h−1 : Y → X, where y1 x1 1− y1 1 . . . −1 . and h ( y) = . . . h(x) = p p y x 1 1− yp (ii) Show that the derivative of h at x, Dh(x) : TX → TY, and the derivative of h−1 at y, ˜ Dh−1 ( y) : TY → TX, can be written as Dh(x)z = Mz and Dh−1 ( y)ζ = Mζ for some ˜ matrices M ∈ Rp ×2p and M ∈ R2p ×p . Show that if M is viewed as a linear map from ˜ TX to TY, then its inverse is M. ˆ (iii) Fix a C1 vector field V : X → TX, and define the new vector field V : Y → TY by ˆ ˆ ˙ ˙ V ( y) = h(V (h−1 ( y))). Show that the dynamics x = V (x) and y = V ( y) are linearly conjugate under H: that is, that {xt } solves the former equation if and only if {h(xt )} solves the latter. (iv) Let x∗ be a rest point of V , and let y∗ = h(x∗ ) be the corresponding rest point of ˆ V . Show that the eigenvalues of DV (x∗ ) with respect to TX are identical to the ˆ eigenvalues of DV ( y∗ ) with respect to TY. What is the relationship between the corresponding pairs of eigenvectors? 8.2.4 The Hypnodisk Game The virtue of Proposition 8.2.11 is that apart from hyperbolicity of equilibrium, virtually no assumptions about the evolutionary dynamic V(·) were needed to establish nonconvergence. We now show that if one is willing to introduce a payoff monotonicity condition—namely, positive correlation (PC)—then one can obtain a nonconvergence without smoothness conditions, and using a two-dimensional state variable, rather than a three-dimensional one as in Mismatching Pennies. This low dimensionality will turn out to be crucial when we study survival of dominated strategies in Section 8.4. Our construction will be based on potential games. In Figure 8.2.5, we present the 326 1 2 3 (ii) The projected payoff vector field (i) The potential function Figure 8.2.5: A coordination game. potential function and projected payoff vector field of the coordination game 1 0 0 x1 x1 C 0 1 0 x = x . 2 2 F (x) = Cx = 0 0 1 x x 3 3 ˙ By our analysis in Chapter 2, solutions to any evolutionary dynamic x = VFC (x) satisfying 1 C conditions (NS) and (PC) ascend the potential function f (x) = 2 x Cx = 1 ((x1 )2 +(x2 )2 +(x3 )2 ) 2 drawn in diagram (i), or, equivalently, travel at acute angles to the projected payoff vectors in diagram (ii). It follows that solutions to VFC from most initial conditions converge to the strict Nash equilibria at the vertices of X. As a second example, suppose that agents are randomly matched to play the anticoordination game −C. In Figure 8.2.6, we draw the resulting population game F−C (x) = 1 −Cx = −x and its concave potential function f −C (x) = − 1 x Cx = − 2 ((x1 )2 + (x2 )2 + (x3 )2 ). 2 Both pictures reveal that under any evolutionary dynamic satisfying conditions (NS) and (PC), all solution trajectories converge to the unique Nash equilibrium x∗ = ( 1 , 1 , 1 ). 333 The construction of the hypnodisk game H : X → R3 is easiest to describe in geometric terms. Begin with the coordination game FC (x) = Cx pictured in Figure 8.2.5(ii). Then draw 1 1 two circles centered at state x∗ = ( 3 , 1 , 1 ) with radii 0 < r < R < √6 , as shown in Figure 33 8.2.7(i); the second inequality ensures that both circles are contained in the simplex. Twist the portion of the vector field lying outside of the inner circle in a clockwise direction, 327 1 2 3 (ii) The projected payoff vector field (i) The potential function Figure 8.2.6: An anticoordination game. excluding larger and larger circles as the twisting proceeds, so that the outer circle is reached when the total twist is 180◦ (Figure 8.2.7(ii)). Exercise 8.2.13. Provide an explicit formula for the resulting population game H(x). What does this construction accomplish? Examining Figure 8.2.7(ii), we see that inside the inner circle, H is identical to the coordination game FC . Thus, solutions to dynamics satisfying (NS) and (PC) starting at states other than x∗ in the inner circle must leave the inner circle. At states outside the outer circle, H is identical to the anticoordination game F−C , so solutions to dynamics satisfying (NS) and (PC) starting at states outside the outer circle must enter the outer circle. Finally, at each state x in the annulus bounded by the two circles, H(x) is not a componentwise constant vector. Therefore, states in the annulus are not Nash equilibria, and so are not rest points of dynamics satisfying (NS). We assemble these observations in the following proposition. Proposition 8.2.14. Let V(·) be an evolutionary dynamic that satisfies (C), (NS), and (PC), and ˙ let H be the hypnodisk game. Then every solution to x = VH (x) other than the stationary solution at x∗ enters the annulus with radii r and R and never leaves, ultimately converging to a cycle therein. The claim of convergence to limit cycles in the final sentence of the proposition follows from the Poincar´ -Bendixson Theorem (Theorem 8.A.5). e 328 1 2 3 (i) Projected payoff vector field for the coordination game 1 2 3 (ii) Projected payoff vector field for the hypnodisk game Figure 8.2.7: Construction of the hypnodisk game. 329 8.3 Chaotic Evolutionary Dynamics In all of the phase diagrams we have seen so far, ω-limit sets have taken a fairly simple form: solution trajectories have converged to rest points, closed orbits, or chains of rest points and connecting orbits. When we consider games with just two or three strategies, this is unavoidable: clearly, all solution trajectories of continuous time dynamics in one dimension converge to equilibrium, while in two-dimensional systems, the Poincar´ e Bendixson Theorem (Theorem 8.A.5) tells us that the three types of ω-limit sets described above exhaust all possibilities. Once we move to flows in three or more dimensions, ω-limit sets can be much more complicated sets often referred to as chaotic (or strange) attractors. Central to most definitions of chaos is sensitive dependence on initial conditions: solution trajectories starting from close together points on the attractor move apart at an exponential rate. Chaotic attractors can also be recognized in phase diagrams by their rather intricate appearance. Rather than delving deeply into these ideas, we content ourselves by presenting a few examples. Example 8.3.1. Consider the single population game F generated by random matching in the normal form game A below: 0 −12 20 0 F(x) = Ax = −21 −4 10 −2 0 22 x1 0 −10 x2 . 0 35 x3 x4 20 1 11 The lone interior Nash equilibrium of this game is the central state x∗ = ( 4 , 1 , 4 , 4 ). 4 ˙ Let x = VF (x) be the replicator dynamic for game F. One can calculate that the eigenvalues of DVF (x∗ ) are approximately −3.18 and .34 ± 1.98i, so like the Nash equilibrium of Mismatching Pennies (Example 8.2.8), the interior equilibrium x∗ here is a sprial saddle with an unstable spiral. ˙ Figure 8.3.1 presents the initial portion of the solution of x = VF (x) from initial condition x0 = (.24, .26, .25, .25). This solution spirals clockwise about x∗ . Near the rightmost point of each circuit, where the value of x3 gets close to zero, solutions sometimes proceed along an “outside” path on which the value of x3 surpasses .6. But they sometimes follow an “inside” path on which x3 remains below .4, and at other times do something in between. Which of these alternatives occurs is difficult to predict from approximate information about the previous behavior of the system. Sensitive dependence on initial conditions is illustrated directly in Figure 8.3.2, which tracks the solutions from two nearby initial conditions, (.47, .31, .11, .11) and (.46999, .31, 330 1 3 4 x2 x1 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 50 t x3 100 150 200 t 50 100 150 200 t x4 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 t Figure 8.3.1: A chaotic attractor under the replicator dynamic. 331 x2 x1 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 20 40 60 80 100 20 40 60 80 100 t 20 t 40 60 80 100 t x4 x3 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 20 40 60 80 100 t Figure 8.3.2: Sensitive dependence on initial conditions under the replicator dynamic. .11, .11001). Apparently, the two solutions stay close together through time t = 50 but diverge thereafter; after time t = 60, the current position of one of the solutions provides little hint about the current position of the other. § The scattered payoff entries in the previous example may seem to suggest that chaos only occurs in “artificial” examples. To dispute this view, we now show that chaotic behavior can occur in very simple games. Example 8.3.2. Asymmetric Rock-Paper-Scissors. Suppose that two populations of agents are randomly matched to play the two-player zero-sum game U = (U1 , U2 ): r II p s 1 1 , −2 2 −1, 1 1, −1 P 1, −1 1 , −1 2 2 −1, 1 S −1, 1 1, −1 1 , −1 2 2 R I U is an asymmetric version of Rock-Paper-Scissors in which a “draw” results in a halfcredit win for player 1. Figures 8.3.3 and 8.3.4 each present a single solution trajectory of the replicator dynamic for FU . Since the social state x = (x1 , x2 ) is four-dimensional, we draw it in two pieces, 332 with x1 represented on the left hand side of each figure and x2 represented on the right. 1 1 Because U is a zero-sum game with Nash equilibrium (( 1 , 1 , 3 ), ( 1 , 1 , 3 )), each solution of 33 33 p the replicator dynamic lives within a level set of H (x) = − p∈P i∈Sp log xi . In Figure 8.3.3, whose initial condition is ((.5, .25, .25), (.5, .25, .25)), the solution trajectory appears to follow a periodic orbit, much like those in our examples from Section 8.1.1. But in Figure 8.3.4, whose initial condition ((.5, .01, .49), (.5, .25, .25)) is closer to the boundary of X, the solution trajectory travels around the level set of H in a seemingly haphazard way. Thus, despite the regularity provided by the constant of motion, the evolution of behavior in this simple game is complicated indeed. § 8.4 Survival of Dominated Strategies By now we have thoroughly considered whether the prediction of Nash equilibiurm play can be justified using evolutionary arguments. On the positive side, Chapters 4 and 5 show that there are many dynamics whose rest points are always identical to the Nash equilibria of the underlying game, and Chapter 6 shows that convergence to Nash equilibrium can be assured under many of these dynamics in particular classes of games. But the final word on this question appears in Section 8.2, which demonstrates that no evolutionary dynamic can converge to Nash equilibrium in all games. This negative result leads us to consider a more modest question. Rather than seek evolutionary support for equilibrium play, we instead turn our attention to a more basic rationality requirement: namely, the avoidance of strategies that are strictly dominated. Theorem 6.4.4 seems to bear out the intuition that evolutionary dynamics select against dominated strategies. But upon further reflection, one finds that there is no a priori reason to expect dominated strategies to be eliminated. Evolutionary dynamics are built upon the notion that agents switch to strategies whose current payoffs are reasonably good. But even if a strategy is dominated, it can have reasonably good payoffs at many population states. Put differently, domination is a “global” property, depending on payoffs at all states, while decision making in evolutionary models is “local”, depending only on the payoffs available at present. By this logic, there is no reason to expect evolutionary dynamics to eliminate dominated strategies as a general rule. To turn this intuition into a formal result, we introduce one further condition on evolutionary dynamics. (IN) Innovation If x NE(F), xi = 0, and i ∈ argmax F j (x), then (VF )i (x) > 0. j∈S 333 R r P S p s xR xP 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 t 50 xr 150 200 t 0.4 0.2 100 0.6 0.4 50 0.8 0.6 200 t 1 0.8 150 xp 1 100 0.2 50 100 150 200 t Figure 8.3.3: Cycling in asymmetric Rock-Paper-Scissors. 334 R r P S p s xR xP 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 50 100 150 200 t 50 xr 150 200 t 0.4 0.2 100 0.6 0.4 50 0.8 0.6 200 t 1 0.8 150 xp 1 100 0.2 50 100 150 200 t Figure 8.3.4: Chaos in asymmetric Rock-Paper-Scissors. 335 Innovation (IN) requires that when a non-Nash population state includes an unused optimal strategy, this strategy’s growth rate must be strictly positive. In other words, if an unplayed strategy is sufficiently rewarding, some members of the population will discover it and select it. We are now in a position to state our survival theorem. Theorem 8.4.1. Suppose the evolutionary dynamic V(·) satisfies (C), (NS), (PC), and (IN). Then ˙ there is a game F such that under x = VF (x), along solutions from most initial conditions, there is a strictly dominated strategy played by a fraction of the population bounded away from 0. Proof. Let H be the hypnodisk game introduced in Section 8.2.4. Let F be the fourstrategy game obtained from H by adding a twin to strategy 3: Fi (x1 , x2 , x3 , x4 ) = Hi (x1 , x2 , x3 + x4 ) for i ∈ {1, 2, 3}; F4 (x) = F3 (x). Strategies 3 and 4 are identical, in that they always yield the same payoff and always have the same payoff consequences for other strategies. The set of Nash equilibria of F is the line segment NE = x∗ ∈ X : x∗ = x∗ = x∗ + x∗ = 2 3 4 1 1 3 . Let I = x ∈ X : (x1 − 1 )2 + (x2 − 1 )2 + (x3 + x4 − 1 )2 ≤ r2 and 3 3 3 O = x ∈ X : (x1 − 1 )2 + (x2 − 1 )2 + (x3 + x4 − 1 )2 ≤ R2 3 3 3 be concentric cylindrical regions in X surrounding NE, as pictured in Figure 8.4.1. By construction, we have that 1 0 0 ˜ = 0 1 0 F(x) = Cx 0 0 1 001 0 0 1 1 x1 x 2 . x 3 x4 at all x ∈ I, so under any dynamic satisfying (PC) and (NS), solutions starting in I − NE ˜ ascend the potential function f C (x) = 1 ((x1 )2 + (x2 )2 + (x3 + x4 )2 ) until leaving the set I. At 2 ˜ states outside the set O, we have that F(x) = −Cx, so solutions starting in X − O ascend ˜ ˜ f −C (x) = − f C (x) until entering O. In summary: 336 1 2 3 4 Figure 8.4.1: Regions O, I, and D = O − I. Lemma 8.4.2. Suppose that V(·) is an evolutionary dynamic that satisfies conditions (C), (NC) ˙ and (PC), and let F be the “hypnodisk with a twin” game. Then every solution to x = VF (x) other than the stationary solutions in NE enter region D = O − I and never leave. Define the flow from the set U ⊆ X under the dynamic VF by ˙ φt (U) = {ξ ∈ X : there is a solution {xs } to x = VF (x) with x0 ∈ U and xt = ξ.} In words, φt (U) contains the time t positions of solutions to VF whose initial conditions are in U. ˜ Since solutions to VF starting in I − NE ascend the function f C until leaving the set I, the reverse time flow is well-defined from all such states, and NE is a repellor under VF . This means that all backward-time solutions to VF that begin in some neighborhood U of NE converge to NE uniformly over time, or, equivalently, that NE is an attractor of the ˙ time-reversed equation y = −VF ( y) (see Appendix 8.B). The dual attractor A of the repellor NE is the forward-time limit of the flow of VF starting from the complement of cl(U): A= φt (X − cl(U)). t≥0 337 A is nonempty, compact, and (both forward and backward) invariant under VF , and Lemma 8.4.2 tells us that A ⊂ D. We now show that the twin strategy is used by a positive mass of agents throughout the attractor A . Let Z = {x ∈ X : x4 = 0} be the face of X on which the twin strategy is unused; we prove Lemma 8.4.3. The attractor A and the face Z are disjoint. Proof. Since VF is Lipschitz continuous and satisfies (VF )i (x) ≥ 0 whenever xi = 0, solutions to VF that start in X − Z cannot approach Z more than exponentially quickly, and in particular cannot reach Z in finite time (see Exercise 8.4.4). Equivalently, backward solutions to VF starting from states in Z cannot enter int(X). Now suppose by way of contradiction that there exists a state ξ in A ∩ Z. Then by our previous arguments, the entire backward orbit from ξ is also contained in A ∩ Z, and hence in D ∩ Z. Since the latter set contains no rest points by condition (PC), the Poincar´ e Bendixson Theorem (Theorem 8.A.5) implies that the backward orbit from ξ converges to a closed orbit γ in D ∩ Z that circumnavigates I ∩ Z. By construction, the annulus D ∩ Z can be split into three regions: one in which strategy 1 is the best response, one in which strategy 2 is the best response, and one in which strategy 3 (and hence strategy 4) is a best response (Figure 8.4.2). Each of these regions is bounded by a simple closed curve that intersects the inner and outer boundaries of the annulus. Therefore, the closed orbit γ, on which strategy 4 is unused, passes through the region in which strategy 4 is optimal. This contradicts innovation (IN). Exercise 8.4.4. Use Gronwall’s Inequality (Lemma 3.A.7) to check the initial claim in the ¨ proof of the lemma. To complete the proof, we now make the twin strategy “feeble”: we uniformly reduce its payoff by ε, creating the new game Fε (x) = F(x) − εe4 . Observe that strategy 4 is strictly dominated by strategy 3 in game Fε . As increasing ε from 0 continuously changes the game from F to Fε , doing so also continuously changes the dynamic from VF to VFε . Thus, by Theorem 8.B.3 on continuation of attractors, we have that for small ε, the attractor A of VF continues to an attractor A ε of VFε on which x4 > 0: thus, the dominated strategy survives throughout A ε . The basin of the attractor A ε contains all points outside of a thin tube around the set NE of Nash equilibria of F. This completes the proof of Theorem 8.4.1. 338 1 3 2 1 2 3 Figure 8.4.2: The best response correspondence of the hypnodisk game. We conclude this chapter with some examples that illustrate and extend the analysis above. Example 8.4.5. We use the hypnodisk game as the basis for the proof of Theorem 8.4.1 because it generates cycling under any dynamic that satisfies (NS) and (PC). But the use of this game is not essential: once we fix the dynamic under consideration, we can find a simpler game that leads to cycling; then the argument based on the introduction of twin strategies can proceed as above. We illustrate this point by constructing an example of survival under the Smith dynamic. Figure 8.4.3 contains the phase diagram for the Smith dynamic in the bad RockPaper-Scissors game 0 −l w x1 G(x) = Ax = w 0 −l x2 , −l w 0 x 3 where w = 1 and l = 2. Evidently, the unique Nash equilibrium x∗ = ( 1 , 1 , 1 ) is unstable, 333 and most solution trajectories converge to a cycle located in int(X). 339 Figure 8.4.3: The Smith dynamic in bad RPS. R R P P S S T (i) bad RPS with a twin T (ii) bad RPS with a feeble twin Figure 8.4.4: The Smith dynamic in two games. 340 Figure 8.4.4(i) presents the Smith dynamic in “bad RPS with a twin”, (8.14) 0 −l w w x1 w 0 −l −l x 2 ˜ = F(x) = Ax −l w 0 0 x . 3 −l w 0 0 x4 1 The Nash equilibria of F are the states on line segment NE = {x∗ ∈ X : x∗ = ( 3 , 1 , c, 1 − c)}, 3 3 which is a repellor under the Smith dynamic. Furthermore, since Scissors and Twin always earn the same payoffs (F3 (x) ≡ F4 (x)), we can derive a simple expression for the rate of change of the difference between their utilization levels: (8.15) ˙ ˙ x3 − x4 = x j [F3 (x) − F j (x)]+ − x3 j∈S j∈S − [F j (x) − F3 (x)]+ x j [F4 (x) − F j (x)]+ − x4 j∈S j∈S = −(x3 − x4 ) [F j (x) − F4 (x)]+ [F j (x) − F4 (x)]+ . j∈S Intuitively, strategies lose agents at rates proportional to their current levels of use, but gain strategies at rates that depend on their payoffs; thus, when the dynamics are not at rest, the weights x3 and x4 move closer together. It follows that except at Nash equilibrium states, the dynamic moves toward the plane P = {x ∈ X : x3 = x4 } on which the identical twins receive equal weight (see Exercise 8.4.6). Figure 8.4.4(ii) presents the Smith dynamic in “bad RPS with a feeble twin”, (8.16) 0 −l w w x1 w 0 −l −l x2 , ˜ Fε (x) = Aε x = −l w 0 0 x3 −l − ε w − ε −ε −ε x4 1 where ε = 10 . Evidently, the attractor from the previous figure moves slightly to the left, and the strictly dominated strategy Twin survives. Indeed, since the Nash equilibrium of “RPS with a twin” on plane P puts mass 1 on Twin, when ε is small solutions to the Smith 6 1 dynamic in “RPS with a feeble twin” place mass greater than 6 on the strictly dominated strategy Twin infinitely often. This lower bound is driven by the fact that in the game with an exact twin, solutions converge to plane P; thus, the bound will obtain under any 341 R R P P S S T (i) RPS with a twin T (ii) RPS with a feeble twin Figure 8.4.5: The replicator dynamic in two games. dynamic that treats different strategies symmetrically. § Exercise 8.4.6. Show that under the Smith dynamic in “RPS with a twin”, solutions from states not on the line of Nash equilibria NE converge to the plane P where the weights on Scissors and Twin are equalized. (Hint: Use equation (8.15) and the Poincar´ -Bendixson e Theorem. You may take as given that the set NE is a repellor.) Example 8.4.7. Theorem 6.4.4 showed that dominated strategies are eliminated along interior solutions of imitative dynamics. But Theorem 8.4.1 shows that this result is not robust to small changes in these dynamics. To understand why, consider evolution under the replicator dynamic in “(standard) RPS with a twin”. In standard Rock-Paper-Scissors, interior solutions of the replicator dynamic are closed orbits (see, e.g., Section 8.1.1). When we introduce an exact twin, xS equation (6.38) tells us that the ratio xT is constant along every solution trajectory. This is xS evident in Figure 8.4.5(i), which shows that the planes on which the ratio xT is constant are all invariant sets. If we make the twin feeble by lowering its payoff uniformly by ε, we xS obtain the dynamics pictured in Figure 8.4.5(ii): now the ratio xT increases monotonically, and the dominated strategy is eliminated. The existence of a continuum of invariant hyperplanes under imitative dynamics in games with identical twins is crucial to this argument. At the same time, dynamics with a continuum of invariant hyperplanes are structurally unstable. If we fix the game but slightly alter the agents’ revision protocol, these invariant sets can collapse, overturning the elimination result. 342 R R P P S S T T (i) bad RPS with a twin Figure 8.4.6: The 9 10 (ii) bad RPS with a feeble twin replicator + 1 10 Smith dynamic in two games. To make this argument concrete, suppose that instead of always following an imitative protocol, agents occasionally use a protocol that requires direct evaluation of payoffs. Such a situation is illustrated in Figure 8.4.6(i), which contains the phase diagram for 11 91 “bad RPS with a twin” (with w = 1 and l = 10 ) under a ( 10 , 10 ) convex combination of the replicator and Smith dynamics. While Figure 8.4.5(i) displayed a continuum of invariant hyperplanes, Figure 8.4.6(i) shows almost all solution trajectories converging to a limit cycle on the plane where xS = xT . If we make the twin feeble, the limit cycle moves slightly to the left, as in Figure 8.4.6(ii), and the dominated strategy survives. § Exercise 8.4.8. Show that an analogue of equation (6.38) holds for the projection dynamic on int(X). Explain why this does not imply that dominated strategies are eliminated along all solutions to the projection dynamic starting from interior initial conditions. Appendix 8.A Three Classical Theorems on Nonconvergent Dynamics 8.A.1 Liouville’s Theorem ˙ Let V : Rn → Rn be a C1 vector field, and consider the differential equation x = V (x) with flow φ : R × Rn → Rn . Let the set A ⊂ Rn be measurable with respect to Lebesgue 343 measure µ on Rn . Liouville’s Theorem concerns the time evolution of µ(φt (A)), the measure (or volume) of the time t image of A under φ. Theorem 8.A.1 (Liouville’s Theorem). d µ(φt (A)) dt = φt (A) tr(DV (x)) dµ(x). The quantity tr(DV (x)) = i ∂Vii (x) ≡ divV (x) is known as the divergence of V at x. ∂x According to Liouville’s Theorem, divV governs the local rates of change in volume ˙ under the flow φ of x = V (x). In particular, if divV = 0 on an open set O ⊆ Rn —that is, if V is divergence-free on this set—then the flow φ conserves volume on O. Before proceeding with the proof of Liouville’s Theorem, let us note that it extends immediately to cases in which the law of motion V : X → TX defined on an affine set X ⊂ Rn with tangent space TX. In this case, µ represents Lebesgue measure on (the affine hull of) X. The only cautionary note is that the derivative of V at state x ∈ X must be represented using the derivative matrix DV (x) ∈ Rn×n , which by definition has rows in ˆ TX. We showed how to compute this matrix in Appendix 2.B.3: if V : Rn → Rn is a C1 ˆ extension of V , then DV (x) = DV (x)PTX , where PTX ∈ Rn×n is the orthogonal projection of Rn onto the subspace TX. Proof. Using the standard multivariate change of variable, we express the measure of the set φt (A) as (8.17) µ(φt (A)) = φt (A) 1 dµ(xt ) = det(Dφt (x0 )) dµ(x0 ). A The derivative matrix Dφt (x0 ) in equation (8.17) captures changes in φt (x0 ), the time t ˙ position of the solution to x = V (x) from initial condition x0 , as this initial condition is varied. It follows from arguments below that det(Dφt (x0 )) > 0, so that the absolute value taken in equation (8.17) is unnecessary. Taking the time derivative of this equation and then differentiating under the integral sign thus yields (8.18) d µ(φt (A)) dt = A d dt det(Dφt (x0 )) dµ(x0 ). Evaluating the right hand side of equation (8.18) requires two lemmas. The first of these is stated in terms of the time inhomogeneous linear equation (8.19) ˙ yt = DV (xt ) yt , ˙ where {xt } is the solution to x = V (x) from initial condition x0 . Equation (8.19) is known as ˙ the (first) variation equation associated with x = V (x). 344 Lemma 8.A.2. The matrix trajectory {Dφt (x0 )}t≥0 is the matrix solution to the first variation equation from initial condition Dφ0 (x0 ) = I ∈ Rn×n . More explicitly, (8.20) d Dφt (x0 ) dt = DV (φt (x0 )) Dφt (x0 ). In words, Lemma 8.A.2 tells us that the column trajectories of {Dφt (x0 )}t≥0 are the solutions to the first variation equation whose initial conditions are the standard basis vectors e1 , . . . , en ∈ Rn . d Proof. By definition, the time derivative of the flow from x0 satisfies dt φt (x0 ) = V (φt (x0 )). Differentiating with respect to x0 and then reversing the order of differentiation yields (8.20). Lemma 8.A.3 provides two basic matrix identities, the first of which is sometimes called Liouville’s formula. Lemma 8.A.3. Let M ∈ Rn×n . Then (i) det(exp(M)) = exp(tr(M)). d (ii) dt det(exp(Mt)) t=0 = tr(M). Proving part (i) of the lemma is not difficult, but the intuition is clearest when M is a diagonal matrix: λ1 det exp ... ··· 0 . . . . . . 0 ··· λn eλ1 . = det .. ··· 0 . . . . . . 0 ··· eλn = e = exp λi i i λ1 λi = exp tr ... ··· 0 . .. . . . 0 ··· λn . Part (ii) follows from part (i) by replacing M with Mt and differentiating. Lemmas 8.A.2 and 8.A.3(ii) enable us to evaluate equation (8.18). First, note that Lemma 8.A.2 and the linearity of the first variation equation imply that Dφt (x0 ) ≈ exp(DV (x0 )t) when t is close to 0. Combining this observation with Lemma 8.A.3(ii) shows that d dt det(Dφt (x0 ))) t=0 ≈ d dt det(exp(DV (x0 )t)) = tr(DV (x0 )). By substituting this equality into of equation (8.18) and noting that our focus on time t = 0 has been arbitrary, we obtain Liouville’s Theorem. Liouville’s Theorem can be used to prove the nonexistence of asymptotically stable sets. Since solutions in a neighborhood of such a set all approach the set, volume must 345 be contracted in this neighborhood. It follows that a region in which divergence is nonnegative cannot contain an asymptotically stable set. Theorem 8.A.4. Suppose divV ≥ 0 on the open set O ⊆ Rn , and let A ⊂ O be compact. Then A ˙ is not asymptotically stable under x = V (x). Theorem 8.A.4 does not rule out the existence of Lyapunov stable sets. In fact, the example of the replicator dynamic in standard Rock-Paper-Scissors shows that such sets are not unusual when V is divergence-free. 8.A.2 The Poincar´ -Bendixson and Bendixson-Dulac Theorems e We now present two classical results concerning differential equations on the plane. The celebrated Poincar´-Bendixson Theorem characterizes the possible long run behave iors of such dynamics, and provides a simple way of establishing the existence of periodic orbits. Recall that a periodic (or closed) orbit of a differential equation is a nonconstant solution {xt }t≥0 such that xT = x0 for some T > 0. Theorem 8.A.5 (The Poincar´ -Bendixson Theorem). Let V : R2 → R2 be Lipschitz continue ˙ ous, and consider the differential equation x = V (x). (i) Let x ∈ R2 . If ω(x) is compact, nonempty, and contains no rest points, then it is a periodic orbit. (ii) Let Y ⊂ R2 . If Y is nonempty, compact, forward invariant, and contains no rest points, then it contains a periodic orbit. Theorem 8.A.5 tells us that in planar systems, the only possible ω-limit sets are rest points, sequences of trajectories leading from one rest point to another (called heteroclinic orbits where there are multiple rest points in the sequence and homoclinic orbits when there is just one), and periodic orbits. In part (i) of the theorem, the requirement that ω(x) be compact and nonempty are automatically satisfied when the dynamic is forward invariant on a compact set—see Proposition 6.A.1. The next result, the Bendixson-Dulac Theorem, provides a method of ruling out the existence of closed orbits in planar systems. To state this theorem, we recall that a set Y ⊂ R2 is simply connected if it contains no holes: more precisely, if every closed curve in Y can be continuously contracted within Y to a single point. Theorem 8.A.6 (The Bendixson-Dulac Theorem). Let V : R2 → R2 be C1 , and consider the ˙ differential equation x = V (x). If divV 0 throughout the simply connected set Y, then Y does not contain a closed orbit. 346 Proof. If γ is a closed orbit in Y, then the region R bounded by γ is invariant under φ. Thus d µ(φt (R)) dt = φt (R) divV (x) dµ(x) by Liouville’s Theorem. Since divV is continuous and nonzero throughout Y, its sign must be constant throughout Y. If this sign is negative, then the volume of R contracts under φ; if it is positive, then the volume of R expands under φ. Either conclusion contradicts the invariance of R under φ. Both of the results above extend to dynamics defined on two-dimensional affine spaces in the obvious way. 8.B Attractors and Continuation 8.B.1 Attractors and Repellors Let φ be a semiflow on the compact set X ⊂ Rn : that is, φ : [0, ∞) × X → X is a continuous map with φ0 (x) = x that satisfies the group property φt (φs (x)) = φt+s (x) for all s, t ≥ 0 and x ∈ X. We call the set A ⊆ X forward invariant under φ if φt (A ) = A for all t ≥ 0. Note that in this case, the sets {φt (A )}t≥0 are nested. We call A invariant under φ if φt (A ) = A for all t ∈ R. It is implicit in this definition that on the set A we have not only a semiflow, but also a flow: on A , we can extend the map φ to be well-defined and satisfy the group property not just for times in [0, ∞), but also for times in (−∞, ∞). A set A ⊆ X is an attractor of φ if it is nonempty, compact, and invariant under φ, and if there is a neighborhood U of A such that lim sup dist(φt (x), A ) = 0. t→∞ x∈U The set B(A ) = {x ∈ X : ω(x) ⊆ A } is called the basin of A . Attractors only differ from asymptotically stable sets (as defined in Chapter 6) only in that the latter need not be invariant. Attractors can be defined in a number of equivalent ways. In the following lemma, the ω-limit of the set U ⊆ X is defined as ω(U) = cl φs (U) . t≥0 s≥t 347 Proposition 8.B.1. The following statements are equivalent: (i) A is an attractor of φ. (ii) There is a neighborhood U of A such that A = ω(U). (iii) A = t≥0 φt (O), where O is open and satisfies φT (cl(O)) ⊂ O for some T > 0. (iv) A = t≥0 φt (O), where O is open, forward invariant, and satisfies φT (cl(O)) ⊂ O for some T > 0. (v) A = t≥0 φt (O), where O is open, forward invariant, and satisfies φt (cl(O)) ⊂ O for all t > 0. In parts (iii), (iv), and (v), the set O is known as a weak trapping region, a trapping region, and a strongly forward invariant trapping region, respectively. Now suppose that φ : (−∞, ∞) × X → X is a flow on X. In this case, the set A ∗ = B(A ) − A is known as the dual repellor of A . A ∗ is the α-limit of of a neighborhood of itself (i.e., it is the ω-limit of a neighborhood of itself under the time-reversed version of φ), it is nonempty, compact, and invariant under φ. Theorem 8.B.2 shows that the behavior of the flow on B(A ) = X − (A ∪ A ∗ ) is very simple: it admits a strict Lyapunov function. Theorem 8.B.2. Let (A , A ∗ ) be an attractor-repellor pair of the flow φ on X. Then there exists a continuous function L : X → [0, 1] with L−1 (0) = A ∗ and L−1 (1) = A such that L is strictly increasing on B(A ) under φ. If φ is only a semiflow, one can still find a continuous Lyapunov function L : (B(A ) ∪ A ) → [0, 1] with L−1 (1) = A that is strictly increasing on B(A ). 8.B.2 Continuation of Attractors ˙ Consider now a one-parameter family of differential equations x = V ε (x) in Rn with ε ε unique solutions xt = φt (x0 ) such that (ε, x) → V (x) is continuous. Then (ε, t, x) → φε (x) t n is continuous as well. Suppose that X ⊂ R is compact and forward invariant under the semi-flows φε . For ε = 0 we omit the superscript in φ. The following continuation theorem for attractors is part of the folklore of dynamical systems. Theorem 8.B.3. Let A be an attractor for φ with basin B(A ). Then for each small enough ε > 0 there exists an attractor A ε of φε with basin B(A ε ), such that the map ε → A ε is upper hemicontinuous and the map ε → B(A ε ) is lower hemicontinuous. Upper hemicontinuity cannot be replaced by continuity in this result. Consider the ˙ family of differential equations x = (ε + x2 )(1 − x) on the real line. The semi-flow φ 348 corresponding to ε = 0 admits A = [0, 1] as an attractor, but when ε > 0 the unique attractor of φε is A ε = {1}. This example shows that perturbations can cause attractors to implode; the theorem shows that perturbations cannot cause attractors to explode. Theorem 8.B.3 is a direct consequence of the following lemma. Lemma 8.B.4. Let A be an attractor for φ with basin B(A ), and let U1 and U2 be open sets satisfying A ⊆ U1 ⊆ U2 ⊆ cl(U2 ) ⊆ B(A ). Then for each small enough ε > 0 there exists an attractor A ε of φε with basin B(A ε ), such that A ε ⊆ U1 and U2 ⊆ B(A ε ). In this lemma, one can always set U1 = {x : dist(x, A ) < δ} and U2 = {x ∈ B(A ) : dist(x, X − B(A )) > δ} for some small enough δ > 0. Proof of Lemma 8.B.4. Since A is an attractor and ω(cl(U2 )) = A , there is a T > 0 such that φt (cl(U2 )) ⊂ U1 for t ≥ T. By the continuous dependence of the flow on the parameter ε and the compactness of φT (cl(U2 )), we have that φε (cl(U2 )) ⊂ U1 ⊆ U2 for all small T enough ε. Thus, U2 is a weak trapping region for the semi-flow φε , and so A ε ≡ ω(U2 ) is an attractor for φε . In addition, A ε ⊂ U1 (since A ε = φε (A ε ) ⊆ φε (cl(U2 )) ⊂ U1 ) and T T U2 ⊂ B(A ε ). 8.N Notes Section 8.1. The conservative properties of dynamics studied in this chapter—the existence of a constant of motion and the preservation of volume—are basic properties of Hamiltonian systems. For more on this connection, see Akin and Losert (1984) and Hofbauer (1995a, 1996); for a general introduction to Hamiltonian systems, see Marsden and Ratiu (2002). Exercises 8.1.4 and 8.1.5 are due to Schuster et al. (1981b,c). Theorem 8.1.7 is due to Akin and Losert (1984) and Hofbauer (1995a), while Theorem 8.1.8 is due to Hofbauer and Sigmund (1988) and Ritzberger and Weibull (1995) (also see Weibull (1995)). Section 8.2. Circulant games were introduced by Hofbauer and Sigmund (1988), who call them “cyclically symmetric games”; also see Schuster et al. (1981c). The hypercycle system was proposed by Eigen and Schuster (1979) to model of cyclical catalysis in a collection of polynucleotides during prebiotic evolution. That the boundary of the simplex is repelling under the hypercycle system when n ≥ 5, a property known as permanence, was established by Hofbauer et al. (1981); the existence of stable limit cycles in this context was proved by Hofbauer et al. (1991). Monocyclic games are studied in the context of the replicator dynamic by Hofbauer and Sigmund (1988), who call them “essentially hypercyclic” games. The uniqueness of 349 the interior Nash equilibrium in Example 8.2.5 follows from the fact that the replicator dynamic is permanent in this game: see Theorems 19.5.1 and 20.5 of Hofbauer and Sigmund (1988) (or Theorems 13.5.1 and 14.5.1 of Hofbauer and Sigmund (1998)). The analysis of the best response dynamic in this example is due to Hofbauer (1995b), Gaunersdorfer and Hofbauer (1995), and Bena¨m et al. (2006a). Lahkar (2007), building on work of ı Hopkins and Seymour (2002), employs these results to establish the dynamic instability of dispersed price equilibria in Burdett and Judd’s (1983) model of equilibrium price dispersion. Proposition 8.2.7 is due to Hofbauer and Swinkels (1996); also see Hofbauer and Sigmund (1998, Section 8.6). The Mismatching Pennies game was introduced by Jordan (1993), and was inspired by a 3 × 3 example due to Shapley (1964); see Sparrow et al. (2008) for a recent analysis of Shapley’s (1964) example. The analyses of the replicator and best response dynamics in Mismatching Pennies are due to Gaunersdorfer and Hofbauer (1995). Proposition 8.2.11 is due to Hart and Mas-Colell (2003). The hypnodisk game is introduced in Hofbauer and Sandholm (2006). Section 8.3. For introductions to chaotic differential equations, see Hirsch et al. (2004) at the undergraduate level or Guckenheimer and Holmes (1983) at the graduate level. Example 8.3.1 is due to Arneodo et al. (1980), who introduce it in the context of the LotkaVolterra equations; also see Skyrms (1992). The attractor in this example is known as a Shilnikov attractor; see Hirsch et al. (2004, Chapter 16). Example 8.3.2 is due to Sato et al. (2002). Section 8.4. This section follows Hofbauer and Sandholm (2006). That paper builds on the work of Berger and Hofbauer (2006), who establish that strictly dominated strategies can survive under the BNN dynamic. For a survival result for the projection dynamic, see Sandholm et al. (2008). Section 8.A. For further details on Liouville’s Theorem, see Sections 4.1 and 5.3 of Hartman (1964). Theorem 8.A.4 in the text is Proposition 6.6 of Weibull (1995). For treatments of the Poincar´ -Bendixson Theorem, see Hirsch and Smale (1974) and Robinson e (1995). Section 8.B. The definition of attractor we use is from Bena¨m (1999). Definition (ii) ı in Proposition 8.B.1 is from Conley (1978), and definitions (iii), (iv), and (v) are from Robinson (1995). Theorem 8.B.2 is due to Conley (1978). Theorem 8.B.3 is part of the folklore of dynamical systems theory; compare Proposition 8.1 of Smale (1967). The analysis presented here is from Hofbauer and Sandholm (2006). 350 Part IV Stochastic Evolutionary Models 351 CHAPTER NINE Stochastic Evolution and Deterministic Approximation 9.0 Introduction In Parts II and III of this book, we investigated the evolution of aggregate behavior under deterministic dynamics. We provided foundations for these dynamics in Chapter 3: there we showed that given any revision protocol ρ and population game F, we can ˙ derive a mean dynamic x = VF (x). This differential equation describes expected motion under the stochastic process that ρ and F implicitly define. We justified our focus on this deterministic equation through an informal appeal to a law of large numbers: since all of the randomness in our evolutionary model is idiosyncratic, it should be averaged away in the aggregate so long as the population size is sufficiently large. Our goal in this chapter is to make this argument rigorous. To do so, we explicitly derive a stochastic evolutionary process—a Markov process—from a given population game F, revision protocol ρ, and finite population size N. Our main result in this chapter, Theorem 9.2.3, is a finite horizon deterministic approximation theorem. Building on our earlier intuition, the theorem shows that over any finite time span, the behavior of the stochastic evolutionary process is indeed nearly deterministic: if the population size is large enough, the stochastic process closely follows a solution trajectory of the mean dynamic with probability close to one. The Markov process we introduce in this chapter provides a precise description of the stochastic evolution of aggregate behavior. Theorem 9.2.3 tells us that over time horizons of moderate length, we can do without studying this Markov process directly, as the deterministic approximation is adequate to address most questions of interest. But if we want to understand behavior in a society over very long time spans, then the deterministic 353 approximation theorem no longer applies, and we must study the Markov process directly. This infinite horizon analysis is the subject of our final chapter. This chapter and the next employ variety of techniques from the theory of probability and stochastic processes. These techniques are reviewed in Appendices 9.A and 9.B and in the appendices to Chapter 10. Appendix 9.C describes an extension of the deterministic approximation theorem for discrete-time models. 9.1 The Markov Process Let us review and slightly modify the model of evolution from in Chapter 3. We take a population game F : X → Rn with pure strategy sets (S1 , . . ., Sp ) as given. It will be convenient to assume that the population masses (m1 , . . ., mp ) are integer-valued. Agents’ p p p choice procedures are described by a revision protocol ρ, where ρp : Rn × Xp → Rn ×n is a + map that takes current payoffs and population states as inputs and returns collections of p conditional switch rate ρi j (Fp (x), xp ) as outputs. To set the stage for our limiting analysis, we suppose that population p ∈ P has Nmp members, where the integer parameter N is called the population size. The feasible social 1 states therefore lie in the discrete grid X N = X ∩ N Zn = {x ∈ X : Nx ∈ Zn }. N The stochastic process {Xt } generated by F, ρ, N is described as follows. Each agent in the society is equipped with a rate R Poisson alarm clock. The ringing of a clock signals the arrival of a revision opportunity for the clock’s owner: if the owner is currently p playing strategy i ∈ Sp , he switches to strategy j i with probability ρi j /R. Finally, the model respects independence assumptions that ensure that “the future is independent of the past except through the present”: different agents’ clocks ring independently of one another, strategy choices are made independently of the timing of the clocks’ rings, and as evolution proceeds, the clocks and the agents are only influenced by the history of the process by way of the current value of the social state. In Chapter 3, we argued informally that the stochastic process described above is well approximated by solutions to the mean dynamic (M) p pp ˙ xi = p p x j ρ ji (Fp (x), xp ) − xi j∈Sp ρi j (Fp (x), xp ). j∈Sp The rest of this chapter provides a formal defense of this approximation result. N We begin by giving a more formal account of the stochastic evolutionary process {Xt }. N The independence assumptions above ensure that {Xt } is a continuous-time Markov process on the finite state space X N . To describe this process explicitly, it is enough to 354 specify its jump rates {λN }x∈X N and transition probabilities {pN }x, y∈X N (see Appendix 9.B). x xy p N Suppose that the current social state is x ∈ X . Then there are Nxi agents playing strategy i ∈ Sp , Nmp agents in population p ∈ P , and NM agents in total, where M = q q∈P m is the total mass of the p populations. Since agents receive revision opportunities independently at exponential rate R, the basic properties of the exponential distribution (see Proposition 9.A.1) imply that revision opportunities arrive in the society as a whole at exponential rate NMR. When an agent playing strategy i ∈ Sp receives a revision opportunity, he switches to p strategy j i with probability ρi j /R. Since this choice is independent of the arrivals of revision opportunities, the probability that the next revision opportunity goes to an agent playing strategy i who then switches to strategy j is p p Nxi NM × ρi j R pp = xi ρi j MR . This switch decreases the number of agents playing strategy i by one and increases the p 1p number playing j by one, shifting the state by N (e j − ei ). Summarizing this analysis yields the following observation, which specifies the paN rameters of the Markov process {Xt }. Observation 9.1.1. A population game F, a revision protocol ρ, and a population size N define N a Markov process {Xt } on the state space X N . This process is described by some initial state N X0 = xN , the jump rates λN = NMR, and the transition probabilities x 0 pN,x+z x 9.2 pp p xi ρi j (F (x), xp ) MR p xi R − = p∈P i∈Sp 0 if z = 1p (e Nj p − ei ), i, j ∈ Sp , i j, p ∈ P , p ji ρi j (Fp (x), xp ) MR if z = 0, otherwise. Finite Horizon Deterministic Approximation N In the previous section, we formally defined the Markov process {Xt } generated by a population game F, a revision protocol ρ, and a population size N. Earlier, we introduced the mean dynamic (M), an ordinary differential equation that captures the expected motion of this process; solutions {xt } of (M) are continuous paths through the set of social states. Can we say more precisely how the stochastic and deterministic processes are linked? 355 The main result in this chapter, Theorem 9.2.3, shows that when the population size N N is sufficiently large, the Markov process {Xt } is well approximated over finite time spans by the deterministic trajectory {xt }. 9.2.1 Kurtz’s Theorem N To begin, we state a general result on the convergence of a sequence {{Xt }}∞=N0 of N Markov processes with decreasing step sizes. We suppose that the process indexed by N N takes values in the state space X N = {x ∈ X : Nx ∈ Zn }, and we let λN ∈ RX and + N N pN ∈ RX ×X denote the jump rate vector and transition matrix of this process. + To simplify the definitions to follow, we let ζN be a random variable (defined on an x N arbitrary probability space) whose distribution describes the stochastic increment of {Xt } from state x: (9.1) N N P(ζN = z) = P Xτk+1 = x + z Xτk = x , x where τk is the time of the process’s kth jump. We then define the functions V N : X N → TX, AN : X N → R, and AN : X N → R by δ V N (x) = λN EζN , x x AN (x) = λN E ζN , x x AN (x) = λN E ζN 1{|ζN |>δ} . x x δ x V N (x), the product of the jump rate at state x and the expected increment per jump at x, N represents the expected increment per time unit from x under {Xt }. V N is thus an alternate N definition of the mean dynamic of {Xt }. In a similar vein, AN (x) is the expected absolute displacement per time unit, while AN (x) is the expected absolute displacement per time δ unit due to jumps traveling further than δ. With these definitions in hand, we can state the basic approximation result. Theorem 9.2.1 (Kurtz’s Theorem). Let V : X → TX be a Lipschitz continuous vector field. Suppose that for some sequence {δN }∞=N0 converging to 0, we have N (9.2) (9.3) lim sup V N (x) − V (x) = 0, N→∞ sup sup AN (x) < ∞, and N (9.4) x∈X N x∈X N lim sup ANN (x) = 0, δ N→∞ x∈X N 356 X3 x3 X2 x2 X1 x1 X 0 = x0 Figure 9.2.1: Kurtz’s Theorem. N and that the initial conditions X0 = xN converge to x0 ∈ X. Let {xt }t≥0 be the solution to the mean 0 dynamic (M) ˙ x = V (x) starting from x0 . Then for each T < ∞ and ε > 0, we have that N lim P sup Xt − xt < ε = 1. N→∞ t∈[0,T] Fix a finite time horizon T < ∞ and an error bound ε > 0. Kurtz’s Theorem tells us that N when the index N is large, nearly all sample paths of the Markov process {Xt } stay within ε of a solution of the mean dynamic (M) through time T. By making N large enough, we N can ensure that with probability close to one, Xt and xt differ by no more than ε for all t between 0 and T (Figure 9.2.1). What conditions do we need to reach this conclusion? Condition (9.2) demands that as N grows large, the expected displacements per time unit V N converge uniformly to a Lipschitz continuous vector field V . Lipschitz continuity of V ensures the existence ˙ and uniqueness of solutions of the mean dynamic x = V (x). Condition (9.3) requires that the expected absolute displacement per time unit is bounded. Finally, condition (9.4) demands that jumps larger than δN make vanishing contributions to the motion of the processes, where {δN }∞=N0 is a sequence of constants that approaches zero. N 357 The intuition behind Kurtz’s Theorem can be explained as follows. At each revision N opportunity, the increment in the process {Xt } is stochastic. However, the expected number of revision opportunities that arrive during the brief time interval I = [t, t + dt] is of order λN dt. Whenever it does not vanish, this quantity grows without bound as x the population size N becomes large. Conditions (9.3) and (9.4) ensure that when N is large, each increment in the state is likely to be small. This ensures that the total change in the state during time interval I is small, so that jump rates and transition probabilities vary little during this interval. Since during I there are a very large number of revision opportunities, each generating nearly the same expected increment, intuition from the law N of large numbers suggests that the change in {Xt } during the interval should be almost N completely determined by the expected motion of {Xt }. This expected motion is captured N by the limiting mean dynamic V , whose solutions approximate the stochastic process {Xt } over finite time spans with probability close to one. 12 N Exercise 9.2.2. Suppose that {Xt } is a Markov process on X N = {0, N , N , . . . , 1} with λN ≡ N. x 1 To ensure that 2 is always a state, restrict attention to even N. Give examples of sequences of transition probabilities from state 1 that 2 (i) satisfy condition (9.2) of Kurtz’s Theorem, but not conditions (9.3) or (9.4); (ii) satisfy conditions (9.2) and (9.4), but not condition (9.3); (iii) satisfy conditions (9.2) and (9.3), but not condition (9.4). (Hint: it is enough to consider transition probabilities under which V N ( 1 ) = 0.) 2 Guided by your answers to parts (ii) and (iii), explain intuitively what conditions (9.3) and (9.4) require. 9.2.2 Deterministic Approximation of the Stochastic Evolutionary Process Returning to our model of evolution, we now use Kurtz’s Theorem to show that the N Markov processes {{Xt }}∞=N0 defined in Section 9.1 can be approximated by solutions to N the mean dynamic (M) derived in Section 3.1.2. N To begin, we compute the expected increment per time unit V N (x) of the process {Xt }. Defining the random variable ζN as in equation (9.1) above, we find that x V N (x) = λN EζN x x = NMR p∈P i∈Sp j i 1p 1p p p (e j − ei ) P ζN = (e j − ei ) x N N pp = NMR p∈P i∈Sp j i xi ρi j 1p p (e j − ei ) N MR 358 p = p pp (e j − ei ) xi ρi j p∈P i∈Sp j∈Sp p = p∈P j∈Sp = p∈P i∈S pp pp xi ρi j − ej i∈Sp p ei p p ρi j ei xi p∈P i∈Sp pp p x j ρ ji − xi j∈Sp j∈S j∈Sp p ρ ji . p Thus, the vector field V N = V is independent of N, and is expressed more concisely as (M) p pp Vi (x) = p p x j ρ ji − xi j∈Sp ρi j , j∈Sp as we established using a different calculation in Section 3.1.2. Conditions (9.3) and (9.4) of Kurtz’ Theorem require that the motions of the processes √ p p N {Xt } not be too abrupt. To verify these conditions, observe that since |e j − ei | = 2 for any √ N distinct i, j ∈ Sp , the increments of {Xt } are always either of length √ we choose δN = 2 , N 2 N or of length zero. If this observation immediately implies condition (9.4): √ AN 2 (x) = λN E ζN 1 x x N √ |ζN |> N2 x = 0. The observation also helps us verify condition (9.3): √ A (x) = N λN x E ζN x ≤ RNM × 2√ = 2RM. N With these calculations in hand, we can present the deterministic approximation theorem. N N Theorem 9.2.3 (Deterministic Approximation of {Xt }). Let {{Xt }}∞=N0 be the sequence of N stochastic evolutionary processes defined in Observation 9.1.1. Suppose that V = V N is Lipschitz N continuous. Let the initial conditions X0 = xN converge to state x0 ∈ X, and let {xt }t≥0 be the 0 solution to the mean dynamic (M) starting from x0 . Then for all T < ∞ and ε > 0, N lim P sup Xt − xt < ε = 1. N→∞ t∈[0,T] Choose a finite time span T and two small constants δ and ε. Then for all large enough N population sizes N, the probability that the process {Xt } stays within ε of the deterministic trajectory {xt } through time T is at least 1 − δ. A key requirement of Theorem 9.2.3 is that V must be Lipschitz continuous, ensuring 359 that the mean dynamic (M) admits a unique solution from every initial condition in X. This requirement is satisfied by members of the families of dynamics (imitative, excess payoff, pairwise comparison) studied in Chapter 4, as well as the perturbed best response dynamics from Chapter 5. The best response and projection dynamics, being discontinuous, are not covered by Theorem 9.2.3, but it seems likely that deterministic approximation results that apply to these dynamics can be proved (see the Notes). It is well worth emphasizing that Theorem 9.2.3 is a finite horizon approximation result, and that it cannot be extended to an infinite horizon result. To see why not, consider the logit choice protocol (Example 3.2.5). Under this protocol, switches between all pairs of strategies occur with positive probability regardless of the current state. It follows that the N induced Markov process {Xt } is irreducible: there is a positive probability path between each ordered pair of states in X N . As we shall see in Chapter 10, irreducibility implies that every state in X N is visited infinitely often with probability one. This fact clearly precludes an infinite horizon analogue of Theorem 9.2.3. Indeed, infinite horizon analysis N of {Xt } requires a different set of tools, which we present in Chapter 10. Example 9.2.4. Toss and switch. Suppose that agents play a game with strategy set S = {L, R} 1 using the constant revision protocol ρ, where ρLL = ρLR = ρRL = ρRR = 2 . Under the simplest interpretation of this protocol, each agent receives revision opportunities at rate 1; upon receiving an opportunity, an agent flips a fair coin, switching strategies if the coin comes up Heads. N For each population size N, the protocol generates a Markov process {Xt } with common jump rate λN ≡ N and transition probabilities x pN,x+z x 1 xR 2 = 1 xL 2 1 2 if z = 1 (e NL − eR ), if z = 1 (e NR − eL ), if z = 0. We can simplify the notation by replacing the vector state variable x = (xL , xR ) ∈ X with the scalar state variable r = xR ∈ [0, 1]. The resulting Markov process {RN } has common t jump rate λN ≡ N and transition probabilities r pNr+z r, 1 1 r 2 if z = − N , 1 = 1 (1 − r) if z = N , 2 1 2 if z = 0. 360 Its mean dynamic is thus V N (r) = λN EζN r r 1 11 = N (− N · 1 r) + ( N · 2 (1 − r)) 2 = 1 2 − r, regardless of the population size N. To solve this differential equation, we move the rest point r = 1 to the origin using change of variable s = r − 1 . The equation 2 2 ˙˙ s=r= 1 2 −r= 1 2 − (s + 1 ) = −s 2 has the general solution st = s0 e−t , implying that rt = 1 2 + (r0 − 1 ) e−t . 2 Fix a time horizon T < ∞. Theorem 9.2.3 tells us that when N is sufficiently large, the process {RN } is very likely to stay very close to an almost deterministic trajectory; this t 1 trajectory converges to state r = 2 , with convergence occurring at exponential rate 1. If we instead fix the population size N and look at behavior over the infinite time horizon (T = ∞), the process {RN } eventually splits off from the deterministic trajectory, t 1 visiting all states in {0, N , . . . , 1} infinitely often. § Exercise 9.2.5. Consider a single population playing population game F using revision protocol ρ. (i) Show that the resulting mean dynamic can be expressed as ˙ x = (x R(x)) , where R(x) ∈ Rn×n is given by ρ (x, F(x)) ij if i j, Ri j (x) = − ρ (x, F(x)) if i = j. ik ki Note that when ρ is independent of F(x) and x as in the previous example, the matrix R is independent of x as well. In this case we obtain the linear dynamic ˙ x = (x R) , whose solutions can be expressed in closed form (see Appendix 7.B). (ii) Suppose that ρi j = 1 for all i and j. Describe the parameters of the resulting N Markov process {Xt }, and write down the corresponding mean dynamic. Show 361 that solutions to the mean dynamic take the form xt = x∗ + (x0 − x∗ ) e−nt , where 1 x∗ = n 1. 9.3 9.3.1 Extensions Finite Population Effects In our model of individual choice, the revision protocol ρ was defined independently of the population size. In some cases, it is more appropriate to allow the revision protocol to depend on N in some vanishing way—for example, to account for the effects of sampling from a finite population, or for the fact that an agent whose choices are based on imitation will not imitate himself. If we include these effects, then ρ varies with N, so the normalized expected increments V N vary with N as well. Fortunately, Kurtz’s Theorem allows for these sorts of effects so long as they are vanishing in size: examining condition (9.2), we see that as long as the functions V N converge uniformly to a limiting mean dynamic V , the finite horizon approximation continues to hold. 9.3.2 Discrete Time Models It is also possible to prove deterministic approximation results for discrete time models of stochastic evolution. To do so, we assume that the number of discrete periods that pass per unit of clock time grows with the population size N. In this situation, one can employ a discrete time version of Kurtz’s Theorem (Theorem 9.C.1 in Appendix 9.C), the requirements of which are direct analogues of those from Theorem 9.2.1 above. So, let us suppose that when the population size is N, each discrete time period is 1 of duration dN = NMR , so that periods begin at times in the set TN = {0, dN , 2dN , . . .}. We N consider two specifications of the discrete time evolutionary process {Xt }t∈TN . Exercise 9.3.1. Discrete time model I: One revision opportunity per period. Suppose that during each period, exactly one agent is selected at random and granted a revision opportunity, with each agent being equally likely to be chosen. The chosen agent’s choices are then p governed by the conditional switch probabilities ρi j /R. Using Theorem 9.C.1, show that Theorem 9.2.3 extends to this discrete time model. Discrete time models can allow a possibility that our continuous time model cannot: they permit many agents to switch strategies simultaneously. The next exercise shows that deterministic approximation is still possible even when simultaneous revisions by many agents are possible, so long as they are sufficiently unlikely. 362 Exercise 9.3.2. Discrete time model II: Random numbers of revision opportunities in each period. Suppose that during each period, each agent tosses a coin that comes up heads with 1 probability NM . Every agent who tosses a head receives a revision opportunity; choices p for such agents are again governed by the conditional switch probabilities ρi j /R. Use the Poisson Limit Theorem (Propositions 9.A.4(ii) and 9.A.5) and Theorem 9.C.1 to show that Theorem 9.2.3 extends to this model. (Hint: In any given period, the number of agents 1 whose tosses come up heads is binomially distributed with parameters NM and NM .) Appendix 9.A The Exponential and Poisson Distributions 9.A.1 Basic Properties The random variable T with support [0, ∞) has an exponential distribution with rate λ, denoted T ∼ exponential(λ), if its decumulative distribution is P(T ≥ t) = e−λt , so that its density function is f (t) = λe−λt . A Taylor approximation shows that for small dt > 0, (9.5) P(T ≤ dt) = 1 − e−λdt = 0 + λe−λ·0 dt + O (dt)2 ≈ λ dt. Exponential random variables are often used to model the random amount of time that passes before a certain occurrence: the arrival of a customer at a queue, the decay of a particle, and so on. We often describe the behavior of exponential random variables using the rhetorical device of a “stochastic alarm clock” that rings after an exponentially distributed amount of time has passed. Some basic properties of the exponential distribution are listed next. Proposition 9.A.1. Let T1 , . . . , Tn be independent with Ti ∼ exponential(λi ). Then (i) ETi = λ−1 ; i (ii) P (Ti ≥ u + t |Ti ≥ u ) = P (Ti ≥ t) = e−λi t ; (iii) If Mn = min{T1 , . . . , Tn } and In = argmin j T j , then Mn ∼ exponential( n=1 λi ), i n P(In = i) = λi / j=1 λ j , and Mn and In are independent. Property (ii), memorylessness, says that if the time before one’s alarm clock rings is exponentially distributed, then one’s beliefs about how long from now the clock will ring do not depend on how long one has already been waiting. Together, this property and equation (9.5) above tell us that until the time when the clock rings, the conditional 363 probability that it rings during the next dt times units is proportional to dt: P (Ti ≤ t + dt |Ti ≥ t ) = P (Ti ≤ dt) ≈ λi dt The exponential distributions are the only continuous distributions with these properties. Property (iii) says that given a collection of independent exponential alarm clocks, then the time until the first clock rings is itself exponentially distributed, the probability that a particular clock rings first is proportional to its rate, and the time until the first ring and the ringing clock’s identity are independent random variables. These facts are essential to the workings of our stochastic evolutionary model. Proof. Parts (i) and (ii) are easily verified. To establish part (iii), set λ = compute the distribution of Mn as follows: P(Mn ≥ t) = P n i =1 n {Ti ≥ t} = i =1 n P(Ti ≥ t) = i =1 n i=1 λi , and e−λi t = e−λt . To prove the remaining claims from part (iii), observe that ∞ ∞ (9.6) P ji {Ti ≤ T j } ∩ {Ti ≥ t} = ji t λ j e−λ j t j dt j λi e−λi ti dti ti ∞ = t ji e−λ j ti λi e−λi ti dti ∞ λi e−λti dti = t = (λi /λ) e−λt . Setting t = 0 in equation (9.6) yields P(In = i) = λi /λ, and an arbitrary choice of t shows that P(Mn ≥ t, In = i) = P(Mn ≥ t)P(In = i). A random variable R has a Poisson distribution with rate λ, denoted R ∼ Poisson(λ), if P(R = r) = e−λ λr /r! for all r ∈ {0, 1, 2, . . .}. Poisson random variables are used to model the number of occurrences of rare events (see Propositions 9.A.3 and 9.A.4). Two of their basic properties are listed below. Proposition 9.A.2. If R1 , . . . , Rn are independent with Ri ∼ Poisson(λi ), then (i) E(Ri ) = λi ; n n (ii) j=1 R j ∼ Poisson( j=1 λ j ). ∞ Proof. (i) E(Ri ) = re r =1 r −λi (λi ) r! ∞ −λ i = λi e r =1 (λi )r−1 = λi (r − 1)! 364 ∞ s −λi (λi ) e s=0 s! = λi . r Rt = 3 3 Rt = 2 2 Rt = 1 1 Rt = 0 0 S 1 = T1 S 2 = T1 + T2 S 3 = T1+ T2+ T3 t Figure 9.A.1: Ring times Sn and numbers of rings Rt of an exponential alarm clock. (ii) When n = 2, we can compute that r P(R1 + R2 = r) = r r1 =0 r −(λ1 +λ2 ) =e e−λ1 P(R1 = r1 ) P(R2 = r − r1 ) = r1 =0 (λ1 )r1 −λ2 (λ2 )r−r1 e r1 ! (r − r1 )! r (λ1 ) (λ2 ) −(λ1 +λ2 ) (λ1 + λ2 ) =e , r1 ! (r − r1 )! r! r =0 r1 r−r1 1 where the final equality follows from the binomial expansion r (λ1 + λ2 )r = r! (λ1 )r1 (λ2 )r−r1 . r ! (r − r1 )! r =0 1 1 Iterating yields the general result. The exponential and Poisson distributions are fundamentally linked. Let {Ti }∞ 1 be a i= sequence of i.i.d. exponential(λ) random variables. We can interpret T1 as the first time that an exponential alarm clock rings, T2 as the interval between the first and second rings, and Tk as the interval between the (k − 1)st and kth rings. In this interpretation, the sum Sn = n=1 Tk represents the time of the nth ring, while Rt = max{n : Sn ≤ t} represents the k number of rings through time t. Figure 9.A.1 presents a single realization of the ring time sequence {Sn }∞ 1 and the number-of-rings process {Rt }t≥0 . n= Proposition 9.A.3 derives the distribution of Rt , establishing a key connection between the exponential and Poisson distributions. Proposition 9.A.3. Rt ∼ Poisson(λt). 365 Proof. To begin, we prove that Sn has density (9.7) fn (t) = λe−λt (λt)n−1 . (n − 1)! This formula is obviously correct when n = 1. Suppose it is true for some arbitrary n. ∞ Then using the convolution formula fX+Y (z) = −∞ fY (z − x) fX (x) dx, we find that t fn+1 (t) = t fn (t − s) f1 (s)ds = 0 0 n (λ(t − s))n−1 −λ(t−s) −λs −λt (λt) e × λe ds = λe . λ (n − 1)! n! Next, we show that this equation implies that Sn has cumulative distribution ∞ e−λt P(Sn ≤ t) = m =n Since (λt)m ∞ m =0 m ! (λt)m . m! = eλt , this statement is equivalent to n−1 e−λt P(Sn ≤ t) = 1 − m =0 (λt)m . m! Differentiating shows that this expression is in turn equivalent to the density of Sn taking form (9.7), as established above. To complete the proof, we express the event that at least n rings have occurred by time t in two equivalent ways: {Rt ≥ n} = {Sn ≤ t}. This observation and the expression for P(Sn ≤ t) above imply that P(Rt = n) = P(Rt ≥ n) − P(Rt ≥ n + 1) = P(Sn ≤ t) − P(Sn+1 ≤ t) = e−λt 9.A.2 (λt)n . n! The Poisson Limit Theorem Proposition 9.A.3 shows that the Poisson distribution describes the number of rings of an exponential alarm clock during a fixed time span. We now establish a discrete analogue of this result. The random variable Xp has a Bernoulli distribution with parameter p ∈ [0, 1], denoted p Xp ∼ Bernoulli(p), if P(Xp = 1) = p and P(Xp = 0) = 1 − p. Let {Xi }n=1 be a sequence of i p p i.i.d. Bernoulli(p) random variables (e.g. coin tosses), and let Sn = n=1 Xi denote their sum i p (the number of heads in n tosses). Then Sn has a binomial distribution with parameters n 366 p and p (Sn ∼ binomial(n, p)): n p P(Sn = s) = ps (1 − p)n−s for all s ∈ {0, 1, . . . , n}. s Finally, the random variable Z has a standard normal distribution (Z ∼ N(0, 1)) if its density 2 function is f (z) = √1 π exp(− z2 ). 2 p Proposition 9.A.4 considers the behavior of the binomial random variables Sn when the number of tosses n becomes large. Recall that the sequence of random variables {Yn }∞ 1 with distribution functions {Fn }∞ 1 converges in distribution (or converges weakly) n= n= to the random variable Y with distribution function F (denoted Yn ⇒ Y, or Fn ⇒ F) if limn→∞ Fn (x) = F(x) at all points x ∈ R at which F is continuous. p Proposition 9.A.4. Let Sn ∼ binomial(n, p). Then as n → ∞, p (i) (ii) S √ n −np ⇒ Z, where Z ∼ N(0, 1). np(1−p) λ/n Sn ⇒ Rλ where Rλ ∼ Poisson(λ). If we increase the number of tosses n of a coin whose bias p is fixed, the Central Limit p Theorem tells us that the distribution of the number of heads Sn approaches a normal p p distribution. (In statement (i), we subtract the mean ESn = np off of Sn and then divide by p the standard deviation SD(Sn ) = np(1 − p) to obtain convergence to a fixed distribution.) Suppose instead that as we increase the number of tosses n, we decrease the probability of heads p in such a way that the expected number of heads np = λ remains fixed. Then p statement (ii), the Poisson Limit Theorem, tells us that the distribution of Sn approaches a Poisson distribution. The basic calculation needed to prove this is as follows: P(Sλ/n n n! λ = s) = s!(n − s)! n λ = P(R = s) × s λ 1− n (1 − λ )n n e −λ s−1 × r=0 n−s λ = 1− n n λ λs × 1− s! n −s n! (n − s)! ns n−r → P(Rλ = s). n−λ The second term of the penultimate expression above is independent of s and is less than 1 (because (1 − λ )n increases to e−λ ), while the final term attains its maximum over s when n s = λ + 1 and decreases to 1 as n grows large. Together, these observations yield the following upper bound, which is needed in Exercise 9.3.2. Proposition 9.A.5. P(Sλ/n = s) ≤ Cλ P(Rλ = s) for some Cλ ∈ R independent of n and s. n 367 9.B Countable State Markov Processes 9.B.1 Countable Probability Models We begin our review of probability theory by discussing probability models with a countable sample space. A countable probability model is a pair (Ω, P), where the sample space Ω is a finite or countable set, 2Ω is the set of all subsets of Ω, and P : 2Ω → [0, 1] is a probability measure: that is, a function satisfying P(∅) = 0, P(Ω) = 1, and countable additivity: if {Ak } is a finite or countable collection of disjoint events (i.e., subsets of Ω), then P( k Ak ) = k P(Ak ). A random variable X is a function whose domain is Ω. The distribution of X is defined by P(X ∈ A) = P(ω ∈ Ω : X(ω) ∈ A) for all subsets A of the range of X. To define a finite collection of discrete random variables {Xk }n=1 , we specify a probability model (Ω, P) and k then define the random variables as functions on Ω. To interpret this construction, imagine picking an ω at random from the sample space Ω according to the probability distribution P. The value of ω so selected determines the realizations X1 (ω), X2 (ω), . . . , Xn (ω) of the entire sequence of random variables X1 , X2 , . . ., Xn . Example 9.B.1. Repeated rolls of a fair die. Suppose we would like to construct a sequence of random variables {Xk }n=1 , where Xk is to represent the kth roll of a fair die. To accomplish k this, we let R = {1, 2, 3, 4, 5, 6} be the set of possible results of an individual roll, and let the sample space be the set of n-vectors Ω = Rn , with typical element ω = (ω1 , . . ., ωn ). To define the probability measure P, it is enough to let P({ω}) = ( 1 )n for all ω ∈ Ω; additivity 6 then determines the probabilities of all other events in 2Ω . The random variables Xk can then be defined as coordinate functions: Xk (ω) = ωk for all ω ∈ Ω and k ∈ {1, . . ., n}. Observe once again that by randomly selecting an ω ∈ Ω, we determine the realizations of all n random variables. Since P(Xk = xk ) = P(ω ∈ Ω : Xk (ω) = xk ) = P(ω ∈ Ω : ωk = xk ) = 1 6 for all xk ∈ R, the random variables Xk have the correct marginal distributions. Moreover, if Ak ⊆ R for k ∈ {1, . . ., n}, it is easy to confirm that P n k =1 {Xk ∈ Ak } = n P(Xk ∈ Ak ), k =1 so the Xk are independent, as desired. § The expected value of a random variable is its integral with respect to the probability 368 measure P. In the case of the kth die roll, EXk = Ω Xk (ω) dP(ω) = ωk P(ω) = ωk ∈R ω∈Ω ωk ω−k P(ωk , ω−k ) = 6 i× 1 6 = 31. 2 i=1 We can create new random variables out of old ones using functional operations. For instance, the total of the results of the n die rolls is a new random variable Sn defined by Sn = n=1 Xk , or, more explicitly, by Sn (ω) = n=1 Xk (ω) for all ω ∈ Ω. k k 9.B.2 Uncountable Probability Models and Measure Theory While the constructions above are sufficient for finite collections of discrete random variables, they do not suffice when individual random variables take an uncountable number of values, or when we are interested in infinite numbers of random variables. To handle these situations, we need the sample space Ω to be uncountable: that is, not expressible as a sequence of elements. Unfortunately, uncountable sample spaces introduce a serious new technical difficulty. As an illustration, suppose we want to construct a random variable representing a uniform draw from the unit interval. It is natural to choose Ω = [0, 1] as our sample space and to define our random variable as the identity function on Ω: that is, X(ω) = ω. But then we encounter a major difficulty: it is impossible to define a countably additive probability measure P that specifies the probability of every subset of Ω. To resolve this problem, one chooses a set of subsets F ⊆ 2Ω whose probabilities will be specified, and then introduces corresponding restrictions on the definition of a random variable. A random variable satisfying these restrictions is said to be measurable, and this general approach to studying functions defined on uncountable domains is known as measure theory. To summarize some of the foregoing discussion: an uncountable probability model consists of a triple (Ω, F , P), where Ω is a sample space, F ⊆ 2Ω is a collection (more specifically, a σ-algebra) of subsets of Ω, and P : F → [0, 1] is a countably additive probability measure. Suppose we would like to study a collection of random variables described by some prespecified joint distributions. How do we know whether it is possible to construct these random variables on some well-chosen probability space? Happily, as long as the marginal and joint distributions satisfy certain obviously necessary consistency conditions, existence of the probability space and the random variables is ensured by the Carath´odory Extension Theorem and the Kolmogorov Extension Theorem. e 369 9.B.3 Distributional Properties and Sample Path Properties The reader may wonder why we bother with the explicit construction of random variables. After all, once we specify the joint distributions of the basic random variables of interest, we also determine the joint distributions of any random variables that can be derived from our original collection. Why not work entirely in terms of these distributions and avoid the explicit construction of the random variables altogether? If we are only interested in distributional properties of our random variables, explicit construction of the random variables is not essential. However, many key results in probability theory concern not the distributional properties of random variables, but rather their sample path properties. These are properties of realization sequences: i.e., the sequences of values X1 (ω), X2 (ω), X3 (ω), . . . that arise for each choice of ω ∈ Ω. The differences between the two sorts of properties can be illustrated through a simple example. Example 9.B.2. Consider the probability model (Ω, P) with sample space Ω = {−1, 1} and 1 probability measure P({−1}) = P({1}) = 2 . Define the sequences of random variables {Xi }∞ 1 i= ˆ and {Xi }∞ 1 as follows: i= Xi (ω) = ω; −ω if i is odd, ˆ Xi (ω) = ω if i is even. ˆ If we look only at marginal distributions, {Xi }∞ 1 and {Xi }∞ 1 are identical, as both sequences i= i= consist of random variables equally likely to have realizations –1 and 1. But from the sample path point of view, the two sequences are different: for either choice of ω, the ˆ sequence {Xi (ω)}∞ 1 is constant, while the sequence {Xi (ω)}∞ 1 alternates between 1 and -1 i= i= forever. We illustrate these ideas in Figures 9.B.1 and 9.B.2, which provide graphical representations of our two sequences of random variables. In these pictures, the vertical axis represents the sample space Ω, the horizontal axis represents indices (or “times”) of the ˆ trials, and the interiors of the figures contain the realizations Xi (ω) and Xi (ω). To focus on distributional properties of a sequence of random variables, we focus on the collection of outcomes in each vertical section of the picture (Figure 9.B.1). In this respect, each Xi is ˆ identical to its partner Xi , and in fact all of the random variables in both sequences share the same distribution. To focus on sample path properties, we look instead at the sequences of outcomes in each horizontal slice of each picture (Figure 9.B.2). By doing so, we see that ˆ for each ω, the sample path {Xi (ω)}∞ 1 is quite different from the sample path {Xi (ω)}∞ 1 . § i= i= 370 ˆ X X –1 –1 1 –1 –1 –1 –1 1 1 1 1 1 1 Ω 2 3 4 time 5 –1 1 –1 1 –1 1 1 –1 1 –1 1 –1 1 Ω 2 3 4 time 5 ... ... ˆ Figure 9.B.1: Distributional properties of X and X. ˆ X X –1 –1 1 –1 –1 –1 –1 1 1 1 1 1 1 Ω 2 3 4 time 5 –1 1 –1 1 –1 1 1 –1 1 –1 1 –1 1 Ω 2 3 4 time 5 ... ... ˆ Figure 9.B.2: Sample path properties of X and X. Example 9.B.3. Properties of i.i.d. random variables. The distinction between distributional properties and sample path properties can be used to classify the fundamental theorems about sequences of i.i.d. random variables. Let {Xi }∞ 1 be a sequence of i.i.d. random i= variables, each of which is a function on the (uncountable) probability space (Ω, F , P). For simplicity, assume that each Xi has mean zero and variance one. Then the sum Sn = n=1 Xi has mean zero and variance n, while the sample average Xn = Sn /n has mean i 1 zero and variance n . The laws of large numbers concern the convergence of the sample averages Xn as the number of trials n grows large. The Weak Law of Large Numbers is a distributional result: as n goes to infinity, the distributions of the random variables Xn approach a point mass at 0. The Weak Law of Large Numbers : For all ε > 0, lim P Xn ∈ [−ε, ε] = 1. n→∞ In contrast, the Strong Law of Large Numbers is a sample path result: for almost every choice of ω ∈ Ω, the sequence of realizations {Xn (ω)}∞ 1 converges to zero. n= The Strong Law of Large Numbers: P ω ∈ Ω : lim Xn (ω) = 0 = 1. n→∞ Note that while the WLLN can be stated directly in terms of distributions, the SLLN only 371 makes sense if our random variables are defined as functions on a probability space. A second pair of results focuses on variation. The Central Limit Theorem concerns dis√ tributions: as n goes to infinity, the distributions of the normalized sums Sn / n converge to the standard normal distribution. Sn The Central Limit Theorem: lim P √ ∈ [a, b] = n→∞ n b a 1 2 √ e−x /2 dx. 2π The Law of the Iterated Logarithm looks at variation within individual sample paths: for almost every choice of ω ∈ Ω, the sequence of realizations {Sn (ω)}∞ 1 exceeds (1 − n= ε) 2n log log n infinitely often, but exceeds (1 + ε) 2n log log n only finitely often. Sn (ω) The Law of the Iterated Logarithm: P ω ∈ Ω : lim sup = 1 = 1. § n→∞ 2n log log n In Chapter 10, we present distributional and sample path convergence theorems for Markov processes; these results are the key to describing the evolution of behavior over infinite time horizons. 9.B.4 Countable State Markov Chains Markov chains and Markov processes are collections of random variables {Xt }t∈T with the property that “the future only depends on the past through the present”. We focus on settings where these random variables take values in some finite or countable state space X . (Of course, even if the state space X is countable, the random variables Xt : Ω → X must be defined on a probability model with an uncountable sample space Ω if the set of times T is infinite.) We use the terms “Markov chain” and “Markov process” to distinguish between the discrete time (T = {0, 1, . . .}) and continuous time (T = [0, ∞)) frameworks. (Some authors use these terms to distinguish between discrete and continuous state spaces.) The sequence of random variables {Xt } = {Xt }∞ 0 is a Markov chain if it satisfies the t= Markov property: P (Xt+1 = xt+1 |X0 = x0 , . . . , Xt = xt ) = P (Xt+1 = xt+1 |Xt = xt ) for all times t ∈ {0, 1, . . .} and all collections of states x0 , . . . , xt+1 ∈ X for which the conditional expectations are well defined. We only consider temporally homogeneous Markov chains, which are Markov chains whose one-step transition probabilities are independent of time: P Xt+1 = y |Xt = x = pxy . 372 We call the matrix p ∈ RX ×X the transition matrix for the Markov chain {Xt }. The vector + X defined by P(X = x) = π is the initial distribution of {X }; when π puts all of its π ∈ R+ 0 x t mass on a single state x0 , we call x0 the initial condition or the initial state. The vector π and the matrix p fully determine the joint distributions of {Xt } via t P (X0 = x0 , . . . , Xt = xt ) = πx0 pxs−1 xs . s=1 Since certain properties of Markov chains do not depend on the initial distribution π, it is sometimes left unspecified. 9.B.5 Countable State Markov Processes A (temporally homogeneous) Markov process on the countable state space X is a collection of random variables {Xt } = {Xt }t≥0 with continuous time index t. This collection must satisfy the following three properties: (MP) The (continuous time) Markov property: P Xtk+1 = xtk+1 Xt0 = xt0 , . . . , Xtk = xtk = P Xtk+1 = xtk+1 Xtk = xtk for all 0 ≤ t0 < . . . < tk+1 and xt0 , . . ., xtk+1 ∈ X with P Xt0 = xt0 , . . . , Xtk = xtk > 0. (TH) Temporal homogeneity: P Xt+u = y |Xt = x = pxy (u) for all t, u ≥ 0. (RCLL) Right continuity and left limits: For every ω ∈ Ω, the sample path {Xt (ω)}t≥0 is continuous from the right and has left limits. That is, lims↓t Xs (ω) = Xt (ω) for all t ∈ [0, ∞), and lims↑t Xs (ω) exists for all t ∈ (0, ∞). While conditions (MP) and (TH) are restrictions on the (joint) distributions of {Xt }, condition (RCLL) is a restriction on the sample paths of {Xt }. Processes satisfying the distributional requirements (MP) and (TH) must take this form: there must be an initial distribution π ∈ RX , a jump rate vector λ ∈ RX , and a transition + + X ×X such that matrix p ∈ R+ (i) The initial distribution of the process is given by P(X0 = x) = πx . (ii) When the process is in state x, the random time before the next jump is exponentially distributed with rate λx . (iii) The state at which a jump from x lands follows the distribution {pxy } y∈X . (Note that the landing state can be x itself if pxx > 0.) 373 (iv) Times between and landing states of jumps are independent of each other, and are also independent of the past conditional on the current state. The objects π, λ, and p implicitly define the joint distributions of the random variables {Xt }, so the Kolmogorov Extension Theorem (Section 9.B.2) tells us that a collection of random variables with these joint distributions exists (i.e., can be defined as functions on some well chosen probability space). However, Kolmogorov’s Theorem does not ensure that the random variables so constructed satisfy the sample path continuity property (RCLL). Fortunately, it is not too difficult to construct the process {Xt } explicitly. Let {Yk }∞ 0 be k= a discrete time Markov chain with initial distribution π and transition matrix p, and let {Tk }∞ 1 be a sequence of i.i.d. exponential(1) random variables that are independent of the k= Markov chain {Yk }. (Since both of these collections are countable, questions of sample path continuity do not arise; the existence of these random variables as functions defined on a common probability space is ensured by Kolmogorov’s Theorem.) Define the random jump times {τn }∞ 0 by τ0 = 0 and n= n τn = k =1 Tn Tk , so that τn − τn−1 = . λYk−1 λYn−1 Finally, define the process {Xt }t≥0 by Xt = Yn when t ∈ [τn , τn+1 ). The process {Xt } begins at some initial state X0 = Y0 = y0 . It remains there for the random duration τ1 ∼ exponential(λ y0 ), at which point a transition to some new state Xτ1 = Y1 = y1 occurs; the process then remains at y1 for the random duration τ2 − τ1 ∼ exponential(λ y1 ), at which point a transition to Xτ2 = Y2 = y2 occurs; and so on. By construction, the sample paths of {Xt } are right continuous with left limits, and it is easy to check that the joint distributions of {Xt } are the ones we desire. Example 9.B.4. The Poisson Process. Consider a Markov process {Xt } with state space X = Z+ , initial condition X0 = 0, jump rates λx = λ > 0 for all x ∈ X , and transition matrix pxy = 1{ y=x+1} for all x, y ∈ X . Under this process, jumps arrive randomly at the fixed rate λ, and every jump increases the state by exactly one unit. A Markov process fitting this description is called a Poisson process. By the definition of this process, (P1) The waiting times τn – τn−1 are i.i.d. with τn – τn−1 ∼ exponential(λ) 374 (n ∈ {1, 2, . . . }). In fact, it can be shown that under the sample path continuity condition (RCLL), condition (P1) is equivalent to (P2) The increments Xtn − Xtn−1 are independent random variables, and (Xtn − Xtn−1 ) ∼ Poisson(λ(tn − tn−1 )) (0 < t1 < . . . < tn ). Proposition 9.A.3 established part of this result: it showed that if condition (P1) holds, then Xt ∼ Poisson(λt) for all t > 0. But the present result says much more: a “pure birth process” whose waiting times are i.i.d. exponentials is not only Poisson distributed at each time t; in fact, all increments of the process are Poisson distributed, and nonoverlapping increments are stochastically independent. Conversely, if one begins with the assumption that the increments of the process are independent and Poisson, then the waits between jumps must be i.i.d. and exponential. § 9.C Kurtz’s Theorem in Discrete Time To obtain a deterministic approximation theorem for discrete time Markov chains, we must assume that the length of a period with respect to clock time becomes vanishingly small as the population size N increases. Let dN be the duration of a period under N the Markov chain {Xt }, so that this chain is initialized at time 0 and has transitions at N N N times dN , 2dN , . . . . We can define {Xt } at all times in [0, ∞) by letting Xt = XkdN when N N t ∈ [kdN , (k + 1)dN ), making each sample path {Xt (ω)} = {Xt (ω)}t≥0 a step function whose jumps occur at multiples of dN . Theorem 9.C.1 (Kurtz’s Theorem in Discrete Time). Suppose that limN→∞ dN = 0. Define the distributions of the random variables ζN by x N N P(ζN = z) = P X(k+1)dN = x + z XkdN = x , x and define the functions V N , AN , and AN by δ V N (x) = 1 E ζN , x dN AN (x) = 1 E dN ζN , and AN (x) = x δ 1 E dN ζN 1{|ζN |>δ} . x x N Then the conclusions of Theorem 9.2.1 hold for the sequence of Markov chains {{Xt }}∞=N0 . N 375 9.N Notes Section 9.2. Kurtz’s Theorem first appeared in Kurtz (1970). For an advanced textbook treatment and further references, see Ethier and Kurtz (1986, Chapter 11). The first formal results in the game theory literature akin to Theorem 9.2.3 focus on specific revision protocols. Boylan (1995) shows how evolutionary processes based on random matching schemes converge to deterministic trajectories when the population size grows large. Binmore et al. (1995), Borgers and Sarin (1997), and Schlag (1998) ¨ consider particular models of evolution that converge to the replicator dynamic. Binmore and Samuelson (1999) prove a general deterministic approximation result for discrete time models of evolution under a somewhat restrictive timing assumption. Sandholm (2003) uses Kurtz’s Theorem to prove a general finite horizon convergence result. This paper also N shows that after spatial normalization, the behavior of {Xt } near rest points of the mean dynamic can be approximated by a diffusion. The strongest deterministic approximation results can be found in Bena¨m and Weibull (2003). These authors establish an exponential ı N bound on the probability of deviations of {Xt } from solutions of the mean dynamic. They N also establish results relating the infinite horizon behavior of {Xt } to the mean dynamic; we introduce these results in Chapter 10. While the results described above rely on the assumption that the mean dynamic is Lipschitz continuous, we conjecture that that analogous results can be established in more general settings—in particular, when the mean dynamic is not a differential equation at all, but rather a differential inclusion. For related results in a somewhat different context, see Bena¨m et al. (2005). ı While we have focused here on the evolution of the distribution of behavior, Tanabe (2006), building on work of Tanaka (1983) and Shiga and Tanaka (1985), proves results about the evolution of the strategy profile: i.e., about the joint distribution of individual agents’ choice trajectories. Suppose that at time 0, the N agents’ choices of strategies from S are i.i.d. Then as N grows large, each agent’s random choice trajectory converges in distribution to ν, the distribution of a certain time-inhomogeneous Markov process—a so-called McKean process—taking values in S. Furthermore, the joint distribution of any k individuals’ choice trajectories converges to the k-fold product of the measure ν. This means that the independence of the k individuals’ choices at time 0 persists over any finite time span, a phenomenon sometimes called propagation of chaos. One can further show that the empirical distribution of the N agents’ choice trajectories also converges to the measure ν. (Since ν is the (limiting) distribution of each individual’s stochastic choice trajectory, this result is a generalization of the Glivenko-Cantelli Theorem.) Now N the time t marginal of this empirical distribution is none other than our state variable Xt , 376 so Theorem 9.2.3 tells us that the collection of time t marginals of ν is none other than the solution to our mean dynamic (M). For an overview of the mathematical literature relevant to this discussion, see Sznitman (1991). Appendices 9.A and 9.B: Billingsley (1995) and Durrett (2005) are excellent graduate level probability texts. The former book provides more thorough coverage of the topics considered in this chapter, and contains an especially clear treatment of the Poisson process. Norris (1997), Br´ maud (1999), and Stroock (2005) are all excellent books on e Markov chains and Markov processes. The first of these is at an undergraduate level, the last at a graduate level, and the middle one somewhere in between. Appendix 9.C: This section follows Kurtz (1970). 377 378 CHAPTER TEN Infinite Horizon Behavior and Equilibrium Selection 10.0 Introduction To be added. 379 380 BIBLIOGRAPHY Abraham, R. and Robbin, J. (1967). Transversal Mappings and Flows. W. A. Benjamin, New York. Akin, E. (1979). The Geometry of Population Genetics. Springer, Berlin. Akin, E. (1980). Domination or equilibrium. Mathematical Biosciences, 50:239–250. Akin, E. (1990). The differential geometry of population genetics and evolutionary games. In Lessard, S., editor, Mathematical and Statistical Developments of Evolutionary Theory, pages 1–93. Kluwer, Dordrecht. Akin, E. (1993). The General Topology of Dynamical Systems. American Mathematical Society, Providence, RI. Akin, E. and Losert, V. (1984). Evolutionary dynamics of zero-sum games. Journal of Mathematical Biology, 20:231–258. Anderson, S. P., de Palma, A., and Thisse, J.-F. (1992). Discrete Choice Theory of Product Differentiation. MIT Press, Cambridge. Arneodo, A., Coullet, P., and Tresser, C. (1980). Occurrence of strange attractors in threedimensional Volterra equations. Physics Letters, 79A:259–263. Aubin, J.-P. (1991). Viability Theory. Birkh¨ user, Boston. a Aubin, J.-P. and Cellina, A. (1984). Differential Inclusions. Springer, Berlin. Avriel, M. (1976). Nonlinear Programming: Analysis and Methods. Prentice-Hall, Englewood Cliffs, NJ. Balkenborg, D. and Schlag, K. H. (2001). Evolutionarily stable sets. International Journal of Game Theory, 29:571–595. Beckmann, M., McGuire, C. B., and Winsten, C. B. (1956). Studies in the Economics of Transportation. Yale University Press, New Haven. 381 Bena¨m, M. (1998). Recursive algorithms, urn processes, and the chaining number of chain ı recurrent sets. Ergodic Theory and Dynamical Systems, 18:53–87. Bena¨m, M. (1999). Dynamics of stochastic approximation algorithms. In Az´ ma, J. et al., ı e editors, S´minaire de Probabilit´s XXXIII, pages 1–68. Springer, Berlin. e e Bena¨m, M. and Hirsch, M. W. (1999). Mixed equilibria and dynamical systems arising ı from fictitious play in perturbed games. Games and Economic Behavior, 29:36–72. Bena¨m, M., Hofbauer, J., and Hopkins, E. (2006a). Learning in games with unstable ı equilibria. Unpublished manuscript, Universit´ de Neuchˆ tel, University of Vienna, e a and University of Edinburgh. Bena¨m, M., Hofbauer, J., and Sorin, S. (2005). Stochastic approximation and differential ı inclusions. SIAM Journal on Control and Optimization, 44:328–348. Bena¨m, M., Hofbauer, J., and Sorin, S. (2006b). Stochastic approximation and differential ı inclusions II: Applications. Mathematics of Operations Research, 31:673–695. Bena¨m, M. and Weibull, J. W. (2003). Deterministic approximation of stochastic evolution ı in games. Econometrica, 71:873–903. Berger, U. (2007). Two more classes of games with the continuous-time fictitious play property. Games and Economic Behavior, 60:247–261. Berger, U. and Hofbauer, J. (2006). Irrational behavior in the Brown-von Neumann-Nash dynamics. Games and Economic Behavior, 56:1–6. Bhatia, N. P. and Szeg˝ , G. P. (1970). Stability Theory of Dynamical Systems. Springer, Berlin. o Billingsley, P. (1995). Probability and Measure. Wiley, New York, third edition. Binmore, K. and Samuelson, L. (1999). Evolutionary drift and equilibrium selection. Review of Economic Studies, 66:363–393. Binmore, K., Samuelson, L., and Vaughan, R. (1995). Musical chairs: Modeling noisy evolution. Games and Economic Behavior, 11:1–35. Bishop, D. T. and Cannings, C. (1978). A generalised war of attrition. Journal of Theoretical Biology, 70:85–124. Bjornerstedt, J. and Weibull, J. W. (1996). Nash equilibrium and evolution by imitation. In ¨ Arrow, K. J. et al., editors, The Rational Foundations of Economic Behavior, pages 155–181. St. Martin’s Press, New York. Bomze, I. M. (1986). Non-cooperative two-person games in biology: A classification. International Journal of Game Theory, 15:31–57. 382 Bomze, I. M. (1990). Dynamical aspects of evolutionary stability. Monatshefte fur Mathe¨ matik, 110:189–206. Bomze, I. M. (1991). Cross entropy minimization in uninvadable states of complex populations. Journal of Mathematical Biology, 30:73–87. Bomze, I. M. (2002). Regularity versus degeneracy in dynamics, games, and optimization: A unified approach to different aspects. SIAM Review, 44:394–414. Bomze, I. M. and Potscher, B. M. (1989). Game Theoretical Foundations of Evolutionary ¨ Stability. Springer, Berlin. Borgers, T. and Sarin, R. (1997). Learning through reinforcement and the replicator dy¨ namics. Journal of Economic Theory, 77:1–14. Boylan, R. T. (1995). Continuous approximation of dynamical systems with randomly matched individuals. Journal of Economic Theory, 66:615–625. ¨ Braess, D. (1968). Uber ein Paradoxen der Verkehrsplanung. Unternehmensforschung, 12:258–268. Br´ maud, P. (1999). Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. e Springer, New York. Brown, G. W. and von Neumann, J. (1950). Solutions of games by differential equations. In Kuhn, H. W. and Tucker, A. W., editors, Contributions to the Theory of Games I, volume 24 of Annals of Mathematics Studies, pages 73–79. Princeton University Press, Princeton. Bulow, J. and Klemperer, P. (1999). The generalized war of attrition. American Economic Review, 89:175–189. Burdett, K. and Judd, K. L. (1983). Equilibrium price dispersion. Econometrica, 51:955–969. Camerer, C. (2003). Behavioral Game Theory. Princeton University Press, Princeton. Conley, C. (1978). Isolated Invariant Sets and the Morse Index. American Mathematical Society, Providence, RI. Cooper, R. W. (1999). Coordination Games: Complementarities and Macroeconomics. Cambridge University Press, Cambridge. Cressman, R. (1992). The Stability Concept of Evolutionary Game Theory: A Dynamic Approach. Springer, Berlin. Cressman, R. (1995). Evolutionary game theory with two groups of individuals. Games and Economic Behavior, 11:237–253. Cressman, R. (1996). Frequency-dependent stability for two-species interactions. Theoretical Population Biology, 49:189–210. 383 Cressman, R. (1997). Local stability of smooth selection dynamics for normal form games. Mathematical Social Sciences, 34:1–19. Cressman, R. (2003). Evolutionary Dynamics and Extensive Form Games. MIT Press, Cambridge. Cressman, R., Garay, J., and Hofbauer, J. (2001). Evolutionary stability concepts for nspecies frequency-dependent interactions. Journal of Theoretical Biology, 211:1–10. Crouzeix, J.-P. (1998). Characterizations of generalized convexity and generalized monotonicity: A survey. In Crouzeix, J.-P. et al., editors, Generalized Convexity, Generalized Monotonicity: Recent Results, pages 237–256. Kluwer, Dordrecht. Crow, J. F. and Kimura, M. (1970). An Introduction to Population Genetics Theory. Harper and Row, New York. Dafermos, S. and Sparrow, F. T. (1969). The traffic assignment problem for a general network. Journal of Research of the National Bureau of Standards B, 73:91–118. Dawkins, R. (1976). The Selfish Gene. Oxford University Press, Oxford. Dawkins, R. (1982). The Extended Phenotye. Oxford University Press, Oxford. Demichelis, S. and Germano, F. (2000). On the indices of zeros of Nash fields. Journal of Economic Theory, 94:192–217. Demichelis, S. and Germano, F. (2002). On (un)knots and dynamics in games. Games and Economic Behavior, 41:46–60. Demichelis, S. and Ritzberger, K. (2003). From evolutionary to strategic stability. Journal of Economic Theory, 113:51–75. Dupuis, P. and Nagurney, A. (1993). Dynamical systems and variational inequalities. Annals of Operations Research, 44:9–42. Durrett, R. (2005). Probability: Theory and Examples. Brooks-Cole, Belmont, CA, third edition. Eigen, M. and Schuster, P. (1979). The Hypercycle: A Principle of Natural Self-Organization. Springer, Berlin. Ellison, G. and Fudenberg, D. (2000). Learning purified mixed equilibria. Journal of Economic Theory, 90:84–115. Ely, J. C. and Sandholm, W. H. (2005). Evolution in Bayesian games I: Theory. Games and Economic Behavior, 53:83–109. Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence. Wiley, New York. 384 Friedberg, S. H., Insel, A. J., and Spence, L. E. (1989). Linear Algebra. Prentice-Hall, Englewood Cliffs, NJ, second edition. Friedman, D. (1991). Evolutionary games in economics. Econometrica, 59:637–666. Fudenberg, D. and Kreps, D. M. (1993). Learning mixed equilibria. Games and Economic Behavior, 5:320–367. Fudenberg, D. and Levine, D. K. (1998). Theory of Learning in Games. MIT Press, Cambridge. Fudenberg, D. and Tirole, J. (1991). Game Theory. MIT Press, Cambridge. Gaunersdorfer, A. and Hofbauer, J. (1995). Fictitious play, Shapley polygons, and the replicator equation. Games and Economic Behavior, 11:279–303. Gilboa, I. and Matsui, A. (1991). Social stability and equilibrium. Econometrica, 59:859–867. Gordon, W. B. (1972). On the diffeomorphisms of euclidean space. American Mathematical Monthly, 79:755–759. Guckenheimer, J. and Holmes, P. (1983). Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. Springer, Berlin. Haigh, J. (1975). Game theory and evolution. Advances in Applied Probability, 7:8–11. Hamilton, W. D. (1967). Extraordinary sex ratios. Science, 156:477–488. Hamilton, W. D. (1996). Narrow Roads of Gene Land, volume 1. W. H. Freeman/Spektrum, Oxford. Harker, P. T. and Pang, J.-S. (1990). Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms, and applications. Mathematical Programming, 48:161–220. Harsanyi, J. C. (1973). Games with randomly disturbed payoffs: A new rationale for mixed-strategy equilibrium points. International Journal of Game Theory, 2:1–23. Hart, S. and Mas-Colell, A. (2001). A general class of adaptive strategies. Journal of Economic Theory, 98:26–54. Hart, S. and Mas-Colell, A. (2003). Uncoupled dynamics do not lead to Nash equilibrium. American Economic Review, 93:1830–1836. Hartman, P. (1964). Ordinary Differential Equations. Wiley, New York. Henry, C. (1973). An existence theorem for a class of differential equations with multivalued right-hand side. Journal of Mathematical Analysis and Applications, 41:179–186. Hewitt, E. and Stromberg, K. (1965). Real and Abstract Analysis. Springer, Berlin. 385 Hines, W. G. S. (1980). Three characterizations of population strategy stability. Journal of Applied Probability, 17:333–340. Hines, W. G. S. (1987). Evolutionary stable strategies: A review of basic theory. Theoretical Population Biology, 31:195–272. Hiriart-Urruty, J.-B. and Lemar´ chal, C. (2001). Fundamentals of Convex Analysis. Springer, e Berlin. Hirsch, M. W. (1988). Systems of differential equations that are competitive or cooperative III: Competing species. Nonlinearity, 1:51–71. Hirsch, M. W. and Smale, S. (1974). Differential Equations, Dynamical Systems, and Linear Algebra. Academic Press, San Diego. Hirsch, M. W., Smale, S., and Devaney, R. L. (2004). Differential Equations, Dynamical Systems, and an Introduction to Chaos. Elsevier, Amsterdam. Hofbauer, J. (1981). On the occurrence of limit cycles in the Volterra-Lotka equation. Nonlinear Analysis, 5:1003–1007. Hofbauer, J. (1985). The selection mutation equation. Journal of Mathematical Biology, 23:41–53. Hofbauer, J. (1995a). Imitation dynamics for games. Unpublished manuscript, University of Vienna. Hofbauer, J. (1995b). Stability for the best response dynamics. Unpublished manuscript, University of Vienna. Hofbauer, J. (1996). Evolutionary dynamics for bimatrix games: A Hamiltonian system? Journal of Mathematical Biology, 34:675–688. Hofbauer, J. (2000). From Nash and Brown to Maynard Smith: Equilibria, dynamics, and ESS. Selection, 1:81–88. Hofbauer, J. and Hopkins, E. (2005). Learning in perturbed asymmetric games. Games and Economic Behavior, 52:133–152. Hofbauer, J., Mallet-Paret, J., and Smith, H. L. (1991). Stable periodic solutions for the hypercycle system. Journal of Dynamics and Differential Equations, 3:423–436. Hofbauer, J. and Sandholm, W. H. (2002). On the global convergence of stochastic fictitious play. Econometrica, 70:2265–2294. Hofbauer, J. and Sandholm, W. H. (2006). Survival of dominated strategies under evolutionary dynamics. Unpublished manuscript, University of Vienna and University of Wisconsin. 386 Hofbauer, J. and Sandholm, W. H. (2007). Evolution in games with randomly disturbed payoffs. Journal of Economic Theory, 132:47–69. Hofbauer, J. and Sandholm, W. H. (2008). Stable games and their dynamics. Unpublished manuscript, University of Vienna and University of Wisconsin. Hofbauer, J., Schuster, P., and Sigmund, K. (1979). A note on evolutionarily stable strategies and game dynamics. Journal of Theoretical Biology, 81:609–612. Hofbauer, J., Schuster, P., and Sigmund, K. (1981). Competition and cooperation in catalytic self-replication. Journal of Mathematical Biology, 11:155–168. Hofbauer, J. and Sigmund, K. (1988). Theory of Evolution and Dynamical Systems. Cambridge University Press, Cambridge. Hofbauer, J. and Sigmund, K. (1998). Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge. Hofbauer, J. and Sigmund, K. (2003). Evolutionary game dynamics. Bulletin of the American Mathematical Society (New Series), 40:479–519. Hofbauer, J., Sorin, S., and Viossat, Y. (2007). Time average replicator and best reply dynamics. Unpublished manuscript, University of Vienna. Hofbauer, J. and Swinkels, J. M. (1996). A universal Shapley example. Unpublished manuscript, University of Vienna and Northwestern University. Hofbauer, J. and Weibull, J. W. (1996). Evolutionary selection against dominated strategies. Journal of Economic Theory, 71:558–573. Hopkins, E. (1999). A note on best response dynamics. Games and Economic Behavior, 29:138–150. Hopkins, E. (2002). Two competing models of how people learn in games. Econometrica, 70:2141–2166. Hopkins, E. and Seymour, R. M. (2002). The stability of price dispersion under seller and consumer learning. International Economic Review, 43:1157–1190. Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge University Press, Cambridge. Imhof, L. A. (2005). The long-run behavior of the stochastic replicator dynamics. Annals of Applied Probability, 15:1019–1045. Jordan, J. S. (1993). Three problems in learning mixed-strategy Nash equilibria. Games and Economic Behavior, 5:368–386. 387 Kimura, M. (1958). On the change of population fitness by natural selection. Heredity, 12:145–167. Kojima, F. and Takahashi, S. (2007). Anti-coordination games and dynamic stability. International Game Theory Review, 9:667–688. Krantz, S. G. and Parks, H. R. (1999). The Geometry of Domains in Space. Birkh¨ user, Boston. a Kuhn, H. W. (2003). Lectures on the Theory of Games. Princeton University Press, Princeton. Kurtz, T. G. (1970). Solutions of ordinary differential equations as limits of pure jump Markov processes. Journal of Applied Probability, 7:49–58. Lahkar, R. (2007). The dynamic instability of dispersed price equilibria. Unpublished manuscript, University College London. Lahkar, R. and Sandholm, W. H. (2008). The projection dynamic and the geometry of population games. Games and Economic Behavior, forthcoming. Lang, S. (1997). Undergraduate Analysis. Springer, New York, second edition. Lax, P. D. (2007). Linear Algebra and Its Applications. Wiley, Hoboken, NJ, second edition. Lotka, A. J. (1920). Undamped oscillation derived from the law of mass action. Journal of the American Chemical Society, 42:1595–1598. Luce, R. D. and Raiffa, H. (1957). Games and Decisions: Introduction and Critical Survey. Wiley, New York. Marsden, J. E. and Ratiu, T. S. (2002). Introduction to Mechanics and Symmetry: A Basic Exposition of Classical Mechanical Systems. Springer, Berlin, second edition. Matsui, A. (1992). Best response dynamics and socially stable strategies. Journal of Economic Theory, 57:343–362. Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge University Press, Cambridge. Maynard Smith, J. and Price, G. R. (1973). The logic of animal conflict. Nature, 246:15–18. McFadden, D. (1981). Econometric models of probabilistic choice. In Manski, C. F. and McFadden, D., editors, Structural Analysis of Discrete Data with Econometric Applications, pages 198–272. MIT Press, Cambridge. McKelvey, R. D. and Palfrey, T. R. (1995). Quantal response equilibria for normal form games. Games and Economic Behavior, 10:6–38. Milgrom, P. and Roberts, J. (1990). Rationalizability, learning, and equilibrium in games with strategic complementarities. Econometrica, 58:1255–1278. 388 Milgrom, P. and Shannon, C. (1994). Monotone comparative statics. Econometrica, 62:157– 180. Milnor, J. W. (1965). Topology from the Differentiable Viewpoint. Princeton University Press, Princeton. Minty, G. J. (1967). On the generalization of a direct method of the calculus of variations. Bulletin of the American Mathematical Society, 73:315–321. Monderer, D. and Sela, A. (1997). Fictitious play and no-cycling conditions. Unpublished manuscript, The Technion. Monderer, D. and Shapley, L. S. (1996). Potential games. Games and Economic Behavior, 14:124–143. Nachbar, J. H. (1990). ’Evolutionary’ selection dynamics in games: Convergence and limit properties. International Journal of Game Theory, 19:59–89. Nagurney, A. (1999). Network Economics: A Variational Inequality Approach. Kluwer, Dordrecht, second edition. Nagurney, A. and Zhang, D. (1996). Projected Dynamical Systems and Variational Inequalities with Applications. Kluwer, Dordrecht. Nagurney, A. and Zhang, D. (1997). Projected dynamical systems in the formulation, stability analysis, and computation of fixed demand traffic network equilibria. Transportation Science, 31:147–158. Nash, J. F. (1951). Non-cooperative games. Annals of Mathematics, 54:287–295. Nemytskii, V. V. and Stepanov, V. V. (1960). Qualitative Theory of Differential Equations. Princeton University Press, Princeton. Norris, J. R. (1997). Markov Chains. Cambridge University Press, Cambridge. Ok, E. A. (2007). Real Analysis with Economic Applications. Princeton University Press, Princeton. Patriksson, M. (1994). The Traffic Assignment Problem: Models and Methods. VSP, Utrecht. Pohley, H.-J. and Thomas, B. (1983). Non-linear ESS models and frequency dependent selection. BioSystems, 16:87–100. Ritzberger, K. (1994). The theory of normal form games from the differentiable viewpoint. International Journal of Game Theory, 23:207–236. Ritzberger, K. and Weibull, J. W. (1995). Evolutionary selection in normal form games. Econometrica, 63:1371–1399. 389 Roberts, A. W. and Varberg, D. E. (1973). Convex Functions. Academic Press, New York. Robinson, C. (1995). Dynamical Systems: Stability, Symbolic Dynamics, and Chaos. CRC Press, Boca Raton, FL. Rockafellar, R. T. (1970). Convex Analysis. Princeton University Press, Princeton. Rosenthal, R. W. (1973). A class of games possessing pure strategy Nash equilibria. International Journal of Game Theory, 2:65–67. Roughgarden, T. (2005). Selfish Routing and the Price of Anarchy. MIT Press, Cambridge. ´ Roughgarden, T. and Tardos, E. (2002). How bad is selfish routing? Journal of the ACM, 49:236–259. ´ Roughgarden, T. and Tardos, E. (2004). Bounding the inefficiency of equilibria in nonatomic congestion games. Games and Economic Behavior, 49:389–403. Samuelson, L. and Zhang, J. (1992). Evolutionary stability in asymmetric games. Journal of Economic Theory, 57:363–391. Sandholm, W. H. (2001). Potential games with continuous player sets. Journal of Economic Theory, 97:81–108. Sandholm, W. H. (2002). Evolutionary implementation and congestion pricing. Review of Economic Studies, 69:81–108. Sandholm, W. H. (2003). Evolution and equilibrium under inexact information. Games and Economic Behavior, 44:343–378. Sandholm, W. H. (2005a). Excess payoff dynamics and other well-behaved evolutionary dynamics. Journal of Economic Theory, 124:149–170. Sandholm, W. H. (2005b). Negative externalities and evolutionary implementation. Review of Economic Studies, 72:885–915. Sandholm, W. H. (2006a). Pairwise comparison dynamics and evolutionary foundations for Nash equilibrium. Unpublished manuscript, University of Wisconsin. Sandholm, W. H. (2006b). A probabilistic characterization of integrability for game dynamics. Unpublished manuscript, University of Wisconsin. Sandholm, W. H. (2007a). Evolution in Bayesian games II: Stability of purified equilibria. Journal of Economic Theory, 136:641–667. Sandholm, W. H. (2007b). Pigouvian pricing and stochastic evolutionary implementation. Journal of Economic Theory, 132:367–382. Sandholm, W. H. (2008a). Local stability under evolutionary game dynamics. Unpublished manuscript, University of Wisconsin. 390 Sandholm, W. H. (2008b). Potential functions for normal form games and for population games. Unpublished manuscript, University of Wisconsin. Sandholm, W. H., Dokumacı, E., and Lahkar, R. (2008). The projection dynamic and the replicator dynamic. Games and Economic Behavior, forthcoming. Sato, Y., Akiyama, E., and Farmer, J. D. (2002). Chaos in learning a simple two-person game. Procedings of the National Academy of Sciences, 99:4748–4751. Schlag, K. H. (1998). Why imitate, and if so, how? A boundedly rational approach to multi-armed bandits. Journal of Economic Theory, 78:130–156. Schuster, P. and Sigmund, K. (1983). Replicator dynamics. Journal of Theoretical Biology, 100:533–538. Schuster, P., Sigmund, K., Hofbauer, J., Gottlieb, R., and Merz, P. (1981a). Selfregulation of behaviour in animal societies III: Games between two populations with selfinteraction. Biological Cybernetics, 90:16–25. Schuster, P., Sigmund, K., Hofbauer, J., and Wolff, R. (1981b). Selfregulation of behaviour in animal societies I: Symmetric contests. Biological Cybernetics, 40:1–8. Schuster, P., Sigmund, K., Hofbauer, J., and Wolff, R. (1981c). Selfregulation of behaviour in animal societies II: Games between two populations without selfinteraction. Biological Cybernetics, 90:9–15. Selten, R. (1980). A note on evolutionarily stable strategies in asymmetric animal conflicts. Journal of Theoretical Biology, 84:93–101. Shahshahani, S. (1979). A new mathematical framework for the study of linkage and selection. Memoirs of the American Mathematical Society, 211. Shapley, L. S. (1964). Some topics in two person games. In Dresher, M., Shapley, L. S., and Tucker, A. W., editors, Advances in Game Theory, volume 52 of Annals of Mathematics Studies, pages 1–28. Princeton University Press, Princeton. Sheffi, Y. (1985). Urban Transportation Networks: Equilibrium Analysis with Mathematical Programming Methods. Prentice-Hall, Englewood Cliffs, NJ. Shiga, T. and Tanaka, H. (1985). Central Limit Theorem for a system of Markovian particles with mean field interactions. Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte ¨ Gebiete, 69:439–459. Skyrms, B. (1990). The Dynamics of Rational Deliberation. Harvard University Press, Cambridge. Skyrms, B. (1992). Chaos in game dynamics. Journal of Logic, Language, and Information, 1:111–130. 391 Slade, M. E. (1994). What does an oligopoly maximize? Journal of Industrial Economics, 42:45–51. Smale, S. (1967). Differentiable dynamical systems. Bulletin of the American Mathematical Society, 73:747–817. Smirnov, G. V. (2002). Introduction to the Theory of Differential Inclusions. American Mathematical Society, Providence, RI. Smith, H. L. (1995). Monotone Dynamical Systems: An Introduction to the Theory of Competitive and Cooperative Systems. American Mathematical Society, Providence, RI. Smith, M. J. (1984). The stability of a dynamic model of traffic assignment—an application of a method of Lyapunov. Transportation Science, 18:245–252. Sparrow, C., van Strien, S., and Harris, C. (2008). Fictitious play in 3 × 3 games: The transition between periodic and chaotic behaviour. Games and Economic Behavior, 63:259– 291. Stroock, D. W. (2005). An Introduction to Markov Processes. Springer, Berlin. Swinkels, J. M. (1992). Evolutionary stability with equilibrium entrants. Journal of Economic Theory, 57:306–332. Swinkels, J. M. (1993). Adjustment dynamics and rational play in games. Games and Economic Behavior, 5:455–484. Sznitman, A. (1991). Topics in propagation of chaos. In Hennequin, P. L., editor, Ecole d’Et´ de Probabilit´s de Saint-Flour XIX, 1989, pages 167–251. Springer, Berlin. e e Tanabe, Y. (2006). The propagation of chaos for interacting individuals in a large population. Mathematical Social Sciences, 51:125–152. Tanaka, H. (1983). Some probabilistic problems in the spatially homogeneous Boltzmann equation. In Kallianpur, G., editor, Theory and Application of Random Fields (Bangalore, 1982), pages 258–267. Springer, Berlin. Taylor, P. D. (1979). Evolutionarily stable strategies with two types of players. Journal of Applied Probability, 16:76–83. Taylor, P. D. and Jonker, L. (1978). Evolutionarily stable strategies and game dynamics. Mathematical Biosciences, 40:145–156. Thomas, B. (1985). On evolutionarily stable sets. Journal of Mathematical Biology, 22:105– 115. Topkis, D. (1979). Equilibrium points in nonzero-sum n-person submodular games. SIAM Journal on Control and Optimization, 17:773–787. 392 Topkis, D. (1998). Supermodularity and Complementarity. Princeton University Press, Princeton. Ui, T. (2000). A Shapley value representation of potential games. Games and Economic Behavior, 31:121–135. van Damme, E. (1991). Stability and Perfection of Nash Equilibria. Springer, Berlin, second edition. Vickers, G. T. and Cannings, C. (1988). Patterns of ESS’s I. Journal of Theoretical Biology, 132:381–510. Vives, X. (1990). Nash equilibrium with strategic complementarities. Journal of Mathematical Economics, 19:305–321. Vives, X. (2000). Oligopoly Pricing: Old Ideas and New Tools. MIT Press, Cambridge. Vives, X. (2005). Complementarities and games: New developments. Journal of Economic Literature, 43:437–479. Volterra, V. (1931). Lecons sur la Theorie Mathematique de la Lutte pour la Vie. Gauthier-Villars, Paris. Weibull, J. W. (1995). Evolutionary Game Theory. MIT Press, Cambridge. Weibull, J. W. (1996). The mass action interpretation. Excerpt from ’The work of John Nash in game theory: Nobel Seminar, December 8, 1994’. Journal of Economic Theory, 69:165–171. Zeeman, E. C. (1980). Population dynamics from game theory. In Nitecki, Z. and Robinson, C., editors, Global Theory of Dynamical Systems (Evanston, 1979), number 819 in Lecture Notes in Mathematics, pages 472–497, Berlin. Springer. 393 ...
View Full Document

This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.

Page1 / 405

pged - Population Games and Evolutionary Dynamics William H...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online