This preview shows pages 1–7. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 6.254 : Game Theory with Engineering Applications Lecture 11: Learning in Games Asu Ozdaglar MIT March 11, 2010 1 Game Theory: Lecture 11 Introduction Outline Learning in Games Fictitious Play Convergence of Fictitious Play Reading: Fudenberg and Levine, Chapters 1 and 2 2 Game Theory: Lecture 11 Learning in Games Learning in Games Most economic theory relies on equilibrium analysis based on Nash equilibrium or its refinements. The traditional explanation for when and why equilibrium arises is that it results from analysis and introspection by the players in a situation where the rules of the game, the rationality of the players, and the payoff functions of players are all common knowledge. In this lecture, we develop an alternative explanation why equilibrium arises as the longrun outcome of a process in which less than fully rational players grope for optimality over time. One of the earliest learning rules, introduced in Brown (1951), is the fictitious play . The most compelling interpretation of fictitious play is as a “beliefbased” learning rule , i.e., players form beliefs about opponent play (from the entire history of past play) and behave rationally with respect to these beliefs. 3 Game Theory: Lecture 11 Learning in Games Setup We focus on a two player strategic form game hI , ( S i ) i ∈I , ( u i ) i ∈I i . The players play this game at times t = 1, 2, . . . . The stage payoff of player i is again given by u i ( s i , s i ) (for the pure strategy profile ( s i , s i ) ). For t = 1, 2, . . . and i = 1, 2, define the function η t i : S i → N , where η t i ( s i ) is the number of times player i has observed the action s i before time t . Let η i ( s i ) represent a starting point (or fictitious past). For example, consider a two player game, with S 2 = { U , D } . If η 1 ( U ) = 3 and η 1 ( D ) = 5, and player 2 plays U , U , D in the first three periods, then η 3 1 ( U ) = 5 and η 3 1 ( D ) = 6. 4 Game Theory: Lecture 11 Learning in Games The Basic Idea The basic idea of fictitious play is that each player assumes that his opponent is using a stationary mixed strategy , and updates his beliefs about this stationary mixed strategies at each step. Players choose actions in each period (or stage) to maximize that period’s expected payoff given their prediction of the distribution of opponent’s actions, which they form according to: μ t i ( s i ) = η t i ( s i ) ∑ ¯ s i ∈ S i η t i ( ¯ s i ) , i.e., player i forecasts player i ’s strategy at time t to be the empirical frequency distribution of past play. 5 Game Theory: Lecture 11 Learning in Games Fictitious Play Model of Learning Given player i ’s belief/forecast about his opponents play, he chooses his action at time t to maximize his payoff, i.e., s t i ∈ arg max s i ∈ S i u i ( s i , μ t i ) ....
View
Full
Document
This note was uploaded on 05/08/2010 for the course CS 6.254 taught by Professor Asuozdaglar during the Spring '10 term at MIT.
 Spring '10
 AsuOzdaglar

Click to edit the document details