This preview shows page 1. Sign up to view the full content.
Unformatted text preview: 4/1/10 Intro to Game Theory CPS 170 Ron Parr (with many slides courtesy of Vince Conitzer) What is game theory? • Game theory studies seFngs where mulGple parGes (agents) each have – diﬀerent preferences (uGlity funcGons), – diﬀerent acGons that they can take • Each agent’s uGlity (potenGally) depends on all agents’ acGons – What is opGmal for one agent depends on what other agents do • Very circular! • Game theory studies how agents can raGonally form beliefs over what other agents will do, and (hence) how agents should act – Useful for acGng as well as (potenGally) predicGng behavior of others • Game theory does not directly aim to be a descripGve theory 1 4/1/10 Real World Game Theory Examples War AucGons Animal behavior Networking protocols, peer to peer networking behavior • Road traﬃc •
•
•
• • Mechanism design: Suppose we want people to do X? How do we engineer the situaGon so that they will act that way? Penalty kick example probability .7 probability .3 ac#on probability 1 ac#on probability .6 probability .4 Is this a “ra*onal” outcome? If not, what is? 2 4/1/10 Rock
paper
scissors Column player AKA player 2 (simultaneously) chooses a column 0, 0 1, 1 1, 1
Row player AKA player 1 chooses a row 1, 1 0, 0 1, 1
1, 1 1, 1 0, 0 A row or column is called an acGon or (pure) strategy Row player’s uGlity is always listed ﬁrst, column player’s second Zero
sum game: the uGliGes in each entry sum to 0 (or a constant) Three
player game would be a 3D table with 3 uGliGes per entry, etc. “Chicken” • Two players drive cars towards each other • If one player goes straight, that player wins • If both go straight, they both die S D D S D
D
S S 0, 0 1, 1
1, 1 5, 5 not zero
sum 3 4/1/10 Rock
paper
scissors – Seinfeld variant MICKEY: All right, rock beats paper! (Mickey smacks Kramer's hand for losing) KRAMER: I thought paper covered rock. MICKEY: Nah, rock ﬂies right through paper. KRAMER: What beats rock? MICKEY: (looks at hand) Nothing beats rock. 0, 0 1, 1 1, 1
1, 1 0, 0 1, 1
1, 1 1, 1 0, 0 Dominance • Player i’s strategy si strictly dominates si’ if – for any s
i, ui(si , s
i) > ui(si’, s
i) • si weakly dominates si’ if – for any s
i, ui(si , s
i) ≥ ui(si’, s
i); and – for some s
i, ui(si , s
i) > ui(si’, s
i) strict dominance weak dominance
i = “the player(s) other than i” 0, 0 1, 1 1, 1
1, 1 0, 0 1, 1
1, 1 1, 1 0, 0 4 4/1/10 Prisoner’s Dilemma • Pair of criminals has been caught • District anorney has evidence to convict them of a minor crime (1 year in jail); knows that they commined a major crime together (3 years in jail) but cannot prove it • Oﬀers them a deal: – If both confess to the major crime, they each get a 1 year reducGon – If only one confesses, that one gets 3 years reducGon confess
confess
don’t confess don’t confess 2, 2 0, 3
3, 0 1, 1 “Should I buy an SUV?” accident cost purchasing + gas cost cost: 5
cost: 3 cost: 5 cost: 5 cost: 8 cost: 2 cost: 5 cost: 5 10, 10 7, 11 11, 7 8, 8 5 4/1/10 “2/3 of the average” game • Everyone writes down a number between 0 and 100 • Person closest to 2/3 of the average wins • Example: – A says 50 – B says 10 – C says 90 – Average(50, 10, 90) = 50 – 2/3 of average = 33.33 – A is closest (50
33.33 = 16.67), so A wins Iterated dominance • Iterated dominance: remove (strictly/weakly) dominated strategy, repeat • Iterated strict dominance on Seinfeld’s RPS: 0, 0 1, 1 1, 1
1, 1 0, 0 1, 1
1, 1 1, 1 0, 0 0, 0 1, 1
1, 1 0, 0 6 4/1/10 “2/3 of the average” game revisited 100 dominated (2/3)*100 (2/3)*(2/3)*100 dominated a8er removal of (originally) dominated strategies … 0 Mixed strategies • Mixed strategy for player i = probability distribuGon over player i’s (pure) strategies • E.g. 1/3 , 1/3 , 1/3 • Example of dominance by a mixed strategy: 1/2 3, 0 0, 0 1/2 0, 0 3, 0
1, 0 1, 0 7 4/1/10 Nash equilibrium [Nash 50] • A vector of strategies (one for each player) is called a strategy proﬁle • A strategy proﬁle (σ1, σ2 , …, σn) is a Nash equilibrium if each σi is a best response to σ
i – That is, for any i, for any σi’, ui(σi, σ
i) ≥ ui(σi’, σ
i) • Note that this does not say anything about mulGple agents changing their strategies at the same Gme • In any (ﬁnite) game, at least one Nash equilibrium (possibly using mixed strategies) exists [Nash 50] • (Note
singular: equilibrium, plural: equilibria) Nash equilibria of “chicken” S
D D
S D S D S 0, 0 1, 1
1, 1 5, 5 • (D, S) and (S, D) are Nash equilibria – They are pure
strategy Nash equilibria: nobody randomizes – They are also strict Nash equilibria: changing your strategy will make you strictly worse oﬀ • No other pure
strategy Nash equilibria 8 4/1/10 Rock
paper
scissors 0, 0 1, 1 1, 1
1, 1 0, 0 1, 1
1, 1 1, 1 0, 0
• Any pure
strategy Nash equilibria? • But it has a mixed
strategy Nash equilibrium: Both players put probability 1/3 on each acGon • If the other player does this, every acGon will give you expected uGlity 0 – Might as well randomize Nash equilibria of “chicken”… D D
S S 0, 0 1, 1
1, 1 5, 5 • Is there a Nash equilibrium that uses mixed strategies? Say, where player 1 uses a mixed strategy? • If a mixed strategy is a best response, then all of the pure strategies that it randomizes over must also be best responses • So we need to make player 1 indiﬀerent between D and S • Player 1’s uGlity for playing D =
pcS • Player 1’s uGlity for playing S = pcD
5pcS = 1
6pcS • So we need
pcS = 1
6pcS which means pcS = 1/5 • Then, player 2 needs to be indiﬀerent as well • Mixed
strategy Nash equilibrium: ((4/5 D, 1/5 S), (4/5 D, 1/5 S)) – People may die! Expected uGlity
1/5 for each player 9 4/1/10 ComputaGonal Issues • Zero
sum games can be solved eﬃciently as linear programs (see slides from earlier in the semester) • General sum games may require exponenGal Gme (in # of acGons) to ﬁnd a single equilibrium (non known eﬃcient algorithm and good reasons to suspect that none exists) • Some bener news: Despite bad worst
case complexity, many games can be solved quickly Game Theory Issues • How descripGve is game theory? – Some evidence that people play equilibria – Some evidence that people act irraGonally – If it is computaGonally intractable to solve for equilibria of large games, it would seem unlikely that people are doing this • How reasonable is game theory? – Are payoﬀs known? – Are situaGons really simultaneous move with no informaGon about how the other player will act? – Are situaGons really single
shot 10 4/1/10 Extensions • ParGal informaGon ( just as MDPs are extended to POMDPs) • Uncertainty about the game parameters, e.g., payoﬀs (Bayesian games) • MulGstep games with distribuGons over next states (game theory + MDPs = stochasGc games) • MulGstep + parGal informaGon (ParGally observable stochasGc games) • Game theory is so general, that it can encompass essenGally all aspects of strategic, mulGagent behavior, e.g., negoGaGng, threats, bluﬀs, coaliGons, bribes, etc. 11 ...
View
Full
Document
 Spring '11
 Parr
 Artificial Intelligence

Click to edit the document details