4/21/2010
Strategic Behavior in
Elections and Markets
Can game theory have anything to say about behavior by
participants in elections or markets?
We often imagine that, in such environments, individual actors
cannot have any impact on outcomes, i.e. th
4/8/2010
? ?
?
Signaling Games
?
? ?
In incomplete information games, one player knows
more information than the other player.
So far, we have focused on the case where the type
of the more informed player was known to that
player but unknown to the les
2/1/2010
BargainingGames
AnApplicationofSequentialMove
Games
TheBargainingProblem
TheBargainingProblemarisesineconomicsituations
wheretherearegainsfromtrade,forexample,whena
buyervaluesanitemmorethanaseller.
Theproblemishowtodividethegains,forexample,
w
Reinforcement Learning
Lectures 4 and 5
Gillian Hayes
17th January 2008
Gillian Hayes
RL Lecture 4/5
17th January 2008
1
Reinforcement Learning
Framework
Rewards, Returns
Environment Dynamics
Components of a Problem
Values and Action Values, V and Q
Economics 1200
Fall 2010
Homework # 5
Write your answers to the following questions on separate sheets of paper. Your answers
are due in class on Tuesday April 13. No late homeworks are accepted.
1. Consider a simple version of the two-firm, Cournot duopo
3/22/2010
Repeated Games
This week we examine the effect of repetition on strategic
behavior in games with perfect information.
If a game is played repeatedly, with the same players, the
players may behave very differently than if the game is
played jus
Reinforcement Learning
Lecture 2
Gillian Hayes
10th January 2008
Gillian Hayes
RL Lecture 2
10th January 2008
1
Reinforcement Learning: How Does It Work?
We detect a state
We choose an action
We get a reward
Our aim is to learn a policy what action to cho
2/5/2010
SimultaneousMoveGames
Decision making without knowledge
of the strategy choice of opponents
SimultaneousMoves
Arisewhenplayershavetomaketheirstrategychoices
simultaneously,withoutknowingthestrategiesthathavebeen
chosenbytheotherplayer(s).
Stude
Economics 1200
Spring 2010
Homework # 4
Write your answers to the following questions on separate sheets of paper. Your answers
are due in class on Tuesday, March 30. No late homeworks are accepted.
1. An employer and its unionized workforce are having di
3/27/2010
More on Sequential and Simultaneous
Move Games
So far we have studied two types of games: 1)
sequential move (extensive form) games where
players take turns choosing actions and 2)
strategic form (normal form) games where players
simultaneously
Economics 1200
Fall 2010
Homework # 6
This is for practice only. You do not have to turn this homework in, but you should be
familiar with the answers in preparation for the final exam. We will go over the answers
in class on Thursday, April 22. Try to wo
Economics 1200
Fall 2010
Answers to Homework # 5
1.
a. If firm 2 has High Costs then Low Output is its weakly dominant strategy. If Firm 2 has
low costs, then High Output is its strictly dominant strategy. We shall assume that Firm 2
always plays its domi
R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction
Evaluating actions vs. instructing by giving correct actions
Pure evaluative feedback depends totally on the action taken.
Pure instructive feedback depends not at all on the action t
Reinforcement Learning
Lecture 11
Gillian Hayes
11th February 2008
Gillian Hayes
RL Lecture 11
11th February 2008
1
Algorithms for Solving RL: Temporal Difference
Learning (TD)
Incremental Monte Carlo Algorithm
TD Prediction
TD vs MC vs DP
TD for control:
Reinforcement Learning
Lecture 3
Gillian Hayes
14th January 2008
Gillian Hayes
RL Lecture 3
14th January 2008
1
Learning from Interaction
with environment
to achieve some goal
Baby playing. No teacher. Sensorimotor connection to environment.
Cause eff
2/22/2010
Probability, Expected Payoffs and Expected Utility
In thinking about mixed strategies, we will need to make use
of probabilities. We will therefore review the basic rules of
probability and then derive the notion of expected value.
We will als
Economics 1200
Fall 2010
Answers to Homework # 4
1.
(a) Using best response analysis, there are no pure strategy Nash equilibria for the game
as presented in normal form.
(b) Let B=build, DB=dont build, S=strike and NS=no strike.
Union
DB
B
Employer
B
Uni
3/30/2010
Complete vs. Incomplete Information Games
All games can be classified as complete information games or
incomplete information games.
Complete information games the player whose turn it is to move
knows at least as much as those who moved before
R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction
Monte Carlo methods learn from complete sample returns
n Only defined for episodic tasks
Monte Carlo methods learn directly from experience
n On-line: No model necessary and still att
Reinforcement Learning
Lecture 19a
Gillian Hayes
10th March 2008
Gillian Hayes
RL Lecture 19a
10th March 2008
1
Focussed Web Crawling Using RL
Searching web for pages relevant to a specific subject
No organised directory of web pages
Web Crawling: start
R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction
Overview of a collection of classical solution methods
for MDPs known as dynamic programming (DP)
Show how DP can be used to compute value functions,
and hence, optimal policies
Disc
Economics 1200
Spring 2010
Homework # 2
Write your answers to the following questions on separate sheets of paper. Your answers are due
in class on Tuesday, February 9. Late homeworks are not accepted.
1. Consider the two-round home bargaining game discus
Reinforcement Learning
Lecture 14
Gillian Hayes
21st February 2008
Gillian Hayes
RL Lecture 14
21st February 2008
1
Using Q-Learning on Robots: Asterix and Obelix
Asterix and Obelix robots
Setting up the problem so that learning can be use
Q-learning
Stat
R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction
describe the RL problem we will be studying for the
remainder of the course
present idealized form of the RL problem for which we
have precise theoretical results;
introduce key comp
Look at how experience with a limited part of the state set
be used to produce good behavior over a much larger part.
Overview of function approximation (FA) methods and how
they can be adapted to RL
Objectives of this chapter:
Chapter 8:
Generalization
Economics 1200
Fall 2010
Answers to Homework # 3
1.[1.5 points] (a) The game in normal form can be depicted using two payoff tables. Payoffs are net of
contribution costs, and are in the order: Larry, Curly, Moe.
Moe Chooses Not Contribute
Curlys Choice
D
Reinforcement Learning
Lecture 7
Gillian Hayes
28th January 2008
Gillian Hayes
RL Lecture 7
28th January 2008
1
Algorithms for Solving RL: Monte Carlo Methods
What are they?
Monte Carlo Policy Evaluation
First-visit policy evaluation
Estimating Q-valu