{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lecture12Notes - C&O 355 Mathematical Programming Fall 2010...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
C&O 355: Mathematical Programming Fall 2010 Lecture 12 Notes Nicholas Harvey http://www.math.uwaterloo.ca/~harvey/ 1 Zero-Sum Games Let M be any m × n real matrix, which we use as the payoff matrix for a two-player, zero-sum game. Von Neumann’s theorem states that max x min y x T My = min y max x x T My, where the max and min are over distributions x R m and y R n . Recall that “distribution” means that x 0, m i =1 x i = 1. Consequently, there exist distributions x * R m and y * R n such that max x min y x T My = x * T My * = min y max x x T My. (1) This quantity is called the value of the game and is denoted by v . Observation 1. Note that for any fixed x , we have min y x T My v . (In particular, x T My * v .) Similarly, for any particular y , we have max x x T My v . (In particular, x * T My v .) Observation 2. For any fixed x , there is a y achieving min y x T My such that y has only one non-zero coordinate (which must have value 1). Such a y corresponds to Bob choosing a single action, rather than a randomized choice of actions. Fix any desired error δ (0 , 1). We will give a method to find distributions ˆ x and ˆ y such that | min y ˆ x T My - v | ≤ δ and | max x x T M ˆ y - v | ≤ δ. (2) Due to Observation 1, we see that (2) is equivalent to min y ˆ x T My v - δ and max x x T M ˆ y v + δ. (3) In other words, if Alice plays according to distribution ˆ x , then no matter how Bob plays, she is guar- anteed a payoff of at least v - δ . Conversely, if Bob plays according to distribution ˆ y , then no matter how Alice plays, he is guaranteed to pay her at most v + δ . 2 The Multiplicative Weights Update Method By scaling, we may assume that M i,j [ - 1 , 1] for every i, j . Set = δ/ 3. Alice will assign some “weights” to each of her actions, then simulate the game by herself for T = (ln m ) / rounds, modifying her weights between each round. These weights are essentially a probability distribution, except they
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}