SP11 cs188 lecture 7 -- expectimax search 6PP

SP11 cs188 lecture 7 -- expectimax search 6PP -...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 CS 188: Artificial Intelligence Spring 2011 Lecture 7: Minimax and Alpha-Beta Search 2/9/2011 Pieter Abbeel – UC Berkeley Many slides adapted from Dan Klein 1 Announcements s W1 out and due Monday 4:59pm s P2 out and due next week Friday 4:59pm 2 Overview s Deterministic zero-sum games s Minimax s Limited depth and evaluation functions s Alpha-Beta pruning s Stochastic games s Expectimax s Non-zero-sum games 3 Game Playing State-of-the-Art s Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Checkers is now solved! s Chess: Deep Blue defeated human world champion Gary Kasparov in a six-game match in 1997. Deep Blue examined 200 million positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply. Current programs are even better, if less historic. s Othello: Human champions refuse to compete against computers, which are too good. s Go: Human champions are beginning to be challenged by machines, though the best humans still beat the best machines. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves, along with aggressive pruning. s Pacman: unknown 4 GamesCrafters http://gamescrafters.berkeley.edu/ Dan Garcia. 5 Game Playing s Many different kinds of games! s Axes: s Deterministic or stochastic? s One, two, or more players? s Perfect information (can you see the state)? s Want algorithms for calculating a strategy (policy) which recommends a move in each state 7
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 Deterministic Games s Many possible formalizations, one is: s States: S (start at s 0 ) s Players: P={1. ..N} (usually take turns) s Actions: A (may depend on player / state) s Transition Function: SxA S s Terminal Test: S {t,f} s Terminal Utilities: SxP R s Solution for a player is a policy: S A 8 Deterministic Single-Player? s Deterministic, single player, perfect information: s Know the rules s Know what actions do s Know when you win s E.g. Freecell, 8-Puzzle, Rubik’s cube s … it’s just search! s Slight reinterpretation: s Each node stores a value : the best outcome it can reach s This is the maximal outcome of its children (the max value ) s Note that we don’t have path sums as before ( utilities at end ) s After search, can pick move that leads to best node s Often: not enough time to search till bottom before taking the next action win lose lose 9 Adversarial Games s Deterministic, zero-sum games: s Tic-tac-toe, chess, checkers s One player maximizes result s The other minimizes result s Minimax search: s A state-space search tree s Players alternate turns s Each node has a
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 7

SP11 cs188 lecture 7 -- expectimax search 6PP -...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online