This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Search in Games CPS 170 Ron Parr Why Study Games? • Many human ac<vi<es can be modeled as games – Nego<a<ons – Bidding – TCP/IP – Military confronta<ons – Pursuit/Evasion • Games are used to train the mind – Human game
playing, animal play
ﬁgh<ng 1 Why Are Games Good for AI? • Games typically have concise rules • Well
deﬁned star<ng and end points • Sensing and eﬀec<ng are simpliﬁed – Not true for sports games – See robocup • Games are fun! • Downside: GeTng taken seriously (not) – See robo search and rescue Some History of Games in AI • Computer games have been around almost as long as computers (perhaps longer) – Chess: Turing (and others) in the 1950s – Checkers: Samuel, 1950s learning program •
•
•
• Usually start with naïve op<mism Follow with naïve pessimism Simon: Predicted pomputer chess champ by 1967 Many, e.g., Kasparov, predicted that a computer would never be champion 2 Games Today • Computers perform at champion level – Backgammon, Checkers (solved), Chess, Othello • Computers perform well – Bridge, poker • Computers s<ll do badly (but recent breakthroughs show promise) – Go, Hex Simple Game Setup • Most commonly, we study games that are: –
–
–
– 2 player Alterna<ng Zero
sum Perfect informa<on • Examples: Checkers, chess, backgammon • Assump<ons can be relaxed at some expense • Economics studies case where #of agents is very large – Individual ac<ons don’t change the dynamics 3 Zero Sum Games • Assign values to diﬀerent outcomes • Win = 1, Loss =
1 • With zero sum games every gain comes at the other player’s expense • Sum of both player’s scores must be 0 • Are any games truly zero sum? Characterizing Games • Two
player alterna<ng move games are very much like search – Ini<al state – Successor func<on – Terminal test – Objec<ve func<on (heuris<c func<on) • Unlike search – Terminal states are ojen a large set – Full search to terminal states usually impossible 4 Game Trees x o x o x o x o x o x x o x o x o x x o o Player 1 x o x o x o x x o x o x x o o
x o x x o x o o x o x o x o o x Player 1 x o x o x Player 2 o x x o x o o x o x x o x o x x o o Game Trees (abstracted) Max nodes A1 A3 A2 Min nodes A11 A12 A21 A22 A31 A32 Terminal Nodes 5 Minimax • Max player tries to maximize his return • Min player tries to minimize his return • This is op<mal for both (assuming zero sum) minimax(nmax ) = max s∈succesors( n) minimax( s)
minimax(nmin ) = min s∈succesors( n ) minimax( s)
€
€ Minimax Values Max nodes 3 3 Min nodes 12 2 3 2 2 4 15 2 6 Minimax Proper<es • Minimax can be run depth ﬁrst – Time O(bm) – Space O(bm) • Assumes that opponent plays op<mally • Based on a worst
case analysis • What if this is incorrect? Minimax in the Real World
• Search trees are too big • Alterna<ng turns double depth of the search – 2 ply = 1 full turn • Branching factors are too high – Chess: 35 – Go: 361 • Full search from start to end never terminates in non
trivial games 7 Evalua<on Func<ons • Like heuris<c func<ons • Try to es<mate value of a node without expanding all the way to termina<on • Using evalua<on func<ons – Do a depth
limited search – Treat evalua<on func<on as if it were terminal • What’s wrong with this? • How do you pick the depth? • How do you manage your <me? • Itera<ve deepening, quiescence Desiderata for Evalua<on Func<ons • Would like to put the same ordering on nodes (even if values aren’t totally right) • Is this a reasonable thing to ask for? • What if you have a perfect evalua<on func<on? • How are evalua<on func<ons made in prac<ce? – Buckets – Linear combina<ons • Chess pieces (material) • Board control (posi<onal, strategic) 8 Search Control Issues • Horizon eﬀects – Something interes<ng is just beyond the horizon? – How do you know? • When to generate more nodes? • If you selec<vely extend your fron<er, how do you decide where? • If you have a ﬁxed amount of total game <me, how do you allocate this? Pruning • The most important search control method is ﬁguring out which nodes you don’t need to expand • Use the fact that we are doing a worst
case analysis to our advantage – Max player cuts oﬀ search when he knows min player can force a provably bad outcome – Min player cuts of search when he knows max can force a provably good (for max) outcome 9 Alpha
beta pruning Max nodes 3 3 Min nodes 12 2 3 2 2 4 15 2 How to prune • We s<ll do (bounded) DFS • Expand at least one path to the “bopom” • If current node is max node, and min can force a lower value, then prune siblings • If current node is min node, and max can force a higher value, then prune siblings 10 Max node pruning 2 Max nodes 2 4 4 Implemen<ng alpha
beta max_value(state, alpha, beta) if cutoﬀ(state) then return eval(state) for each s in successors(state) do alpha = max(alpha, min_value(s, alpha, beta)) if alpha >= beta the return beta end return alpha alpha=value of
best alternative
available to max
player beta=value of best alterna<ve available to min player min_value(state, alpha, beta) if cutoﬀ(state) then return eval(state) for each s in successors(state) do beta = min(beta, max_value(s, alpha, beta)) if beta <= alpha the return alpha end return beta 11 Amazing facts about alpha
beta • Empirically, alpha
beta has the eﬀect of reducing the branching factor by half for many problems • Eﬀec<vely doubles search horizon • Alpha
beta makes the diﬀerence between novice and expert computer players What About Probabili<es? Max nodes Chance nodes P=0.5 P=0.5 P=0.6 P=0.4 P=0.9 P=0.1 Min nodes 12 Expec<minimax • n random outcomes per chance node • O(bmnm) <me €
€ eminimax(nmax ) = max s∈succesors( n ) eminimax( s)
eminimax(nmin ) = min s∈succesors( n ) eminimax( s)
eminimax(nchance ) = ∑s∈succesors( n ) eminimax( s)p( s)
€ Expec<minimax is nasty • High branching factor • Randomness makes evalua<on fns diﬃcult – Hard to predict many steps into future – Values tend to smear together – Preserving order is not suﬃcient • Pruning is problema<c – Need to prune based upon bound on an expecta<on – Need a priori bounds on the evalua<on func<on 13 Mul<player Games • Things sort
of generalize, but can get complicated • Maintain vector of possible values for each player at each node • Might assume that each player acts greedily, but what’s wrong with this? • Correct treatment requires the full machinery of game theory Conclusions • Game tree search is a special kind of search • Rely heavily on heuris<c evalua<on func<ons • Alpha
beta is a big win • Most successful players use alpha
beta • Final thought: Tradeoﬀ between search eﬀort and evalua<on func<on eﬀort • When is it beper to invest in your evalua<on func<on? 14 ...
View
Full
Document
This note was uploaded on 02/17/2012 for the course COMPSCI 170 taught by Professor Parr during the Spring '11 term at Duke.
 Spring '11
 Parr
 Artificial Intelligence

Click to edit the document details