To understand Alpha-Beta, consider the following situation. It's Whites turn, white is trying to maximize the score, black is trying to minimize the score.
White evaluates move A,B, and C and finds the best score is 20 with C. Now consider what happens when evaluating move D:
If white selects move D, we need to consider counter-moves by black. Early on, we find black can capture the white queen, and that subtree gets a MIN score of 5 due to the lost queen. However, we have not considered all of blacks counter-moves. Is it worth checking the rest? No.
We don't care if black can get a score lower than 5 because whites move "C" could keep the score to 20. Black will not choose a counter-move with a score higher than 5 because he is trying to MINimize the score and has already found move with a score of 5. For white, move C is preferred over move D as soon as the MIN for D (5 so far) goes below that of C (20 for sure). So we "prune" the rest of the tree there, pop back up a level and evaluate white moves E,F,G,H.... to the end.
Hope that helps.
I noticed you said you found the problem but shouldnt the minimax alpha beta pruning be
if it is MAX's turn to move
for child in children
result = alphaBetaMinimax(child, alpha, beta)
if result > alpha
alpha = result
if node is root
bestMove = operator of child
if alpha >= beta
return alpha
return alpha
if it is MIN's turn to move
for child in children
result = alphaBetaMinimax(child, alpha, beta)
if result < beta
beta = result
if node is root
bestMove = operator of child
if beta <= alpha
return beta
return beta
you wrote:
if alpha >= beta
return beta
return alpha
Best Answer
I know this is an old question however....
Yes Alpha-beta and minimax returns the same answer. All Alpha-Beta does is prevent minimax from making calculations that are 100% guaranteed to NOT be an optimal state for the current player (MAX or MIN).
You may however have equivalent actions for a given state. How your algorithm decides which equivalent actions to return depends on how it is implemented. If sets/unordered lists are used somewhere, the order in which evaluations are made may change.
This may also depends on what you do if Alpha/Beta value is equal to the current best option. Since equal values would not produce a better result, there's not point in exploring that path further. Therefore you would just keep the "first best action encountered". However with Minimax, you explore everything anyways, so you may decide to keep the "last best" value. That's one case where Minimax would return a different action than Alpha-Beta. But they are still equivalent as far as your scoring function is concerned...