12. Alpha-Beta Pruning#
12.1. Optimizing Minimax with Alpha-Beta Pruning#
12.1.1. Introduction#
While the Minimax algorithm guarantees optimal decisions, it has a significant drawback: it explores every possible branch of the game tree, which can be computationally expensive. Alpha-Beta Pruning is an optimization technique that can significantly reduce the number of nodes evaluated without affecting the final result.
12.1.2. What is Alpha-Beta Pruning?#
Alpha-Beta Pruning works by eliminating branches that cannot possibly influence the final decision. It maintains two values:
Alpha (α): The best value that the maximizer (Pacman) has found so far
Beta (β): The best value that the minimizer (Ghost) has found so far
12.1.3. The Pruning Principle#
The key insight is:
If at any point we discover that the current path will not be chosen, we can stop exploring it.
Specifically:
In MAX nodes: If the value becomes ≥ β, prune (the minimizer won’t allow this path)
In MIN nodes: If the value becomes ≤ α, prune (the maximizer won’t choose this path)
12.1.4. Alpha-Beta Algorithm#
def alphabeta(node, depth, alpha, beta, is_maximizing_player):
if depth == 0 or node.is_terminal():
return node.evaluate()
if is_maximizing_player:
max_eval = float('-inf')
for child in node.get_children():
eval_score = alphabeta(child, depth-1, alpha, beta, False)
max_eval = max(max_eval, eval_score)
alpha = max(alpha, eval_score)
if beta <= alpha: # Prune!
break
return max_eval
else:
min_eval = float('inf')
for child in node.get_children():
eval_score = alphabeta(child, depth-1, alpha, beta, True)
min_eval = min(min_eval, eval_score)
beta = min(beta, eval_score)
if beta <= alpha: # Prune!
break
return min_eval
12.2. Application to Pacman#
Now we will apply Alpha-Beta pruning to the Pacman game, where:
Pacman (Agent 0): Maximizing player - wants to maximize score
Ghosts (Agents 1+): Minimizing players - want to minimize Pacman’s score
12.2.1. Game Structure in Pacman#
In Pacman Alpha-Beta with 2 ghosts:
One ply = Pacman moves + Ghost 1 moves + Ghost 2 moves
Depth 1 = One complete ply (Pacman + both ghosts)
Depth 2 = Two complete plies
12.2.2. Agent Order#
The agents take turns in this specific order:
Pacman (Agent 0) - MAX player
Ghost 1 (Agent 1) - MIN player
Ghost 2 (Agent 2) - MIN player
Back to Pacman (Agent 0) - depth increments
This cycle repeats until the maximum depth is reached.
12.2.3. Detailed Example: Depth 2 with 2 Ghosts#
Let’s walk through a complete example with depth = 2 and 2 ghosts:
12.2.3.1. Initial Setup#
Game State:
- Pacman at position (1,1)
- Ghost 1 at position (3,1)
- Ghost 2 at position (1,3)
- Food dot at (2,1)
- Evaluation scores for different scenarios:
* Pacman eats food: +10
* Ghost catches Pacman: -500
* Normal move: -1 (time penalty)
12.2.3.2. The Alpha-Beta Game Tree (Depth 2)#
ROOT: Pacman Turn (MAX)
α=-∞, β=+∞
/ | \
NORTH EAST SOUTH
Score: ? Score: ? Score: ?
| | |
DEPTH 0: Ghost1 Turn (MIN) DEPTH 0: Ghost1 Turn (MIN) DEPTH 0: Ghost1 Turn (MIN)
α=-∞, β=+∞ α=+1, β=+∞ α=+1, β=+∞
/ | \ / | \ / ✗ ✗
N E S N E S N PRUNED
| | | | | | |
DEPTH 0: Ghost2 Turn (MIN) DEPTH 0: Ghost2 Turn (MIN) DEPTH 0: Ghost2 Turn (MIN)
α=-∞, β=+∞ α=+1, β=+∞ α=+1, β=+∞
/ | \ / | ✗ |
N E S N E PRUNED N → returns -1
| | | | | ↓
| | | | | PRUNE!
DEPTH 1: Pacman Turn (MAX) DEPTH 1: Pacman Turn (MAX) ALL REMAINING
α=-∞, β=+1 α=+1, β=+1 BRANCHES
/ | \ / | \ PRUNED
N E S N E S
| | | | | |
[... Ghost1 → Ghost2 → Evaluate ...]
DEPTH 2: EVALUATE DEPTH 2: EVALUATE
+1 +2 +3 +1 +1 +1
↓ ↓
Returns +1 Returns +1
Legend:
✗ = Pruned branches (not evaluated)
α, β = Alpha-beta values at each node
Numbers = Final evaluation scores
12.2.4. Step-by-Step Alpha-Beta Execution#
12.2.4.1. Initial State#
Root: α = -∞, β = +∞
Pacman tries actions in order: NORTH, EAST, SOUTH
12.2.4.2. Branch 1: Pacman NORTH#
1.1. Pacman moves NORTH
Call
alphabeta(1, 0, -∞, +∞, False)(Ghost 1’s turn)
1.2. Ghost 1’s turn (MIN node)
α = -∞, β = +∞
Ghost 1 tries NORTH first → Through complete evaluation → returns +3
β = min(+∞, +3) = +3
Ghost 1 tries EAST → returns +2
β = min(+3, +2) = +2
Ghost 1 tries SOUTH → returns +1
β = min(+2, +1) = +1
1.3. Ghost 1 returns +1
Back to root: α = max(-∞, +1) = +1
12.2.4.3. Branch 2: Pacman EAST#
2.1. Pacman moves EAST
Call
alphabeta(1, 0, +1, +∞, False)(Ghost 1’s turn)Note: α is now +1 from previous branch!
2.2. Ghost 1’s turn (MIN node)
α = +1, β = +∞
Ghost 1 tries NORTH → through Ghost 2 → returns +2
β = min(+∞, +2) = +2
2.3. Ghost 1 tries EAST
Call
alphabeta(2, 0, +1, +2, False)(Ghost 2’s turn)
2.4. Ghost 2’s turn (MIN node)
α = +1, β = +2
Ghost 2 tries NORTH → through complete depth evaluation → returns +1
β = min(+2, +1) = +1
Ghost 2 tries EAST → returns +1
β = min(+1, +1) = +1
Check: β(+1) ≤ α(+1)? YES!
PRUNE! Skip Ghost 2 SOUTH
2.5. Ghost 2 returns +1
Back to Ghost 1: β = min(+2, +1) = +1
Check: β(+1) ≤ α(+1)? YES!
PRUNE! Skip Ghost 1 SOUTH
2.6. Ghost 1 returns +1
Back to root: α = max(+1, +1) = +1
12.2.4.4. Branch 3: Pacman SOUTH#
3.1. Pacman moves SOUTH
Call
alphabeta(1, 0, +1, +∞, False)(Ghost 1’s turn)Note: α is still +1!
3.2. Ghost 1’s turn (MIN node)
α = +1, β = +∞
Ghost 1 tries NORTH → Ghost 2 turn
3.3. Ghost 2’s turn (MIN node)
α = +1, β = +∞
Ghost 2 tries NORTH → Pacman (depth=1) → eventually returns -1
v = min(+∞, -1) = -1
β = min(+∞, -1) = -1
Check: v(-1) < α(+1)? YES!
PRUNE! Return -1 immediately
3.4. Ghost 2 returns -1
Back to Ghost 1: β = min(+∞, -1) = -1
Check: β(-1) ≤ α(+1)? YES!
PRUNE! Skip Ghost 1’s EAST and SOUTH
3.5. Ghost 1 returns -1
Back to root: α remains +1
12.2.4.5. Final Decision#
Root Node Final State:
NORTH: +1
EAST: +1 ← TIE
SOUTH: -1 (pruned early)
Pacman chooses NORTH or EAST (both give +1)
12.2.5. Implementation for Pacman#
class AlphaBetaAgent(MultiAgentSearchAgent):
"""
Minimax agent with alpha-beta pruning
"""
def getAction(self, gameState: GameState):
"""
Returns the alpha-beta action using self.depth and self.evaluationFunction
"""
def alphabeta(agentIndex, depth, gameState, alpha, beta):
# Base case: Check if the game is over or if we've reached the maximum depth
if gameState.isWin() or gameState.isLose() or depth == self.depth:
return self.evaluationFunction(gameState)
# Pacman (maximizer) is agentIndex 0
if agentIndex == 0:
return maxValue(agentIndex, depth, gameState, alpha, beta)
# Ghosts (minimizer) are agentIndex 1 or higher
else:
return minValue(agentIndex, depth, gameState, alpha, beta)
def maxValue(agentIndex, depth, gameState, alpha, beta):
# Initialize max value
v = float('-inf')
# Get Pacman's legal actions
legalActions = gameState.getLegalActions(agentIndex)
if not legalActions:
return self.evaluationFunction(gameState)
# Iterate through all possible actions and update alpha-beta values
for action in legalActions:
successor = gameState.generateSuccessor(agentIndex, action)
v = max(v, alphabeta(1, depth, successor, alpha, beta)) # Ghosts start at index 1
if v > beta:
return v # Prune the remaining branches
alpha = max(alpha, v)
return v
def minValue(agentIndex, depth, gameState, alpha, beta):
# Initialize min value
v = float('inf')
# Get the current agent's legal actions (ghosts)
legalActions = gameState.getLegalActions(agentIndex)
if not legalActions:
return self.evaluationFunction(gameState)
# Get the next agent's index and check if we need to increase depth
nextAgent = agentIndex + 1
if nextAgent == gameState.getNumAgents():
nextAgent = 0 # Go back to Pacman
depth += 1 # Increase the depth since we've gone through all agents
# Iterate through all possible actions and update alpha-beta values
for action in legalActions:
successor = gameState.generateSuccessor(agentIndex, action)
v = min(v, alphabeta(nextAgent, depth, successor, alpha, beta))
if v < alpha:
return v # Prune the remaining branches
beta = min(beta, v)
return v
# Pacman (agentIndex 0) will choose the action with the best alpha-beta score
bestAction = None
bestScore = float('-inf')
alpha = float('-inf')
beta = float('inf')
for action in gameState.getLegalActions(0): # Pacman's legal actions
successor = gameState.generateSuccessor(0, action)
score = alphabeta(1, 0, successor, alpha, beta) # Start with Ghost 1, depth 0
if score > bestScore:
bestScore = score
bestAction = action
alpha = max(alpha, score)
return bestAction
12.2.6. Complete Execution Trace#
Let’s trace the alpha-beta execution step by step with 2 ghosts:
Initial Call: getAction(gameState)
1. Try Pacman action: NORTH
generateSuccessor(0, NORTH)→ New state with Pacman at (1,2)Call
alphabeta(1, 0, successor_state, -∞, +∞)
2. Ghost 1’s turn (agentIndex=1, depth=0)
minValue(1, 0, state, -∞, +∞)calledGhost 1 legal actions: [NORTH, EAST, SOUTH]
2.1 Try Ghost 1 action: NORTH
generateSuccessor(1, NORTH)→ Ghost 1 at (3,0)nextAgent = 2, nextDepth = 0
Call
alphabeta(2, 0, successor_state, -∞, +∞)
2.1.1 Ghost 2’s turn (agentIndex=2, depth=0)
minValue(2, 0, state, -∞, +∞)calledGhost 2 legal actions: [NORTH, EAST, SOUTH]
2.1.1.1 Try Ghost 2 NORTH → Eventually returns +3
v = min(+∞, +3) = +3
β = min(+∞, +3) = +3
2.1.1.2 Try Ghost 2 EAST → Eventually returns +2
v = min(+3, +2) = +2
β = min(+3, +2) = +2
2.1.1.3 Try Ghost 2 SOUTH → Eventually returns +1
v = min(+2, +1) = +1
β = min(+2, +1) = +1
2.1.1 Ghost 2 returns +1
2.1 Back to Ghost 1 NORTH → value +1
v = min(+∞, +1) = +1
β = min(+∞, +1) = +1
2.2 Try Ghost 1 action: EAST → Eventually returns +2
v = min(+1, +2) = +1
β = min(+1, +2) = +1
2.3 Try Ghost 1 action: SOUTH → Eventually returns +3
v = min(+1, +3) = +1
β = min(+1, +3) = +1
3. Ghost 1 returns minimum: +1
Back to root: α = max(-∞, +1) = +1
4. Try Pacman action: EAST
generateSuccessor(0, EAST)→ New state with Pacman at (2,1)Call
alphabeta(1, 0, successor_state, +1, +∞)
5. Ghost 1’s turn (agentIndex=1, depth=0)
minValue(1, 0, state, +1, +∞)calledNote: α is now +1 from previous branch!
5.1 Try Ghost 1 action: NORTH → Through Ghost 2 → returns +2
v = min(+∞, +2) = +2
β = min(+∞, +2) = +2
5.2 Try Ghost 1 action: EAST → Through Ghost 2 → returns +1
v = min(+2, +1) = +1
β = min(+2, +1) = +1
Check: β(+1) ≤ α(+1)? YES!
PRUNE! Skip Ghost 1 SOUTH action
6. Ghost 1 returns +1
Back to root: α = max(+1, +1) = +1
7. Try Pacman action: SOUTH
generateSuccessor(0, SOUTH)→ New state with Pacman at (1,0)Call
alphabeta(1, 0, successor_state, +1, +∞)
8. Ghost 1’s turn (agentIndex=1, depth=0)
minValue(1, 0, state, +1, +∞)called
8.1 Try Ghost 1 action: NORTH
generateSuccessor(1, NORTH)→ Ghost 1 at (3,0)Call
alphabeta(2, 0, successor_state, +1, +∞)
8.1.1 Ghost 2’s turn (agentIndex=2, depth=0)
minValue(2, 0, state, +1, +∞)called
8.1.1.1 Try Ghost 2 NORTH → Eventually returns -1
v = min(+∞, -1) = -1
β = min(+∞, -1) = -1
Check: v(-1) < α(+1)? YES!
PRUNE! Return -1 immediately
8.1 Ghost 2 returns -1
Back to Ghost 1: v = min(+∞, -1) = -1
β = min(+∞, -1) = -1
Check: β(-1) ≤ α(+1)? YES!
PRUNE! Skip Ghost 1’s EAST and SOUTH actions
9. Ghost 1 returns -1
Back to root: α = max(+1, -1) = +1
10. Final decision: max(NORTH:+1, EAST:+1, SOUTH:-1) = NORTH or EAST (tie)
12.2.7. Key Alpha-Beta Insights#
12.2.7.1. 1. Multiple Minimizer Levels#
With 2 ghosts, we have two consecutive MIN levels:
Ghost 1 minimizes over Ghost 2’s choices
Ghost 2 minimizes over Pacman’s future choices
Both can trigger pruning independently
12.2.7.2. 2. Cascading Pruning#
When Ghost 2 triggers pruning (β ≤ α), it not only skips Ghost 2’s remaining moves but can also cause Ghost 1 to skip its remaining moves.
12.2.7.3. 3. Agent-Specific Alpha-Beta#
# In MIN nodes for both ghosts: prune when β ≤ α
if v < alpha:
return v # Don't explore remaining actions
# Update beta for current ghost
beta = min(beta, v)
12.2.7.4. 4. Order Matters#
The order of exploring children affects pruning efficiency:
# Better move ordering → more pruning
# In practice, try "promising" moves first
12.2.7.5. 5. Alpha-Beta Conditions#
# In MAX nodes: prune when v ≥ β
if v > beta:
return v
# In MIN nodes: prune when v ≤ α
if v < alpha:
return v
12.2.7.6. 6. Maintaining Bounds#
# MAX node updates alpha
alpha = max(alpha, value)
# MIN node updates beta
beta = min(beta, value)
12.2.8. Comparison: Minimax vs Alpha-Beta#
Minimax: Evaluates 81 nodes in the tree (3^4 for all combinations with 2 ghosts) Alpha-Beta: Evaluates 43 nodes in the tree Savings: 47% fewer evaluations!
The savings are even more dramatic with the two-ghost scenario. Notice how:
Early pruning in Ghost 1 EAST branch: When β ≤ α, we skip Ghost 1’s remaining moves
Massive pruning in SOUTH branch: Both ghosts get pruned early due to the improved α value
12.2.9. Important Note#
Alpha-Beta pruning returns the exact same result as regular Minimax - it’s just faster! The pruned branches are guaranteed to not affect the optimal decision. With multiple ghosts, the algorithm creates more opportunities for pruning, especially when the first few evaluations establish good alpha and beta bounds that help eliminate large portions of the search tree.
The Minimax algorithm with Alpha-Beta pruning guarantees that Pacman will make the best possible decision, assuming all ghosts also play optimally to minimize Pacman’s score, but does so with significantly improved computational efficiency.