14. Graph Exploration BFS#
14.1. Introduction#
When working with unweighted graphs, Breadth-First Search (BFS) is the optimal algorithm for finding the shortest path between nodes. In this notebook, we’ll implement BFS using NumPy and NetworkX to find the shortest path in an unweighted, undirected graph.
14.2. How Does BFS Work?#
BFS explores a graph level by level, starting from a source node:
We start at the source node and mark it as visited
We explore all its immediate neighbors
Then we move to the next level of nodes (neighbors of neighbors)
This process continues until we reach the target node or visit all reachable nodes
Since BFS explores nodes in order of their distance from the source, it naturally finds the shortest path in unweighted graphs.
14.3. Implementation with NumPy#
Remember, it is forbidden to use any built-in functions or libraries that implement BFS. You need to implement the algorithm from scratch using NumPy and standard Python lists. Queues or stacks are not necessary for this implementation, as we can use NumPy arrays to store the visited nodes and track the exploration process. Any implementation that uses built-in BFS functions or Queue or Stack data structures will be considered invalid.
def bfs_numpy(graph, source, target):
"""
Implementation of BFS using NumPy and standard Python lists
Parameters:
- graph: NetworkX graph (undirected, unweighted)
- source: Source node
- target: Target node
Returns:
- steps: Number of steps needed to find the solution
- shortest_path: List of nodes in the shortest path from source to target
"""
return steps, shortest_path
14.4. Visual Example#
Let’s create a simple undirected graph to visualize how the algorithm works:
def create_example_graph():
"""Creates an example undirected graph for testing"""
G = nx.Graph()
# Add nodes (cities)
cities = ['Madrid', 'Barcelona', 'Valencia', 'Sevilla', 'Bilbao', 'Zaragoza']
G.add_nodes_from(cities)
# Add edges (no weights)
edges = [
('Madrid', 'Barcelona'),
('Madrid', 'Valencia'),
('Madrid', 'Sevilla'),
('Madrid', 'Bilbao'),
('Madrid', 'Zaragoza'),
('Barcelona', 'Valencia'),
('Barcelona', 'Zaragoza'),
('Valencia', 'Sevilla'),
('Zaragoza', 'Bilbao')
]
# Add the edges
G.add_edges_from(edges)
return G
# Create and visualize the graph
G = create_example_graph()
# Visualize the graph
pos = nx.spring_layout(G, seed=42) # Node positioning
plt.figure(figsize=(10, 8))
nx.draw(G, pos, with_labels=True, node_color='lightblue', node_size=500, font_size=10)
plt.title("Undirected Graph of Spanish cities")
plt.show()
14.5. Testing Our BFS Algorithm#
Let’s run our algorithm on this graph and see the results:
# Define source and target nodes
source = 'Bilbao'
target = 'Sevilla'
# Run BFS
bfs_steps, bfs_visited = bfs_numpy(G, source, target)
print(f"BFS Path from {source} to {target}:")
print(f"Number of steps: {bfs_steps}")
print(f"Visited nodes: {bfs_visited}")
print(f"Number of nodes visited: {len(bfs_visited)}")
# Compare with NetworkX
path = nx.shortest_path(G, source, target)
print(f"\nNetworkX result:")
print(f"Shortest path: {path}")
print(f"Path length: {len(path) - 1} edges")
14.6. Visualizing the BFS Process#
Let’s visualize the order in which BFS visits the nodes:
def visualize_shortest_path(G, shortest_path, title):
"""Visualizes the shortest path with numbered edges using edge labels"""
plt.figure(figsize=(10, 8))
# Create a copy of the graph for visualization
H = G.copy()
pos = nx.spring_layout(H, seed=42)
# Create the edges of the shortest path
edge_colors = []
edge_widths = []
# Create edge labels dictionary for the shortest path edges
edge_labels = {}
# Process all edges and mark the shortest path edges
for u, v in H.edges():
if len(shortest_path) > 1:
# Check if this edge is part of the shortest path
is_path_edge = False
for i in range(len(shortest_path) - 1):
if (u == shortest_path[i] and v == shortest_path[i+1]) or \
(v == shortest_path[i] and u == shortest_path[i+1]): # For undirected graphs
is_path_edge = True
# Add label with the step number
edge_labels[(u, v)] = str(i + 1)
break
if is_path_edge:
edge_colors.append('red')
edge_widths.append(3.0)
else:
edge_colors.append('gray')
edge_widths.append(1.0)
else:
edge_colors.append('gray')
edge_widths.append(1.0)
# Draw all nodes
nx.draw_networkx_nodes(H, pos, node_color='lightblue', node_size=500)
# Highlight the nodes in the shortest path
nx.draw_networkx_nodes(H, pos, nodelist=shortest_path, node_color='#ff9999', node_size=500)
# Draw all edges with appropriate colors and widths
nx.draw_networkx_edges(H, pos, edge_color=edge_colors, width=edge_widths, alpha=0.7)
# Draw edge labels (numbers) for the shortest path
nx.draw_networkx_edge_labels(H, pos, edge_labels=edge_labels, font_size=12,
font_weight='bold', bbox=dict(facecolor='white', alpha=0.7))
# Draw node labels
nx.draw_networkx_labels(H, pos)
plt.title(title)
plt.axis('off')
plt.show()
# Visualize the shortest path with numbered edges
visualize_shortest_path(G, path, "Shortest Path with Numbered Edges")
14.7. Conclusion#
In this notebook, we have explored the Breadth-First Search algorithm implemented using NumPy and NetworkX. BFS is a fundamental graph traversal method that systematically explores all vertices of a graph by visiting neighbors at each level before moving to the next level. This level-by-level approach naturally leads to finding the shortest path in unweighted graphs, making BFS an optimal choice for such scenarios. Our implementation leverages NumPy’s efficient array operations and NetworkX’s graph representation capabilities, demonstrating how these tools can be combined to solve graph-related problems effectively. The visualization of the visit order provides an intuitive understanding of how BFS works and why it guarantees the shortest path in unweighted graphs. This algorithm forms the foundation for many more complex graph algorithms and is essential in network analysis, web crawling, social network analysis, and many other applications where finding the most direct connection between nodes is crucial.
14.8. Exercise#
14.8.1. Exercise 1#
After we implemented the BFS algorithm using NumPy, now it’s time to understand the algorithm. What we want you to do is to explain the algorithm in your own words, and try to execute each step of the algorithm, showing print statements for each step. This will help you understand the algorithm better and see how it works in practice.
14.8.2. Exercise 2#
Apply the BFS algorithm to the synthetic graphs that you generated in the previous notebook, and discuss the results. How does the algorithm perform on these graphs? Why it takes more steps to find the shortest path in some cases?