drawGraph([0, 1, 2, 3, 4, 5], [[0, 3], [1, 4], [1, 3], [2, 3], [3, 5]], -1, [], [], d3.range(6).map(() => ''), []);
7 Graphs
“To err is human — and to blame it on a computer is even more so.”
– Robert Orben
Learning Objectives
After completing this chapter, students are expected to understand:
the idea and implementation of graphs
how BFS and DFS work
how to find shortest distance paths between vertices in a graph
how to apply the two kinds of search algorithms to design optimal solutions
A graph is a collection of vertices (also called nodes) and edges between them. Sometimes, the vertices have vertex attributes and edges a direction and/or weights. More formally, a graph G is denoted by (V, E), where V=\{v_1,v_2,\ldots,v_n\} is the vertex set and E\subset V\times V is the collection of edges.
The above graph G has 6 vertices: V=\{0, 1, 2, 3, 4, 5\} and 5 edges: E=\{(0, 3), (1, 4), (1, 3), (2, 3), (3, 5)\}.
Note that the graph G above is an undirected graph, unlike a directed graph that has directed edges or arrows. We will consider only undirected graphs here.
Also, the edges of G are unweighted. In a weighted graph, each edge e\in E has an edge weight w_e.
Exercise 7.1 Draw the graph G=(V,E), where
V=\{v_0, v_1, v_2, v_3\}
E is the set of all possible edges.
Exercise 7.2 What is the minimum and maximum number of edges possible in a graph with n vertices.
Exercise 7.3 Implement a Vertex
class to store the vertex attributes of a graph.
Exercise 7.4 Implement an Edge
class to store the edge attributes of a graph.
7.1 Representing a Graph
There are two standard ways to represent a graph G=(V,E):
adjacency list, and
adjacency matrix.
Although both representations are equivalent, different graph algorithms prefer to tak one over the other as input due to convenience and efficiently. Also, the preference is sometimes dictated by how sparse the graph is.
Adjacency Matrix
The adjacency-matrix representation of a graph assumes that the vertices are numbered 1, 2,\ldots, |V| is some arbitrary manner. Then the adjacency matrix of G is |V|\times|V| matrix A=(a_{ij}) such that a_{ij}=\begin{cases} 1 &\text{ if }(i,j)\in E, \\ 0 &\text{ otherwise } \end{cases}
Memory
- The adjacency matrix requires \Theta(n^2) (meaning both O(n^2) and \Omega(n^2)) memory.
Exercise 7.5 Write down the adjacency matrix of G with V=\{1, 2, 3, 4, 5\} and E=\{(1, 2), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (4, 5)\}.
Exercise 7.6 What can you infer about the nature of a graph from its adjacency matrix?
Adjacency List
The adjacency-list representation of a graph G consists of a list Adj
such that each entry is the adjacency list Adj[u]
of a vertex of u\in V.
For each u\in V, the adjacency list Adj[u]
contains all the vertices v such that there is an edge (u,v)\in E. This is, Adj[u]
consists of all the vertices adjacent to u in G.
Memory
- The adjacency list requires \Theta(|V| + |E|) memory.
Exercise 7.7 Give an adjacency-list representation for a complete binary tree on 7 vertices. Give an equivalent adjacency-matrix representation. Assume that the edge are undirected and that the vertices are numbered from 1 to 7 as in a binary heap.
Exercise 7.8 What is the worst-case time-complexity of discovering all the edges?
7.2 Implementing a Graph
Exercise 7.9 Implement the Graph
class in Python. Add methods to add and delete vertices and edges.
Exercise 7.10 Implement a method in class Graph
to return the adjacency matrix representation.
7.3 Graph Traversal
We now discuss two most commonly used algorithms to search a graph. A graph can searched for a variety of reasons:
checking reachability of a query node from a source node
counting the number of connected components
computing the shortest distance of a query node from a source
extracting a spanning-tree rooted at a query source
searching for nodes with a query property
Breadth-First Search (BFS)
Given a graph G=(V, E) and a source vertex s, the BFS algorithm explores each vertex reachable from s. The idea is to discover the reachable vertices by gradually expanding the frontier of the seen vertices along the breadth of the frontier.
Alongside exploring the reachable vertices from the source, BFS computes the shortest-path distance from s to each reachable vertex and a BFS tree with root s containing all reachable vertices.
The basic logic of BFS is the following:
Color each vertex as
White
(not visited)Gray
(currently visiting or on the frontier)Black
(already visited).
In the beginning each node is colored
White
—except for the source node s beingGray
.Maintain a queue Q, to hold the currently exploring
Gray
nodes.In each iteration, do the following repeatedly until Q is empty:
deQueue the head u of Q and color it Black
using the adjacency list
Adj[u]
, discover the neighbors of u that are still Whitecolor these unexplored neighbors as
Gray
and enQueue them to Q
Pseudocode
Here is a pseudocode of BFS:
\begin{algorithm} \caption{BFS} \begin{algorithmic} \Procedure{BFS}{$G, s$} \For{each vertex $u\in G.V$ and $v\neq s$} \State{$u.color=\text{White}$} \State{$u.d=\infty$} \State{$u.\pi=\text{NIL}$} \EndFor \State{$s.color=\text{Gray}$} \State{$s.d=0$} \State{$s.\pi=\text{NIL}$} \State{Queue $Q=\emptyset$} \State{\call{EnQueue}{$Q,s$}} \While{$Q\neq\emptyset$} \State{$u=$\call{DeQueue}{$Q$}} \State{▷ search the neighbors of $u$} \For{each vertex $v\in G.Adj[u]$} \If{$v.color==\text{White}$} \Comment{is $v$ being discovered now?} \State{$v.color=\text{Gray}$} \State{$v.d=u.d + 1$} \State{$v.\pi=u$} \State{\call{EnQueue}{$Q,v$}}\Comment{$v$ is now on the frontier} \EndIf \EndFor \State{$u.color=\text{Black}$} \Comment{$u$ is now behind the frontier} \EndWhile \EndProcedure \end{algorithmic} \end{algorithm}
Demo
Shortest Paths
Define the shortest-path distance \delta(s, v) from s to v as the minimum number of edges in any path from vertex s to vertex v. A path of lenght \delta(s, v) is called a shortest path from s to v. If v is not reachable from s, then \delta(s, v)=\infty.
A remarkable byproduct of BFS is that it computes the shortest-path distances v.d
of all reachable vertices v from s along the way; see line #18
. The proof that v.d
is, in fact, \delta(s, v) is rather technical. An interested student may call upon the textbook (Cormen et al. 2009, 558–60).
Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms, Third Edition. 3rd ed. The MIT Press.
BFS Tree
Another noteworthy byproduct of BFS is the construction of a breadth-first tree, a tree rooted at s containing all reachable nodes.
The tree corresponds to the \pi attributes in the BFS algorithm; the blue edges in the demo. More specifically, v.π
denotes the parent in the tree of a reachable node v from s; see line #19
.
Depth-first Search (DFS)
Given a graph G=(V, E), the DFS algorithm searches deeper in the graph to explore all vertices of G. Depth-first search explores edges out of the most recently discovered vertex v that still has unexplored edges leaving it. Once all of v’s edges have been explored, the search backtracks to explore edges leaving the vertex from which v was discovered. This process continues until all vertices that are reachable from the original source vertex have been discovered. If any undiscovered vertices remain, then depth-first search selects one of them as a new source, repeating the search from that source.
Like BFS, the DFS algorithm keeps track of predecessor attribute v.\pi, producing a subgraph during the search. Since DFS chooses different sources, the resulting subgraph is often a forest, unlike a tree in case of BFS.
The basic logic of DFS is the following:
Color each vertex as
White
(not visited)Gray
(currently visiting)Black
(already visited).
In the beginning each node is colored
White
.Take a
White
vertex and run the subroutineDFS-VISIT
on it to recursively explore all theWhite
adjacent vertices.Take another unvisited vertex and repeat
Step 3
.
Pseudocode
Here is a pseudocode of DFS
:
\begin{algorithm} \caption{DFS} \begin{algorithmic} \Procedure{DFS}{$G$} \For{each vertex $u\in G.V$} \State{$u.color=\text{White}$} \State{$u.\pi=\text{NIL}$} \EndFor \State{$time=0$} \For{each vertex $u\in G.V$} \If{$u.color==\text{White}$} \State{\call{DFS-VISIT}{$G, u$}} \EndIf \EndFor \EndProcedure \Procedure{DFS-VISIT}{$G, u$} \State{$time=time+1$} \State{$u.d=time$} \State{$u.color=\text{Gray}$} \For{each vertex $u\in G.Adj[u]$} \If{$u.color==\text{White}$} \State{$v.\pi=u$} \State{\call{DFS-VISIT}{$G, v$}} \EndIf \State{$time=time+1$} \State{$u.f=time$} \State{$u.color=\text{Black}$} \EndFor \EndProcedure \end{algorithmic} \end{algorithm}
Demo
7.4 Exercises
Exercise 7.11 Take a graph of your choice with at least 8 nodes and 10 edges. Dry-run DFS
for on this graph choosing any source of your choice.
Exercise 7.12 Explain why DFS
terminates for any input graph.
Exercise 7.13 What would happen if we used a Stack instead of a Queue.
Exercise 7.14 What would happen if we removed line #23
in BFS
.
Exercise 7.15 Describe the worst-case running time of BFS
in terms of the number of vertices V and edges E of G.
Exercise 7.16 A bipartite graph is a graph such that each vertex is colored either red
or blue
and each edge connects to differently colored vertices. Design a algorithm to check if a graph is bipartite.