Studio 12: Breadth-First and Depth-First Search

This week's studio will give you some practice with graph traversals and will explore the relationship between DFS and topological orderings of a graph.

Part A: BFS and DFS Practice

Consider the following directed graph:

graph for BFS and DFS

  • First, perform a BFS on the graph starting with vertex E. If you have to choose between edges at a given step, pick the edge that ends with the vertex that comes first in the alphabet.

Question A1: what are the distances assigned by BFS to each vertex?

Question A2: which edges are present in the resulting shortest-path tree?

  • Now run DFS on the same graph, but this time, start with vertex A. Again, if you have to choose between edges at a given step, pick the edge that ends with the vertex that comes first in the alphabet. If you have to restart DFS to visit the entire graph, again do so with the unvisited vertex that comes first in the alphabet.

Question A3: what are the starting and finishing times assigned by DFS to each vertex?

Part B: Relationship of DFS to Topological Order

Recall that a topological ordering on a directed, acyclic graph $G$ is a total ordering of $G$'s vertices such that, if there exists a directed path from $u$ to $v$ in $G$, then $u$ comes before $v$ in the ordering. Topological orderings satisfy the precedence constraints of $G$ -- for each directed edge $(u,v)$, the ordering ensures that $u$ comes before $v$.

Question B1: does the graph in Part A have a topological ordering? If so, give one; if not, why not?

In class, we made the following bold claim based on a single example:

If we run DFS on a directed acyclic graph $G$ and enumerate its vertices in decreasing order of finishing times, the result is a topological ordering on $G$.

Let's see if we can convince ourselves that the claim is true for all $G$. To simplify the thought process, let's focus on a single directed edge in $G$ from vertex $u$ to vertex $v$. We'll argue that DFS always assigns $v$ an earlier finishing time than $u$.

Question B2: why does this latter claim imply the original claim?

Suppose that at some point, DFS traverses the edge $(u,v)$. Recall that in DFS, a vertex can have three statuses:

  • unstarted (no start or finish time assigned);
  • in-progress (start but no finish time assigned);
  • finished (both start and finish times assigned).

Question B3: if $v$ is unstarted when edge $(u,v)$ is traversed, what can you say about the relative finishing times of $v$ and $u$? Why?

Question B4: if $v$ is finished when edge $(u,v)$ is traversed, what can you say about the relative finishing times of $v$ and $u$? Why?

Question B5: can $v$ be in-progress when edge $(u,v)$ is traversed? Why or why not?

Question B6: using your observations in B2-B5, sketch the proof of the original claim.

Part C: The Space of Possible Topological Orderings

A directed acyclic graph may have multiple valid topological orderings. For example, consider the following graph:

graph for topological ordering question

Question C1: if you run DFS on this graph, using the same decision rules as in Part A, what topological ordering do you get?

Question C2: give three other topological orderings of this graph different from that in C1.

A reasonable question might be, is every topological ordering of a graph "reachable" by DFS? That is, for each topological ordering, is there an execution of DFS that produces it?

Question C3: for each of the orderings you found in C2, find an execution of DFS on the graph that produces it.

You might have to make different choices of which edge to follow first from a vertex and/or where to start, and perhaps restart, the search.

In fact, we can show that every possible topological ordering of a directed acyclic graph is generated by some execution of DFS. Indeed, let $G = (V,E)$ be a directed acyclic graph with $n$ vertices, and let $X = [v_1, v_2, \ldots, v_n]$ be a topological ordering of $G$'s vertices.

Question C4: which vertex $v \in X$ is guaranteed not to have any successors, that is, no $u$ such that directed edge $(v,u)$ exists?

Question C5: if we start DFS from the $v$ you picked in C4, what can you say about $v$'s finishing time relative to those of other vertices?

Question C6: generalize your observations in C4 and C5 to describe a possible execution of DFS whose finishing times yield the topological order $X$ on $G$.

Note, however, that the mapping between topological orderings and DFS executions is not bijective.

Question C7: give an example of two different DFS executions (in terms of start and finish times) on the graph in C1 that yield the same topological ordering.

Part D: Uniqueness of Topological Orderings

As we saw in Part C, a directed acyclic graph can have many different topological orderings. A reasonable follow-up question is, are there graphs with only one such ordering, and if so, which ones are they?

Suppose that a graph $G$ has a topological ordering $[v_1, v_2, \ldots, v_n]$.

Question D1: show that for any $1 \leq i < n$, the vertex ordering

\[ [v_1, \ldots v_{i-1}, v_{i+1}, v_i, v_{i+2}, \ldots, v_n] \]

is also a topological ordering for $G$ iff the directed edge $(v_i, v_{i+1})$ does not exist.

(This is the original ordering with just the adjacent pair $v_i, v_{i+1}$ reversed.)

The more interesting direction of this proof is the "only if", i.e. why can we change the order of $v_i$ and $v_{i+1}$ if there is no edge between them?

Question D2: based on your result in D1, argue that $G = (V,E)$ has a unique topological order iff it contains a directed path of $|V|-1$ edges that touches all $|V|$ vertices.

A path that touches every vertex in $G$ is called a Hamiltonian path, after William Rowan Hamilton, who studied such paths in regular polygons.

Question D3: give an efficient algorithm to test whether a directed acyclic graph $G=(V,E)$ contains a Hamiltonian path and return the path if it exists. What is the running time of your algorithm?

Interestingly, while testing for the existence of a Hamiltonian path is quite efficient in directed acyclic graphs, it is much harder in general directed graphs. In fact, this problem is known to be NP-complete -- it's one of a large class of problems that is considered unlikely to be solvable by any algorithm that runs in time polynomial in the input size (in this case, the size of the graph). Whether this problem, or any other NP-complete problem, has a polynomial-time algorithm (also known as the "P vs NP question") is one of the great unsolved questions of computer science.