Graph traversals

The concept of traversing across various nodes (or vertices) in a graph along connected edges is termed as graph traversal. The traversing across nodes is typically organized, but can sometimes be random. However, in both the scenarios, traversing begins from a specified start node and ends at a specified final node. Usually, the start and final nodes are not directly connected. In order to establish an indirect connection, a selective organized search is instantiated across the various connected paths. Graph traversal algorithms are generally designed to begin at a given start node and then search for subsequent connected nodes before terminating at the given final node.

Additionally, these traversing algorithms also need to factor in some plausible issues such as the following:

  • Infinite looping: Traversing can end in an infinite loop provided the graph contain cycles
  • Disconnected graph: Traversing (from a specified start node) sometimes can culminate without reaching all the nodes, as they are not connected with the traversed path

These issues are typically addressed by keeping track of the nodes encountered along the paths traversed. The nodes once marked are generally not visited again, unless otherwise specified. Thus, infinite looping can be prevented from happening. In addition, if all the nodes are not visited in a traverse, then a new traversal is initiated from an unmarked node, which continues until all the nodes are visited at least once.

The following R code shows the structure of a graph traverse algorithm. It takes in Graph_ADT, number of vertices n, and a vector of vertices as inputs:

graph_Traverse <- function(Graph_ADT,n,vertices) 
{ 
  ## Initialise marks to zero 
  verticesMarks <- list() 
  for( i in 1:n) 
  verticesMarks[[i]] <- Graph_ADT$initMark(vertices[i],0) ## 0 means 
  not visited 
 
  ## Initiate traversing upon checking for unmarked nodes 
  for(i in 1:n) 
  if(Graph_ADT$getMark(vertices[i])==0) 
  initTraverse(verticesMarks,vertices[i]) 
}

The following are some approaches to implement graph traversal algorithms (the initTraverse function):

  • Depth-first search (DFS)
  • Breadth-first search (BFS)
  • Topological sort

Depth-first search

DFS is a recursive implementation of the graph traversal algorithm, applicable to both directed and undirected graphs. At each step during the traverse, DFS recursively checks and visits all the unvisited nodes, which are directly connected with the node under consideration. Simultaneously, all the visited nodes along the path are pushed into a stack in the order of the traverse. During the traverse, if nodes with no unvisited nodes directly connected to them are encountered, then such nodes are popped out of the stack leaving behind the ones with directly connected unvisited nodes. This keeps track of those nodes that determine the forward traverse path such that all the nodes are visited prior to culmination. The following R code implements the DFS algorithm with three inputs. The inputs are Graph_ADT, n (number of nodes in the graph), and v (node under consideration to perform DFS):

DepthFirstSearch <- function(Graph_ADT, n, v) 
{ 
  ## Ensure all nodes are visited and processed prior node v 
  preVisit(v) 
 
  ## mark the node v under consideration as 1 (i.e. visited) 
  VerticesMarks <- list() 
  VerticesMarks[[v]] <- Graph_ADT$initMark(v,1) 
 
  ## Recursively visit all connected nodes of v till all are    marked 
  as 1 
  ## get the first vertex 
  node <- Graph_ADT$firstVertex(v)  
  ## check node belongs to neighboring nodes using conVert    function 
  while(node %in% conVert(v)){ 
    ## check if the node is unvisited 
    if(Graph_ADT$getMark(VerticesMarks[[node]] == 0))   
    ## recursively run DFS 
    DepthFirstSearch(Graph_ADT,n, node)   
    ## assign next neighbouring vertex 
    node <- Graph_ADT$nextVertex(v,node)      
  } 
 
  ## Run post processing remaining un-visited nodes 
  postVisit(v) 
}

An example of a DFS algorithm on an undirected graph is shown in Figure 8.4.

Depth-first search

Figure 8.4: Represents an undirected graph (left) along with its final search path (right) using a DFS algorithm

Figure 8.4 illustrates an initial undirected graph and its corresponding final search path obtained using the DFS algorithm. Figure 8.5 illustrates the working of DFS in detail using stacks. The sub-function nextVertex selects the node with the lowest index value among all the directly connected unvisited nodes. As the graph is undirected, the DFS algorithm can move in either of the directions as against the directed graph, where the DFS algorithm can move only in a single direction.

The asymptote of the DFS algorithm is θ(|E|+|V|). |E| represents a visit to each node (traversing all edges only once) and |V| represents visiting each node (only once).

Depth-first search

Fig 8.5: Illustration of recursive processing using the DFS algorithm and using A as a start vertex

Breadth-first search

BFS works on a principle similar to the DFS algorithm except the following:

  • BFS is not a recursive implementation unlike DFS
  • To keep track of the marked nodes, BFS uses queues as against the stacks used in DFS
  • Prior to moving to the next node, BFS ensures visits to all of its directly unmarked connected nodes unlike DFS, wherein only one of the unmarked connected nodes is visited at each iteration

The following R code implements the BFS algorithm using four inputs. The inputs are Graph_ADT, startVertex (starting nodes of the graph to begin the traverse), queue (an empty queue to keep track of connected nodes in the order of visit), and n (total number of nodes in the graph):

BreadthFirstSearch <- function(Graph_ADT,startVertex, queue, n) 
{ 
  ## initialise an empty queue with a start vertex 
  queue <- initQueue(startVertex) 
 
  ## Initialise first vertex by marking it as 1 (visited) 
  VerticesMarks <- list() 
  VerticesMarks[[v]] <- Graph_ADT$initMark(v,1) 
 
  ## Subsequently start processing in queues 
  while(length(queue) != 0){ 
    ## extract first element in the queue 
    v <- extQueue(queue) 
 
    ## Pre-Process all directly connected nodes of v 
    preVisit(v) 
 
    ## Mark visited nodes with 1 and accordingly queue the nodes 
    node <- firstVertex(v) 
    while(node %in% conVert(v)){ 
      if(getMark(graph[node] == 0)){ 
        graph <- Graph_ADT$initMark(node,1) 
        queue <- initQueue(node) 
      } 
      node <- Graph_ADT$nextVertex(startVertex,node) 
    } 
  } 
}

Figure 8.6 illustrates an initial undirected graph and its corresponding final search path obtained using the BFS algorithm.

Breadth-first search

Figure 8.6 : Represents an undirected graph (left) along with its final search path (right) using the BFS algorithm

Application of the BFS algorithm on queues is illustrated in Figure 8.7.

Breadth-first search

Figure 8.7 : Illustration of processing using the BFS algorithm and using 1 as a start vertex

Topological sort

The topological sort algorithm is primarily used in scenarios where nodes are conditionally dependent on previous nodes. In other words, the graph traversal can only happen if all its predecessor-connected nodes are visited (or processed). It is generally used in jobs where each stage is scheduled one after the other. For example, during construction of a tower, the columns cannot be raised till the foundation is complete, and roofing cannot be done till the columns are erected. Here, laying the foundation is followed by the erection of columns, which is followed by laying of the roof.

DAG form the basis of topological sort algorithms. In DAG, all the nodes are directionally connected, which takes care of order, and none of the nodes form a cycle, which ensures no conflict with any predecessor nodes (already visited and marked). Thus, DAG safeguards the linearity order among interconnected nodes, thereby being suitable for implementation of the topological sort algorithm. Figure 8.8 illustrates an example of a DAG acceptable for implementing the topological sort algorithm. The topological sort of this graph is 1, 2, 3, 4, 5, 6, 7, 8.

Topological sort

Figure 8.8: An example graph to perform topological sort

The topological sort algorithm is performed using both the DFS and BFS algorithm on DAGs.

In the case of a DFS approach, when a node is visited, no pre-processing is performed (using the preVisit function), whereas during recursive implementation, if the same node is revisited, then that node is returned as an output (using the postVisit function). Thus, the order of output-returned nodes is a reverse sort. The output for the preceding example DAG using the DFS algorithm is 8, 7, 6, 4, 5, 2, 3, 1. Thus, the topological sort is reverse of the output, that is-1, 3, 2, 5, 4, 6, 7, 8. This is also called In-order search (pre-order and post-order). Also, in-order search leads to a sorted output.

The following R code implements a topological sort using the recursive DFS algorithm. The inputs are Graph_ADT, n (total number of nodes in the graph), and vertices (a vector of vertices of the graph):

## Main function to perform topological sort 
Topological_DFS_sort <- function(Graph_ADT, n, vertices) 
{ 
  ## initialise all nodes with 0 (unvisited) 
  verticesMarks <- list() 
  for(i in 1:n) 
   verticesMarks[[i]] <<- Graph_ADT$initMark(vertices[i],0) 
   
  ## Process all nodes by recursive traversing 
  for(i in 1:n) 
   if(Graph_ADT$getMark(vertices[i]) == 0) 
    topological_secondary(Graph_ADT,i) 
} 
 
## recursive secondary function to help main function 
topological_secondary <- function(Graph_ADT,i) 
{ 
  ## Mark the node as 1 (visited) 
  verticesMarks[[i]] <<- Graph_ADT$initMark(vertices[i], 1) 
 
  ## Perform traversing across connected nodes 
  v <- Graph_ADT$firstVertex(vertices[i]) 
  while(v %in% conVert(vertices[i])){ 
    if(Graph_ADT$getMark(vertices[i] == 0)) 
    topological_secondary(vertices,v) 
    v <- Graph_ADT$nextVertex(vertices[i],v) 
  } 
  return(v) 
}

In the case of the BFS approach, the topological sort algorithm is implemented using queuing logic. Here, the nodes are inserted into the queue, not only purely based on their index value (as described in the previous section), but also taking the account of each node's pre-requisites. One of the widely used pre-requisites is the count of inward edges for each node. These counts determine the constraints for each node. Once each node is assigned its respective counts, the nodes with zero count are considered as starting nodes, and are placed in the queue in a predefined order (for example, based on their index value). Then the queuing process begins, where each node is pushed out of the queue, and all its relevant connected nodes are pushed into the queue. Once a node is pushed out, the counts of its directly connected nodes are decreased by a value of one, and the nodes, which have their current count reduced to zero, are pushed into the queue. The order in which the nodes are pushed out of the queue determines the output of the topological sort. Sometimes, the queue becomes empty and not all the nodes are visited.

These situations arise because of some cyclicity present in the graph or on violation of any of the node's prerequisites. The output of the preceding example using the BFS algorithm in topological sort is 1, 2, 3, 4, 5, 6, 7, 8, 9.

The following R code shows an implementation of BFS in topological sort. The inputs are Graph_ADT, x (total number of nodes in the graph), vertices (a vector of vertices of the graph), and queue (an empty queue to keep track of connected nodes in the order of visit):

Topological_BFS_sort <- function(Graph_ADT, queue, n, vertices) 
{ 
  ## Initialise a list to track count of inwards edges for each node 
  countEdge <- list() 
 
  ## initialise count of each node to 0 
  for (i in vertices) 
   countEdge[[i]] <- 0 
 
  ## Assign count (inward nodes) prerequisite to each node 
  for(i in vertices){ 
    v <- Graph_ADT$firstVertex(vertices[i]) 
    while(v %in% conVert(vertices[i])){ 
      countEdge[[v]] <- countEdge[[v]] + 1 
      v <- Graph_ADT$nextVertex(vertices[[i]],v) 
    } 
  } 
 
  ## Initialize queue with nodes which have zero count of inward    
  edges 
  for(i in vertices) 
   if(countEdge[[i]] == 0) 
    queue <-  Graph_ADT$initQueue(i) 
 
  ## Process the nodes which are in the queue 
  while(length(queue) != 0){ 
    v <- extQueue(queue) 
    print(v) 
    w <- Graph_ADT$firstVertex(v) 
    while(w %in% conVert(vertices[v])){ 
      ## Decrease the count prerequisite by 1 
      countEdge[[w]] <- countEdge[[w]] - 1 
      if(countEdge[[w]] == 0) ## no prerequisites 
      queue <- initQueue(w) 
      w <- Graph_ADT$nextVertex(vertices[v],w) 
    } 
  } 
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset