UNIT-IV NONLINEAR DATA STRUCTURE –GRAPH
Representation of Graphs, Topological Sort, Depth First Search and Breadth-First Search,
Minimum Spanning Tree – Prim's Algorithm, Shortest path algorithm – Dijikstra’s
Algorithm.
Graph and its representations
A Graph is a non-linear data structure consisting of vertices and edges. The vertices are
sometimes also referred to as nodes and the edges are lines or arcs that connect any two
nodes in the graph. More formally a Graph is composed of a set of vertices (V) and a set of
edges (E). The graph is denoted by G (V, E).
Representations of Graph
Here are the two most common ways to represent a graph: For simplicity, we are going to
consider only unweighted graphs in this post.
1. Adjacency Matrix
1. Adjacency List
Adjacency Matrix Representation
An adjacency matrix is a way of representing a graph as a matrix of boolean (0’s
and 1’s). Let’s assume there are n vertices in the graph So, create a 2D
matrix adjMat[n][n] having dimension n x n.
If there is an edge from vertex i to j, mark adjMat[i][j] as 1.
If there is no edge from vertex i to j, mark adjMat[i][j] as 0.
Representation of Undirected Graph as Adjacency Matrix:
The below figure shows an undirected graph. Initially, the entire Matrix is initialized
to 0. If there is an edge from source to destination, we insert 1 to both cases
(adjMat[destination] and adjMat[destination]) because we can go either way.
Graph Representation of Undirected graph to Adjaceny Matrix
Representation of Directed Graph as Adjacency Matrix:
The below figure shows a directed graph. Initially, the entire Matrix is initialized to 0. If
there is an edge from source to destination, we insert 1 for that
particular adjMat[destination].
Graph Representation of Directed graph to Adjaceny Matrix
Representation of Directed Graph as Adjacency list:
The below directed graph has 3 vertices. So, an array of list will be created of size 3,
where each indices represent the vertices. Now, vertex 0 has no neighbours. For vertex 1, it
has two neighbour (i.e, 0 and 2) so, insert vertices 0 and 2 at indices 1 of array. Similarly,
for vertex 2, insert its neighbours in array of list.
Graph Representation of Directed graph to Adjaceny list
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of
vertices such that for every directed edge u-v, vertex u comes before v in the ordering.
Note: Topological Sorting for a graph is not possible if the graph is not a DAG.
Example:
Input: Graph:
Example
Output: 5 4 2 3 1 0
Explanation: The first vertex in topological sorting is always a vertex with an in-degree of
0 (a vertex with no incoming edges). A topological sorting of the following graph is “5 4 2
3 1 0”. There can be more than one topological sorting for a graph. Another topological
sorting of the following graph is “4 5 2 3 1 0”.
Depth First Search or DFS for a Graph
Depth First Search (or DFS) for a graph is similar to Depth First Traversal of a tree.
Like trees, we traverse all adjacent vertices one by one. When we traverse an adjacent
vertex, we completely finish the traversal of all vertices reachable through that adjacent
vertex. After we finish traversing one adjacent vertex and its reachable vertices, we move to
the next adjacent vertex and repeat the process. This is similar to a tree, where we first
completely traverse the left subtree and then move to the right subtree. The key difference
is that, unlike trees, graphs may contain cycles (a node may be visited more than once). To
avoid processing a node multiple times, we use a Boolean visited array.
Example:
Input: adj = [[1, 2], [0, 2], [0, 1, 3, 4], [2], [2]]
Output: 1 2 0 3 4
Explanation: The source vertex s is 1. We visit it first, then we visit an adjacent. There are
two adjacent 1, 0 and 2. We can pick any of the two
Start at 1: Mark as visited. Output: 1
Move to 2: Mark as visited. Output: 2
Move to 0: Mark as visited. Output: 0 (backtrack to 2)
Move to 3: Mark as visited. Output: 3 (backtrack to 2)
Move to 4: Mark as visited. Output: 4 (backtrack to 1)
Input: [[2,3,1], [0], [0,4], [0], [2]]
Output: 0 2 4 3 1
Explanation: DFS Steps:
Start at 0: Mark as visited. Output: 0
Move to 2: Mark as visited. Output: 2
Move to 4: Mark as visited. Output: 4 (backtrack to 2, then backtrack to 0)
Move to 3: Mark as visited. Output: 3 (backtrack to 0)
Move to 1: Mark as visited. Output: 1
Breadth First Search (BFS) is a fundamental graph traversal algorithm. It begins with a
node, then first traverses all its adjacent. Once all adjacent are visited, then their adjacent
are traversed. This is different from DFS in a way that closest vertices are visited before
others. We mainly traverse vertices level by level. A lot of popular graph algorithms
like Dijkstra’s shortest path, Kahn’s Algorithm, and Prim’s algorithm are based on BFS.
BFS itself can be used to detect cycle in a directed and undirected graph, find shortest path
in an unweighted graph and many more problems.
BFS from a Given Source:
The algorithm starts from a given source and explores all reachable vertices from
the given source. It is similar to the Breadth-First Traversal of a tree. Like tree, we begin
with the given source (in tree, we begin with root) and traverse vertices level by level using
a queue data structure. The only catch here is that, unlike trees, graphs may contain cycles,
so we may come to the same node again. To avoid processing a node more than once, we
use a Boolean visited array.
Initialization: Enqueue the given source vertex into a queue and mark it as visited.
1. Exploration: While the queue is not empty:
Dequeue a node from the queue and visit it (e.g., print its value).
For each unvisited neighbor of the dequeued node:
Enqueue the neighbor into the queue.
Mark the neighbor as visited.
2. Termination: Repeat step 2 until the queue is empty.
This algorithm ensures that all nodes in the graph are visited in a breadth-first
manner, starting from the starting node.
What is Minimum Spanning Tree (MST)
A minimum spanning tree (MST) is defined as a spanning tree that has the minimum
weight among all the possible spanning trees. A spanning tree is defined as a tree-like
subgraph of a connected, undirected graph that includes all the vertices of the graph. Or, to
say in Layman’s words, it is a subset of the edges of the graph that forms a tree (acyclic)
where every node of the graph is a part of the tree. The minimum spanning tree has all the
properties of a spanning tree with an added constraint of having the minimum possible
weights among all possible spanning trees. Like a spanning tree, there can also be many
possible MSTs for a graph.
Minimum spanning tree for directed graph
Properties of a Spanning Tree:
The spanning tree holds the below-mentioned principles:
The number of vertices (V) in the graph and the spanning tree is the same.
There is a fixed number of edges in the spanning tree which is equal to one less than
the total number of vertices ( E = V-1 ).
The spanning tree should not be disconnected, as in there should only be a single
source of component, not more than that.
The spanning tree should be acyclic, which means there would not be any cycle in
the tree.
The total cost (or weight) of the spanning tree is defined as the sum of the edge
weights of all the edges of the spanning tree.
There can be many possible spanning trees for a graph.
The minimum spanning tree has all the properties of a spanning tree with an added
constraint of having the minimum possible weights among all possible spanning
trees. Like a spanning tree, there can also be many possible MSTs for a graph. Let’s
look at the MST of the above example Graph,
Algorithms to find Minimum Spanning Tree:
There are several algorithms to find the minimum spanning tree from a given graph
Introduction to prim’s algorithm:
Prim’s algorithm is a Greedy algorithm. This algorithm always starts with a single
node and moves through several adjacent nodes, in order to explore all of the connected
edges along the way.
The algorithm starts with an empty spanning tree. The idea is to maintain two sets of
vertices. The first set contains the vertices already included in the MST, and the other set
contains the vertices not yet included. At every step, it considers all the edges that connect
the two sets and picks the minimum weight edge from these edges. After picking the edge,
it moves the other endpoint of the edge to the set containing MST.
A group of edges that connects two sets of vertices in a graph is called cut in graph
theory. So, at every step of Prim’s algorithm, find a cut, pick the minimum weight edge
from the cut, and include this vertex in MST Set (the set that contains already included
vertices).
How does Prim’s Algorithm Work?
The working of Prim’s algorithm can be described by using the following steps:
Step 1: Determine an arbitrary vertex as the starting vertex of the MST.
Step 2: Follow steps 3 to 5 till there are vertices that are not included in the MST (known
as fringe vertex).
Step3: Find edges connecting any tree vertex with the fringe vertices.
Step 4: Find the minimum among these edges.
Step 5: Add the chosen edge to the MST if it does not form any cycle.
Step 6: Return the MST and exit
Illustration of Prim’s Algorithm:
Consider the following graph as an example for which we need to find the Minimum
Spanning Tree (MST).
Step 1: Firstly, we select an arbitrary vertex that acts as the starting vertex of the Minimum
Spanning Tree. Here we have selected vertex 0 as the starting vertex.
Step 2: All the edges connecting the incomplete MST and other vertices are the edges {0,
1} and {0, 7}. Between these two the edge with minimum weight is {0, 1}. So include the
edge and vertex 1 in the MST.
Step 3: The edges connecting the incomplete MST to other vertices are {0, 7}, {1, 7} and
{1, 2}. Among these edges the minimum weight is 8 which is of the edges {0, 7} and {1,
2}. Let us here include the edge {0, 7} and the vertex 7 in the MST. [We could have also
included edge {1, 2} and vertex 2 in the MST].
Step 4: The edges that connect the incomplete MST with the fringe vertices are {1, 2}, {7,
6} and {7, 8}. Add the edge {7, 6} and the vertex 6 in the MST as it has the least weight
(i.e., 1).
Step 5: The connecting edges now are {7, 8}, {1, 2}, {6, 8} and {6, 5}. Include edge {6, 5}
and vertex 5 in the MST as the edge has the minimum weight (i.e., 2) among them.
Step 6: Among the current connecting edges, the edge {5, 2} has the minimum weight. So
include that edge and the vertex 2 in the MST.
Step 7: The connecting edges between the incomplete MST and the other edges are {2, 8},
{2, 3}, {5, 3} and {5, 4}. The edge with minimum weight is edge {2, 8} which has weight
2. So include this edge and the vertex 8 in the MST.
Step 8: See here that the edges {7, 8} and {2, 3} both have same weight which are
minimum. But 7 is already part of MST. So we will consider the edge {2, 3} and include
that edge and vertex 3 in the MST.
Step 9: Only the vertex 4 remains to be included. The minimum weighted edge from the
incomplete MST to 4 is {3, 4}.
The final structure of the MST is as follows and the weight of the edges of the MST is (4 +
8 + 1 + 2 + 4 + 2 + 7 + 9) = 37.
Note: If we had selected the edge {1, 2} in the third step then the MST would look like the
following.
How to implement Prim’s Algorithm?
Follow the given steps to utilize the Prim’s Algorithm mentioned above for finding
MST of a graph:
1. Create a set mstSet that keeps track of vertices already included in MST.
2. Assign a key value to all vertices in the input graph. Initialize all key values as
INFINITE. Assign the key value as 0 for the first vertex so that it is picked
first.
3. While mstSet doesn’t include all vertices
a) Pick a vertex u that is not there in mstSet and has a minimum key
value.
b) Include u in the mstSet.
c) Update the key value of all adjacent vertices of u. To update the key
values, iterate through all adjacent vertices.
d) For every adjacent vertex v, if the weight of edge u-v is less than the
previous key value of v, update the key value as the weight of u-v.
The idea of using key values is to pick the minimum weight edge from the cut. The
key values are used only for vertices that are not yet included in MST, the key value for
these vertices indicates the minimum weight edges connecting them to the set of vertices
included in MST.
Shortest Path Algorithm
It is an algorithm for finding the shortest path between all the pairs of vertices in a
weighted graph. This algorithm follows the dynamic programming approach to find the
shortest path.
Types of Shortest Path Algorithms:
As we know there are various types of graphs (weighted, unweighted, negative, cyclic, etc.)
therefore having a single algorithm that handles all of them efficiently is not possible. In
order to tackle different problems, we have different shortest-path algorithms, which can be
categorised into two categories:
Single Source Shortest Path Algorithms:
In this algorithm we determine the shortest path of all the nodes in the graph with
respect to a single node i.e. we have only one source node in this algorithms.
1. Depth-First Search (DFS)
1. Breadth-First Search (BFS)
1. Multi-Source BFS
1. Dijkstra's algorithm
1. Bellman-Ford algorithm
1. Topological Sort
1. A* search algorithm
All Pair Shortest Path Algorithms:
Contrary to the single source shortest path algorithm, in this algorithm we determine
the shortest path between every possible pair of nodes.
1. Floyd-Warshall algorithm
1. Johnson's algorithm
Dijkstra’s Algorithm:
Dijkstra’s algorithm is a popular algorithms for solving many single-source shortest
path problems having non-negative edge weight in the graphs i.e., it is to find the shortest
distance between two vertices on a graph. It was conceived by Dutch computer scientist
Edsger W. Dijkstra in 1956.
The algorithm maintains a set of visited vertices and a set of unvisited vertices. It
starts at the source vertex and iteratively selects the unvisited vertex with the smallest
tentative distance from the source. It then visits the neighbours of this vertex and updates
their tentative distances if a shorter path is found. This process continues until the
destination vertex is reached, or all reachable vertices have been visited.
Need for Dijkstra’s Algorithm (Purpose and Use-Cases)
The need for Dijkstra’s algorithm arises in many applications where finding the
shortest path between two points is crucial.
For example, It can be used in the routing protocols for computer networks and also used
by map systems to find the shortest path between starting point and the Destination
Algorithm for Dijkstra’s Algorithm:
1. Mark the source node with a current distance of 0 and the rest with infinity.
2. Set the non-visited node with the smallest current distance as the current node.
3. For each neighbor, N of the current node adds the current distance of the adjacent
node with the weight of the edge connecting 0->1. If it is smaller than the current
distance of Node, set it as the new current distance of N.
4. Mark the current node 1 as visited.
5. Go to step 2 if there are any nodes are unvisited.
How does Dijkstra’s Algorithm works?
Let’s see how Dijkstra’s Algorithm works with an example given below:
Dijkstra’s Algorithm will generate the shortest path from Node 0 to all other Nodes in the
graph.
Consider the below graph:
The algorithm will generate the shortest path from node 0 to all the other nodes in the
graph.
For this graph, we will assume that the weight of the edges represents the distance
between two nodes.
As, we can see we have the shortest path from,
Node 0 to Node 1, from
Node 0 to Node 2, from
Node 0 to Node 3, from
Node 0 to Node 4, from
Node 0 to Node 6.
Initially we have a set of resources given below:
The Distance from the source node to itself is 0. In this example the source node is
0.
The distance from the source node to all other node is unknown so we mark all of
them as infinity.
Example: 0 -> 0, 1-> ∞,2-> ∞,3-> ∞,4-> ∞,5-> ∞,6-> ∞.
We’ll also have an array of unvisited elements that will keep track of unvisited or
unmarked Nodes.
Algorithm will complete when all the nodes marked as visited and the distance
between them added to the path. Unvisited Nodes: - 0 1 2 3 4 5 6.
Step 1: Start from Node 0 and mark Node as visited as you can check in below image
visited Node is marked red.
Step 2: Check for adjacent Nodes, Now we have to choices (Either choose Node1 with
distance 2 or either choose Node 2 with distance 6) and choose Node with minimum
distance. In this step Node 1 is Minimum distance adjacent Node, so marked it as visited
and add up the distance.
Distance: Node 0 -> Node 1 = 2
Step 3: Then Move Forward and check for adjacent Node which is Node 3, so marked it as
visited and add up the distance, Now the distance will be:
Distance: Node 0 -> Node 1 -> Node 3 = 2 + 5 = 7
Step 4: Again we have two choices for adjacent Nodes (Either we can choose Node 4 with
distance 10 or either we can choose Node 5 with distance 15) so choose Node with
minimum distance. In this step Node 4 is Minimum distance adjacent Node, so marked it as
visited and add up the distance.
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 = 2 + 5 + 10 = 17
Step 5: Again, Move Forward and check for adjacent Node which is Node 6, so marked it
as visited and add up the distance, Now the distance will be:
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 -> Node 6 = 2 + 5 + 10 + 2 = 19
So, the Shortest Distance from the Source Vertex is 19 which is optimal one