Graph Algarithms-Course Material
Graph Algarithms-Course Material
Introduction
• History of an algorithm can be traced back to the ancient in 50 AD
• An efficient method to find GCD was proposed by Euclid.
• With the arrival of computers, a variety of algorithms have been developed.
• An algorithm is a finite set of instructions to solve a given problem.
• Design and analysis of algorithms has been a key subject in any Computer Science
Curriculum.
• There has been a growing interest in the study of algorithms for graph theoretical
problems in the last few decades mainly because of numerous increasing applications
of graphs in real life, like biological networks, social networks, the Web, and the
Internet, which are commonly called complex networks which can be represented by
graphs.
• Study of graph algorithms is reported in various headings including algorithmic graph
theory, graphs and applications.
Algorithm
o Computer programming is the process that professionals use to write code that
instructs how a computer, application or software program performs. At its most
basic, computer programming is a set of instructions to facilitate specific actions.
o If you want to solve software project, there are two phases: Design and
Implementation. First, we have to design the program and latter implementation.
Algorithm Program
It was not depending on hardware and O.S Dependent on hardware and and
operating system
Every Experiment has its own inputs, definiteness, finiteness, effective ness and
output
For Example,
What are the inputs: Safety goggles and safety equipment, Beakers, Conical flasks,
Test tubes, Tongs, and Racks, Watch glasses, Vessels, Funnels.
• Further the experiment must be completed within finite time with a valid output.
• It does not make sense to have an algorithm that runs 2 days to estimate weather for
tomorrow since we know what it would be by then.
Algorithmic Approach
• English words and phrases can be used to express statements and processing steps.
• Words like read, write, compute and Set can be used for operations, computations
and assigning values to variables.
• Comparison operations are expressed as equal to, less than and greater than.
• Arithmetical operations are expressed using words like Add, Subtract, Multiply and
Divide Control structures are coded using the phrases like repeat, for, while, if, halt,
exit.
The Euclidean algorithm is an efficient method for computing the greatest common divisor
(GCD) of two integers (numbers), gcd(x,y)
Input and output statements are expressed using words read and write.
4. If X>Y then X ← 𝑋 − 𝑌
Exit
Classification of Algorithms
Depending on the strategy for solving a particular problem, algorithms are classified as
follows.
Iterative algorithms:
Example: Divide and conquer algorithms are used in searching and sorting problems.
Greedy Algorithms
Example:
Dijkstra's Algorithm
Back Tracking Algorithms:
• In back tracking algorithms all possible solutions are explored, until the end is
reached and then steps are tracked back.
Searching algorithms:
Sorting algorithms:
Compression algorithms:
Encoding algorithms
Geometrical algorithms:
Structures of an Algorithm
Three fundamental structures used in algorithms are the assignment, decision, and loops
Assignment
b←3
a ← b2
Here we assign an integer value of 3 to a variable b and then assign the square of b to a.
Decision
• Decisions are the key structures in algorithms as in daily life. We need to branch to
some part of the algorithm based on some condition.
• The following example uses if...then...else structure to determine the larger one of
two input numbers:
input a, b
else print b
end if
Loops
Yet another fundamental structure in algorithms is the loop structure to perform an action,
respectively. There are three main ways to perform loops in an algorithm as follows
for loops:
The for loops are commonly used when we know how many times the loop should
execute before we start with the loop. There is a loop variable and test condition.
• In the following example, i is the loop variable and it is incremented by 1, at the end
of each loop iteration and checked against the boundary value of 10. This simple loop
prints the squares of integers between 1 and 10.
for i=1 to 10
step 1 print i * i
end for
• while loops: In case we do not know how many times the loop will be executed,
while structure may be used.
• The following example illustrates the use of while loop, where the input Q for quitting
by the user and otherwise adds the two numbers given by the user.
• We do not know when the user may want to stop, so the use of while is appropriate
here.
input a, b
print a + b
input chr
end while
• This structure is used in similar situations to while loops when we do not know how
many times the loop will be executed. The main difference is that we do the test at
the end of the loop which means this loop is executed at least once whereas the
while loop may not be executed even once. We will write the previous example with
the repeat...until (or loop...until) structure with a clearly shorter code.
Repeat
input a, b
print a + b
Algorithm in next slide displays a procedure called Count that is called from the main
program with the parameter k. It displays all integers between 1 and k.
Running this algorithm will display 1, 1 2, 1 2 3, and 1 2 3 4 at the output calling the
procedure four times.
Chapter 1
Graphs have numerous applications including computer science, scientific computing,
chemistry, and sociology since they are simple yet effective to model real-life phenomenon.
A vertex of a graph represents some entity such as a person in a social network or a protein
in a biological network. An edge in such a network corresponds to a social interaction such
as friendship in a social network or a biochemical interaction between two proteins in the
cell.
Graph Algorithms
A sequential algorithm has a single flow of control and is executed sequentially. It accepts
input, works on the input, and provides an output.
For example, reading two integers, adding them and printing the sum is a sequential
algorithm that consists of three steps.
Let us assume the graph of Fig. given below represents a small network with vertices labeled
,v1, v2,..., v8- and having integers 1,…,13 associated with each edge.
The integer value for each edge may denote the cost of sending a message over that edge.
Commonly, the edge values are called the weights of edges. Let us assume our aim is to
design a sequential algorithm to find the edge with the maximum value. This may have some
practical usage, as we may need to find the highest cost edge in the network to avoid that
link while some communication or transportation is done.
Let us form a distance matrix D for this graph, which has entries d(i, j) showing the weights
of edges between vertices vi and v j as below:
If two vertices vi and v j are not connected, we insert a zero for that entry in D. We can have
a simple sequential algorithm that finds the largest value in each row of this matrix first and
then the maximum value of all these largest values in the second step as shown in Algorithm
1.1.
This algorithm requires 8
comparisons for each row for a total of 64 comparisons. It needs n2 comparisons in general
for a graph with n vertices.
Parallel graph algorithms aim for performance as all parallel algorithms. This way of
speeding up programs is needed especially for very large graphs representing complex
networks such as biological or social networks which consist of huge number of nodes and
edges. We have several processors working in parallel on the same problem and the results
are commonly gathered at a single processor for output. Parallel algorithms may synchronize
and communicate through shared memory, or they run as distributed memory algorithms
communicating by the transfer of messages only. The latter mode of communication is a
more common practice in parallel computing due to its versatility to realize in general
network architectures.
Let us reconsider the sequential algorithm in the previous section and attempt to parallelize
it using data partitioning. Since graph data is represented by the distance matrix, the first
thing to consider would be the partitioning of this matrix. Indeed, row-wise or column-wise
partitioning of a matrix representing a graph is commonly used in parallel graph algorithms.
Let us have a controlling processor we will call the supervisor or the root and two worker
processors to do the actual work. This mode of operation, sometimes called
supervisor/worker model, is also a common practice in the design of parallel algorithms.
Processors are commonly called processes to mean the actual processor may also be doing
some other work. We now have three processes p0, p1, and p2, and p0 is the supervisor.
The process p0 has the distance matrix initially, and it partitions and sends the first half of
the rows from 1 to 4 to p1 and 5 to 8 to p2 as shown below
In the distributed version of our sample maximum weight edge finding algorithm, we have
computational nodes of a computer network as the vertices of the graph, and our aim is that
each node in the network modeled by the graph should receive the largest weight edge of
the graph in the end.
In the following rounds, a node broadcasts the largest weight it has seen so far and after a
certain number of steps, the largest value will be propagated to all nodes of the graph as
shown in Algorithm 1.3. The number of steps is the diameter of the graph which is the
maximum number of edges between any two vertices
We now can see fundamental differences between parallel and distributed graph algorithms
using this example as follows.
• Parallel graph algorithms are needed mainly for the speedup they provide. There are a
number of processing elements that work in parallel which cooperate to finish an overall
task. The main relation between the number of processes and the size of the graph is that
we would prefer to use more processes for large graphs. We assume each processing
element can communicate with each other in general although there are some special
parallel computing architectures such as processors forming a cubic architecture of
communication as in the hypercube.
• In distributed graph algorithms, computational nodes are the vertices of the graph under
consideration and communicate with their neighbors only to solve a problem related to the
network represented by the graph. Note that the process number is the number of vertices
of the graph for these algorithms.
Algorithms for Large Graphs
Recent technical advancements in the last few decades have resulted in the availability of
data of very large networks. These networks are commonly called complex networks and
consist of tens of thousands of nodes and hundreds of thousands of links between the
nodes. One such type of network is the biological networks within the cell of living
organisms. A protein–protein interaction (PPI) network is a biological network formed with
interacting proteins outside the nucleus in the cell
Complexity of graph algorithms: A polynomial time algorithm has a complexity that can be
expressed by a polynomial function. There are very few polynomial Complexity of graph
algorithms: A polynomial time algorithm has a complexity that can be expressed by a
polynomial function. There are very few polynomial
Approximation algorithms:
Search for an approximation algorithm that finds a suboptimal solution rather than an
optimal one. In this case, we need to prove that the approximation algorithm always
provides a solution within an approximation ratio to the optimal solution. Various proof
techniques can be employed and there is no need to experiment with the approximation
algorithm other than for statistical purposes. Finding and proving approximation algorithms
are difficult for many graph problems.
Randomized algorithms:
These algorithms decide on the course of execution based on some random choice, for
example, selection of an edge at random. The output is presented typically as expected or
with high probability, meaning there is a chance, if even slightly, that the output may not be
correct. However, the randomized algorithms provide polynomial solutions to many difficult
graph problems
Heuristics:
In many cases, our only choice is the use of common-sense approaches called heuristics in
search of a solution. Choice of a heuristic is commonly pursued by intuition, and we need to
experiment the algorithm with the heuristic for a wide range of inputs to show it works
experimentally.
Performance:
Even with polynomial time graph algorithms, the size of the graph may restrict its use for
large graphs. Recent interest in large graphs representing large real-life networks demands
high-performance algorithms which are commonly realized by parallel computing. Biological
networks and social networks are examples of such networks. Therefore, there is a need for
efficient parallel algorithms to be implemented in these large graphs. However, some graph
problems are difficult to parallelize due to the structure of the procedures used.
• Distribution:
Several large real networks are distributed in a sense that each node of the network is an
autonomous computing element. The Internet, the Web, mobile ad hoc networks, and
wireless sensor networks are examples of such networks which can be termed as computer
networks in general. These networks can again be modeled conveniently by graphs.
However, the nodes of the network now actively participate in the execution of the graph
algorithm. This type of algorithm is termed distributed algorithms.
DFS Algorithm:
A standard DFS implementation puts each vertex of the graph into one of two categories:
• Visited
• Not Visited
The purpose of the algorithm is to mark each vertex as visited while avoiding cycles.
2. Take the top item of the stack and add it to the visited list.
3. Create a list of that vertex's adjacent nodes. Add the ones which aren't in the visited
list to the top of the stack.
d b
c
f e
t
Applications
Depth-first search
Informed Search:
Informed Search algorithms have information on the goal state which helps in more efficient
searching. This information is obtained by a function that estimates how close a state is to
the goal state.
Uninformed Search:
Uninformed search algorithms have no additional information on the goal node other than
the one provided in the problem definition. The plans to reach the goal state from the start
state differ only by the order and length of actions.
(1) The Breadth First Search (BFS) technique. Let G be a graph and let s, t be two specified
vertices of G. We will now describe a method of finding a path from s to t, if there is any,
which uses the least number of edges. Such a path, if it exists, is called a shortest path from
s to t. The method assigns labels 0,1,2,. . . to vertices of G and is called the Breadth First
Search (BFS for short) technique. It is given by the following Algorithm.
Step 2. Find all unlabeled vertices in G which are adjacent to vertices labeled i. If there are
no such vertices then t is not connected to s (by a path). If there are such vertices,
label them i + 1.
Stop.
Example:
https://favtutor.com/blogs/breadth-first-search-python
Breadth First Search
Breadth-First Search
The main idea of the breadth-first search (BFS) method is to visit all neighbors of a vertex
first before visiting other vertices, and hence the name. Starting from the source vertex s, we
first visit all neighbors of s in an arbitrary order. These neighbor vertices, N(s), all have a
distance of unity from the vertex s after the visit. We then visit neighbors of vertices in N(s)
which are labeled with distance of 2 to vertex s. This process continues until all vertices are
visited.
___________________________________________________________________
____________________________________________________________________
2: Output: D*n+ and Pre d*n+ distances and predecessors of vertices in BFS tree
4: D*v+←∞
5: Pred*v+ ←⊥
6: end for
8: Pred*s+ ← s
9: Q←s
15: Pred*u+ ← v
16: enque(Q, u)
17: end if
Applications
Breadth-first search
• Used to find available neighbor nodes in peer-to-peer networks such as Bit Torrent.
Dijkstra's Algorithm
Step 1. Set λ(S) = 0 and for all vertices v ≠ s, λ (V) =∞. Set T = V, the vertex set
Step 3. If u = t, stop.
Step 4. For every edge e = uv incident with u, if v belongs to T and λ ( v) > λ ( u )+w( e)
change the value of λ ( v) to λ ( u) + w( e) (i.e., given an edge e from an
"uncolored" vertex v to u, change λ (V) to min (λ (v), λ (u) + w(e)).
Step 5. Change T to T - ,u- and go to Step 2, (i.e., "color" u and then go back
to Step 2 to find an “uncolored" vertex with minimum label).
Kruskal’s Algorithm
Step 1. Choose e1, an edge of G, such that w( el) is as small as possible and el is not a loop.
Step 2. If edges el, e2,. .. , ei have been chosen, and then choose an edge ei+1, not already
Prims’s Algorithm
Step 2. Choose an edge el = v1 v2 of G such that v2 not equal to v1 and e1 has smallest
weight among the edges of G incident with v1
Step 3. If edges e1 e2, . . . , ei have been chosen involving end points v1 v2…….,vi+1
choose an edge vi+1= vj vk with vj belongs to , v1 v2…….,vi+1- and vk does not belongs
to , v1 v2…….,vi+1} such that ei+l has smallest weight among the edges of G with
precisely one end in , v1 v2…….,vi+1-.
Step 4. Stop after n - 1 edges have been chosen. Otherwise repeat Step 3.
Challenges in Graph Algorithms
o For example, assume an algorithm A to solve a graph problem P has time complexity
2n, n being the number of vertices in the graph. We can see that A may have poor
performance even for graphs with n > 20 vertices.
Performance:
o Even with polynomial time graph algorithms, the size of the graph may restrict its use
for large graphs. Recent interest in large graphs representing large real-life networks
demands high-performance algorithms which are commonly realized by parallel
computing.
Distribution:
• Several large real networks are distributed in a sense each node of the network is an
autonomous computing element.
• The Internet, the Web, mobile ad hoc networks, and wireless sensor networks are
examples of such networks which can be termed as computer networks in general.
These networks can again be modeled conveniently by graphs. However, the nodes
of the network now actively participate in the execution of the graph algorithm.
Applications
Bellman–Ford algorithm
• Used to find directions to travel from one location to another in mapping software
like Google maps or Apple maps.
DFS Algorithm:
It is an algorithm that searches a graph (or) tree data structure by starting at the root
node and exploring as far as possible down each branch before backtracking.
Note: DFS is a recursive algorithm that uses backtracking to exhaustively search all
nodes.
BFS Vs DFS
BFS is a network search method that investigates all surrounding nodes after starting
from root node.
DFS is an algorithm that starts with first node in the network and continue to explore
until it discovers the desired node (or) even a node with no children.
ntained in 𝑇 ′ . This process continues until all edges in the queue are processed.
This algorithm consists of the following steps: