DSA 5 Unit Merged Notes
DSA 5 Unit Merged Notes
Abstract Data Types (ADTs) - Stack ADT – Operations – Applications – Balancing Symbols –
Evaluating arithmetic expressions Infix to Postfix conversion – Queue ADT – Operations –
Circular Queue – DeQueue – Applications of Queues.
Stack Implementation:
1. Array Implementation
2. Linked List implementation
Array Implementation:
❖ Stack can be implemented using one-dimensional array. One-dimensional array is usedto
hold elements of a stack.
❖ Implementing a stack using array can store fixed number of data values.
❖ In a stack, initially top is set to -1.
❖ Top is used to keep track of the index of the top most elements.
Linked List implementation of Stack:
❖ Instead of using array, we can also use linked list to implement stack.
❖ Linked list allocates the memory dynamically. However, time complexity in both the
scenario is same for all the operations i.e. push, pop and peek.
❖ In linked list implementation of stack, the nodes are maintained non-contiguously in the
memory.
❖ Each node contains a pointer to its immediate successor node in the stack.
APPLICATIONS OF STACK :
o Evaluation of Arithmetic Expressions
o Balancing Symbols
o Backtracking
o Delimiter Checking
o Reverse a Data
o Processing Function Calls
A stack is a very effective data structure for evaluating arithmetic expressions in programming
languages. An arithmetic expression consists of operands and operators.
Backtracking
Backtracking is another application of Stack. It is a recursive algorithm that is used for solving the
optimization problem
Delimiter Checking
The common application of Stack is delimiter checking, i.e., parsing that involves analyzing a source
program syntactically. It is also called parenthesis checking. Different types of delimiters include the
parenthesis checking (,), curly braces {,} and square brackets [,], and common delimiters /* and */.
Every opening delimiter must match a closing delimiter.
Reverse a Data:
To reverse a given set of data, we need to reorder the data so that the first and last elements are
exchanged, the second and second last element are exchanged, and so on for all other elements.
o Reversing a string
o Converting Decimal to Binary
Stack plays an important role in programs that call several functions in succession
Evaluation of Arithmetic Expression requires two steps:
o First, convert the given expression into special notation.
o Evaluate the expression in this new notation.
o Infix Notation
o Prefix Notation
o Postfix Notation
Infix Notation
The infix notation is a convenient way of writing an expression in which each operator is placed
between the operands. Infix expressions can be parenthesized or unparenthesized depending upon the
problem requirement.
Example: A + B, (C - D) etc.
All these expressions are in infix notation because the operator comes between the operands.
Prefix Notation
The prefix notation places the operator before the operands. This notation was introduced by the Polish
mathematician and hence often referred to as polish notation.
All these expressions are in prefix notation because the operator comes before the operands.
Postfix Notation
The postfix notation places the operator after the operands. This notation is just the reverse of Polish
notation and also known as Reverse Polish notation.
All these expressions are in postfix notation because the operator comes after the operands.
Stack is the ideal data structure to evaluate the postfix expression because the top element is always the
most recent operand. The next element on the Stack is the second most recent operand to be operated
on.
Before evaluating the postfix expression, the following conditions must be checked. If any one of the
conditions fails, the postfix expression is invalid.
o When an operator encounters the scanning process, the Stack must contain a pair of operands or
intermediate results previously calculated.
o When an expression has been completely evaluated, the Stack must contain exactly one value
Example:
def evaluate_postfix(postfix_expr):
stack = []
operators = set(['+', '-', '*', '/', '%', '^'])
for elem in postfix_expr:
if elem not in operators:
stack.append(float(elem))
else:
b = stack.pop()
a = stack.pop()
if elem == '+':
stack.append(a + b)
elif elem == '-':
stack.append(a - b)
elif elem == '*':
stack.append(a * b)
elif elem == '/':
stack.append(a / b)
elif elem == '%':
stack.append(a % b)
elif elem == '^':
stack.append(a ** b)
return stack.pop()
postfix = '236*+' #Infix: 2 + 3 * 6
evaluate_postfix(postfix)
print('{} \t: {}'.format(postfix, evaluate_postfix(postfix)))
postfix = '23*6+' #Infix: 2 * 3 + 6
evaluate_postfix(postfix)
print('{} \t: {}'.format(postfix, evaluate_postfix(postfix)))
postfix = '23*42/+' #Infix: 2 * 3 + 4 / 2
evaluate_postfix(postfix)
print('{} : {}'.format(postfix, evaluate_postfix(postfix)))
OUTPUT:
236*+ : 20.0
23*6+ : 12.0
23*42/+ : 8.0
BALANCING SYMBOLS:
Suppose we wish to check a text string T to see if the bracket characters it contains are well
formed. We use the term brackets here to include character p tinct opening and closing characters such
as { }, < >, and ( ). The brackets are well formed if the number of openers is the same as the number of
closers and, reading the string from left to right, the number of closers never exceeds the number of
openers.
• A stack can be used to verify whether a program contains balanced braces
▪ An example of balanced braces
abc{defg{ijk}{l{mn}}op}qr
▪ An example of unbalanced braces
abc{def}}{ghij{kl}m
abc{def}{ghij{kl}m
Each time you encounter a “}”, it matches an already encountered “{”. When we reach the end
of the string, you have matched each “{”
Figure 1.4 Balancing parenthesis
Algorithm for Balancing Symbols:
Step 1: Make an empty stack.
Step 2: Read characters until end of file.
Step3: If (Character= =’(‘ )
Step 4: Push ‘(‘symbol on to the stack
Step 5: Else if (Character= =’)‘ )
Step 6: Pop the stack upto ‘)’
Step 7: else
Step 8: Print the error report
Thus, for example,
()()() is well formed.
(()())) is not well formed.
QUEUE:
❖ Queue is a linear data structure where the element is inserted at one end called REAR and
deleted from the other end called as FRONT.
❖ Front points to the beginning of the queue and Rear points to the end of the queue.
❖ Queue follows the FIFO (First - In - First Out) structure.
❖ According to its FIFO structure, element inserted first will also be removed first.
❖ The enqueue() and dequeue() are two important operations used in a queue.
Operations on Queue:
Following are the basic operations performed on a Queue.
Operations Description
enqueue() This function defines adding an element into queue.
dequeue() This function defines removing an element from queue.
Queue Implementation:
There are several ways to implement queue. They are
1. Array implementation
2. Linked List implementation
Array implementation:
Array is the easiest way to implement a queue.
Array Name: Array_queue[10]
Input Values: 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
Step 1: Initially front and rear value is -1. (Queue is Empty)
Step 2:
Step 3:
Step 4:
Step 5:
Step 6:
Step 7:
Step 8:
Step 9:
Step 10:
Step 11:
Queue Overflow:
DEQUEUE:
Step 1:
Step 2:
Step 3 to Step 9:
Types of deque
In input restricted queue, insertion operation can be performed at only one end, while deletion can
beperformed from both ends.
In output restricted queue, deletion operation can be performed at only one end, while insertion can
beperformed from both ends.
Operations performed on deque
o Insertion at front
o Insertion at rear
o Deletion at front
o Deletion at rear
In addition to the above operations, following operations are also supported in deque -
In this operation, the element is inserted from the front end of the queue. Before implementing the
operation, we first have to check whether the queue is full or not. If the queue is not full, then the
elementcan be inserted from the front end by using the below conditions -
o If the queue is empty, both rear and front are initialized with 0. Now, both will point to the
firstelement.
o Otherwise, check the position of the front if the front is less than 1 (front < 1), then
reinitialize itby front = n - 1, i.e., the last index of the array.
Insertion at the rear end
In this operation, the element is inserted from the rear end of the queue. Before implementing the
operation, we first have to check again whether the queue is full or not. If the queue is not full, then the
element can be inserted from the rear end by using the below conditions -
o If the queue is empty, both rear and front are initialized with 0. Now, both will point to the first
element.
o Otherwise, increment the rear by 1. If the rear is at last index (or size - 1), then instead of increasing
it by 1, we have to make it equal to 0.
In this operation, the element is deleted from the front end of the queue. Before implementing the
operation, we first have to check whether the queue is empty or not.
If the queue is empty, i.e., front = -1, it is the underflow condition, and we cannot perform the deletion.
If the queue is not full, then the element can be inserted from the front end by using the below conditions
-
If the deque has only one element, set rear = -1 and front = -1.
Else if front is at end (that means front = size - 1), set front = 0.
In this operation, the element is deleted from the rear end of the queue. Before implementing the
operation, we first have to check whether the queue is empty or not.
If the queue is empty, i.e., front = -1, it is the underflow condition, and we cannot perform the deletion.
If the deque has only one element, set rear = -1 and front = -1.
This operation is performed to check whether the deque is empty or not. If front = -1, it means that the
deque is empty.
Check full
This operation is performed to check whether the deque is full or not. If front = rear + 1, or front = 0 and
rear = n - 1 it means that the deque is full.
The time complexity of all of the above operations of the deque is O(1), i.e., constant.
APPLICATIONS OF DEQUE
o Deque can be used as both stack and queue, as it supports both operations.
o Deque can be used as a palindrome checker means that if we read the string from both ends, the
string would be the same.
o Multiprocessor Scheduling: When multiple processes in a system are being carried out by
multiple processors like CPU, or Core, at that time that system uses a multiprocessor scheduling
algorithm. All the processors use our deque data structure to store all the different threads of
processes.
CIRCULAR QUEUE:
❖ In Circular Queue, Enqueue of a new element is performed at the very first location of
the queue if the last location of the queue is full.
❖ In such case the first element comes after the last element.
CIRCULAR QUEUE OPERATIONS
1. Task Scheduling: Queues can be used to schedule tasks based on priority or the order in which
they were received.
2. Resource Allocation: Queues can be used to manage and allocate resources, such as printers or
CPU processing time.
3. Batch Processing: Queues can be used to handle batch processing jobs, such as data analysis or
image rendering.
4. Message Buffering: Queues can be used to buffer messages in communication systems, such as
message queues in messaging systems or buffers in computer networks.
5. Event Handling: Queues can be used to handle events in event-driven systems, such as GUI
applications or simulation systems.
6. Traffic Management: Queues can be used to manage traffic flow in transportation systems,
such as airport control systems or road networks.
7. Operating systems: Operating systems often use queues to manage processes and resources.
For example, a process scheduler might use a queue to manage the order in which processes are
executed.
UNIT II
Tree – Binary tree ADT-Tree -Traversals Algorithms –Search Tree – Binary Search Trees-AVL Trees
(Insertion, Deletion) –Splay Trees (Insertion, Deletion, Searching)- Red-Black Trees.
The data structure can be defined as the collection of elements and all the possible operations which
are required for those set of elements. Formally data structure can be defined as a data structure is a
set of domains D, a set of domains F and a set of axioms A. this triple (D,F,A) denotes the data
structure d.
The non-linear data structure is the kind of data structure in which the data may be arranged in
hierarchical fashion.
TREES:
A tree is a non-linear data structure that is used to represents hierarchical relationships
between individual data items. A tree is an ideal data structure for representing
hierarchical data. A tree can be theoretically defined as a finite set of one or more data
items (or nodes) such that:
1. There is a special node called the root of the tree.
2.Removing nodes (or data item) are partitioned into number of
mutually exclusive (i.e., Disjoined) subsets each of which is itself a tree
are called sub tree.
50 30 40 10 20
Fig 1.3: Array Representation of Binary Tree
The root element is always stored in position 1. The left child of node i is stored in
position 2iand right child of node is stored in position 2i + 1. Hence the following
formulae can be used to identify the parent, left child and right child of a particular
node.
Parent( i ) = i / 2, if i ≠ 1. If i = 1 then i is the root node and root does not have parent.
The empty positions in the tree where no node is connected are represented in the array using -1,
indicating absence of a node. Using the formula, we can see that for a node 3, the parent is3/ 2 _1.
Referring to the array locations, we find that 50 is the parent of 40. The left child of node 3 is 2*3 _
6. But the position 6 consists of -1 indicating that the left child does not exist for the node 3. Hence
50 does not have a left child. The right child of node 3 is 2*3 + 1 _ 7. The position 7 in the array
consists of 20. Hence, 20 is the right child of 40.
Preorder Traversal:
(1) Process the root R.
(2) Traverse the left subtree of R in preorder.
(3) Traverse the right subtree of R in preorder.
In order Traversal:
(1) Traverse the left subtree of R in in order.
(2) Process the root R.
(3) Traverse the right subtree of R in in order.
Post order Traversal:
(1) Traverse the left subtree of R in post order.
(2) Traverse the right subtree of R in post order.
(3) Process the root R.
Observe that each algorithm contains the same three steps, and that the left subtree of R
is always traversed before the right subtree. The difference between the algorithms is
the time atwhich the root R is processed. The three algorithms are sometimes called,
respectively, the node-left-right (NLR) traversal, the left-node-right (LNR) traversal
and the left-right-node (LRN) traversal. Now we can present the detailed algorithm for
these traversal methods in both recursive method and iterative method.
In the preorder traversal the node element is visited first and then the right subtree of
the nodeand then the right subtree of the node is visited. Consider the following case
where we have 6nodes in the tree A, B, C, D, E, F. The traversal always starts from the
root of the tree. The node A is the root and hence it is visited first. The value at this
node is processed. The processing can be doing some computation over it or just
printing its value. Now we check if there exists any left child for this node if so apply
the preorder procedure on the left subtree. Now check if there is any right subtree for
the node A, the preorder procedure is applied on the right subtree.
Since there exists a left subtree for node A, B is now considered as the root of the left
subtree of A and preorder procedure is applied. Hence, we find that B is processed next
and then it is checked if B has a left subtree. This recursive method is continued until
all the nodes are visited.
In order Traversal
In the In order traversal method, the left subtree of the current node is visited
first and then thecurrent node is processed and at last the right subtree of the current
node is visited. In the following example, the traversal starts with the root of the binary
tree. The node A is the root and it is checked if it has the left subtree. Then the inorder
traversal procedure is applied on the left subtree of the node A. Now we find that node
D does not have left subtree. Hence the node D is processed and then it is checked if
there is a right subtree for node D. Since there isno right subtree, the control returns
back to the previous function which was applied on B. Since left of B is already visited,
now B is processed. It is checked if B has the right subtree. If so apply the inroder
traversal method on the right subtree of the node B. This recursive procedure is followed
till all the nodes are visited.
Fig 1.6 : In order Traversal
Algorithm
End if
If left(temp) ≠ NU
INORDER(left(tem
End if
Print info(temp)
If right(temp) ≠ NULL
INORDER(right(temp))
End if
End INORDER
Post order Traversal
In the post order traversal method, the left subtree is visited first, then the right
subtree and at last the current node is processed. In the following example, A is the root
node. Since A has the left subtree the post order traversal method is applied recursively
on the left subtree of A. Then when left subtree of A is completely is processed, the post
order traversal method is recursively applied on the right subtree of the node A. If right
subtree is completely processed, then the current node A is processed.
Fig 1.7: Post order Traversal
ALGORITHM:
POSTORDER( ROOT )
Temp = ROOT
If temp=NULL
Return
End if
If left(temp) ≠NULL
POSTORDER(left(temp))
End if
POSTORDER(right(temp))
End if
Print info(temp)
End POSTORDER
Algorithm
DELETE( ROOT, k )
SEARCH( ROOT, k )
If Loc = NULL
Print “Item not
found”Return
End if
If right(Loc) ≠ NULL and left(Loc) ≠
NULLCASEB(Loc, par)
Else
CASEA(Loc,
par)End if
End Delete
Case A:
The search is operation is performed for the key value that has to be deleted. The search
operation, returns the address of the node to be deleted in the pointer LOC and its
parents’ address is returned in a pointer PAR. If the node to be deleted has no children
then, it is checked whether the node pointed by LOC is left child of PAR or is it the
right child of PAR. If it is the left child of PAR, then left of PAR is made NULL else
right of PAR is made NULL.
The sequence of steps followed for deleting 20 from the tree is as shown.
Save = Loc
While left(temp) ≠ NULL
Save = temp
Temp = left(temp)
End while
Suc = temp
Parsuc = save
CASEA( suc, parsuc)
If par ≠ NULL
If Loc = left(par)
Left(par) = suc
Else
Right(par) = suc
End if
Else
ROOT = suc
End if
Left(suc) = left(Loc)
Right(suc) = right(Loc)
End CAS
HEIGHT BALANCED TREES ( AVL TREES )
The Height balanced trees were developed by researchers Adelson-Velskii and Landis.
Hence these trees are also called AVL trees. Height balancing attempts to maintain the
balance factor of the nodes within limit.
Height of the tree: Height of a tree is the number of nodes visited in traversing a
branch that leads to a leaf node at the deepest level of the tree.
Balance factor: The balance factor of a node is defined to be the difference
between the height of the node’s left sub tree and the height of the node’s right sub
tree.
Consider the following tree. The left height of the tree is 5, because there are 5 nodes
(45, 40, 35, 37 and 36) visited in traversing the branch that leads to a leaf node at the
deepest level of this tree.
Balance factor = height of left subtree – height of the right subtree
In the following tree the balance factor for each and every node is calculated and shown.
For example, the balance factor of node 35 is (0 – 2 ) = - 2.
The tree which is shown below is a binary search tree. The purpose of going for a binary
search tree is to make the searching efficient. But when the elements are added to the
binary search tree in such a way that one side of the tree becomes heavier, then the
searching becomes inefficient. The very purpose of going for a binary search tree is not
served. Hence we try to adjust this unbalanced tree to have nodes equally distributed on
both sides. This is achieved by rotating the tree using standard algorithms called the
AVL rotations. After applying AVL rotation, the tree becomes balanced and is called
the AVL tree or the height balanced tree.
LEFT-OF-LEFT(pivot)
P = left(pivot)
Q = right(P)
Root = P
Right(P) = pivot
Left(pivot) = Q
End LEFT-OF-LEFT
Right-of- Right Rotation:
In this case, the pivot element is fixed as before. The new node is found to be added as the right child
to the right subtree of the pivot element. The first two steps in the algorithm sets the pointer P and Q
to the correct positions. The next two steps rotate the tree to balance it. The last two steps calculate
the new balance factor of the nodes
Q = left(P)
Root = P
Left(P) = pivot
Right(pivot) = Q
End RIGHT-OF-RIGHT
Right-of-Left Rotation:
In this following tree, a new node 19 is added. This is added as the right child to the left
subtree of the pivot node. The node 20 fixed as the pivot node, as it disturbs the balance
of the tree. In the first two steps the pointers P and Q are positioned. In the next four
steps, tree is rotated. In the remaining steps, the new balance factors are calculated.
Fig 2.7: Right-of-Left Rotation
Q
Root = Q
Left(Q) = P
Right(Q) = Pivot
Left(pivot) = right(Q)
Right(P) = left(Q)
End RIGHT-OF-LEFT
Left-of-Right:
In the following tree, a new node 21 is added. The tree becomes unbalanced and
the node 20 is the node which has a deviated balance factor and hence fixed as the pivot
node. In the first two steps of the algorithm, the pointers P and Q are positioned. In the
next 4 steps the tree is rotated to make it balanced. The remaining steps calculate the
new balance factors for the nodes in the tree.
Fig 2.8: Left-of-Right Rotation
LEFT-OF-RIGHT(pivot)
P=
right(pivo
t)Q =
left(P)
Root= Q
Right(Q)
=P
Left(Q) =
Pivot
Right(pivot) =
left(Q)Left(P) =
right(Q)
End LEFT-OF-RIGHT
Red-Black Tree
STRUCTURAL PROPERTIES
A red-black tree is a binary search tree whose height is logarithmic in the number of keys
stored.
1. Every node is colored red or black.(Colors are only examined during insertion and deletion)
4. Every simple path from a child of node X to a leaf has the same number of black nodes.
Example:
=red =black
40
3
20 100
2 3
10 30 60 120
1 1 2 2
0 0 0 0 50 80 110 140
1 2 1 2
0 0 0 0
70 90 130 160
1 1 1 1
0 0 0 0 0 0
150 170
1 1
0 0 0 0
Observations:
2. If a node X is not a leaf and its sibling is a leaf, then X must be red.
3. There may be many ways to color a binary search tree to make it a red-black tree.
4. If the root is colored red, then it may be switched to black without violating structural properties.
2
ROTATIONS
Technique for rebalancing in most balanced binary search tree schemes. Takes(1)time.
A C
a b c d
C A
B B
d a
A C
c b
a b c d
3
INSERTION
3. Mayviolatestructuralproperty3.Leadstothreecases,alongwithsymmetricversions.
Case 1:
= red =black
k+1 k+1
B B
k
x k+1
C A C
A k k
k
k
A’
x A’
k c d k c d
a b
a b
Case 2:
=red = black
k+1 k+1
C C
k k
A D B D
k k-1 k k-1
x B x A
k k
a d e c d e
b c a b
4
Case 3:
=red = black
k+1 k+1
C B
k k
B D A C
k k-1 k k
x A D
k k-1
c d e a b c
a b d e
Example 1:
=red =black
40
20 60
10 30 50 70
Insert15
40 40
20 60 1 x 20 60
10 30 50 70 10 30 50 70
x 15 15
5
Insert13
40 40
20 60 2 20 60
10 30 50 70 10 30 50 70
15 13
x 13 x 15
40
3 20 60
13 30 50 70
10 15
Insert75
40 40
20 60 1 20 x 60
13 30 50 70 13 30 50 70
10 15 x 75 10 15 75
Insert14
40 40
20 60 1 20 60
13 30 50 70 x 13 30 50 70
10 15 75 10 15 75
x 14 14
x 40 Resetto black
1 20 60
13 30 50 70
10 15 75
14
6
Example 2:
140
40 160
10 30 60 120
50 80 110 130
70 90
Insert75
140
40 160
10 30 60 120
50 80 110 130
70 90
75
x
7
140
40 160
1
10 30 60 120
50 x 80 110 130
70 90
75
140
40 160
1
10 30 60 120
50 80 110 130
70 90
75
8
140
100 160
10 30 50 80
70 90
75
100
40 140
3 20 60 120 160
70 90
75
DELETION
b. Colorofdeletednode?
Red Done
BlackSettemporary“doubleblack”pointer(x)atsentinel.
Determine which of four rebalancing cases
applies.
b. Delete success or using the appropriate one of the previous two cases.
Case 1:
=red =black
k+1 k+1
B D
k k
x A D B E
k-2 k k k-1
C E x A C
k-1 k-1 k-2 k-1
a b e f
c d e f a b c d
Case 2:
B x B
k k-1
x A D A D
k-2 k-1 k-2 k-1
C E C E
k-2 k-2 k-2 k-2
a b a b
c d e f c d e f
10
Case 3:
B B
k k
x A D x A C
k-2 k-1 k-2 k-1
C E D
k-1 k-2 k-1
a b a b c
E
k-2
c d e f d
e f
Case 4:
k+color(B)+color(C) k+color(D)+color(C)
B D
k+color(C) k+color(C)
x A D B E
k-2 k-1 k-1 k-1
+color(C) +color(C) +color(C) +color(C)
C E A C
k-1 k-1 k-2 k-1
+color(C) +color(C)
a b e f
c d e f a b c d
80
40 120
20 60 100 140
40 120
20 60 100 140
80
40 120
2
20 x 60 100 140
80
2 x 40 120
20 60 100 140
x
80
2 40 120
20 60 100 140
If x reaches the root, then done. Only place in tree where this happens.
Delete60
12
80
40 120
20 70 100 140
Delete70
80 80
40 120 1 20 120
2 80
20 120
Delete10
80 80
20 120 3 20 120
80
4 30 120
20 40 100 140
Delete40
13
80 80
30 120 2 x 30 120
x
20 100 140 20 100 140
120 120
1 80 140 2 x 80 140
20 90 110 20 90 110
Delete120
130 130
80 140 2 80 x 140
x
30 100 150 30 100 150
20 90 110 20 90 110
130 100
3 4
100 x 140 80 130
30 90 20 150
20
110 110
80 130 4 80 140
x 130
30 90 140 30 90 150
20 150 20
Delete100
14
SPLAY TREE
A splay tree is a binary search tree in which restructuring is done using a scheme called splay. The splay is
a heuristic method which moves a given vertex v to the root of the splay tree using a sequence of
rotations.
Introduction
Splay Tree is not strictly balanced like the AVL Tree but is intended to work faster for cache applications.
Splaying is an additional operation carried out during the insert, delete, and search operations. The
recently accessed node becomes the root and the tree changes for even read-only search operations.
• Roughly balanced: Rotations are used to bring the recently accessed node to the root. This helps
the tree be roughly balanced.
• Zig Rotation: Right rotation.
• Zag Rotation: Left rotation.
When we use BSTs for searching the data, the time complexity in an average case is O(log2N). This is
because the average case height of a binary tree is O(log2N). However, the worst case can be a left or a
right-skewed tree i.e. a linear tree. Then the worst-case time complexity of the operations like searching,
insertion, and deletion becomes O(N).
However, what if the trees are made self-balancing such that they never form skewed structures and the
height always remains O(log2N)? This is done in AVL trees and Red-Black trees. These are self-balancing
BSTs; hence, the worst-case time complexity of searching, insertion and deletion is O(log2N).
However, the question is, can we do better than this? The answer is yes. In some practical scenarios, we
can do better than AVL and Red-Black Trees. How? Using splay trees.
The main idea of splay trees is based on the “most frequently used elements”. Basically, a splay tree in
data structure, involves all the operations of a normal Binary Search Tree i.e. searching, insertion,
deletion, etc. However, it also includes one special operation i.e. always performed and it is
called splaying. The idea is to bring the more frequently used elements closer to the root of the BST so
that if they are searched again, they can be found more quickly as compared to the less frequently used
elements.
Let us say we want to search for 8. So, we start from the root node. Since the value at the root node is 10
and we are searching for 8 i.e. a smaller element than the current node, we will move toward the left. So,
we go to the next node and we found 8.
15
This is a normal search operation in BST. However, we will do a little more than this in the splay tree. We
will perform splaying i.e. 8 will now become the root node of the tree. How? We can do this with some
rotations. There are 6 different types of rotations that can be performed in a splay tree. These are as
follows.
• Zig Zag Rotation (Right-Left Rotation i.e. first do Zig then Zag)
• Zag Zig Rotation (Left-Right Rotation i.e. first do Zag then Zig)
In this rotation, we move every node to 1 position towards its right. This rotation is used in the case when
the element that we are searching for is either present in the root node of the BST or it is present in the
left child of the root node.
Consider that we are searching for node 9. This means that we are searching for the left child of the root
node. So, we can perform zig rotation.
So, the steps for searching node 9 are the same as we perform the search in BST and have already been
discussed. Now, since this is the zig rotation i.e. we need to rotate each element 1 position to its right. This
rotation is shown below.
16
So, every element has to go 1 position to its right. Now, we can imagine every element going to 1 position
to its right and we imagine that 9 will become the root node and 11 will; be its right child and the left
child will be 7. What about node 10?
So, node 10 is the right child of node 9 before rotation. So, after rotation, it will remain in the right subtree
of node 9. However, it will now become the left child of node 11 as shown below.
In this rotation, 2 right rotations are performed one after another. This is used in the case when the
element that we are searching for is neither the root node nor the left child of the root node but is present
below in the left subtree of the root node. This means that the node has both, parent and grandparent.
Consider the tree shown below.
17
So, if we are searching for node 3, we find it at the last level. Now, we will perform the 1st right rotation.
After one rotation, the tree is shown below. Now, we perform the second rotation.
18
So, after 2 right rotations, node 3 is now the root node of the tree.
The Zag rotation or the left rotation is performed when the element that we are searching for is either the
root node or the right child of the root node.
So, consider that we are searching for node 13. Since it is the right child of the root node, we perform the
zig rotation. Every element will rotate and reach 1 position to its current left. This is shown below.
19
So, since after rotation, 14 has to become the right child of 13 and 12 was present as its left child, 12 will
remain in the left subtree but becomes the right child of node 11. Node 13 has become the root node now.
Here, we perform 2 left rotations. Again, it is done when we are searching for an element i.e., not the root
or the right child of the root but present in the right subtree and has both parent and grandparent. So,
consider the tree shown below.
If we are searching 7, the result of the zag zag rotation is shown below.
20
So, node 7 has now become the root node of the tree.
This is a sequence of zig rotations which are then followed by a sequence of zag rotations. The zig-zag
rotation is used when the parent node and the grandparent node of a node are in LR or RL relationship
with each other. Consider the tree shown below.
21
If we want to search 5, then the parent and grandparent are in an RL relationship with each other. So, we
first perform the Right rotation i.e. Zig rotation about node 6, and then the left rotation i.e. Zag rotation
about the root node i.e. node 4, thus completing the Zig Zag rotation. This is shown below.
Finally, we can see that 5 is the root node of the tree. So, this is Zig Zag rotation.
This rotation is the same as the zig-zag rotation. The only difference is that the left rotation happens first
and then the right rotation. Consider the tree shown below.
So, here if we want to search 5, the zag zig rotation is shown below.
22
So, we can see that node 5 is at the root of the tree. This is Zag Zig Rotation.
Now that we have studied the splay trees, let us implement these rotations and other BST operations of a
splay tree in the program shown below.
23
UNIT - III DIVIDE AND CONQUER STRATEGY AND GREEDYSTRATEGY
Divide and Conquer Strategy: Quick Sort-Multiplication of large integers and Strassen's Matrix
Multiplication. Greedy Technique: Prim’s Algorithm - Kruskal’s Algorithm- Dijkistra’s
Algorithm - Huffman Trees and Code.
Divide and Conquer strategy:
Divide:
❖ Divide the array into two sub arrays by picking any key value in the array called pivot
element.
❖ Each element in the left sub array is less than or equal to pivot element.
Each element in the right sub-tree is greater than the pivot element
Conquer:
❖ Recursively divide the sub-arrays until the array will contain only one element.
Combine:
❖ Combine all the sorted elements in a group to form a list of sorted elements.
The following are some standard algorithms that follow Divide and Conquer algorithm.
1. Quicksort is a sorting algorithm. The algorithm picks a pivot element and rearranges the
array elements so that all elements smaller than the picked pivot element move to the
left side of the pivot, and all greater elements move to the right side. Finally, the
algorithm recursively sorts the sub-arrays on the left and right of the pivot element.
2. Merge Sort is also a sorting algorithm. The algorithm divides the array into two halves;
recursively sorts them, and finally merges the two sorted halves.
QUIICKSORT:
Quick sort is sorting algorithm that uses divide and conquer
methodology.In this method, division is dynamically carried out.
SORTING:
Sorting is a mechanism to arranging the unsorted elements in either ascending or descending
order.
There are two kinds of sorting.
1. InternalSort–Utilizes internal memory
2. ExternalSort–Utilizes external memory or secondary storage devices.
Algorithm:
Step(1):Let take the first element as the‘pivot”element.
Step(2):Set low index of an array to “i”.Set
high index of an array to“j”.
Step(3):if(A[i]<pivot)=>then increment“i”
Step(4):if(A[j]>pivot)=>then decrement“j”.
Step(5):if‘i”cant increment and‘j”cant decrement,then swap(A[i],A[j])
Step(6):if‘i’and‘j’are crossed(or)if(A[j]<pivot)=>then swap(A[j],pivot)
Example:
Consider the list of elements– 50,30, 10, 90,80, 20,40, 70.
Step:1
i j
50 30 10 90 80 20 40 70
pivot
Step:2
i j
50 30 10 90 80 20 40 70
pivot
We will increment‘i’,if(A[i]<pivot),we will continue to increment‘i’until the element the
element of A[i] is greater than pivot.
Step:3
i j
50 30 10 90 80 20 40 70
pivot
Increment‘i’as A[i]<pivot
Step:4
i j
50 30 10 90 80 20 40 70
pivot
A[i]>pivot,So stop incrementing‘i’
Step:5
A[j]>pivot,So decrement‘j’
i j
50 30 10 90 80 20 40 70
pivot
Step:5
A[j]<pivot,So stop decrementing‘j’
i j
50 30 10 90 80 20 40 70
pivot
Step:6
‘i’cant increment and‘j’cant decrement.Then swap(A[i],A[j])
i j
50 30 10 40 80 20 90 70
pivot
Step:7
A[i]<pivot=>Increment‘i’
i j
50 30 10 40 80 20 90 70
pivot
Step:8
A[i]>pivot=>Stop incrementing‘i’
i j
50 30 10 40 80 20 90 70
pivot
Step:9
A[j]>pivot=>Decrement‘j’
i j
50 30 10 40 80 20 90 70
pivot
Step:10
A[j]<pivot=>Stop Decrementing‘j’
i j
50 30 10 40 80 20 90 70
pivot
Step:11
‘i’cant increment and‘j’cant decrement.Then swap(A[i],A[j])
i j
50 30 10 40 20 80 90 70
pivot
Step:12
A[i]<pivot=>Increment‘i’
i, j
50 30 10 40 20 80 90 70
pivot
Step:13
A[i]>pivot=>Stop Incrementing‘i’
i, j
50 30 10 40 20 80 90 70
pivot
Step:14
A[j]>pivot=> Decrement‘j’
j i
50 30 10 40 20 80 90 70
pivot
Step:15
A[j]<pivot=>Stop Decrementing‘j’
j i
50 30 10 40 20 80 90 70
pivot
Step:16
‘i’and‘j’are crossed andA[j]<pivot=>swap(A[j],pivot)
i j
20 30 10 40 50 80 90 70
pivot
Left subarray Right SubArray
MERGE SORT:
Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls
itself for the two halves and then merges the two sorted halves. The merge() function is used for
merging two halves
A Algorithm:
A
step 1: start
step 2: declare array and left, right, mid variable
step 4: Stop
PROGRAM:
def merge(a,b):c=[]
while len(a) != 0 and len(b) != 0:
ifa[0]<b[0]:
c.append(a[0])
a.remove(a[0])else:
c.append(b[0])
b.remove(b[0])iflen(a)==0:
c=c+belse:
c=c+a
returnc
defdivide(x):
if len(x) == 0 or len(x) == 1:return x
else:
middle=len(x)//2
a = divide(x[:middle])b = divide(x[middle:])
returnmerge(a,b)
x=[89,13,67,34,90,100,103,167,25]
Sorting SpaceCom
Method Worst Case AverageCase Best Case plexity
Dumb Approach:
Let us represent number D as D = dn-1dn-2. . . d2d1d0. Each digit di is the ith least significant digit
in number D. Value of each position is given by,
d0 = 100 position
d1 = 101 position
d2 = 102 position
.
dn – 1 = 10n – 1 position.
According to position value,
45 = 4*101 + 5*100
23 = 2*101 + 3*100
For any real values a, b, c and d,
= 1035
Let’s derive formula to multiply two numbers of two digits each. Consider C = A * B. If we
consider A = a1a0 and B = b1b0, then
C=a1b1102+ a1 b0101+ a0 b1101+ a0 b0100
C= a1b1102+ (a1 b0+ a0 b1 )101+ a0 b0100
C = A * B = c2102 + c1101 + c0100
Where,
c2 = a1 * b1
c1 = a1* b0 + a0* b1
c0 = a0 * b0
Generalization:
If size of integer is n digit, then we can generalize multiplication as,
C = c210n + c110n/2 + c0
Example: Multiply 2345 with 678 using divide and conquer approach.
Solution:
Size of both operands must be even, so pad zero to the multiplier.
= 138
c1 = a1*b0 + a0*b1
= (23*78) + (45 * 06)
= 2064
c0 = a0 * b0
=(45 * 78)
= 3510
Clever Approach:
First approach:
According to dumb approach,
c2 = a1 * b1
c1 = a1*b0 + a0*b1 …(1)
c0 = a0*b0
Second approach:
Let us rewrite c1 as,
c1 = (a1 + a0) * (b1 + b0) – (c2 + c0)
= (a1b1 + a1b0 + a0b1 + a0b0) – (a1b1 + a0b0)
= a1*b0 + a0*b1 …(2)
Equation (1) and Equation (2) are the same, but the second approach of computing c1 involves
only one multiplication rather than two as it requires in the first approach.
= 138
c0 = a0*b0
= (45 * 78)
= 3510
= 68 * 84 – 3648
= 2064
It is same as c1 = (a1*b0 + a0*b1) of dumb multiplication approach, but does only one
multiplication rather than two.
C = c2 * 104 + c1 * 102 + c0
= 1380000 + 206400 + 3510 = 15, 89, 910
Strassen suggested a divide and conquer strategy-based matrix multiplication technique that
requires fewer multiplications than the traditional method. The multiplication operation is
defined as follows using Strassen’s method:
C11 = P + S – T + V
C12 = R + T
C21 = Q + S
C22 = P + R – Q + U
Where,
Greedy Approach:
A greedy algorithm is an approach for solving a problem by selecting the best option
available at the moment. It doesn't worry whether the current best result will bring the overall
optimal result. Used to solve an optimization problem.
Each character occupies 8 bits. There are a total of 15 characters in the above string. Thus, a
total of 8 * 15 = 120 bits are required to send this string.
Using the Huffman Coding technique, we can compress the string to a smaller size.
Huffman coding first creates a tree using the frequencies of the character and then generates
code for each character.
Once the data is encoded, it has to be decoded. Decoding is done using the same tree.
Huffman Coding prevents any ambiguity in the decoding process using the concept of prefix
code ie. a code associated with a character should not be present in the prefix of any other code.
The tree created above helps in maintaining the property.
Huffman coding is done with the help of the following steps.
4. Create an empty node z. Assign the minimum frequency to the left child of z and assign
the second minimum frequency to the right child of z. Set the value of the z as the sum
of the above two minimum frequencies.
5. Remove these two minimum frequencies and add the sum into the list of frequencies (*
denote the internal nodes in the figure above).
6. Insert node z into the tree.
7. Repeat steps 3 to 5 for all the characters
8. For each non-leaf node, assign 0 to the left edge and 1 to the right edge.
Starting vertex:0
Mark vertex 0 as visited
Initial table:
0 1 2 3 4 5 6 7 8
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
Adjacency vertex for 0 is 1 and 7, so find distance between(0,1)and (0,7)
Distance (0,1)
d(x, y) = d(x) + c(x, y) < d(y)
d(0, 1) = d(0) + c(0, 1) < d(1)
= (0 + 4) < ∞
= 4 < ∞ update the table V(0,1)=4
D(0,1)=4
Distance (0,7)
d(x, y) = d(x) + c(x, y) < d(y)
= (0 + 8) < ∞
= 8 < ∞ update the table V(0,7)=8
Now the altered table:
0 1 2 3 4 5 6 7 8
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
1 4 ∞ ∞ ∞ ∞ ∞ 8 ∞
Least value is 4 (vertex 1 is selected)
From the table the vertex 1 and 7, vertex 1 has least cost so we can select vertex 1. Mark 1
vertex as visited.
Adjacency vertex for 1 is 0, 2 and 7. 0 is already visited so find the distance between(1,2)and
(1,7)
Distance (1,2)
d(x, y) = d(x) + c(x, y) < d(y)
d(1,2)=d(1)+c(1,2)<d(2)
= (4 + 8) < ∞
= 12 < ∞ update the table V(1,2)=12
Distance (1,7)
d(x, y) = d(x) + c(x, y) < d(y)
d(1,7)=d(1)+c(1,7)<d(7)
= (4 + 11) < 8
= 15 < 8 table is not updated so V(1,7)=8
Now the altered table:
0 1 2 3 4 5 6 7 8
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
1 4 ∞ ∞ ∞ ∞ ∞ 8 ∞
Least values is 8(vertex 7 is selected)
7 12 ∞ ∞ ∞ ∞ 8 ∞
From the table the vertex 2 and 7, vertex 7 has least cost so we can select vertex 7.Mark vertex
7 as visited.
Adjacency vertex for 7 is 0,1,6 and 8. 0 and 1 is already visited, so find distance between
(7,6)and (7,8)
Distance (7,8)
d(x, y) = d(x) + c(x, y) < d(y)
d(7,8)=d(7)+c(7,8)<d(8)
= (8 + 7) < ∞
= 15 < ∞ update the table V(7,8)=15
Distance (7,6)
d(x, y) = d(x) + c(x, y) < d(y)
d(7,6)=d(7)+c(7,6)<d(6)
= (8 + 1) < ∞
= 9 < ∞ update the table V(7,6)=9
Now the altered table:
0 1 2 3 4 5 6 7 8
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
1 4 ∞ ∞ ∞ ∞ ∞ 8 ∞
7 12 ∞ ∞ ∞ ∞ 8 ∞
6 12 ∞ ∞ ∞ 9 15 Least value is 9(vertex 6 is selected)
From the table the vertex 2 , 6 and 8, vertex 6 has least cost so we can select vertex 6.Mark
vertex 6 as visited.
Adjacency vertex for 6 is 7,8 and 5. 7 is already visited, so find distance between (6,8)and (6,5)
Distance (6,8)
d(x, y) = d(x) + c(x, y) < d(y)
d(6,8)=d(6)+c(6,8)<d(8)
= (9 + 15) < 15
= 24 < 15 table is not updated V(6,8)=15
Distance (6,5)
d(x, y) = d(x) + c(x, y) < d(y)
d(6,5)=d(6)+c(6,5)<d(5)
= (9 + 2) < ∞
= 11 < ∞ Update the table V(6,5)=11
Now the altered table:
0 1 2 3 4 5 6 7 8
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
1 4 ∞ ∞ ∞ ∞ ∞ 8 ∞
7 12 ∞ ∞ ∞ ∞ ∞
8
6 12 ∞ ∞ ∞ 9 15
From the table the vertex 2 , 5 and 8, vertex 5 has least cost so we can select vertex 5.Mark
vertex 5 as visited.
Adjacency vertex for 5 is 2,3,4 and 6. 6 is already visited, so find distance between (5,2), (5,3)
and(5,4)
Distance (5,2)
d(x, y) = d(x) + c(x, y) < d(y)
d(5,2) = d(5)+c(5,2) < d(2)
= (11 + 4) < 12
= 15 < 12 table is not updated V(5,2)=12
Distance (5,3)
d(x, y) = d(x) + c(x, y) < d(y)
d(5,3) = d(5)+c(5,3) < d(3)
= (11 + 14) < ∞
= 25 < ∞ update the table V(5,3) = 25
Distance (5,4)
d(x, y) = d(x) + c(x, y) < d(y)
d(5,4) = d(5) + c(5,4) < d(4)
= (11 + 10) < ∞
= 22 < ∞ update the table V(5,4)=22
Now the altered table
0 1 2 3 4 5 6 7 8
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
1 4 ∞ ∞ ∞ ∞ ∞ 8 ∞
7 12 ∞ ∞ ∞ ∞ 8 ∞
6 12 ∞ ∞ ∞ 9 15
5 12 ∞ ∞ 11 15
12 25 22 Least value is 12(vertex 2 is
2 15 selected)
From the table the vertex 2,3,4 and 8, vertex 2 has least cost so we can select vertex 2.Mark
vertex 2 as visited.
Adjacency vertex for 2 is 1,3,5 and 8. 1 and 5 are already visited, so find distance between (2,3)
and(2,8)
Distance (2,3)
d(x, y) = d(x) + c(x, y) < d(y)
d(2,3) = d(2) + c(2,3) < d(3)
= (12 + 7) < 25
= 19 < 25 update the table V(2,3)
Distance (2,8)
d(x, y) = d(x) + c(x, y) < d(y)
d(2,8) = d(2) + c(2,8) < d(8)
= (12 + 2) < 15
= 14 < 15 update the table V(2,8) =14
Now the altered table
0 1 2 3 4 5 6 7 8
0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞
1 4 ∞ ∞ ∞ ∞ ∞ 8 ∞
12 ∞ ∞ ∞ ∞
7 8 ∞
∞ ∞ ∞
6 12 9 15
∞ ∞
5 12 11 15
25 22
2 12 15
19 22 Least value is 14(vertex 8 is
8 14
selected)
19 22
3
22
4
From the table the vertex 3,4 and 8, vertex 8 has least cost so we can select vertex 8.Mark
vertex 8 as visited.
Adjacency vertex for 8 is 2,6 and 7. 2,6 and 7 are already visited, so we can take next least cost
vertex from the table.
From above table the next unvisited vertex are 3 and 4. The least value is 19 so we can select
vertex 3.Mark vertex 3 as visited.
From vertex 3 there is no adjacent unvisited vertex, so we can take next minimum cost vertex.
Next minimum vertex is 4 .
Shortest path between the vertex 0 to other vertex:
Minimum
Source Destination Shortest Path
Cost
0 1 4 0-1
0 2 8 0-1-2
0 3 19 0-1-2-3
0 4 22 0-7-6-5-4
0 5 11 0-7-6-5
0 6 9 0-7-6
0 7 8 0-7
0 8 14 0-1-2-8
Prim's algorithm starts with the single node and explores all the adjacent nodes with all the
connecting edges at every step. The edges with the minimal weights causing no cycles in the
graph got selected.
Algorithm:
Step 1: Select a starting vertex
Step 2: Repeat Steps 3 and 4 until there are fringe vertices
Step 3: Select an edge 'e' connecting the tree vertex and fringe vertex that has minimum weight
Step 4: Add the selected edge and the vertex to the minimum spanning tree T
[END OF LOOP]
Step 5: EXIT
Discuss prim’s algorithm to find minimum spanning tree and analyze its complexity.
Apply the algorithm to find the minimum spanning tree for the following graph.
Now, the tree S-A is treated as one node and we check for all edges going out from it. We
select the one which has the lowest cost and include it in the tree.
After this step, S-A-C tree is formed. Now we'll again treat it as a node and will check all the
edges again. However, we will choose only the least cost edge. In this case, C-D is the new
edge, which is less than other edges' cost 8,4, etc.
After adding node D to the spanning tree, we now have two edges going out of it having the
same cost 2, i.e. D-T and D-B. Thus, we can add either one. But the next step will again yield
edge 2 as the least cost. Hence, we are showing a spanning tree with both edges included.
Kruskal’s Algorithm:
Kruskal's Algorithm is used to find the minimum spanning tree for a
connected weighted graph. The main target of the algorithm is to find the subset of edges by
using which we can traverse every vertex of the graph. It follows the greedy approach that finds
an optimum solution at every stage instead of focusing on a global optimum.
Algorithm steps:
Step 3: Check if the new edge creates a cycle or loop in a spanning tree.
Step 4: If it doesn’t form the cycle, then include that edge in MST. Otherwise, discard it.
Discuss Kruskal’s algorithm to find minimum spanning tree and analyze its time
complexity. Apply the algorithm to find the minimum spanning tree for the following
graph.
Step 1:Arrange all edges in their increasing order of weight.
The least cost is 2 and edges involved are B,D and D,T. We add them. Adding them does not
violate spanning tree properties, so we continue to our next edge selection.
Next cost is 3, and associated edges are A,C and C,D. We add them again
Next cost in the table is 4, and we observe that adding it will create a circuit in the graph. −
We ignore it. In the process we shall ignore/avoid all edges that create a circuit.
Edges with cost 5 and 6 also create circuits. We ignore them and move on.
Now we are left with only one node to be added. Between the two least cost edges available 7
and 8, we shall add the edge with cost 7.
We have included all the nodes of the graph and we now have minimum cost spanning tree.
UNIT IV DYNAMIC PROGRAMMING AND BACKTRACKING
Dynamic Programming: Computing binomial coefficient - Warshall's and Floyd's algorithm . Backtracking:
General method – N Queens Problem – Hamiltonian Circuits .Exhaustive search: DFS, BFS.
Dynamic Programming:
Dynamic programming is a technique for solving a complex problem by first breaking
into a collection of simpler subproblems, solving each subproblem just once, and then storing
their solutions to avoid repetitive computation.
Dynamic programming is a technique that breaks the problems into sub-problems, and
saves the result for future purposes so that we do not need to compute the result again. The
subproblems are optimized to optimize the overall solution is known as optimal substruct ure
property. The main use of dynamic programming is to solve optimization problems.
With the base values F(0) = 0, and F(1) = 1. To calculate the other numbers, we follow the
above relationship. For example, F(2) is the sum f(0) and f(1), which is equal to 1.
How can we calculate F(20)?
The F(20) term will be calculated using the nth formula of the Fibonacci series. The below
figure shows that how F(20) is calculated.
The following are the steps that the dynamic programming follows:
o It breaks down the complex problem into simpler subproblems.
o It finds the optimal solution to these sub-problems.
o It stores the results of subproblems (memorization). The process of storing the results
of subproblems is known as memorization.
o It reuses them so that same sub-problem is calculated more than once.
o Finally, calculate the result of the complex problem.
There are two approaches to dynamic programming:
Top-down approach
The top-down approach follows the memorization technique, while bottom-up approach
follows the tabulation method. Here memorization is equal to the sum of recursion and
caching. Recursion means calling the function itself, while caching means storing the
intermediate results.
Bottom-Up approach:
The bottom-up approach is also one of the techniques which can be used to
implement the dynamic programming.
It uses the tabulation technique to implement the dynamic programming
approach.
It solves the same kind of problems but it removes the recursion. If we remove
the recursion, there is no stack overflow issue and no overhead of the recursive
functions. In this tabulation technique, we solve the problems and store the
results in a matrix.
Computing Binomial Coefficient:
Binomial coefficient is a coefficient of a term in the expansion of the binomial(x+y)n .
According to the binomial theorem,
(x+y)n = nC0 xn y0 +nC1 xn-1y1+nC2 xn-2 y2 +………+nCn x0yn
Where nC0+nC1 +nC2 +………+nCn are binomial coefficient.
Binomial Coefficient using Dynamic Programming:
Many sub problems are called again and again, since they have an overlapping
sub problems property. Re-computations of the same sub problems is avoided by
storing their results in the temporary array C[i, j] in a bottom up manner.
The optimal substructure for using dynamic programming is stated as,
r→ 0 1 2 3 4
n↓
0 1
1 1 1
2 1 2 1
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5
6 1 6 15 20 15 C(6,4) =15
Problem : Given a set of items, each having different weight and value or profit
associated with it. Find the set of items such that the total weight is less than or equal
to a capacity of the knapsack and the total value earned is as large as possible.
The knapsack problem is useful in solving resource allocation problem. Let X = < x 1,
x2, x3 , . . . . . , xn> be the set of n items. Sets W = <w1 , w2, w3, . . . , w n> and V = < v1 ,
v2, v3, . . . , vn> are weight and value associated with each item in X. Knapsack
capacity is M unit.
The knapsack problem is to find the set of items which maximizes the profit such that
collective weight of selected items does not cross the knapsack capacity.
Select items from X and fill the knapsack such that it would maximize the profit.
Knapsack problem has two variations. 0/1 knapsack, that does not allow breaking of
items. Either add an entire item or reject it. It is also known as a binary knapsack.
Fractional knapsack allows breaking of items. Profit will be earned proportionally
Example: Find an optimal solution for following 0/1 Knapsack problem using dynamic
programming: Number of objects n = 4, Knapsack Capacity M = 5, Weights (W 1, W 2,
W3 , W 4) = (2, 3, 4, 5) and profits (P1 , P2, P3, P4 ) = (3, 4, 5, 6).
Solution:
Solution of the knapsack problem is defined as,
Initial table:
Boundary conditions would be V [0, i] = V[i, 0] = 0. Initial configuration of table looks
like.
Item
Weight Value 0 1 2 3 4 5
values
- - 0 0 0 0 0 0 0
2 3 1 0
3 4 2 0
4 5 3 0
5 6 4 0
Filling first column, j = 1
V [1, 1] ⇒ i = 1, j = 1, wi = w1 = 2
As, j < wi , V [i, j] = V [i – 1, j]
V [1, 1] = V [0, 1] = 0
V [2, 1] ⇒ i = 2, j = 1, wi = w2 = 3
As, j < wi , V [i, j] = V [i – 1, j]
V [2, 1] = V [1, 1] = 0
V [3, 1] ⇒ i = 3, j = 1, wi = w3 = 4
As, j < wi, V [i, j] = V [i – 1, j]
V [3, 1] = V [2, 1] = 0
V [4, 1] ⇒ i = 4, j = 1, wi = w4 = 5
As, j < wi, V[i, j] = V[i – 1, j]
V [4, 1] = V [3, 1] = 0
Filling first column, j = 2
V[1, 2] ⇒ i = 1, j = 2, wi = w1 = 2, vi = 3
As, j ≥ w i, V [i, j]=max {V [i – 1, j], vi + V[i – 1, j – wi] }
= max {V [0, 2], 3 + V [0, 0]}
V[1, 2] = max (0, 3) = 3
V[2, 2] ⇒ i = 2, j = 2, wi = w2 = 3, vi = 4
As, j < wi, V [i, j] = V[i – 1, j]
V[2, 2] = V[1, 2] = 3
V[3, 2] ⇒ i = 3, j = 2, wi = w3 = 4, vi = 5
As, j < wi, V[i, j] = V[i – 1, j]
V[3, 2] = V [2, 2] = 3
V[4, 2] ⇒ i = 4, j = 2, wi = w4 = 5, vi = 6
As, j < wi, V[i, j] = V[i – 1, j]
V[4, 2] = V[3, 2] = 3
V[1, 3] ⇒ i = 1, j = 3, wi = w1 = 2, vi = 3
As, j ≥ w i, V [i, j]=max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [0, 3], 3 + V [0, 1]}
V[1, 3] = max (0, 3) = 3
V[2, 3] ⇒ i = 2, j = 3, wi = w2 = 3, vi = 4
As, j ≥ wi, V [i, j] = max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [1, 3], 4 + V [1, 0]}
V[2, 3] = max (3, 4) = 4
V[3, 3] ⇒ i = 3, j = 3, wi = w3 = 4, vi = 5
As, j < wi , V [i, j] = V [i – 1, j]
V[3, 3] = V [2, 3] = 4
V[4, 3] ⇒ i = 4, j = 3, wi = w4 = 5, vi = 6
As, j < wi, V[i, j] = V[i – 1, j]
V[4, 3] = V [3, 3] = 4
V[1, 4] ⇒ i = 1, j = 4, w i = w1 = 2, vi = 3
As, j ≥ w i, V [i, j]=max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [0, 4], 3 + V [0, 2]}
V[1, 4] = max (0, 3) = 3
V[2, 4] ⇒ i = 2, j = 4, wi = w2 = 3 , vi = 4
As, j ≥ wi , V [i, j] =max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [1, 4], 4 + V [1, 1]}
V[2, 4] = max (3, 4 + 0) = 4
V[3, 4] ⇒ i = 3, j = 4, wi = w3 = 4, vi = 5
As, j ≥ w i, V [i, j]=max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [2, 4], 5 + V [2, 0]}
V[3, 4] = max (4, 5 + 0) = 5
V[4, 4] ⇒ i = 4, j = 4, wi = w4 = 5, vi = 6
As, j < wi , V [i, j] = V [i – 1, j]
V[4, 4] = V [3, 4] = 5
Filling first column, j = 5
V [1, 5] ⇒ i = 1, j = 5, wi = w1 = 2, vi = 3
As, j ≥ wi , V [i, j] = max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [0, 5], 3 + V [0, 3]}
V[1, 5] = max (0, 3) = 3
V[2, 5] ⇒ i = 2, j = 5, wi = w2 = 3, vi = 4
As, j ≥ wi , V [i, j] =max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [1, 5], 4 + V [1, 2]}
V[2, 5] = max (3, 4 + 3) = 7
V[3, 5] ⇒ i = 3, j = 5, w i = w3 = 4, vi = 5
As, j ≥ wi, V [i, j] = max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [2, 5], 5 + V [2, 1]}
V[3, 5] = max (7, 5 + 0) = 7
V [4, 5] ⇒ i = 4, j = 5, wi = w4 =5, vi = 6
As, j ≥ w i, V [i, j] = max {V [i – 1, j], vi + V [i – 1, j – wi] }
= max {V [3, 5], 6 + V [3, 0]}
V[4, 5] = max (7, 6 + 0) = 7
Final table would be,
Item
Weight Value 0 1 2 3 4 5
values
- - 0 0 0 0 0 0 0
2 3 1 0 0 3 3 3 3
3 4 2 0 0 3 4 4 7
4 5 3 0 0 3 4 5 7
5 6 4 0 0 3 4 5 7
V[i, j] = V[4, 5] = 7
V[i – 1, j] = V[3, 5] = 7
V[i, j] = V[i – 1, j], so don’t select ith item and check for the previous item.
so i = i – 1 = 4 – 1 = 3
Solution Set S = { }
Step 2 : i = 3, j = 5
V[i, j] = V[3, 5] = 7
V[i – 1, j] = V[2, 5] = 7
V[i, j] = V[i – 1, j], so don’t select ith item and check for the previous item.
so i = i – 1 = 3 – 1 = 2
Solution Set S = { }
Step 3 : i = 2, j = 5
V[i, j] = V[2, 5] = 7
V[i – 1, j] = V[1, 5] = 3
V[i, j] ≠ V[i – 1, j], so add item Ii = I2 in solution set.
Reduce problem size j by wi
j = j – wi = j – w2 = 5 – 3 = 2
i=i–1=2–1=1
Solution Set S = {I2}
Step 4 : i = 1, j = 2
V[1, j] = V[1, 2] = 3
V[i – 1, j] = V[0, 2] = 0
V[i, j] ≠ V[i – 1, j], so add item Ii = I1 in solution set.
Reduce problem size j by wi
j = j – wi = j – w1 = 2 – 2 = 0
Solution Set S = {I1 , I2}
Problem size has reached to 0, so final solution is
S = {I1, I2 } X = (1,1,0,0) Earned profit = P1 + P2 = 7
Algorithm:
For a graph with N vertices:
Step 1: Initialize the shortest paths between any 2 vertices with Infinity.
Step 2: Find all pair shortest paths that use 0 intermediate vertices, then find the shortest paths
that use 1 intermediate vertex and so on.. until using all N vertices as intermediate nodes.
Step 3: Minimize the shortest paths between any 2 pairs in the previous operation.
Step 4: For any 2 vertices (i,j) , one should actually minimize the distances between this pair
using the first K nodes, so the shortest path will be: min(dist[i][k]+dist[k][j],dist[i][j]).
dist[i][k] represents the shortest path that only uses the first K vertices, dist[k][j] represents the
shortest path between the pair k,j. As the shortest path will be a concatenation of the shortest
path from i to k, then from k to j.
for k in range(1,n):
for i in range(1,n):
for j in range(1,n):
dist[i][j] = min( dist[i][j], dist[i][k] + dist[k][j] )
Example:
Warshall Algorithm:
1 2 $20
2 5 $30
3 10 $50
4 5 $10
Disadvantages:
o It requires lots of memory since each level of the tree must be saved into memory to
expand the next level.
o BFS needs lots of time if the solution is far away from the root node.
Example:
In the below tree structure, we have shown the traversing of the tree using BFS algorithm
from the root node S to goal node K. BFS search algorithm traverse in layers, so it will follow
the path which is shown by the dotted arrow, and the traversed path will be:
1. S---> A--->B---->C--->D---->G--->H--->E---->F---->I >K
Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of
nodes traversed in BFS until the shallowest Node. Where the d= depth of shallowest solution
and b is a node at every state.
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of
frontier which is O(b d).
class Graph:
def __init__(self):
self.graph = defaultdict(list)
# Example usage:
# Create a graph
g = Graph()
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 0)
g.add_edge(2, 3)
g.add_edge(3, 3)
2. Depth-first Search
o Depth-first search isa recursive algorithm for traversing a tree or graph data structure.
o It is called the depth-first search because it starts from the root node and follows each
path to its greatest depth node before moving to the next path.
o DFS uses a stack data structure for its implementation.
o The process of the DFS algorithm is similar to the BFS algorithm.
Advantage:
o DFS requires very less memory as it only needs to store a stack of the nodes on the
path from root node to the current node.
o It takes less time to reach to the goal node than BFS algorithm (if it traverses in the
right path).
Disadvantage:
o There is the possibility that many states keep re-occurring, and there is no guarantee
of finding the solution.
o DFS algorithm goes for deep down searching and sometime it may go to the infinite
loop.
Example:
class Graph:
def __init__(self):
self.graph = defaultdict(list)
# Example usage:
# Create a graph
g = Graph()
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 2)
g.add_edge(2, 0)
g.add_edge(2, 3)
g.add_edge(3, 3)
In the below search tree, we have shown the flow of depth-first search, and it will follow the
order as:
Root node--->Left node----- > right node.
It will start searching from root node S, and traverse A, then B, then D and E, after traversing
E, it will backtrack the tree as E has no other successor and still goal node is not found. After
backtracking it will traverse node C and then G, and here it will terminate as it found goal
node.
Completeness: DFS search algorithm is complete within finite state space as it will expand
every node within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the
algorithm. It is given by:
T(n)= 1+ n2 + n3 + ......... + nm=O(nm)
Where, m= maximum depth of any node and this can be much larger than d (Shallowest
solution depth)
Space Complexity: DFS algorithm needs to store only single path from the root node, hence
space complexity of DFS is equivalent to the size of the fringe set, which is O(bm).
The iterative deepening algorithm is a combination of DFS and BFS algorithms. This search
algorithm finds out the best depth limit and does it by gradually increasing the limit until a
goal is found.
This algorithm performs depth-first search up to a certain "depth limit", and it keeps
increasing the depth limit after each iteration until the goal node is found.
This Search algorithm combines the benefits of Breadth-first search's fast search and depth-
first search's memory efficiency.
The iterative search algorithm is useful uninformed search when search space is large, and
depth of goal node is unknown.
Advantages:
o It combines the benefits of BFS and DFS search algorithm in terms of fast search and
memory efficiency.
Disadvantages:
o The main drawback of IDDFS is that it repeats all the work of the previous phase.
# Example usage:
# Create a graph
g = Graph()
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 3)
g.add_edge(1, 4)
g.add_edge(2, 5)
g.add_edge(2, 6)
start_node = 0
goal_node = 6
Example:
Following tree structure is showing the iterative deepening depth-first search. IDDFS
algorithm performs various iterations until it does not find the goal node. The iteration
performed by the algorithm is given as:
1'st Iteration --------> A
2'nd Iteration -------> A, B, C
3'rd Iteration -------- >A, B, D, E, C, F, G
4'th Iteration --------- >A, B, D, H, I, E, C, F, K, G
In the fourth iteration, the algorithm will find the goal node.
Completeness:
This algorithm is complete is if the branching factor is finite.
Time Complexity:
Let's suppose b is the branching factor and depth is d then the worst-case timecomplexity is
O(b d).
Optimal:
IDDFS algorithm is optimal if path cost is a non- decreasing function of the depth ofthe node.
1.BACKTRACKING
General Method:
Backtracking is used to solve problem in which a sequence of objects is chosen from a specified set so that the
sequence satisfies some criterion. The desired solution is expressed as an n-tuple (x1, , xn) where each xi Є S, S
being a finiteset.
The solution is based on finding one or more vectors that maximize, minimize, or satisfy a criterion function P
(x1, , xn). Form a solution and check at every step if this has any chance of success. If the solution at any point
seems not promising, ignore it. All solutions requires a set of constraints divided into two categories: explicit and
implicit constraints.
Definition 1: Explicit constraints are rules that restrict each xi to take on values only from a given set. Explicit
constraints depend on the particular instance I of problem being solved. All tuples that satisfy the explicit
constraints define a possible solution space for I.
Definition 2: Implicit constraints are rules that determine which of the tuples in the solution space of I satisfy the
criterion function.
For 8-queensproblem:
Explicit constraints using 8-tuple formation, for this problem are S= {1, 2, 3, 4, 5, 6, 7,8}.
The implicit constraints for this problem are that no two queens can be the same (i.e., all queens must be on
different columns) and no two queens can be on the same diagonal.
Backtracking is a modified depth first search of a tree. Backtracking algorithms determine problem solutions by
systematically searching the solution space for the given problem instance. This search is facilitated by using a
tree organization for the solution space.
Backtracking is the procedure whereby, after determining that a node can lead to nothing but dead end, we go
back (backtrack) to the nodes parent and proceed with the search on the next child.
A backtracking algorithm need not actually create a tree.
Rather, it only needs to keep track of the values in the current branch being investigated. This is the way we
implement backtracking algorithm. We say that the state space tree exists implicitly in the algorithm because it is
not actually constructed.
State space: is the set of paths from root node to other nodes. State space tree is the tree organization of the
solution space. The state space trees are called static trees.
Terminology:
Problem state is each node in the depth first search tree.
Solution states are the problem states „S‟ for which the path from the root node to „S‟ defines a tuple in the
solution space.
Answer states are those solution states for which the path from root node to s defines a tuple that is a member of
the set of solutions.
Live node is a node that has been generated but whose children have not yet been generated.
E-node is a live node whose children are currently being explored. In other words, an E-node is a node currently
being expanded.
Dead node is a generated node that is not to be expanded or explored any further. All children of a dead node
have already beenexpanded.
Branch and Bound refers to all state space search methods in which all children of an E-node are generated
before any other live node can become theE-node.
Depth first node generation with bounding functions is called backtracking. State generation methods in which
the E-node remains the E-node until it is dead, lead to branch and bound methods.
N-Queens Problem:
Let us consider, N = 8. Then 8-Queens Problem is to place eight queens on an 8 x 8 chessboard so that no two
“attack”, that is, no two of them are on the same row, column, or diagonal.
All solutions to the 8-queens problem can be represented as 8-tuples (x1,......... , x8), where xi is the column of the
ith row where the ith queen is placed.
The explicit constraints using this formulation are Si = {1, 2, 3, 4, 5, 6, 7, 8}, 1 <i < 8.
Therefore the solution space consists of 888-tuples.
The implicit constraints for this problem are that no two xi‟s can be the same (i.e., all queens must be on
different columns) and no two queens can be on the same diagonal.
This realization reduces the size of the solution space from 88 tuples to 8! Tuples.
The promising function must check whether two queens are in the same column or diagonal:
Suppose two queens are placed at positions (i, j) and (k, l) Then:
Initialization: The algorithm initializes a 2D array called the "distance matrix". This matrix
stores the shortest distances between all pairs of vertices in the graph. Initially, it's filled with the
direct edge weights if there's an edge between two vertices, otherwise, it's set to infinity. Also,
the distance from a vertex to itself is set to 0.
Iterative Update: The algorithm iterates through all vertices and considers each vertex as an
intermediate vertex in the shortest path. For each pair of vertices (i, j), it checks if the shortest
path from vertex i to vertex j can be improved by including vertex k as an intermediate vertex. If
it can, the distance matrix is updated to reflect the new shortest path.
Termination: After all iterations, the distance matrix will contain the shortest distances between
all pairs of vertices in the graph.
Floyd's algorithm has a time complexity of O(V^3), where V is the number of vertices in the
graph.
for i in range(V):
for j in range(V):
dist[i][j] = graph[i][j]
for k in range(V):
for i in range(V):
for j in range(V):
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
return dist
# Example graph represented as an adjacency matrix
graph = [
[0, 5, INF, 10],
[INF, 0, 3, INF],
[INF, INF, 0, 1],
[INF, INF, INF, 0]
]
shortest_distances = floyd(graph)
for row in shortest_distances:
print(row)
Vertex 0 is connected to vertex 1 with weight 5 and to vertex 3 with weight 10.
Vertex 1 is connected to vertex 2 with weight 3.
Vertex 2 is connected to vertex 3 with weight 1.
Applying Floyd's algorithm on this graph will yield the shortest distances between all pairs of
vertices:
[0, 5, 8, 9]
[inf, 0, 3, 4]
[inf, inf, 0, 1]
[inf, inf, inf, 0]
This output shows that the shortest distance from vertex i to vertex j is stored at index [i][j] in
the resulting distance matrix.
UNIT V BRANCH-AND-BOUND,NP PROBLEMS AND APPROXIMATION
ALGORITHMS
Branch and Bound-Assignment -Knapsack problem – Traveling salesman problem -
NPComplete and NP-Hard problems. Approximation Algorithms - NP Hard
ProblemsKnapsack and Travelling Sales Man Problem..
Branch-And-Bound:
Branch and Bound is another method to systematically search a solution space. Just
like backtracking, we will use bounding functions to avoid generating subtrees that do not
contain an answer node.
It has a bounding function, which goes far beyond the feasibility test as a mean
to prune efficiently the search tree.
Branch and Bound refers to all state space search methods in which all children of the E-node
are generated before any other live node becomes the E-node.
Live node is a node that has been generated but whose children have not yet
been generated.
E-node is a live node whose children are currently being explored. In other words,
an E-node is a node currently being expanded.
Dead node is a generated node that is not to be expanded or explored any further.
All children of a dead node have already been expanded.
Branch-an-bound refers to all state space search methods in which all children of
an E-node are generated before any other live node can become the E-node.
Assignment Problem:
Given n tasks and n agents.
Each agent has a cost to complete each task.
Assign each agent a task to minimize cost.
Example:
There are two approaches to calculate the cost function:
1. For each worker, we choose job with minimum cost from list of unassigned jobs (take minimum
entry from each row).
2. For each job, we choose a worker with lowest cost for that job from list of unassigned workers
(take minimum entry from each column).
In this article, the first approach is followed.
Let’s take below example and try to calculate promising cost when Job 2 is assigned to worker A.
Since Job 2 is assigned to worker A (marked in green), cost becomes 2 and Job 2 and worker A
becomes unavailable (marked in red).
Now we assign job 3 to worker B as it has minimum cost from list of unassigned jobs. Cost becomes 2
+ 3 = 5 and Job 3 and worker B also becomes unavailable.
Finally, job 1 gets assigned to worker C as it has minimum cost among unassigned jobs and job 4 gets
assigned to worker C as it is only Job left. Total cost becomes 2 + 3 + 5 + 4 = 14.
Below diagram shows complete search space diagram showing optimal solution path in green.
Knapsack Problem:
Example:
Solve the following instance of the knapsack problem using branch andbound.
Traveling salesman problem:
Travelling Salesman Problem (TSP) Problem is defined as “given n cities and distance
between each pair of cities, find out the path which visits each city exactly once and come
back to starting city, with the constraint of minimizing thetravelling distance.”
Algorithm :
1. Initially, graph is represented by cost matrix C, where
Cij = cost of edge, if there is a direct path from city i to city j
Cij = ∞, if there is no direct path from city i to city j.
2. Convert cost matrix to reduced matrix by subtracting minimum values from appropriate
rows and columns, such that each row and column contains at least one zero entry.
3. Find cost of reduced matrix. Cost is given by summation of subtracted amount from the
cost matrix to convert it in to reduce matrix.
(c) Reduce A again, except rows and columns having all ∞ entries.
7. Compute the cost of newly created reduced matrix as,Cost =
L + Cost(i, j) + r
Find the solution of following travelling salesman problem using branch and bound
method.
Solution:
Reduce above cost matrix by subtracting minimum value from each row andcolumn.
Reduce it subtracting minimum value from corresponding column. Doing this weget,
Cost of M1 = C(1)
= Row reduction cost + Column reduction cost
= (10 + 2 + 2 + 3 + 4) + (1 + 3) = 25
This means all tours in graph has length at least 25. This is the optimal cost of thepath.
M2 is already reduced.
Cost of node 2 :
Cost of node 3:
= 25 + 0 + 0 = 25
Cost of node 5:
Node 4 has minimum cost for path 1-4. We can go to vertex 2, 3 or 5. Let’s explore all
three nodes.
Matrix M8 is reduced.
Cost of node 8:
NP PROBLEMS:
P – POLYNOMIAL TIME
P – POLYNOMIAL TIME
EXAMPLE :
NP VS P:
Polynomial Reduction:
NP Complete
Example: Solve the knapsack problem for the following instance using greedy approach. The item
can be completely selected or skipped completely.