Cs6402 DAA Notes (Unit-3)

2014
CS6402 DESIGN AND ANALYSIS OF ALGORITHMS
Dept of CSE
RMD Engineering
College
CS6402- Design And Analysis Of Algorithms

R.M.D.E.C.
UNIT III
DYNAMIC PROGRAMMING AND GREEDY TECHNIQUE
Dynamic Programming
Computing a Binomial Coefficient
Warshalls algorithm
Floyd algorithm
Optimal Binary Search Trees
Knapsack Problem & Memory functions
Greedy Technique
Prims algorithm
Kruskal's Algorithm
Dijkstra's Algorithm
Huffman Trees
DYNAMIC PROGRAMMING
Disadvantage of divide and conquer technique:
One disadvantage of using Divide-and-Conquer is that the process of recursively solving separate subinstances can result in the same computations being performed repeatedly since identical sub-instances may arise.
What is dynamic programming?
Dynamic programming is a technique for solving problems with overlapping sub problems.
Main idea:
set up a recurrence relating a solution to a larger instance to solutions of some smaller instances
solve smaller instances once
record solutions in a table
extract solution to the initial instance from that table
Example:
The Fibonacci numbers are the elements of the sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . ,
which can be defined by the simple recurrence
F(n) = F(n 1) + F(n 2) for n > 1
F(4)
F(3)
This F(2) is saved
F(2) F(1)
F(1)
F(2)
F(1)
reused from memory.(not computed again)

F(0)
F(0)
variations of dynamic programming technique

1) Top-Down:
Start solving the given problem by breaking it down. If you see that the problem has been solved already, then
just return the saved answer. If it has not been solved, solve it and save the answer. This is usually easy to think of
and very intuitive. This is referred to as Memorization.
2) Bottom-Up:
Analyze the problem and see the order in which the sub-problems are solved and start solving from the
trivial sub problem, up towards the given problem. In this process, it is guaranteed that the sub problems are solved
before solving the problem. This is usually referred to as Dynamic Programming(DP).
Principle of optimality
Page 2

R.M.D.E.C.
It says that an optimal solution to any instance of an optimization problem is composed of optimal solutions
to its subinstances. The principle of optimality holds much more often than not.
1.COMPUTING BINOMIAL COOFFICEIENT:
What is binomial coefficient?
The binomial coefficient (n; k) is the number of ways of picking k unordered outcomes from n possibilities,
also known as a combination or combinatorial number. The symbols nC_k and (n; k) are used to denote a binomial
coefficient, and are sometimes read as "n choose k."
The value of the binomial coefficient for nonnegative and is given explicitly by
Computing a binomial coefficient by DP

Binomial coefficients are coefficients of the binomial formula:
(a + b)n = C(n,0)anb0 + . . . + C(n,k)an-kbk + . . . + C(n,n)a0bn
Recurrence: C(n,k) = C(n-1,k) + C(n-1,k-1) for n > k > 0
C(n,0) = 1, C(n,n) = 1 for n 0
Value of C(n,k) can be computed by filling a table:
Page 3

R.M.D.E.C.
Analysis:
Time efficiency: (nk)
Space efficiency: (nk)
Example:
Value of C(5,3) is computed as follows
2.WARSHALLS ALGORITHM
Page 4

R.M.D.E.C.
Computes the transitive closure of a relation

Alternatively: existence of all nontrivial paths in a digraph
Example of transitive closure:
Adjacency matrix
Transitive closure
Adjacencymatrix
the adjacency matrix A = {aij} of a directed graph is the boolean matrix that has 1 in its ith row and jth
column if and only if there is a directed edge from the ith vertex to the jth vertex.
What is transitive closure?

The transitive closure of a directed graph with n vertices can be defined as the n n boolean matrix T = {tij}, in which
the element in the ith row and the jth column is 1 if there exists a nontrivial path (i.e., directed path of a positive length) from
the ith vertex to the jth vertex; otherwise, tij is 0.
Basic idea of the algorithm
Constructs transitive closure T as the last matrix in the sequence of n-by-n matrices R(0), ,R(k),
,R whereR(k)[i,j] = 1 iff there is nontrivial path from i to j with only first k vertices allowed as intermediate
Note that R(0) = A (adjacency matrix),R(n)= T (transitive closure)
(n)
Recurrence relating elements R(k) to elements of R(k-1)is:

R(k)[i,j] = R(k-1)[i,j] or(R(k-1)[i,k] and R(k-1)[k,j])
It implies the following rules for generating R(k) from R(k-1):
Rule 1 If an element in row i and column j is 1 in R(k-1), it remains 1 in R(k)
Rule 2 If an element in row i and column j is 0 in R(k-1),it has to be changed to 1 in R(k) if and only
if the element in its row i and column k and the element in its column j and row k are both
1s in R(k-1)
Page 5

R.M.D.E.C.
Time complexity: (n3)
Example:
Page 6

R.M.D.E.C.
3. FLOYD ALGORITHM
Computes the shortest path between all pair of vertices in weighted diagraph
Example:
Weighted matrix
the weighted matrix W = {wij} of a directed graph is that has +ve weight in its ith row and jth column if and
only if there is a directed edge from the ith vertex to the jth vertex else
Distancematrix
The Distance matrix D = {dij} ,wherethe element dij in the ith row and the jth column of this matrix
indicates the length of the shortest path from the ith vertex to the jth vertex.
Constructs Distancematrix D as the last matrix in the sequence of n-by-n matrices D0), ,D(k),
,D(n)whereD(k)[i,j] gives shortest path from i to j with only first k vertices allowed as intermediate
Note that D(0) = A (weighted matrix),D(n)= D(all pair shortest path matrix)
Recurrence
It implies the following rule for generating D(k) from D(k-1):
Page 7

R.M.D.E.C.
Element in row i and column j of the current distance matrix D(k1) is replaced by the sum of the elements in the same
row i and the column k and in the same column j and the row k if and only if the latter sum is smaller than its current value.

Example:
Page 8

R.M.D.E.C.
4.OPTIMAL BINARY SEARCH TREE(OBST)

Definition:
Optimal binary search tree is one for which the average number of comparisons in the search is as small as
possible.
Problem:
Given n keys a1 <<an and probabilities p1pn searching for them, find a BST with a minimum
average number of comparisons in successful search.
Since total number of BSTs with n nodes is given by (nth Catalan number)
which grows exponentially, brute force is hopeless.

Example: What is an optimal BST for keys A, B, C, and D with search probabilities 0.1, 0.2, 0.4, and 0.3,
respectively?
Page 9

R.M.D.E.C.
The average number of comparisons in a successful search in the first of these trees is 0.1 . 1+ 0.2 . 2 + 0.4 .
3+ 0.3 . 4 = 2.9, and for the second one it is 0.1 . 2 + 0.2 . 1+ 0.4 . 2 + 0.3 . 3= 2.1. Neither of these two trees is, in
fact, optimal. Instead of trying all 14 possible BSTs, OBST algorithms computes the single optimal BST
Let C[i,j] be minimum average number of comparisons made in T[i,j], optimal BST for keys ai<
<aj,where 1 i j n. Consider optimal BST among all BSTs with some ak (i k j) as their root; T[i,j] is the best
among them.
Hence,
Page 10

R.M.D.E.C.
(Table for constructing OBST)

Example:
key
probability
B
0.1
C
0.2
D
0.4
0.3
The tables below are filled diagonal by diagonal: the left one is filled using the recurrence
C[i,j] = min {C[i,k-1] + C[k+1,j]} + ps ,
C[i,i] = pi ;
the right one, for trees roots, records ks values giving the minima
Page 11

R.M.D.E.C.
The final table is given as
Thus, the average number of key comparisons in the optimal tree is equal to1.7.
Page 12

R.M.D.E.C.
Constructing BST from Table:
Since R(1, 4) = 3, the root of the optimal tree contains the third key, i.e., C. Its left subtree is made up of
keys A and B, and its right subtree contains just key D (why?).
To find the specific structure of these subtrees, we find first their roots by consulting the root table again as
follows. Since R(1, 2) = 2, the root of the optimaltree containing A and B is B, with A being its left child (and the
root of the one node tree: R(1, 1) = 1). Since R(4, 4) = 4, the root of this one-node optimal tree is its only key D.
The resultant BST is
5.KNAPSACK PROBLEM
Definition
Given n items of
integer weights: w1 w2 wn
values:
v1 v2 vn
a knapsack of integer capacity W
find most valuable subset of the items that fit into the knapsack
problem instance defined by first i items and capacity j (j W).
Let V[i,j] be optimal value of such instance. Then
Page 13

R.M.D.E.C.
Initial conditions: V[0,j] = 0 and V[i,0] = 0

The goal?
To find V[n,W]
Table for solving Knapsack problem
Example:
Solution
Page 14

R.M.D.E.C.
V[4,5]=37
How to find the subset of items?
We can find the composition of an optimal subset by backtracing the computations of this entry in the table. Since
F(4, 5) > F(3, 5), item 4 has to be included in an optimal solution along with an optimal subset for filling 5 2 = 3
remaining units of the knapsack capacity. The value of the latter is F(3, 3). Since F(3, 3) = F(2, 3), item 3 need not
be in an optimal subset. Since F(2, 3) > F(1, 3), item 2 is a part of an optimal selection, which leaves element F(1, 3
1) to specify its remaining composition. Similarly, since F(1, 2) > F(0, 2), item 1 is the final part of the optimal
solution {item 1, item 2, item 4}.
Complexity:
The time efficiency and space efficiency of this algorithm are both in (nW).
The time needed to find the composition of an optimal solution is inO(n).
Memory Functions
The direct top-down approach to finding a solution to such a recurrence leads to an algorithm that solves common
subproblems more than once and hence is very inefficient (typically, exponential or worse). The classic dynamic
programming approach, on the other hand, works bottom up: it fills a table with solutions to all smaller
subproblems, but each of them is solved only once. An unsatisfying aspect of this approach is that solutions to some
of these smaller subproblems are often not necessary for getting a solution to the problem given. Since this
drawback is not present in the top-down approach, it is natural to try to combine the strengths of the top-down and
bottom-up approaches. The goal is to get a method that solves only subproblems that are necessary and does so only
once. Such a method exists; it is based on using memory functions.
This method solves a given problem in the top-down manner but, in addition, maintains a table of the kind that
would have been used by a bottom-up dynamic programming algorithm. Initially, all the tables entries are
initialized with a special null symbol to indicate that they have not yet been calculated. Thereafter, whenever a
Page 15

R.M.D.E.C.
new value needs to be calculated, the method checks the corresponding entry in the table first: if this entry is not
null, it is simply retrieved from the table; otherwise, it is computed by the recursive call whose result is then
recorded in the table.
Let us apply the memory function method to the instance considered in previous Example. The table in Figure
gives the results. Only 11 out of 20 nontrivial values (i.e., not those in row 0 or in column 0) have been computed
Just one nontrivial entry, V (1, 2), is retrieved rather than being recomputed. For larger instances, the proportion of
such entries can be significantly larger.
In general, we cannot expect more than a constant-factor gain in using the memory function method for the
knapsack problem, because its time efficiency class is the same as that of the bottom-up algorithm.
Page 16

R.M.D.E.C.
GREEDY TECHNIQUES
What is greedy technique?
Constructs a solution to an optimization problem piece by piece through a sequence of choices that are
feasible, i.e., it has to satisfy the problems constraints
locally optimal, i.e., it has to be the best local choice among all feasible choices available on that step
irrevocable, i.e., once made, it cannot be changed on subsequent steps of the algorithm
For some problems, yields an optimal solution for every instance. For most, does not but can be useful for
fast approximations.
Optimal solutions:
change making for normal coin denominations
minimum spanning tree (MST)
single-source shortest paths
Huffman codes
Approximations:
traveling salesman problem (TSP)
knapsack problem
other combinatorial optimization problems
Minimum spanning Tree(MST)

Spanning tree of a connected graph G: a connected acyclic subgraph of G that includes all of Gs vertices
Minimum spanning tree of a weighted, connected graph G: a spanning tree of G of minimum total weight
Example:
1.PRIMS ALGORITHM
This algorithm is used to find a minimum spanning tree for a given weighted connected graph.
Start with tree T1 consisting of one (any) vertex and grow tree one vertex at a time to produce MST through a
series of expanding subtrees T1, T2, , Tn
On each iteration, construct Ti+1 from Ti by adding vertex not in Ti that is closest to those already in Ti (this is a
greedy step!)-ties can be broken arbitrarily.
Stop when all vertices are included
Labelling the node:
Page 17

R.M.D.E.C.
provide each vertex not in the current tree with the information about the shortest edge connecting the vertex to a
tree vertex.
We can provide such information by attaching two labels to a vertex: the name of the nearest tree vertex and the
length (the weight) of the corresponding edge.
Vertices that are not adjacent to any of the tree vertices can be given the label indicating their infinite distance
to the tree vertices and a null label for the name of the nearest tree vertex.
Types of vertices:
There are 2 types:fringe and the unseen.
The fringe contains only the vertices that are not in the tree but are adjacent to at least one tree vertex. These are the
candidates from which the next tree vertex is selected.
The unseen vertices are all the other vertices of the graph, called unseen because they are yet to be affected by the
algorithm.
Pseudocode of the algorithm
Complexity:
O(n2) for weight matrix representation of graph and array implementation of priority queue
O(m log n) for adjacency list representation of graph with n vertices and m edges and min-heap implementation of
priority queue
Example:
Page 18

R.M.D.E.C.
2.KRUSHKALS ALGORITHM
This algorithm is also used to find a minimum spanning tree for a given weighted connected graph.
Sort the edges in non-decreasing order of lengths
Grow tree one edge at a time to produce MST through a series of expanding forests F1, F2, , Fn-1
On each iteration, add the next edge on the sorted list unless this would create a cycle. (If it would, skip the edge.)
Page 19

R.M.D.E.C.
Page 20

R.M.D.E.C.
Complexity:
Time efficiency of Kruskals algorithm will be inO(|E| log |E|).
Prims vs Krushkals
Algorithm looks easier than Prims but is harder to implement (checking for cycles!)
Cycle checking: a cycle is created iff added edge connects vertices in the same connected component
3.DIJKSTRAS ALGORITHM ( single source shortest path)

Given a weighted connected graph G, this algorithm find shortest paths from source vertex s to each of the
other vertices.
Its applications are Transportation network and packet routing in communication network.
Start with tree T1 consisting of one (any) vertex and grow tree one vertex at a time to produce MST through a
series of expanding subtrees T1, T2, , Tn
On each iteration, construct Ti+1 from Ti by adding vertex not in Ti that is closest to those already in Ti (this is a
greedy step!)-ties can be broken arbitrarily.
Stop when all vertices are included
Labelling the nodes:
Similar to Prims MST algorithm, with a different way of computing numerical labels: Among vertices
not already in the tree, it finds vertex u with the smallest sum
dv + w(v,u)
where
v is a vertex for which shortest path has been already found on preceding iterations (such vertices form a tree)
dv is the length of the shortest path form source to v
w(v,u) is the length (weight) of edge from v to u
Types of vertices:
There are 2 types:fringe and the unseen.
The fringe contains only the vertices that are not in the tree but are adjacent to at least one tree vertex. These are the
candidates from which the next tree vertex is selected.
The unseen vertices are all the other vertices of the graph, called unseen because they are yet to be affected by the
algorithm.
Page 21

R.M.D.E.C.
Example :
Page 22

R.M.D.E.C.
Complexity:( depends on the data structure used)

Its (|V |2) for graphs represented by their weight matrix and the priority queueimplemented as an unordered array.
For graphs represented by their adjacencylists and the priority queue implemented as a min-heap, it is in
O(|E| log |V |).
Note:
Doesnt work for graphs with negative weights
Applicable to both undirected and directed graphs
Dont mix up Dijkstras algorithm with Prims algorithm!
4.HUFFMAN TREES (finds codeword)
This is used to construct a codeword for encoding and decoding.
Coding:
assignment of bit strings to alphabet characters
Codewords:
bit strings assigned for characters of alphabet
Two types of codes:

fixed-length encoding (e.g., ASCII)- length of codeword is same for all chars
variable-length encoding (e,g., Morse code)- length of codeword is different
Prefix-free codes: no codeword is a prefix of another codeword
Problem: If frequencies of the character occurrences are known, what is the best binary prefix-free code?
Any binary tree with edges labeled with 0s and 1s yields a prefix-free code of characters assigned to its leaves
Optimal binary tree minimizing the expected (weighted average) length of a codeword can be constructed as
follows
Huffmans algorithm(constructs Huffman tree)
Initialize n one-node trees with alphabet characters and the tree weights with their frequencies.
Repeat the following step n-1 times:
join two binary trees with smallest weights into one (as left and right subtrees) and make its weight equal the
sum of the weights of the two trees.
Mark edges leading to left and right subtrees with 0s and 1s, respectively.
Page 23

R.M.D.E.C.
Hence, DAD is encoded as 011101, and 10011011011101 is decoded as BAD_AD.

Page 24

R.M.D.E.C.
With the occurrence frequencies given and the codeword lengths obtained, the average number of bits per
symbol in this code is
2 . 0.35 + 3 . 0.1+ 2 . 0.2 + 2 . 0.2 + 3 . 0.15 = 2.25.
Had we used a fixed-length encoding for the same alphabet, we would have to use at least 3 bits per each symbol
Compression ratio:
compression ratio: (3-2.25)/3*100% = 25%
Thus, Huffmans encoding of the text will use 25% less memory than its fixed-length encoding.
Advantage and disadvantage (fixed/static Huffman code)
Huffmans encoding is one of the most important file-compression methods. In addition to its simplicity and
versatility, it yields an optimal, i.e., minimal-length, encoding (provided the frequencies of symbol occurrences are independent
and known in advance).
The simplest version of Huffman compression calls, in fact, for a preliminary scanning of a given text to count the
frequencies of symbol occurrences in it. Then these frequencies are used to construct a Huffman coding tree and encode the text
as described above.
This scheme makes it necessary, however, to include the coding table into the encoded text to make its decoding
possible. This drawback can be overcome by using dynamic Huffman encoding, in which the coding tree is updated each time a
new symbol is read from the source text.
Page 25

Cs6402 DAA Notes (Unit-3)

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Cs6402 DAA Notes (Unit-3)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cs6402 DAA Notes (Unit-3)

Uploaded by

Copyright:

Available Formats

2014

CS6402 DESIGN AND ANALYSIS OF ALGORITHMS

CS6402- Design And Analysis Of Algorithms

DYNAMIC PROGRAMMING AND GREEDY TECHNIQUE

reused from memory.(not computed again)

variations of dynamic programming technique

CS6402- Design And Analysis Of Algorithms

Computing a binomial coefficient by DP

CS6402- Design And Analysis Of Algorithms

Time efficiency: (nk)

Space efficiency: (nk)

CS6402- Design And Analysis Of Algorithms

Computes the transitive closure of a relation

What is transitive closure?

Recurrence relating elements R(k) to elements of R(k-1)is:

CS6402- Design And Analysis Of Algorithms

Time complexity: (n3)

CS6402- Design And Analysis Of Algorithms

Basic idea of the algorithm

It implies the following rule for generating D(k) from D(k-1):

CS6402- Design And Analysis Of Algorithms

Time complexity: (n3)

CS6402- Design And Analysis Of Algorithms

4.OPTIMAL BINARY SEARCH TREE(OBST)

which grows exponentially, brute force is hopeless.

CS6402- Design And Analysis Of Algorithms

CS6402- Design And Analysis Of Algorithms

(Table for constructing OBST)

CS6402- Design And Analysis Of Algorithms

The final table is given as

CS6402- Design And Analysis Of Algorithms

Time complexity: (n3)

CS6402- Design And Analysis Of Algorithms

Initial conditions: V[0,j] = 0 and V[i,0] = 0

CS6402- Design And Analysis Of Algorithms

CS6402- Design And Analysis Of Algorithms

CS6402- Design And Analysis Of Algorithms

Minimum spanning Tree(MST)

Labelling the node:

CS6402- Design And Analysis Of Algorithms

CS6402- Design And Analysis Of Algorithms

CS6402- Design And Analysis Of Algorithms

CS6402- Design And Analysis Of Algorithms

3.DIJKSTRAS ALGORITHM ( single source shortest path)

CS6402- Design And Analysis Of Algorithms

CS6402- Design And Analysis Of Algorithms

Complexity:( depends on the data structure used)

assignment of bit strings to alphabet characters

bit strings assigned for characters of alphabet

Two types of codes:

CS6402- Design And Analysis Of Algorithms

Hence, DAD is encoded as 011101, and 10011011011101 is decoded as BAD_AD.

CS6402- Design And Analysis Of Algorithms

You might also like