DAA_4
DAA_4
Contents
Example: An example of a digraph, its adjacency matrix, and its transitive closure is given
below.
(a) Digraph. (b) Its adjacency matrix. (c) Its transitive closure.
We can generate the transitive closure of a digraph with the help of depthfirst search or breadth-
first search. Performing either traversal starting at the ith vertex gives the information about the
vertices reachable from it and hence the columns that contain 1’s in the ith row of the transitive
closure. Thus, doing such a traversal for every vertex as a starting point yields the transitive
closure in its entirety.
Since this method traverses the same digraph several times, we can use a better algorithm called
Warshall’s algorithm. Warshall’s algorithm constructs the transitive closure through a series
of n × n boolean matrices:
Each of these matrices provides certain information about directed paths in the digraph.
Specifically, the element in the ith row and jth column of matrix R(k) (i, j = 1, 2, . . . , n, k
= 0, 1, . . . , n) is equal to 1 if and only if there exists a directed path of a positive length from
the ith vertex to the jth vertex with each intermediate vertex, if any, numbered not higher than k.
Thus, the series starts with R(0) , which does not allow any intermediate vertices in its paths;
hence, R(0) is nothing other than the adjacency matrix of the digraph. R(1) contains the
information about paths that can use the first vertex as intermediate. The last matrix in the
series, R(n) , reflects paths that can use all n vertices of the digraph as intermediate and hence is
nothing other than the digraph’s transitive closure.
This means that there exists a path from the ith vertex vi to the jth vertex vj with each
intermediate vertex numbered not higher than k:
vi, a list of intermediate vertices each numbered not higher than k, vj . --- (*)
Two situations regarding this path are possible.
1. In the first, the list of its intermediate vertices does not contain the kth vertex. Then this
path from vi to vj has intermediate vertices numbered not higher than k−1. i.e. r 1
2. The second possibility is that path (*) does contain the kth vertex vk among the intermediate
vertices. Then path (*) can be rewritten as;
i.e r 1 and r 1
Thus, we have the following formula for generating the elements of matrix R (k) from the
elements of matrix R(k−1)
As an example, the application of Warshall’s algorithm to the digraph is shown below. New
1’s are in bold.
Analysis
Its time efficiency is Θ(n3). We can make the algorithm to run faster by treating matrix rows as
bit strings and employ the bitwise or operation available in most modern computer languages.
Space efficiency: Although separate matrices for recording intermediate results of the
algorithm are used, that can be avoided.
(a) Digraph. (b) Its weight matrix. (c) Its distance matrix
We can generate the distance matrix with an algorithm that is very similar to Warshall’s
algorithm. It is called Floyd’s algorithm.
Floyd’s algorithm computes the distance matrix of a weighted graph with n vertices through a
series of n × n matrices:
The element in the ith row and the jth column of matrix D(k) (i, j = 1, 2, . . . , n, k = 0, 1,
. . . , n) is equal to the length of the shortest path among all paths from the i th vertex to the jth
vertex with each intermediate vertex, if any, numbered not higher than k.
As in Warshall’s algorithm, we can compute all the elements of each matrix D (k) from its
immediate predecessor D(k−1)
We can partition all such paths into two disjoint subsets: those that do not use the k th vertex vk
as intermediate and those that do.
i. Since the paths of the first subset have their intermediate vertices numbered
not higher than k − 1, the shortest of them is, by definition of our matrices, of
length
ii. In the second subset the paths are of the form vi, vertices numbered ≤ k − 1,
vk, vertices numbered ≤ k − 1, vj .
Taking into account the lengths of the shortest paths in both subsets leads to the following
recurrence:
So let a1, . . . , an be distinct keys ordered from the smallest to the largest and let p1, . . . , pn be
the probabilities of searching for them. Let C(i, j) be the smallest average number of
comparisons made in a successful search in a binary search tree T ij made up of keys ai, . . , aj,
where i, j are some integer indices, 1≤ i ≤ j ≤ n.
Following the classic dynamic programming approach, we will find values of C(i, j) for all
smaller instances of the problem, although we are interested just in C(1, n). To derive a
recurrence underlying a dynamic programming algorithm, we will consider all possible ways
to choose a root ak among the keys ai, . . . , aj . For such a binary search tree (Figure 8.8), the
root contains key ak, the left subtree Tik−1 contains keys ai, . . . , ak−1 optimally arranged, and
the right subtree Tjk+1contains keys ak+1, . . . , aj also optimally arranged. (Note how we are
taking advantage of the principle of optimality here.)
If we count tree levels starting with 1 to make the comparison numbers equal the keys’ levels,
the following recurrence relation is obtained:
The two-dimensional table in Figure 8.9 shows the values needed for computing C(i, j) by
formula (8.8): they are in row i and the columns to the left of column j and in column j and the
rows below row i. The arrows point to the pairs of entries whose sums are computed in order
to find the smallest one to be recorded as the value of C(i, j). This suggests filling the table
along its diagonals, starting with all zeros on the main diagonal and given probabilities pi, 1≤ i
≤ n, right above it and moving toward the upper right corner.
The algorithm we just sketched computes C(1, n)—the average number of comparisons for
successful searches in the optimal binary tree. If we also want to get the optimal tree itself, we
need to maintain another two-dimensional table to record the value of k for which the minimum
in (8.8) is achieved. The table has the same shape as the table in Figure 8.9 and is filled in the
same manner, starting with entries R(i, i) = i for 1≤ i ≤ n. When the table is filled, its entries
indicate indices of the roots of the optimal subtrees, which makes it possible to reconstruct an
optimal tree for the entire set given.
Example: Let us illustrate the algorithm by applying it to the four-key set we used at the
beginning of this section:
Thus, out of two possible binary trees containing the first two keys, A and B, the root of the
optimal tree has index 2 (i.e., it contains B), and the average number of comparisons in a
successful search in this tree is 0.4. On finishing the computations we get the following final
tables:
Thus, the average number of key comparisons in the optimal tree is equal to 1.7. Since R(1, 4)
= 3, the root of the optimal tree contains the third key, i.e., C. Its left subtree is made up of keys
A and B, and its right subtree contains just key D. To find the specific structure of these
subtrees, we find first their roots by consulting the root table again as follows. Since R(1, 2) =
2, the root of the optimal tree containing A and B is B, with A being its left child (and the root
of the one-node tree: R(1, 1) = 1). Since R(4, 4) = 4, the root of this one-node optimal tree is
its only key D. Figure given below presents the optimal tree in its entirety.
Let us consider an instance defined by the first i items, 1≤ i ≤ n, with weights w1, . . . , wi, values
v1, . . . , vi , and knapsack capacity j, 1 ≤ j ≤ W. Let F(i, j) be the value of an optimal solution
to this instance. We can divide all the subsets of the first i items that fit the knapsack of capacity
j into two categories: those that do not include the ith item and those that do. Note the following:
i. Among the subsets that do not include the ith item, the value of an optimal subset is, by
definition, F(i − 1, j).
ii. Among the subsets that do include the ith item (hence, j − wi ≥ 0), an optimal subset is
made up of this item and an optimal subset of the first i−1 items that fits into the
knapsack of capacity j − wi . The value of such an optimal subset is vi + F(i − 1, j − wi).
Thus, the value of an optimal solution among all feasible subsets of the first I items is the
maximum of these two values.
It is convenient to define the initial conditions as follows:
We can find the composition of an optimal subset by backtracing the computations of this entry
in the table. Since F(4, 5) > F(3, 5), item 4 has to be included in an optimal solution along with
an optimal subset for filling 5 − 2 = 3 remaining units of the knapsack capacity. The value of
the latter is F(3, 3). Since F(3, 3) = F(2, 3), item 3 need not be in an optimal subset. Since F(2,
3) > F(1, 3), item 2 is a part of an optimal selection, which leaves element F(1, 3 − 1) to specify
its remaining composition. Similarly, since F(1, 2) > F(0, 2), item 1 is the final part of the
optimal solution {item 1, item 2, item 4}.
Analysis
The time efficiency and space efficiency of this algorithm are both in Θ(nW). The time needed
to find the composition of an optimal solution is in O(n).
Memory Functions
The direct top-down approach to finding a solution to such a recurrence leads to an algorithm
that solves common subproblems more than once and hence is very inefficient.
The classic dynamic programming approach, on the other hand, works bottom up: it fills a table
with solutions to all smaller subproblems, but each of them is solved only once. An unsatisfying
aspect of this approach is that solutions to some of these smaller subproblems are often not
necessary for getting a solution to the problem given. Since this drawback is not present in the
top-down approach, it is natural to try to combine the strengths of the top-down and bottom-up
approaches. The goal is to get a method that solves only subproblems that are necessary and
does so only once. Such a method exists; it is based on using memory functions.
This method solves a given problem in the top-down manner but, in addition, maintains a table
of the kind that would have been used by a bottom-up dynamic programming algorithm.
Initially, all the table’s entries are initialized with a special “null” symbol to indicate that they
have not yet been calculated. Thereafter, whenever a new value needs to be calculated, the
method checks the corresponding entry in the table first: if this entry is not “null,” it is simply
retrieved from the table; otherwise, it is computed by the recursive call whose result is then
recorded in the table.
The following algorithm implements this idea for the knapsack problem. After initializing the
table, the recursive function needs to be called with i = n (the number of items) and j = W (the
knapsack capacity).
Algorithm MFKnapsack(i, j )
//Implements the memory function method for the knapsack problem
//Input: A nonnegative integer i indicating the number of the first items being
considered and a nonnegative integer j indicating the knapsack capacity
//Output: The value of an optimal feasible subset of the first i items
//Note: Uses as global variables input arrays Weights[1..n], V alues[1..n], and
table F[0..n, 0..W ] whose entries are initialized with −1’s except for
row 0 and column 0 initialized with 0’s
Example-2 Let us apply the memory function method to the instance considered in Example
1. The table in Figure given below gives the results. Only 11 out of 20 nontrivial values (i.e.,
not those in row 0 or in column 0) have been computed. Just one nontrivial entry, V (1, 2), is
retrieved rather than being recomputed. For larger instances, the proportion of such entries can
be significantly larger.
Figure: Example of solving an instance of the knapsack problem by the memory function algorithm
In general, we cannot expect more than a constant-factor gain in using the memory function
method for the knapsack problem, because its time efficiency class is the same as that of the
bottom-up algorithm