[go: up one dir, main page]

0% found this document useful (0 votes)
38 views3 pages

Dynamic Programming Solution To The Longest Common Subsequence Problem

This document summarizes a dynamic programming solution to find the longest common subsequence (LCS) between two sequences. It presents the key ideas of the algorithm, including characterizing the optimal substructure, recursively defining the value of the optimal solution, computing it bottom-up using dynamic programming, and constructing the optimal solution by backtracking through the computed information. The overall runtime is O(mn) where m and n are the lengths of the two sequences, and O(mn) additional space is used.

Uploaded by

Hur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views3 pages

Dynamic Programming Solution To The Longest Common Subsequence Problem

This document summarizes a dynamic programming solution to find the longest common subsequence (LCS) between two sequences. It presents the key ideas of the algorithm, including characterizing the optimal substructure, recursively defining the value of the optimal solution, computing it bottom-up using dynamic programming, and constructing the optimal solution by backtracking through the computed information. The overall runtime is O(mn) where m and n are the lengths of the two sequences, and O(mn) additional space is used.

Uploaded by

Hur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Dynamic Programming Solution to the

Longest Common Subsequence Problem


Cheng Li, Virgil Pavlu
[this solution follows “Introduction to Algorithms” book by Cormen et al]

Longest Common Subsequence Problem


Given two sequences X =< x1 , x2 , . . . , xm > and Y =< y1 , y2 , . . . , yn >, find a maximum length common
subsequence of X and Y .

Methodology
(1) Characterize the Structure of an Optimal Solution. The LCS problem exhibits optimal sub-
structure in the following manner. Given a sequence X =< x1 , x2 , . . . , xm >, we define the ith prefix of X,
for i = 0, 1, . . . , m, as Xi =< x1 , x2 , . . . , xi >.
Claim 1 Let X =< x1 , x2 , . . . , xm > and Y =< y1 , y2 , . . . , yn > be sequences, and let Z =< z1 , z2 , . . . , zk >
be any LCS of X and Y .
1. If xm = yn , then zk = xm = yn and Zk−1 is an LCS of Xm−1 and Yn−1 .
2. If xm 6= yn , then zk 6= xm implies that Z is an LCS of Xm−1 and Y .
3. If xm 6= yn , then zk 6= yn implies that Z is an LCS of X and Yn−1 .
Proof: (1) By contradiction, assume zk 6= xm , then by appending xm = yn to Z, we get a common
subsequence of X and Y of length k + 1, contradicting the supposed optimality of Z. So zk = xm = yn .
Thus, the prefix Zk−1 is a common subsequence of Xm−1 and Yn−1 . Next we show that it is an LCS. Suppose
for the purpose of contradiction that there exists a common subsequence W of Xm−1 and Yn−1 with length
greater than k − 1. We can append xm = yn to W and get a common subsequence of X and Y whose length
is greater than k, which contradicting the supposed optimality of Z.
(2) zk 6= xm implies that Z is a common subsequence of Xm−1 and Y . By contradiction, suppose
that there is a common subsequence W of Xm−1 and Y with length greater than k, then W is a common
subsequence of Xm and Y , contradicting the supposed optimality of Z.
(3) The proof is similar to (2). 2

(2) Recursively Define the Value of the Optimal Solution. Let C[i, j] be the length of an LCS of
the sequences Xi and Yj . If either i = 0 or j = 0, one of the sequences has length 0, and so the LCS has
length 0. If i, j > 0 and xm = yn we should first find an LCS of Xm−1 and Yn−1 and then append xm = yn
to this LCS to get an LCS of X and Y . If i, j > 0 and xm 6= yn , then we must first find an LCS of Xm−1
and Y and an LCS of X and Yn−1 , and then choose the longer one as an LCS of X and Y . We thus have
the following recurrence.
Claim 2 
0
 if i = 0 or j = 0,
C[i, j] = C[i − 1, j − 1] + 1 if i, j > 0 and xi = yj ,

max(C[i, j − 1], C[i − 1, j]) if i, j > 0 and xi 6= yj .

Proof: The correctness of this recursive definition is embodied in the paragraph which proceeds it. 2

1
(3) Compute the Value of the Optimal Solution Bottom-up. Consider the following piece of pseu-
docode, where X =< x1 , x2 , . . . , xm >, Y =< y1 , y2 , . . . , yn >.

LCS-Length(X, Y )
1 m = X.length
2 n = Y.length
3 let S[1..m, 1..n] and C[0..m, 0..n] be new tables
4 for i = 1 to m
5 C[i, 0] = 0
6 for j = 0 to n
7 C[0, j] = 0
8 for i = 1 to m
9 for j = 1 to n
10 if xi == yj
11 C[i, j] = C[i − 1, j − 1] + 1
12 S[i, j] = “ - ”
13 elseif C[i − 1, j] ≥ C[i, j − 1]
14 C[i, j] = C[i − 1, j]
15 S[i, j] = “ ↑ ”
16 else C[i, j] = C[i, j − 1]
17 S[i, j] = “ ← ”
18 return C and S

Claim 3 When the above procedure terminates, C[i, j] will contain the length of an LCS of the sequences
Xi and Yj , and S[i, j] will point to the table entry corresponding to the optimal subproblem solution chosen
when computing C[i, j].
Proof: The correctness of the above procedure is based on the fact that it correctly implements the recursive
definition given above. The base case is properly handled in Line 4-7, and the recursive case is properly
handled in Lines 8 to 17. Note that since the loop defined in Line 8 goes from 1 to m and the loop defined
in Line 9 goes from 1 to n, no element of C is accessed in either Line 11,13,14 or 16 before it has been
computed. 2

(4) Construct the Optimal Solution from the Computed Information. Consider the following
piece of pseudocode, where S is the table computed above.

Print-LCS(S, X, i, j)
1 if i == 0 or j == 0
2 return
3 if S[i, j] == “ - ”
4 Print-LCS(S, X, i − 1, j − 1)
5 print xi
6 elseif S[i, j] == “ ↑ ”
7 Print-LCS(S, X, i − 1, j)
8 elsePrint-LCS(S, X, i, j − 1)

Claim 4 The above procedure prints out an LCS of X and Y .


Proof: The above procedure traces through the table by following the arrows. When S[i, j]= “-”, xi = yj
is an element of the LCS, and the procedure will print it out. 2

2
j 0 1 2 3 4 5 6

i yi B D C A B A

0 xi
0 0 0 0 0 0 0
↑ ↑ ↑ - ← -
1 A
0 0 0 0 1 1 1
- ← ← ↑ - ←
2 B
0 1 1 1 1 2 2
↑ ↑ - ← ↑ ↑
3 C
0 1 1 2 2 2 2
- ↑ ↑ ↑ - ←
4 B
0 1 1 2 2 3 3
↑ - ↑ ↑ ↑ ↑
5 D
0 1 2 2 2 3 3
↑ ↑ ↑ - ↑ -
6 A
0 1 2 2 3 3 4
- ↑ ↑ ↑ - ↑
7 B
0 1 2 2 3 4 4

Figure 1: The C and S tables computed by LCS-LENGTH on the sequence X =< A, B, C, B, D, A, B >
and Y =< B, D, C, A, B, A >.

(5) Running Time and Space Requirements. The LCS-Length procedure runs in Θ(mn) since each
table entry takes Θ(1) time to compute, and it uses Θ(mn) additional space in the form of the tables S and
C. The Print-LCS procedure runs in time O(m + n) since it decrements at least one of i and j in each
recursive call. It uses no additional space beyond the inputs given. Thus, the total running time is Θ(mn)
and the total space requirement is Θ(mn).

You might also like