8000 Merge pull request #30 from mstrechen/master · nemanator/Algorithms-Explanation@10d1761 · GitHub
[go: up one dir, main page]

Skip to content

Commit 10d1761

Browse files
authored
Merge pull request TheAlgorithms#30 from mstrechen/master
Add Longest Common subsequence algorithm
2 parents f85f0f4 + ef39cc7 commit 10d1761

File tree

1 file changed

+80
-0
lines changed

1 file changed

+80
-0
lines changed
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Longest Common Subsequence
2+
3+
#### Problem Statement
4+
5+
Given two strings `S` and `T`, find the length of the longest common subsequence (<b>LCS</b>).
6+
7+
#### Approach
8+
9+
Let the `dp[i][j]` be the length of the longest common subsequence of prefixes `S[1..i]` and `T[1..j]`. Our answer (the length of LCS) is `dp[|S|][|T|]` since the prefix of the length of string is the string itself.
10+
11+
Both `dp[0][i]` and `dp[i][0]` are `0` for any `i` since the LCS of empty prefix and anything else is an empty string.
12+
13+
Now let's try to calculate `dp[i][j]` for any `i`, `j`. Let's say `S[1..i] = *A` and `T[1..j] = *B` where `*` stands for any sequence of letters (could be different for `S` and `T`), `A` stands for any letter and `B` stands for any letter different from `A`. Since `A != B`, our LCS doesn't include `A` or `B` as a last character. So we could try to throw away `A` or `B` character. If we throw `A`, our LCS length will be `dp[i - 1][j]` (since we have prefixes `S[1..i - 1]` and `T[1..j]`). If we try to throw `B` character, we will have prefixes `S[1..i]` and `T[1..j - 1]` so the length of LCS will be `dp[i][j - 1]`. As we are looking for the <b>Longest</b> common subsequence, we will pick <b>the maximum value</b> from `dp[i][j - 1]` and `dp[i - 1][j]`.
14+
15+
But what if `S[1..i] = *A` and `T[1..j] = *A`? We could say that the LCS of our prefixes is LCS of prefixes `S[1..i - 1]` and `T[1..j - 1]` <b>plus</b> the letter `A`. So `dp[i][j] = dp[i - 1][j - 1] + 1` if `S[i] = T[j]`.
16+
17+
We could see that we can fill our `dp` table row by row, column by column. So our algorithm will be like:
18+
19+
- Let's say that we have strings `S` of the length N and `T` of the length M (numbered from 1). Let's create the table `dp` of size `(N + 1) x (M + 1)` numbered from 0.
20+
- Let's fill the 0th row and the 0th column of `dp` with 0.
21+
- Then, we follow the algorithm:
22+
```
23+
for i in range(1..N):
24+
for j in range(1..M):
25+
if(S[i] == T[j])
26+
dp[i][j] = dp[i - 1][j - 1] + 1
27+
else
28+
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
29+
```
30+
31+
32+
#### Time Complexity
33+
34+
`O(N * M)` In any case
35+
36+
#### Space Complexity
37+
38+
`O(N * M)` - simple implementation
39+
`O(min {N, M})` - two-layers implementation (as `dp[i][j]` depends on only i-th and i-th layers, we coudld store only two layers).
40+
41+
#### Example
42+
43+
Let's say we have strings `ABCB` and `BBCB`. We will build such a table:
44+
```
45+
# # A B C B
46+
# 0 0 0 0 0
47+
B 0 ? ? ? ?
48+
B 0 ? ? ? ?
49+
C 0 ? ? ? ?
50+
B 0 ? ? ? ?
51+
```
52+
Now we will start to fill our table from 1st row. Since `S[1] = A` and `T[1] = B`, the `dp[1][1]` will be tha maximal value of `dp[0][1] = 0` and `dp[1][0] = 0`. So `dp[1][1] = 0`. But now `S[2] = B = T[1]`, so `dp[1][2] = dp[0][1] + 1 = 1`. `dp[1][3]` is `1` since `A != C` and we pick `max{dp[1][2], dp[0][3]}`. And `dp[1][4] = dp[0][3] + 1 = 1`.
53+
```
54+
# # A B C B
55+
# 0 0 0 0 0
56+
B 0 0 1 1 1
57+
B 0 ? ? ? ?
58+
C 0 ? ? ? ?
59+
B 0 ? ? ? ?
60+
```
61+
Now let's fill the other part of the table:
62+
```
63+
# # A B C B
64+
# 0 0 0 0 0
65+
B 0 0 1 1 1
66+
B 0 0 1 1 2
67+
C 0 0 1 2 2
68+
B 0 0 1 2 3
69+
```
70+
So the length of LCS is `dp[4][4] = 3`.
71+
72+
#### Code Implementation Links
73+
74+
- [Java](https://github.com/TheAlgorithms/Java/blob/master/Dynamic%20Programming/LongestCommonSubsequence.java)
75+
- [Python](https://github.com/TheAlgorithms/Python/blob/master/dynamic_programming/longest_common_subsequence.py)
76+
- [C++](https://github.com/TheAlgorithms/C-Plus-Plus/blob/master/Dynamic%20Programming/Longest%20Common%20Subsequence.cpp)
77+
78+
#### Video Explanation
79+
80+
[Video explanation by Tushar Roy](https://youtu.be/NnD96abizww)

0 commit comments

Comments
 (0)
0