|
| 1 | +# Longest Common Subsequence |
| 2 | + |
| 3 | +#### Problem Statement |
| 4 | + |
| 5 | +Given two strings `S` and `T`, find the length of the longest common subsequence (<b>LCS</b>). |
| 6 | + |
| 7 | +#### Approach |
| 8 | + |
| 9 | +Let the `dp[i][j]` be the length of the longest common subsequence of prefixes `S[1..i]` and `T[1..j]`. Our answer (the length of LCS) is `dp[|S|][|T|]` since the prefix of the length of string is the string itself. |
| 10 | + |
| 11 | +Both `dp[0][i]` and `dp[i][0]` are `0` for any `i` since the LCS of empty prefix and anything else is an empty string. |
| 12 | + |
| 13 | +Now let's try to calculate `dp[i][j]` for any `i`, `j`. Let's say `S[1..i] = *A` and `T[1..j] = *B` where `*` stands for any sequence of letters (could be different for `S` and `T`), `A` stands for any letter and `B` stands for any letter different from `A`. Since `A != B`, our LCS doesn't include `A` or `B` as a last character. So we could try to throw away `A` or `B` character. If we throw `A`, our LCS length will be `dp[i - 1][j]` (since we have prefixes `S[1..i - 1]` and `T[1..j]`). If we try to throw `B` character, we will have prefixes `S[1..i]` and `T[1..j - 1]` so the length of LCS will be `dp[i][j - 1]`. As we are looking for the <b>Longest</b> common subsequence, we will pick <b>the maximum value</b> from `dp[i][j - 1]` and `dp[i - 1][j]`. |
| 14 | + |
| 15 | +But what if `S[1..i] = *A` and `T[1..j] = *A`? We could say that the LCS of our prefixes is LCS of prefixes `S[1..i - 1]` and `T[1..j - 1]` <b>plus</b> the letter `A`. So `dp[i][j] = dp[i - 1][j - 1] + 1` if `S[i] = T[j]`. |
| 16 | + |
| 17 | +We could see that we can fill our `dp` table row by row, column by column. So our algorithm will be like: |
| 18 | + |
| 19 | +- Let's say that we have strings `S` of the length N and `T` of the length M (numbered from 1). Let's create the table `dp` of size `(N + 1) x (M + 1)` numbered from 0. |
| 20 | +- Let's fill the 0th row and the 0th column of `dp` with 0. |
| 21 | +- Then, we follow the algorithm: |
| 22 | + ``` |
| 23 | + for i in range(1..N): |
| 24 | + for j in range(1..M): |
| 25 | + if(S[i] == T[j]) |
| 26 | + dp[i][j] = dp[i - 1][j - 1] + 1 |
| 27 | + else |
| 28 | + dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]) |
| 29 | + ``` |
| 30 | +
|
| 31 | +
|
| 32 | +#### Time Complexity |
| 33 | +
|
| 34 | +`O(N * M)` In any case |
| 35 | +
|
| 36 | +#### Space Complexity |
| 37 | +
|
| 38 | +`O(N * M)` - simple implementation |
| 39 | +`O(min {N, M})` - two-layers implementation (as `dp[i][j]` depends on only i-th and i-th layers, we coudld store only two layers). |
| 40 | +
|
| 41 | +#### Example |
| 42 | +
|
| 43 | +Let's say we have strings `ABCB` and `BBCB`. We will build such a table: |
| 44 | +``` |
| 45 | +# # A B C B |
| 46 | +# 0 0 0 0 0 |
| 47 | +B 0 ? ? ? ? |
| 48 | +B 0 ? ? ? ? |
| 49 | +C 0 ? ? ? ? |
| 50 | +B 0 ? ? ? ? |
| 51 | +``` |
| 52 | +Now we will start to fill our table from 1st row. Since `S[1] = A` and `T[1] = B`, the `dp[1][1]` will be tha maximal value of `dp[0][1] = 0` and `dp[1][0] = 0`. So `dp[1][1] = 0`. But now `S[2] = B = T[1]`, so `dp[1][2] = dp[0][1] + 1 = 1`. `dp[1][3]` is `1` since `A != C` and we pick `max{dp[1][2], dp[0][3]}`. And `dp[1][4] = dp[0][3] + 1 = 1`. |
| 53 | +``` |
| 54 | +# # A B C B |
| 55 | +# 0 0 0 0 0 |
| 56 | +B 0 0 1 1 1 |
| 57 | +B 0 ? ? ? ? |
| 58 | +C 0 ? ? ? ? |
| 59 | +B 0 ? ? ? ? |
| 60 | +``` |
| 61 | +Now let's fill the other part of the table: |
| 62 | +``` |
| 63 | +# # A B C B |
| 64 | +# 0 0 0 0 0 |
| 65 | +B 0 0 1 1 1 |
| 66 | +B 0 0 1 1 2 |
| 67 | +C 0 0 1 2 2 |
| 68 | +B 0 0 1 2 3 |
| 69 | +``` |
| 70 | +So the length of LCS is `dp[4][4] = 3`. |
| 71 | +
|
| 72 | +#### Code Implementation Links |
| 73 | +
|
| 74 | +- [Java](https://github.com/TheAlgorithms/Java/blob/master/Dynamic%20Programming/LongestCommonSubsequence.java) |
| 75 | +- [Python](https://github.com/TheAlgorithms/Python/blob/master/dynamic_programming/longest_common_subsequence.py) |
| 76 | +- [C++](https://github.com/TheAlgorithms/C-Plus-Plus/blob/master/Dynamic%20Programming/Longest%20Common%20Subsequence.cpp) |
| 77 | +
|
| 78 | +#### Video Explanation |
| 79 | +
|
| 80 | +[Video explanation by Tushar Roy](https://youtu.be/NnD96abizww) |
0 commit comments