From a65307b34add4dc4da0e9c91619799aca46f5750 Mon Sep 17 00:00:00 2001 From: Matvii Strechen Date: Fri, 26 Oct 2018 11:46:26 +0300 Subject: [PATCH 1/4] Add Longest Common Subsequence algorithms --- .../Longest Common Subseqence.md | 80 +++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 Dynamic Programming/Longest Common Subseqence.md diff --git a/Dynamic Programming/Longest Common Subseqence.md b/Dynamic Programming/Longest Common Subseqence.md new file mode 100644 index 00000000..bcb28407 --- /dev/null +++ b/Dynamic Programming/Longest Common Subseqence.md @@ -0,0 +1,80 @@ +# Longest Common Subsequence + +#### Problem Statement + +Given two strings `S` and `T`, find the length of the longest common subsequence (LCS). + +#### Approach + +Let the `dp[i][j]` be the length of the longest common subsequence of prefixes `S[1..i]` and `T[1..j]`. Our answer (the length of LCS) is `dp[|S|][|T|]` since the prefix of the length of string is the string itself. + +Both `dp[0][i]` and `dp[i][0]` are `0` for any `i` since the LCS of empty prefix and anything else is an empty string. + +Now let's try to calculate `dp[i][j]` for any `i`, `j`. Let's say `S[1..i] = *A` and `T[1..j] = *B` where `*` stands for any sequence of letters (could be different for `S` and `T`), `A` stands for any letter and `B` stands for any letter different from `A`. Since `A != B`, our LCS doesn't include `A` or `B` as a last character. So we could try to throw away `A` or `B` character. If we throw `A`, our LCS length will be `dp[i - 1][j]` (since we have prefixes `S[1..i - 1]` and `T[1..j]`). If we try to throw `B` character, we will have prefixes `S[1..i]` and `T[1..j - 1]` so the length of LCS will be `dp[i][j - 1]`. As we are looking for the Longest common subsequence, we will pick the maximum value from `dp[i][j - 1]` and `dp[i - 1][j]`. + +But what if `S[1..i] = *A` and `T[1..j] = *A`? We could say that the LCS of our prefixes is LCS of prefixes `S[1..i - 1]` and `T[1..j - 1]` plus the letter `A`. So `dp[i][j] = dp[i - 1][j - 1] + 1` if `S[i] = T[j]`. + +We could see that we can fill our `dp` table row by row, column by column. So our algorithm will be like: + +- Let's say that we have strings `S` of the length N and `T` of the length M (numbered from 1). Let's create the table `dp` of size `(N + 1) x (M + 1)` numbered from 0. +- Let's fill the 0th row and the 0th column of `dp` with 0. +- Then, we follow the algorithm: + ``` + for i in range(1..N): + for j in range(1..M): + if(S[i] == T[j]) + dp[i][j] = dp[i - 1][j - 1] + 1 + else + dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]) + ``` + + +#### Time Complexity + +`O(N * M)` In any case + +#### Space Complexity + +`O(N * M)` - simple implementation +`O(min {N, M})` - two-layers implementation (as `dp[i][j]` depends on only i-th and i-th layers, we coudld store only two layers). + +#### Example + +Let's say we have strings `ABCB` and `BBCB`. We will build such a table: +``` +# # A B C B +# 0 0 0 0 0 +B 0 ? ? ? ? +B 0 ? ? ? ? +C 0 ? ? ? ? +B 0 ? ? ? ? +``` +Now we will start to fill our table from 1st row. Since `S[1] = A` and `T[1] = B`, the `dp[1][1]` will be tha maximal value of `dp[0][1] = 0` and `dp[1][0] = 0`. So `dp[1][1] = 0`. But now `S[2] = B = T[1]`, so `dp[1][2] = dp[0][1] + 1 = 1`. `dp[1][3]` is `1` since `A != C` and we pick `max{dp[1][2], dp[0][3]}`. And `dp[1][4] = dp[0][3] + 1 = 1`. +``` +# # A B C B +# 0 0 0 0 0 +B 0 0 1 1 1 +B 0 ? ? ? ? +C 0 ? ? ? ? +B 0 ? ? ? ? +``` +Now let's fill the other part of the table: +``` +# # A B C B +# 0 0 0 0 0 +B 0 0 1 1 1 +B 0 0 1 1 2 +C 0 0 1 2 2 +B 0 0 1 2 3 +``` +So the length of LCS is `dp[4][4] = 3`. + +#### Code Implementation Links + +- [Java](https://github.com/TheAlgorithms/Java/blob/master/Dynamic%20Programming/LongestCommonSubsequence.java) +- [Python](https://raw.githubusercontent.com/TheAlgorithms/Python/master/dynamic_programming/longest_common_subsequence.py) +- [C++](https://raw.githubusercontent.com/TheAlgorithms/C-Plus-Plus/master/Dynamic%20Programming/Longest%20Common%20Subsequence.cpp) + +#### Video Explanation + +[Video explanation by Tushar Roy](https://youtu.be/NnD96abizww) From 923a1fb3788285b6b907799470507fe229ff6346 Mon Sep 17 00:00:00 2001 From: Matvii Strechen Date: Fri, 26 Oct 2018 11:53:41 +0300 Subject: [PATCH 2/4] Delete last row --- Dynamic Programming/Longest Common Subseqence.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Dynamic Programming/Longest Common Subseqence.md b/Dynamic Programming/Longest Common Subseqence.md index bcb28407..0eb1b120 100644 --- a/Dynamic Programming/Longest Common Subseqence.md +++ b/Dynamic Programming/Longest Common Subseqence.md @@ -77,4 +77,4 @@ So the length of LCS is `dp[4][4] = 3`. #### Video Explanation -[Video explanation by Tushar Roy](https://youtu.be/NnD96abizww) +[Video explanation by Tushar Roy](https://youtu.be/NnD96abizww) \ No newline at end of file From 4fb7ad65ad7977b82d7042d1b037899e2423b33e Mon Sep 17 00:00:00 2001 From: Ashwek Swamy <39827514+ashwek@users.noreply.github.com> Date: Wed, 13 Feb 2019 20:14:20 +0530 Subject: [PATCH 3/4] Rename Longest Common Subseqence.md to Longest Common Subseqence.md --- ...ngest Common Subseqence.md => Longest Common Subseqence.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename Dynamic Programming/{Longest Common Subseqence.md => Longest Common Subseqence.md} (98%) diff --git a/Dynamic Programming/Longest Common Subseqence.md b/Dynamic Programming/Longest Common Subseqence.md similarity index 98% rename from Dynamic Programming/Longest Common Subseqence.md rename to Dynamic Programming/Longest Common Subseqence.md index 0eb1b120..bcb28407 100644 --- a/Dynamic Programming/Longest Common Subseqence.md +++ b/Dynamic Programming/Longest Common Subseqence.md @@ -77,4 +77,4 @@ So the length of LCS is `dp[4][4] = 3`. #### Video Explanation -[Video explanation by Tushar Roy](https://youtu.be/NnD96abizww) \ No newline at end of file +[Video explanation by Tushar Roy](https://youtu.be/NnD96abizww) From ef39cc79528896ef2b0d4f8d3f41bff5aff45101 Mon Sep 17 00:00:00 2001 From: Ashwek Swamy <39827514+ashwek@users.noreply.github.com> Date: Wed, 13 Feb 2019 20:16:58 +0530 Subject: [PATCH 4/4] Update Longest Common Subseqence.md --- Dynamic Programming/Longest Common Subseqence.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Dynamic Programming/Longest Common Subseqence.md b/Dynamic Programming/Longest Common Subseqence.md index bcb28407..7dbc0627 100644 --- a/Dynamic Programming/Longest Common Subseqence.md +++ b/Dynamic Programming/Longest Common Subseqence.md @@ -72,8 +72,8 @@ So the length of LCS is `dp[4][4] = 3`. #### Code Implementation Links - [Java](https://github.com/TheAlgorithms/Java/blob/master/Dynamic%20Programming/LongestCommonSubsequence.java) -- [Python](https://raw.githubusercontent.com/TheAlgorithms/Python/master/dynamic_programming/longest_common_subsequence.py) -- [C++](https://raw.githubusercontent.com/TheAlgorithms/C-Plus-Plus/master/Dynamic%20Programming/Longest%20Common%20Subsequence.cpp) +- [Python](https://github.com/TheAlgorithms/Python/blob/master/dynamic_programming/longest_common_subsequence.py) +- [C++](https://github.com/TheAlgorithms/C-Plus-Plus/blob/master/Dynamic%20Programming/Longest%20Common%20Subsequence.cpp) #### Video Explanation