8000 Translate "LIS" article · cp-algorithms/cp-algorithms@6228359 · GitHub
[go: up one dir, main page]

Skip to content

Commit 6228359

Browse files
committed
Translate "LIS" article
1 parent 7f5483c commit 6228359

File tree

3 files changed

+325
-0
lines changed

3 files changed

+325
-0
lines changed

src/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ especially popular in field of competitive programming.*
133133

134134
### Sequences
135135
- [RMQ task (Range Minimum Query - the smallest element in an interval)](./sequences/rmq.html)
136+
- [Longest increasing subsequence](./sequences/longest_increasing_subsequence.html)
136137
- [K-th order statistic in O(N)](./sequences/k-th.html)
137138

138139
### Schedules
Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
<!--?title Longest increasing subsequence -->
2+
# Longest increasing subsequence
3+
4+
We are given an array with $n$ numbers: $a[0 \dots n-1]$.
5+
The task is to find the longest, strictly increasing, subsequence in $a$.
6+
7+
Formally we look for the longest sequence of indices $i_1, \dots i_k$ such that
8+
$$i_1 < i_2 < \dots < i_k,\\\\
9+
a[i_1] < a[i_2] < \dots < a[i_k]$$
10+
11+
In this article we discuss multiple algorithms for solving this task.
12+
Also we will discuss some other problems, that can be reduced to this problem.
13+
14+
## Solution in $O(n^2)$ with dynamic programming
15+
16+
Dynamic programming is a very general technique that allows to solve a huge class of problems.
17+
Here we apply the technique for our specific task.
18+
19+
First we will search only for the **length** of the longest increasing subsequence, and only later learn how to restore the subsequence itself.
20+
21+
### Finding the length
22+
23+
To accomplish this task, we define an array $d[0 \dots n-1]$, where $d[i]$ is the length of the longest increasing subsequence that ends in the element at index $i$.
24+
We will compute this array gradually: first $d[0]$, then $d[1]$, and so on.
25+
After this array is computed, the answer to the problem will be the maximum value in the array $d[]$.
26+
27+
So let the current index be $i$.
28+
I.e. we want to compute the value $d[i]$ and all previous values $d[0], \dots, d[i-1]$ are already known.
29+
Then there are two options:
30+
31+
- $d[i] = 1$: the required subsequence consists of only the element $a[i]$.
32+
- $d[i] > 1$: then in the required subsequence is another number before the number $a[i]$.
33+
Let's focus on that number:
34+
it can be any element $a[j]$ with $j = 0 \dots i-1$ and $a[j] < a[i]$.
35+
In this fashion we can compute $d[i]$ using the following formula:
36+
If we fixate the index $j$, than the longest increasing subsequence ending in the two elements $a[j]$ and $a[i]$ has the length $d[j] + 1$.
37+
All of these values $d[j]$ are already known, so we can directly compute $d[i]$ with:
38+
$$d[i] = \max_{\substack{j = 0 \dots i-1 \\\\ a[j] < a[i]}} \left(d[j] + 1\right)$$
39+
40+
If we combine these two cases we get the final answer for $d[i]$:
41+
42+
$$d[i] = \max\left(1, \max_{\substack{j = 0 \dots i-1 \\\\ a[j] < a[i]}} \left(d[j] + 1\right)\right)$$
43+
44+
### Implementation
45+
46+
Here is an implementation of the algorithm described above, which computes the length of the longest increasing subsequence.
47+
48+
```cpp lis_n2
49+
int lis(vector<int> const& a) {
50+
int n = a.size();
51+
vector<int> d(n, 1);
52+
for (int i = 0; i < n; i++) {
53+
for (int j = 0; j < i; j++) {
54+
if (a[j] < a[i])
55+
d[i] = max(d[i], d[j] + 1);
56+
}
57+
}
58+
59+
int ans = d[0];
60+
for (int i = 1; i < n; i++) {
61+
ans = max(ans, d[i]);
62+
}
63+
return ans;
64+
}
65+
```
66+
67+
### Restoring the subsequence
68+
69+
So far we only learned how to find the length of the subsequence, but not how to find the subsequence itself.
70+
71+
To be able to restore the subsequence we generate an additional auxiliary array $p[0 \dots n-1]$ that we will compute alongside the array $d[]$.
72+
$p[i]$ will be the index $j$ of the second last element in the longest increasing subsequence ending in $i$.
73+
In other words the index $p[i]$ is the same index $j$ at which the highest value $d[i]$ was obtained.
74+
This auxiliary array $p[]$ points in some sense to the ancestors.
75+
76+
Then to derive the subsequence, we just start at the index $i$ with the maximal $d[i]$, and follow the ancestors until we deduced the entire subsequence, i.e. until we reach the element with $d[i] = 1$.
77+
78+
### Implementation of restoring
79+
80+
We will change the code from the previous sections a little bit.
81+
We will compute the array $p[]$ alongside $d[]$, and afterwards compute the subsequence.
82+
83+
For convenience we originally assign the ancestors with $p[i] = -1$.
84+
For elements with $d[i] = 1$, the ancestors value will remain $-1$, which will be slightly more convenient for restoring the subsequence.
85+
86+
```cpp lis_n2_restore
87+
vector<int> lis(vector<int> const& a) {
88+
int n = a.size();
89+
vector<int> d(n, 1), p(n, -1);
90+
for (int i = 0; i < n; i++) {
91+
for (int j = 0; j < i; j++) {
92+
if (a[j] < a[i] && d[i] < d[j] + 1) {
93+
d[i] = d[j] + 1;
94+
p[i] = j;
95+
}
96+
}
97+
}
98+
99+
int ans = d[0], pos = 0;
100+
for (int i = 1; i < n; i++) {
101+
if (d[i] > ans) {
102+
ans = d[i];
103+
pos = i;
104+
}
105+
}
106+
107+
vector<int> subseq;
108+
while (pos != -1) {
109+
subseq.push_back(a[pos]);
110+
pos = p[pos];
111+
}
112+
reverse(subseq.begin(), subseq.end());
113+
return subseq;
114+
}
115+
```
116+
117+
### Alternative way of restoring the subsequence
118+
119+
It is also possible to restore the subsequence without the auxiliary array $p[]$.
120+
We can simply recalculate the current value of $d[i]$ and also see how the maximum was reached.
121+
122+
This method leads to a slightly longer code, but in return we save some memory.
123+
124+
## Solution in $O(n \log n)$ with dynamic programming and binary search
125+
126+
In order to obtain a faster solution for the problem, we construct a different dynamic programming solution that runs in $O(n^2)$, and then later improve it to $O(n \log n)$.
127+
128+
We will use the dynamic programming array $d[0 \dots n]$.
129+
This time $d[i]$ will be the element at which a subsequence of length $i$ terminates.
130+
If there are multiple such sequences, then we take the one that ends in the smallest element.
131+
132+
Initially we assume $d[0] = -\infty$ and for all other elements $d[i] = \infty$.
133+
134+
We will again gradually process the numbers, first $a[0]$, then $a[1]$, etc, and in each step maintain the array $d[]$ so that it is up to date.
135+
136+
After processing all the elements of $a[]$ the length of the desired subsequence is the largest $l$ with $d[l] < \infty$.
137+
138+
```cpp lis_method2_n2
139+
int lis(vector<int> const& a) {
140+
int n = a.size();
141+
const int INF = 1e9;
142+
vector<int> d(n+1, INF);
143+
d[0] = -INF;
144+
145+
for (int i = 0; i < n; i++) {
146+
for (int j = 1; j <= n; j++) {
147+
if (d[j-1] < a[i] && a[i] < d[j])
148+
d[j] = a[i];
149+
}
150+
}
151+
152+
int ans = 0;
153+
for (int i = 0; i <= n; i++) {
154+
if (d[i] < INF)
155+
ans = i;
156+
}
157+
return ans;
158+
}
159+
```
160+
161+
We now make two important observations.
162+
163+
The array $d$ will always be sorted:
164+
$d[i-1] \le d[i]$ for all $i = 1 \dots n$.
165+
And also the element $a[i]$ will only update at most one value $d[j]$.
166+
167+
Thus we can find this element in the array $d[]$ using binary search in $O(\log n)$.
168+
In fact we are simply looking in the array $d[]$ for the first number that is strictly greater than $a[i]$, and we try to update this element in the same way as the above implementation.
169+
170+
### Implementation
171+
172+
```cpp lis_method2_nlogn
173+
int lis(vector<int> const& a) {
174+
int n = a.size();
175+
const int INF = 1e9;
176+
vector<int> d(n+1, INF);
177+
d[0] = -INF;
178+
179+
for (int i = 0; i < n; i++) {
180+
int j = upper_bound(d.begin(), d.end(), a[i]) - d.begin();
181+
if (d[j-1] < a[i] && a[i] < d[j])
182+
d[j] = a[i];
183+
}
184+
185+
int ans = 0;
186+
for (int i = 0; i <= n; i++) {
187+
if (d[i] < INF)
188+
ans = i;
189+
}
190+
return ans;
191+
}
192+
```
193+
194+
### Restoring the subsequence
195+
196+
It is also possible to restore the subsequence using this approach.
197+
This time we have to maintain two auxiliary arrays.
198+
One that tells us the index of the elements in $d[]$.
199+
And again we have to create an array of "ancestors" $p[i]$.
200+
$p[i]$ will be the index of the previous element for the optimal subsequence ending in element $i$.
201+
202+
It's easy to maintain these two arrays in the course of iteration over the array $a[]$ alongside the computations of $d[]$.
203+
And at the end it in not difficult to restore the desired subsequence using these arrays.
204+
205+
## Solution in $O(n \log n)$ with data structures
206+
207+
Instead of the above method for computing the longest increasing subsequence in $O(n \log n)$ we can also solve the problem in a different way: using some simple data structures.
208+
209+
Let's go back to the first method.
210+
Remember that $d[i]$ is the value $d[j] + 1$ with $j < i$ and $a[j] < a[i]$.
211+
212+
Thus if we define an additional array $t[]$ such that
213+
$$t[a[i]] = d[i],$$
214+
then the problem of computing the value $d[i]$ is equivalent to finding the **maximum value in a prefix** of the array $t[]$:
215+
$$d[i] = \max\left(t[0 \dots a[i] - 1] + 1\right)$$
216+
217+
The problem of finding the maximum of a prefix of an array (which changes) is a standard problem that can be solved by many different data structures.
218+
For instance we can use a [Segment tree](./data_structures/segment_tree.html) or a [Fenwick tree](./data_structures/fenwick.html).
219+
220+
This method has obviously some **shortcomings**:
221+
in terms of length and complexity of the implementation this approach will be worse than the method using binary search.
222+
In addition if the input numbers $a[i]$ are especially large, then we would have to use some tricks, like compressing the numbers (i.e. renumber them from $0$ to $n-1$), or use an implicit Segment tree (only generate the branches of the tree that are important).
223+
Otherwise the memory consumption will be too high.
224+
225+
On the other hand this method has also some **advantages**:
226+
with this method you don't have to think about any tricky properties in the dynamic programming solution.
227+
And this approach allows us to generalize the problem very easily (see below).
228+
229+
## Related tasks
230+
231+
Here are several problems that are closely related to the problem of finding the longest increasing subsequence.
232+
233+
### Longest non-decreasing subsequence
234+
235+
This is in fact nearly the same problem.
236+
Only now it is allowed to use identical numbers in the subsequence.
237+
238+
The solution is essentially also nearly the same.
239+
We just have to change the inequality signs, and make a slightly modification to the binary search.
240+
241+
### Number of longest increasing subsequences
242+
243+
We can use the first discussed method, either the $O(n^2)$ version or the version using data structures.
244+
We only have to additionally store in how many ways we can obtain longest increasing subsequences ending in the values $d[i]$.
245+
246+
The number of ways to form a longest increasing subsequences ending in $a[i]$ is the sum of all ways for all longest increasing subsequences ending in $j$ where $d[j]$ is maximal.
247+
There can be multiple such $j$, so we need to sum all of them.
248+
249+
Using a Segment tree this approach can also be implemented in $O(n \log n)$.
250+
251+
It is not possible to use the binary search approach for this task.
252+
253+
### Smallest number of non-increasing subsequences covering a sequence
254+
255+
For a given array with $n$ numbers $a[0 \dots n - 1]$ we have to colorize the numbers in the smallest number of colors, so that each color forms a non-increasing subsequence.
256+
257+
To solve this, we notice that the minimum number of required colors is equal to the length of the longest increasing subsequence.
258+
259+
**Proof**:
260+
We need to prove the **duality** of these two problems.
261+
262+
Let's denote by $x$ the length of the longest increasing subsequence and by $y$ the least number of non-increasing subsequences that form a cover.
263+
We need to prove that $x = y$.
264+
265+
It is clear that $y < x$ is not possible, because if we have $x$ strictly increasing elements, than no two can be part of the same non-increasing subsequence.
266+
Therefore we have $y \ge x$.
267+
268+
We now show that $y > x$ is not possible by contradiction.
269+
Suppose that $y > x$.
270+
Then we consider any optimal set of $y$ non-increasing subsequences.
271+
We transform this in set in the following way:
272+
as long as there are two such subsequences such that the first begins before the second subsequence, and the first sequence start with a number greater than or equal to the second, then we unhook this starting number and attach it to the beginning of second.
273+
After a finite number of steps we have $y$ subsequences, and their starting numbers will form an increasing subsequence of length $y$.
274+
Since we assumed that $y > x$ we reached a contradiction.
275+
276+
Thus it follows that $y = x$.
277+
278+
**Restoring the sequences**:
279+
The desired partition of the sequence into subsequences can be done greedily.
280+
I.e. go from left to right and assign the current number or that subsequence ending with the minimal number which is greater than or equal to the current one.
281+
282+
## Practice Problems
283+
284+
- [Topcoder - IntegerSequence](https://community.topcoder.com/stat?c=problem_statement&pm=5922&rd=8075)
285+
- [Topcoder - AutoMarket](https://community.topcoder.com/stat?c=problem_statement&pm=3937&rd=6532)
286+
- [Codeforces - Tourist](http://codeforces.com/contest/76/problem/F)
287+
- [Codeforces - LCIS](http://codeforces.com/problemset/problem/10/D)
288+
- [SPOJ - SUPPER](http://www.spoj.com/problems/SUPPER/)
289+
- [Topcoder - BridgeArrangement](https://community.topcoder.com/stat?c=problem_statement&pm=2967&rd=5881)
290+
- [ACMSGURU - "North-East"](http://codeforces.com/problemsets/acmsguru/problem/99999/521)

test/test_lis.cpp

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#include <cassert>
2+
#include <vector>
3+
#include <algorithm>
4+
5+
using namespace std;
6+
7+
namespace Method1 {
8+
#include "lis_n2.h"
9+
}
10+
11+
namespace Method1_Restore {
12+
#include "lis_n2_restore.h"
13+
}
14+
15+
namespace Method2 {
16+
#include "lis_method2_n2.h"
17+
}
18+
19+
namespace Method2_nlogn {
20+
#include "lis_method2_nlogn.h"
21+
}
22+
23+
int main() {
24+
vector<int> a = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15};
25+
26+
assert(Method1::lis(a) == 6);
27+
28+
vector<int> subseq = {0, 4, 6, 9, 13, 15};
29+
assert(Method1_Restore::lis(a) == subseq);
30+
31+
assert(Method2::lis(a) == 6);
32+
33+
assert(Method2_nlogn::lis(a) == 6);
34+
}

0 commit comments

Comments
 (0)
0