|
| 1 | +<!--?title Longest increasing subsequence --> |
| 2 | +# Longest increasing subsequence |
| 3 | + |
| 4 | +We are given an array with $n$ numbers: $a[0 \dots n-1]$. |
| 5 | +The task is to find the longest, strictly increasing, subsequence in $a$. |
| 6 | + |
| 7 | +Formally we look for the longest sequence of indices $i_1, \dots i_k$ such that |
| 8 | +$$i_1 < i_2 < \dots < i_k,\\\\ |
| 9 | +a[i_1] < a[i_2] < \dots < a[i_k]$$ |
| 10 | + |
| 11 | +In this article we discuss multiple algorithms for solving this task. |
| 12 | +Also we will discuss some other problems, that can be reduced to this problem. |
| 13 | + |
| 14 | +## Solution in $O(n^2)$ with dynamic programming |
| 15 | + |
| 16 | +Dynamic programming is a very general technique that allows to solve a huge class of problems. |
| 17 | +Here we apply the technique for our specific task. |
| 18 | + |
| 19 | +First we will search only for the **length** of the longest increasing subsequence, and only later learn how to restore the subsequence itself. |
| 20 | + |
| 21 | +### Finding the length |
| 22 | + |
| 23 | +To accomplish this task, we define an array $d[0 \dots n-1]$, where $d[i]$ is the length of the longest increasing subsequence that ends in the element at index $i$. |
| 24 | +We will compute this array gradually: first $d[0]$, then $d[1]$, and so on. |
| 25 | +After this array is computed, the answer to the problem will be the maximum value in the array $d[]$. |
| 26 | + |
| 27 | +So let the current index be $i$. |
| 28 | +I.e. we want to compute the value $d[i]$ and all previous values $d[0], \dots, d[i-1]$ are already known. |
| 29 | +Then there are two options: |
| 30 | + |
| 31 | +- $d[i] = 1$: the required subsequence consists of only the element $a[i]$. |
| 32 | +- $d[i] > 1$: then in the required subsequence is another number before the number $a[i]$. |
| 33 | + Let's focus on that number: |
| 34 | + it can be any element $a[j]$ with $j = 0 \dots i-1$ and $a[j] < a[i]$. |
| 35 | + In this fashion we can compute $d[i]$ using the following formula: |
| 36 | + If we fixate the index $j$, than the longest increasing subsequence ending in the two elements $a[j]$ and $a[i]$ has the length $d[j] + 1$. |
| 37 | + All of these values $d[j]$ are already known, so we can directly compute $d[i]$ with: |
| 38 | + $$d[i] = \max_{\substack{j = 0 \dots i-1 \\\\ a[j] < a[i]}} \left(d[j] + 1\right)$$ |
| 39 | + |
| 40 | +If we combine these two cases we get the final answer for $d[i]$: |
| 41 | + |
| 42 | +$$d[i] = \max\left(1, \max_{\substack{j = 0 \dots i-1 \\\\ a[j] < a[i]}} \left(d[j] + 1\right)\right)$$ |
| 43 | + |
| 44 | +### Implementation |
| 45 | + |
| 46 | +Here is an implementation of the algorithm described above, which computes the length of the longest increasing subsequence. |
| 47 | + |
| 48 | +```cpp lis_n2 |
| 49 | +int lis(vector<int> const& a) { |
| 50 | + int n = a.size(); |
| 51 | + vector<int> d(n, 1); |
| 52 | + for (int i = 0; i < n; i++) { |
| 53 | + for (int j = 0; j < i; j++) { |
| 54 | + if (a[j] < a[i]) |
| 55 | + d[i] = max(d[i], d[j] + 1); |
| 56 | + } |
| 57 | + } |
| 58 | + |
| 59 | + int ans = d[0]; |
| 60 | + for (int i = 1; i < n; i++) { |
| 61 | + ans = max(ans, d[i]); |
| 62 | + } |
| 63 | + return ans; |
| 64 | +} |
| 65 | +``` |
| 66 | +
|
| 67 | +### Restoring the subsequence |
| 68 | +
|
| 69 | +So far we only learned how to find the length of the subsequence, but not how to find the subsequence itself. |
| 70 | +
|
| 71 | +To be able to restore the subsequence we generate an additional auxiliary array $p[0 \dots n-1]$ that we will compute alongside the array $d[]$. |
| 72 | +$p[i]$ will be the index $j$ of the second last element in the longest increasing subsequence ending in $i$. |
| 73 | +In other words the index $p[i]$ is the same index $j$ at which the highest value $d[i]$ was obtained. |
| 74 | +This auxiliary array $p[]$ points in some sense to the ancestors. |
| 75 | +
|
| 76 | +Then to derive the subsequence, we just start at the index $i$ with the maximal $d[i]$, and follow the ancestors until we deduced the entire subsequence, i.e. until we reach the element with $d[i] = 1$. |
| 77 | +
|
| 78 | +### Implementation of restoring |
| 79 | +
|
| 80 | +We will change the code from the previous sections a little bit. |
| 81 | +We will compute the array $p[]$ alongside $d[]$, and afterwards compute the subsequence. |
| 82 | +
|
| 83 | +For convenience we originally assign the ancestors with $p[i] = -1$. |
| 84 | +For elements with $d[i] = 1$, the ancestors value will remain $-1$, which will be slightly more convenient for restoring the subsequence. |
| 85 | +
|
| 86 | +```cpp lis_n2_restore |
| 87 | +vector<int> lis(vector<int> const& a) { |
| 88 | + int n = a.size(); |
| 89 | + vector<int> d(n, 1), p(n, -1); |
| 90 | + for (int i = 0; i < n; i++) { |
| 91 | + for (int j = 0; j < i; j++) { |
| 92 | + if (a[j] < a[i] && d[i] < d[j] + 1) { |
| 93 | + d[i] = d[j] + 1; |
| 94 | + p[i] = j; |
| 95 | + } |
| 96 | + } |
| 97 | + } |
| 98 | +
|
| 99 | + int ans = d[0], pos = 0; |
| 100 | + for (int i = 1; i < n; i++) { |
| 101 | + if (d[i] > ans) { |
| 102 | + ans = d[i]; |
| 103 | + pos = i; |
| 104 | + } |
| 105 | + } |
| 106 | +
|
| 107 | + vector<int> subseq; |
| 108 | + while (pos != -1) { |
| 109 | + subseq.push_back(a[pos]); |
| 110 | + pos = p[pos]; |
| 111 | + } |
| 112 | + reverse(subseq.begin(), subseq.end()); |
| 113 | + return subseq; |
| 114 | +} |
| 115 | +``` |
| 116 | + |
| 117 | +### Alternative way of restoring the subsequence |
| 118 | + |
| 119 | +It is also possible to restore the subsequence without the auxiliary array $p[]$. |
| 120 | +We can simply recalculate the current value of $d[i]$ and also see how the maximum was reached. |
| 121 | + |
| 122 | +This method leads to a slightly longer code, but in return we save some memory. |
| 123 | + |
| 124 | +## Solution in $O(n \log n)$ with dynamic programming and binary search |
| 125 | + |
| 126 | +In order to obtain a faster solution for the problem, we construct a different dynamic programming solution that runs in $O(n^2)$, and then later improve it to $O(n \log n)$. |
| 127 | + |
| 128 | +We will use the dynamic programming array $d[0 \dots n]$. |
| 129 | +This time $d[i]$ will be the element at which a subsequence of length $i$ terminates. |
| 130 | +If there are multiple such sequences, then we take the one that ends in the smallest element. |
| 131 | + |
| 132 | +Initially we assume $d[0] = -\infty$ and for all other elements $d[i] = \infty$. |
| 133 | + |
| 134 | +We will again gradually process the numbers, first $a[0]$, then $a[1]$, etc, and in each step maintain the array $d[]$ so that it is up to date. |
| 135 | + |
| 136 | +After processing all the elements of $a[]$ the length of the desired subsequence is the largest $l$ with $d[l] < \infty$. |
| 137 | + |
| 138 | +```cpp lis_method2_n2 |
| 139 | +int lis(vector<int> const& a) { |
| 140 | + int n = a.size(); |
| 141 | + const int INF = 1e9; |
| 142 | + vector<int> d(n+1, INF); |
| 143 | + d[0] = -INF; |
| 144 | + |
| 145 | + for (int i = 0; i < n; i++) { |
| 146 | + for (int j = 1; j <= n; j++) { |
| 147 | + if (d[j-1] < a[i] && a[i] < d[j]) |
| 148 | + d[j] = a[i]; |
| 149 | + } |
| 150 | + } |
| 151 | + |
| 152 | + int ans = 0; |
| 153 | + for (int i = 0; i <= n; i++) { |
| 154 | + if (d[i] < INF) |
| 155 | + ans = i; |
| 156 | + } |
| 157 | + return ans; |
| 158 | +} |
| 159 | +``` |
| 160 | +
|
| 161 | +We now make two important observations. |
| 162 | +
|
| 163 | +The array $d$ will always be sorted: |
| 164 | +$d[i-1] \le d[i]$ for all $i = 1 \dots n$. |
| 165 | +And also the element $a[i]$ will only update at most one value $d[j]$. |
| 166 | +
|
| 167 | +Thus we can find this element in the array $d[]$ using binary search in $O(\log n)$. |
| 168 | +In fact we are simply looking in the array $d[]$ for the first number that is strictly greater than $a[i]$, and we try to update this element in the same way as the above implementation. |
| 169 | +
|
| 170 | +### Implementation |
| 171 | +
|
| 172 | +```cpp lis_method2_nlogn |
| 173 | +int lis(vector<int> const& a) { |
| 174 | + int n = a.size(); |
| 175 | + const int INF = 1e9; |
| 176 | + vector<int> d(n+1, INF); |
| 177 | + d[0] = -INF; |
| 178 | +
|
| 179 | + for (int i = 0; i < n; i++) { |
| 180 | + int j = upper_bound(d.begin(), d.end(), a[i]) - d.begin(); |
| 181 | + if (d[j-1] < a[i] && a[i] < d[j]) |
| 182 | + d[j] = a[i]; |
| 183 | + } |
| 184 | +
|
| 185 | + int ans = 0; |
| 186 | + for (int i = 0; i <= n; i++) { |
| 187 | + if (d[i] < INF) |
| 188 | + ans = i; |
| 189 | + } |
| 190 | + return ans; |
| 191 | +} |
| 192 | +``` |
| 193 | + |
| 194 | +### Restoring the subsequence |
| 195 | + |
| 196 | +It is also possible to restore the subsequence using this approach. |
| 197 | +This time we have to maintain two auxiliary arrays. |
| 198 | +One that tells us the index of the elements in $d[]$. |
| 199 | +And again we have to create an array of "ancestors" $p[i]$. |
| 200 | +$p[i]$ will be the index of the previous element for the optimal subsequence ending in element $i$. |
| 201 | + |
| 202 | +It's easy to maintain these two arrays in the course of iteration over the array $a[]$ alongside the computations of $d[]$. |
| 203 | +And at the end it in not difficult to restore the desired subsequence using these arrays. |
| 204 | + |
| 205 | +## Solution in $O(n \log n)$ with data structures |
| 206 | + |
| 207 | +Instead of the above method for computing the longest increasing subsequence in $O(n \log n)$ we can also solve the problem in a different way: using some simple data structures. |
| 208 | + |
| 209 | +Let's go back to the first method. |
| 210 | +Remember that $d[i]$ is the value $d[j] + 1$ with $j < i$ and $a[j] < a[i]$. |
| 211 | + |
| 212 | +Thus if we define an additional array $t[]$ such that |
| 213 | +$$t[a[i]] = d[i],$$ |
| 214 | +then the problem of computing the value $d[i]$ is equivalent to finding the **maximum value in a prefix** of the array $t[]$: |
| 215 | +$$d[i] = \max\left(t[0 \dots a[i] - 1] + 1\right)$$ |
| 216 | + |
| 217 | +The problem of finding the maximum of a prefix of an array (which changes) is a standard problem that can be solved by many different data structures. |
| 218 | +For instance we can use a [Segment tree](./data_structures/segment_tree.html) or a [Fenwick tree](./data_structures/fenwick.html). |
| 219 | + |
| 220 | +This method has obviously some **shortcomings**: |
| 221 | +in terms of length and complexity of the implementation this approach will be worse than the method using binary search. |
| 222 | +In addition if the input numbers $a[i]$ are especially large, then we would have to use some tricks, like compressing the numbers (i.e. renumber them from $0$ to $n-1$), or use an implicit Segment tree (only generate the branches of the tree that are important). |
| 223 | +Otherwise the memory consumption will be too high. |
| 224 | + |
| 225 | +On the other hand this method has also some **advantages**: |
| 226 | +with this method you don't have to think about any tricky properties in the dynamic programming solution. |
| 227 | +And this approach allows us to generalize the problem very easily (see below). |
| 228 | + |
| 229 | +## Related tasks |
| 230 | + |
| 231 | +Here are several problems that are closely related to the problem of finding the longest increasing subsequence. |
| 232 | + |
| 233 | +### Longest non-decreasing subsequence |
| 234 | + |
| 235 | +This is in fact nearly the same problem. |
| 236 | +Only now it is allowed to use identical numbers in the subsequence. |
| 237 | + |
| 238 | +The solution is essentially also nearly the same. |
| 239 | +We just have to change the inequality signs, and make a slightly modification to the binary search. |
| 240 | + |
| 241 | +### Number of longest increasing subsequences |
| 242 | + |
| 243 | +We can use the first discussed method, either the $O(n^2)$ version or the version using data structures. |
| 244 | +We only have to additionally store in how many ways we can obtain longest increasing subsequences ending in the values $d[i]$. |
| 245 | + |
| 246 | +The number of ways to form a longest increasing subsequences ending in $a[i]$ is the sum of all ways for all longest increasing subsequences ending in $j$ where $d[j]$ is maximal. |
| 247 | +There can be multiple such $j$, so we need to sum all of them. |
| 248 | + |
| 249 | +Using a Segment tree this approach can also be implemented in $O(n \log n)$. |
| 250 | + |
| 251 | +It is not possible to use the binary search approach for this task. |
| 252 | + |
| 253 | +### Smallest number of non-increasing subsequences covering a sequence |
| 254 | + |
| 255 | +For a given array with $n$ numbers $a[0 \dots n - 1]$ we have to colorize the numbers in the smallest number of colors, so that each color forms a non-increasing subsequence. |
| 256 | + |
| 257 | +To solve this, we notice that the minimum number of required colors is equal to the length of the longest increasing subsequence. |
| 258 | + |
| 259 | +**Proof**: |
| 260 | +We need to prove the **duality** of these two problems. |
| 261 | + |
| 262 | +Let's denote by $x$ the length of the longest increasing subsequence and by $y$ the least number of non-increasing subsequences that form a cover. |
| 263 | +We need to prove that $x = y$. |
| 264 | + |
| 265 | +It is clear that $y < x$ is not possible, because if we have $x$ strictly increasing elements, than no two can be part of the same non-increasing subsequence. |
| 266 | +Therefore we have $y \ge x$. |
| 267 | + |
| 268 | +We now show that $y > x$ is not possible by contradiction. |
| 269 | +Suppose that $y > x$. |
| 270 | +Then we consider any optimal set of $y$ non-increasing subsequences. |
| 271 | +We transform this in set in the following way: |
| 272 | +as long as there are two such subsequences such that the first begins before the second subsequence, and the first sequence start with a number greater than or equal to the second, then we unhook this starting number and attach it to the beginning of second. |
| 273 | +After a finite number of steps we have $y$ subsequences, and their starting numbers will form an increasing subsequence of length $y$. |
| 274 | +Since we assumed that $y > x$ we reached a contradiction. |
| 275 | + |
| 276 | +Thus it follows that $y = x$. |
| 277 | + |
| 278 | +**Restoring the sequences**: |
| 279 | +The desired partition of the sequence into subsequences can be done greedily. |
| 280 | +I.e. go from left to right and assign the current number or that subsequence ending with the minimal number which is greater than or equal to the current one. |
| 281 | + |
| 282 | +## Practice Problems |
| 283 | + |
| 284 | +- [Topcoder - IntegerSequence](https://community.topcoder.com/stat?c=problem_statement&pm=5922&rd=8075) |
| 285 | +- [Topcoder - AutoMarket](https://community.topcoder.com/stat?c=problem_statement&pm=3937&rd=6532) |
| 286 | +- [Codeforces - Tourist](http://codeforces.com/contest/76/problem/F) |
| 287 | +- [Codeforces - LCIS](http://codeforces.com/problemset/problem/10/D) |
| 288 | +- [SPOJ - SUPPER](http://www.spoj.com/problems/SUPPER/) |
| 289 | +- [Topcoder - BridgeArrangement](https://community.topcoder.com/stat?c=problem_statement&pm=2967&rd=5881) |
| 290 | +- [ACMSGURU - "North-East"](http://codeforces.com/problemsets/acmsguru/problem/99999/521) |
0 commit comments