Dynamic Programming - Longest Increasing Subsequence (LIS)

2 users say they have been asked this quesiton in interview.
Difficulty Level:

Problem Statement

Given a set of integers in random order, find out a sequence of elements which are in sorted in increasing order. The constraint here is that they are not strictly consecutive. Some examples would make things clear:

Input: {10, 22, 9, 33, 21, 50, 41, 60, 80, 5}   Output: 6 (10, 22, 33, 50, 60, 80)
Input: {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15}   Output: 6 (0, 4, 6, 9, 13, 15)

Building the Solution

As soon you digest the problem statement, you would be tempted to try out the most obvious solution - Start iterating from first element to last element and keep on picking the elements which are greater than the previously picked number. Let us apply it to Input #1 from above:

We pick 10, 22, and then skip 9 since it is lesser than last element picked 22, then pick 33, skip 21, pick 50, skip 41, pick 60, 80 and skip the last element 5. Works perfectly !!

Let us change the input sequence slightly - {15, 9, 10, 11, 12, 13, 14}. Now, if we pick 15 we can't pick any other since it is largest in this set. A shrewd mind will say - we will pick the smallest number first, and then build up the solution. Which isn't going to work either. So, what is that we are missing to get the correct answer! Well, the thing is - we can never be sure of the correctness of the solution until unless we have tried all the possible combinations.

So, how do we try out each possible combination? The idea is simple - pick an element and find the length of longest increasing subsequence for it. For Input #1 from above, for element - '21', we will check all the sequence {10, 22, 9, 33}, {10, 22, 9}, {10, 22} and {10} which it can be part of. And then find out the longest increasing subsequence out of these 4 subsequences (in this case it is - {10, 22, 9, 33} with length 3. However, we cannot simply include 21 in all of the sequences. We can only include it in subsequence #2 and #4. We can't include it in any other sequence since 21 is lesser than the last element in those subsequences. So, following is the set of probable solutions:

{10, 21}               => LIS is 2
{10, 22}               => LIS is 2
{10, 22, 9}           => LIS is 2
{10, 22, 9, 21}     => LIS is 3
{10, 22, 9, 33}     => LIS is 3

Which gives us the overall maximum(3) for Input - {10, 22, 9, 33, 21}. The recurrence relation can be written as:

F(i) = 1 + MAX(F(j)) where 0 < j < i and Inp[j] < Inp[i]i

Recursive Solution

#include <stdio.h>
#include <stdlib.h>

int lis(int *arr, int len, int *max);
int main()
{
int arr[] = {10, 22, 9, 33, 21, 50, 41, 60, 80, 5};
int len = sizeof(arr)/sizeof(arr[0]);
int max = 1;

(void)lis(arr, len-1, &max);
printf("The Longest Increasing Subsequence is: %d\n", max);
return 0;
}

int lis(int *arr, int len, int *max)
{
int i = 0;
int n = 1;

for (; i < len; i++) {
int temp = lis(arr, i, max);
if ((arr[len] > arr[i]) && (temp >= n)) {
n = temp + 1;
}
}

if (n > *max)
*max = n;

//printf("lis(%d) = %d\n", len, n);
return n;
}


Dynamic Programming Solution

Recursive solution looks correct, but is not efficient enough. In the worst case Time complexity will be exponential - O(2n). Can we do better than this ? If you observe closely, then you will realize that this problem exhibits both the properties of Dynamic Programming - optimal substructure and overlapping subproblem. Consider an input sequence with 5 elements and F(5) be the value of longest increasing subsequence ending at 5th element. If we go by above recursive routine, we will end up in a situation like this:

Dynamic Programming - Longest Increasing Subsequence | EduSagar


What we are doing here is that we are solving same subproblem multiple number of times. For example, we solve F(2) again and again while we evaluate F(5), F(4), F(3) as subproblems. To avoid resolving the same problem again and again, we use Dynamic Programming approach. We will use an array to store the results of each subproblem once it is evaluated and reuse it the next time instead of evaluating it again. This eats up O(n) space but Time complexity comes down to O(n2). Here is the program:


#include <stdio.h>
#include <stdlib.h>

void printoutput(int *arr, int *out, int curi);
int main()
{
int arr[] = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15};
int *res;
int *out;

int len = sizeof(arr)/sizeof(arr[0]);
int max = len > 0 ? 1 : 0;
int n = 1;
int maxi = 0;
int i,j;

res = (int*)malloc(sizeof(int) * len);
out = (int*)malloc(sizeof(int) * len);
for (i=0; i < len; i++) {
res[i] = 1;
out[i] = -1;
}

for (i = 1; i < len; i++) {
n = res[i];
for (j = 0; j < i; j++) {
if ((arr[i] > arr[j]) && (res[j] >= n)) {
n = res[j] + 1;
out[i] = j;
}
}
res[i] = n;
if (n > max) {
maxi = i;
max = n;
}
}

printf("The Longest Increasing Subsequence is: %d\n", max);
if (len)
printoutput(arr, out, maxi);
printf("\n");
return 0;
}

void printoutput(int *arr, int *out, int curi)
{
if (curi < 0)
return;

if (out[curi] != -1) {
printoutput(arr, out, out[curi]);
}
printf("%d ", arr[curi]);
}

The above program not only finds the length of the longest increasing subsequence, but also prints it out using a recursive routine. There is an O(nLogn) solution too, interested readers can check it out here and here.


comments powered by Disqus