The Longest Increasing Subsequence (LIS) problem is a classic interview question frequently asked by top companies like Google, Amazon, and Meta. It tests dynamic programming and algorithm optimization skills.
Problem Statement
Given an integer array, nums, the task is to find the length of the longest strictly increasing subsequence in the array.
You only need to return the length of such a subsequence, not the subsequence itself.
Constraints:
- $1 \leq$
nums.length$\leq 2500$ - $−10^4 \leq$
nums[i]$\leq 10^4$
Examples
| Input | Output | Explanation |
|---|---|---|
| [10, 9, 2, 5, 3, 7, 101, 18] | 4 | A classic example with multiple increasing subsequences. One valid LIS is [2, 3, 7, 101]. |
| [0, 1, 0, 3, 2, 3] | 4 | Demonstrates that the subsequence does not need to be contiguous. An LIS here is [0, 1, 2, 3]. |
| [7, 7, 7, 7, 7] | 1 | All elements are equal. Since the subsequence must be strictly increasing, the longest possible length is 1. |
| [5, 4, 3, 2, 1] | 1 | The array is strictly decreasing, so no increasing subsequence longer than a single element exists. |
Breaking Down the Problem
At first glance, this problem may look similar to finding a longest increasing subarray. However, the key difference lies in the word subsequence.
In a subsequence, elements do not need to be adjacent. We are allowed to skip elements as long as the relative order is preserved. This significantly increases the number of possible combinations we need to consider.
What the problem is really asking is this:
For each element in the array, how long of an increasing subsequence can we build if that element is the last one?
If we can answer this question for every position, then the longest increasing subsequence in the entire array is simply the maximum of those answers.
Naive Approach
A straightforward way to think about the problem is to generate all possible subsequences of the array and check which ones are strictly increasing. Among those, we could track the maximum length.
While this idea works conceptually, it is not practical.
An array of length nnn has $2^{n}$ possible subsequences. Even for moderately sized inputs, this approach becomes infeasible due to exponential time complexity.
This limitation forces us to look for a more structured way to reuse previously computed results instead of recalculating them from scratch.
Improving the Naive Approach with Dynamic Programming
The main limitation of the naive approach is that it repeatedly solves the same subproblems. Many increasing subsequences share common prefixes, yet the brute-force method treats each possibility independently.
The key observation is that increasing subsequences have overlapping subproblems. If we already know the length of the longest increasing subsequence ending at earlier positions, we can reuse that information instead of recomputing it from scratch.
Rather than asking for the longest increasing subsequence in the entire array, we reframe the problem:
- What is the longest increasing subsequence ending at index
0? - What is the longest increasing subsequence ending at index
1? - And so on.
Once we have the answer for every index, the overall result is simply the maximum among them.
This leads to a dynamic programming solution that:
- Builds results incrementally from left to right
- Reuses previously computed information
- Avoids the redundant work present in the naive approach
Compared to brute force, this approach reduces the time complexity from exponential to polynomial. However, since each element still needs to be compared with all previous elements, the time complexity remains $O(n^{2})$, with an additional $O(n)$ space requirement to store intermediate results.
While this is a significant improvement and works well for moderate input sizes, the solution can be optimized further by reducing the number of comparisons.
Optimal Solution Using Binary Search
To improve upon the quadratic time complexity of the dynamic programming approach, we need a different way to think about how increasing subsequences grow. Instead of tracking the best subsequence ending at every index, we focus on keeping the most promising candidates for subsequences of different lengths.
The core idea is to maintain an auxiliary array where each position represents the smallest possible ending value of an increasing subsequence of a given length. Since this array is always kept in sorted order, we can use binary search to efficiently update it as we process each element in the input.
Step-by-Step Algorithm
The following steps describe how to compute the length of the longest increasing subsequence using binary search and a greedy strategy.
- Initialize an empty list,
tails, to store candidate subsequence endings. - Iterate through each element,
num, in the input array:- Use binary search on
tailsto find the leftmost index wherenumcan be placed. - If such an index exists within
tails, replace the value at that index withnum. - Otherwise, append
numto the end oftails.
- Use binary search on
- After processing all elements, return the length of
tails.
To better understand the solution, let’s walk through the algorithm using the illustration below:
Executable Python Implementation of the LIS Algorithm
The code below implements the algorithm described above.
Time Complexity
Each element in the input array is processed once. For every element, we perform a binary search on the tails array, which takes $O(logn)$ time in the worst case. Therefore, the overall time complexity of the algorithm is $O(nlogn)$.
Space Complexity
The algorithm uses an auxiliary array tails to store candidate subsequence endings. In the worst case, this array can grow to the size of the input array, resulting in a space complexity of $O(n)$.
Common Pitfalls to Avoid
A frequent source of confusion in this problem is the difference between a subsequence and a subarray. The elements of a subsequence do not need to be contiguous, and overlooking this detail often leads to incorrect approaches.
Another common pitfall appears in the optimal solution. The auxiliary tails array does not represent the actual longest increasing subsequence. Instead, it stores the smallest possible ending values for subsequences of different lengths. Replacing values in this array may seem counterintuitive, but it never affects the final answer and is essential for achieving the $O(nlogn)$ time complexity.
Keeping these points in mind helps avoid logical mistakes and makes the solution easier to reason about, especially in interview settings.