Removing duplicates from a sorted array is a fundamental array-manipulation problem frequently asked in interviews at companies like Google and Amazon. It tests your ability to work with in-place modifications, respect space constraints, and exploit the properties of sorted data rather than relying on extra memory.
Problem Overview
Given a sorted integer array nums, remove duplicates in place such that each element appears only once, and return the number of unique elements k. After the operation, the first k positions of nums should contain the unique values in their original sorted order. Elements beyond the index k – 1 can be ignored.
Constraints:
- $1 \leq$
nums.length$ \leq 3 \times 10^{4}$ - $−100 \leq$
nums[i]$\leq 100$ numsis sorted in non-decreasing order.
Examples
| Input | Output | Explanation |
|---|---|---|
| [7] | 1 | Only one element exists, so it’s already unique. The first k=1 element is [7]. |
| [-2,-2,-2,-1] | 2 | After in-place removal, the first two positions become [-2,-1]. Everything after the index 1 can be ignored. |
| [1,2,3,4] | 4 | No duplicates appear, so k equals the array length and the first four elements remain [1,2,3,4]. |
| [-100,-100,-50,-50,-50,0,100,100] | 4 | Unique values are [-100,-50,0,100], which occupy the first k=4 slots in sorted order. |
Why the naive approach fails
If extra space were allowed, the problem would be much simpler. A straightforward approach is to use a map to record each unique value and its frequency as we iterate through the array. Once the map is built, we can overwrite the input array with the map’s keys. While this correctly identifies all unique elements, it violates the problem’s requirement to modify the array in-place using constant extra space.
Trying to stay in place often leads candidates to a different brute-force idea: scan the array and, whenever a duplicate is found, remove it by shifting all subsequent elements one position to the left. Although this preserves order and avoids extra memory, it performs a large amount of repeated work. In cases with long runs of duplicates, the same elements are shifted repeatedly, resulting in $O(n^{2})$ time complexity in the worst case.
Because duplicates in a sorted array always appear in contiguous blocks, repeatedly shifting elements is unnecessary work. This inefficiency is what motivates a single-pass solution that avoids both extra space and repeated movement.
Optimized Approach: Two Pointers Technique
The problem gives us two crucial guarantees.
- The array is already sorted.
- Only the first
kelements of the array matter after removing duplicates; anything beyond that can be ignored.
Together, these constraints strongly suggest that we should overwrite the array in place rather than actually removing elements.
In a sorted array, a value is unique if and only if it differs from the element immediately before it. This allows us to identify new, unique values in a single pass.
We can solve this problem using the Two Pointers pattern, where one pointer scans the array to read values, while the other pointer tracks the position where the next unique value should be written. The write pointer always marks the boundary of the “unique” prefix of the array. Whenever the scanning pointer encounters a value different from the previous one, that value must be a new unique element and is written at the current write position.
This approach does not require extra space and avoids the need for repeated element shifting. Each element is examined once, and each unique value is written exactly once, making the solution both efficient and easy to reason about.
Step-by-Step Algorithm
The following steps describe how to modify the array in-place to retain only unique elements while preserving their original order.
- Initialize
write_indexto $1$ to mark the position where the next unique element should be written, since the first element of a sorted array is always unique. - Iterate through the array using an index
i, starting from position 111 up tolen(nums) - 1. - At each step, compare the current element
nums[i]with the previous elementnums[i - 1]. - If the two values are different, indicating a new unique element:
- Assign
nums[i]tonums[write_index]to record the new unique element. - Increment
write_indexto move to the next write position.
- Assign
- After the iteration completes, return
write_index, which represents the total number of unique elements in the array.
To better understand the solution, let’s walk through the algorithm using the illustration below:
Python Implementation
The code below implements the algorithm described above.
Time Complexity
The time complexity is $O(n)$ because the algorithm scans the array once from left to right, doing constant work per element.
Space complexity
The space complexity is $O(1)$ because it modifies nums in-place and uses only a few variables regardless of input size.
Edge cases
The following edge cases highlight scenarios that often cause confusion:
- Array length is 1
Example:nums = [7]
Why it works: the loop never runs, and the function returns1, keeping the single value. - All elements are duplicates
Example:nums = [3,3,3]
Why it works: nonums[i] != nums[i-1]occurs, sowrite_indexstays1, leaving one representative. - No duplicates at all
Example:nums = [-3,-1,0,2]
Why it works: every adjacent pair differs, so each element is written andkbecomesn.
Common Pitfalls
The following pitfalls represent frequent mistakes candidates make during interviews when implementing this solution.
- Returning the modified array instead of returning k: The problem explicitly asks for the count of unique elements. Returning the array itself results in an incorrect answer even if the in-place modification is correct.
- Incrementing
write_indexeven when the current value is a duplicate: The write pointer should only advance when a new unique value is found. Moving it for duplicates introduces gaps and corrupts the unique prefix. - Comparing against the wrong reference value: Checking against overwritten positions (such as
nums[write_index - 1]) without careful reasoning can lead to incorrect duplicate detection. In a sorted array, comparingnums[i]withnums[i - 1]is sufficient and safe. - Forgetting that only the first k elements matter: Elements beyond index
k - 1can contain any values and should be ignored. Attempting to clean or validate the rest of the array adds unnecessary complexity.