Level Up Your Coding Skills & Crack Interviews — Save up to 50% or more on Educative.io Today! Claim Discount

Arrow
Table of contents

26. Remove Duplicates from Sorted Array

Removing duplicates from a sorted array is a fundamental array-manipulation problem frequently asked in interviews at companies like Google and Amazon. It tests your ability to work with in-place modifications, respect space constraints, and exploit the properties of sorted data rather than relying on extra memory.

Problem Overview

Given a sorted integer array nums, remove duplicates in place such that each element appears only once, and return the number of unique elements k. After the operation, the first k positions of nums should contain the unique values in their original sorted order. Elements beyond the index k – 1 can be ignored.

Constraints:

  • $1 \leq$ nums.length $ \leq 3 \times 10^{4}$
  • $−100 \leq$ nums[i] $\leq 100$
  • nums is sorted in non-decreasing order.

Examples

InputOutputExplanation
[7]1Only one element exists, so it’s already unique. The first k=1 element is [7].
[-2,-2,-2,-1]2After in-place removal, the first two positions become [-2,-1]. Everything after the index 1 can be ignored.
[1,2,3,4]4No duplicates appear, so k equals the array length and the first four elements remain [1,2,3,4].
[-100,-100,-50,-50,-50,0,100,100]4Unique values are [-100,-50,0,100], which occupy the first k=4 slots in sorted order.

Why the naive approach fails

If extra space were allowed, the problem would be much simpler. A straightforward approach is to use a map to record each unique value and its frequency as we iterate through the array. Once the map is built, we can overwrite the input array with the map’s keys. While this correctly identifies all unique elements, it violates the problem’s requirement to modify the array in-place using constant extra space.

Trying to stay in place often leads candidates to a different brute-force idea: scan the array and, whenever a duplicate is found, remove it by shifting all subsequent elements one position to the left. Although this preserves order and avoids extra memory, it performs a large amount of repeated work. In cases with long runs of duplicates, the same elements are shifted repeatedly, resulting in $O(n^{2})$ time complexity in the worst case.

Because duplicates in a sorted array always appear in contiguous blocks, repeatedly shifting elements is unnecessary work. This inefficiency is what motivates a single-pass solution that avoids both extra space and repeated movement.

Optimized Approach: Two Pointers Technique

The problem gives us two crucial guarantees.

  1. The array is already sorted.
  2. Only the first k elements of the array matter after removing duplicates; anything beyond that can be ignored.

Together, these constraints strongly suggest that we should overwrite the array in place rather than actually removing elements.

In a sorted array, a value is unique if and only if it differs from the element immediately before it. This allows us to identify new, unique values in a single pass.

We can solve this problem using the Two Pointers pattern, where one pointer scans the array to read values, while the other pointer tracks the position where the next unique value should be written. The write pointer always marks the boundary of the “unique” prefix of the array. Whenever the scanning pointer encounters a value different from the previous one, that value must be a new unique element and is written at the current write position.

This approach does not require extra space and avoids the need for repeated element shifting. Each element is examined once, and each unique value is written exactly once, making the solution both efficient and easy to reason about.

Step-by-Step Algorithm

The following steps describe how to modify the array in-place to retain only unique elements while preserving their original order.

  1. Initialize write_index to $1$ to mark the position where the next unique element should be written, since the first element of a sorted array is always unique.
  2. Iterate through the array using an index i, starting from position 111 up to len(nums) - 1.
  3. At each step, compare the current element nums[i] with the previous element nums[i - 1].
  4. If the two values are different, indicating a new unique element:
    1. Assign nums[i] to nums[write_index] to record the new unique element.
    2. Increment write_index to move to the next write position.
  5. After the iteration completes, return write_index, which represents the total number of unique elements in the array.

To better understand the solution, let’s walk through the algorithm using the illustration below:

1 / 5

Python Implementation

The code below implements the algorithm described above.

Time Complexity

The time complexity is $O(n)$ because the algorithm scans the array once from left to right, doing constant work per element.

Space complexity

The space complexity is $O(1)$ because it modifies nums in-place and uses only a few variables regardless of input size.

Edge cases

The following edge cases highlight scenarios that often cause confusion:

  • Array length is 1
    Example: nums = [7]
    Why it works: the loop never runs, and the function returns 1, keeping the single value.
  • All elements are duplicates
    Example: nums = [3,3,3]
    Why it works: no nums[i] != nums[i-1] occurs, so write_index stays 1, leaving one representative.
  • No duplicates at all
    Example: nums = [-3,-1,0,2]
    Why it works: every adjacent pair differs, so each element is written and k becomes n.

Common Pitfalls

The following pitfalls represent frequent mistakes candidates make during interviews when implementing this solution.

  • Returning the modified array instead of returning k: The problem explicitly asks for the count of unique elements. Returning the array itself results in an incorrect answer even if the in-place modification is correct.
  • Incrementing write_index even when the current value is a duplicate: The write pointer should only advance when a new unique value is found. Moving it for duplicates introduces gaps and corrupts the unique prefix.
  • Comparing against the wrong reference value: Checking against overwritten positions (such as nums[write_index - 1]) without careful reasoning can lead to incorrect duplicate detection. In a sorted array, comparing nums[i] with nums[i - 1] is sufficient and safe.
  • Forgetting that only the first k elements matter: Elements beyond index k - 1 can contain any values and should be ignored. Attempting to clean or validate the rest of the array adds unnecessary complexity.

Frequently Asked Questions

Why does sorting matter here?

Sorting guarantees that duplicates appear next to each other, allowing us to identify unique elements in a single pass without extra memory.

Why not “remove” elements by shifting left?

Repeated shifting causes the same elements to move multiple times, resulting in quadratic time complexity for larger inputs.

When should I use the Two Pointers pattern?

When you need to scan and rewrite an array in-place, especially if it’s sorted, Two Pointers is often the best fit.

Share with others:

Leave a Reply

Your email address will not be published. Required fields are marked *