Can we solve this problem using sorting instead of a set?

Yes, we can sort the array first and then check adjacent elements for duplicates. This would have O(n log n) time complexity and O(1) or O(n) space, depending on whether we sort in-place or create a copy. However, this is slower than the hash map approach and may modify the input.

What if we need to find which element is duplicated, not just whether a duplicate exists?

We can modify our solution to return the element instead of True when we find it in the set. The algorithm structure remains the same; only the return value changes.

How would this change if the input were a data stream instead of an array?

The set based approach works perfectly for streams. We’d process elements one at a time as they arrive, maintaining our hash map across stream chunks. Early return optimization is even more valuable in streaming scenarios.

217. Contains Duplicate | Codinginterview.com

Detecting duplicates in an array is a fundamental problem that appears frequently in technical interviews at companies like Google, Amazon, and Microsoft. This problem tests your understanding of hash-based data structures and your ability to optimize from a quadratic solution to linear time.

Problem statement

You are given an integer array nums. Your task is to determine whether any value appears more than once in the array. Return True if it does; otherwise, return False.

Constraints:

1 <= nums.length <= 10⁵
-10⁹ <= nums[i] <= 10⁹

Examples

Input	Output	Explanation
`[3, 7, 1, 3]`	`True`	The number 3 appears at indices 0 and 3.
`[5, 2, 8, 1]`	`False`	Every element in the array is unique.
`[2, 2, 2, 5, 5, 9, 5, 3, 9, 3]`	`True`	Multiple elements repeat: 2, 5, 9, and 3 all appear more than once.

Why checking all pairs becomes expensive

When we first encounter this problem, a common brute-force approach is to use a nested loop: compare each element with every element that comes after it. Specifically, for each index i, we scan indices i + 1 through n - 1 to check whether the same value appears again. This method correctly detects duplicates, but it becomes very inefficient as the array grows.

The issue is that for every element, we potentially perform a linear scan over the remaining portion of the array. If the array contains no duplicates, we end up checking almost every possible pair of elements. The total number of comparisons in this worst case is O(n²). With n = 10⁵, this could mean ~10¹⁰ comparisons, which is far too slow.

Optimized approach using Set

The “check all pairs“ method wastes work because it re-checks the same information again and again. When we move from index i to i + 1, we basically restart the search from scratch, even though we already learned a lot about the earlier elements.

So instead of repeatedly scanning the remaining part of the array for each number, we can flip the perspective:

While traversing the array from left to right, anything we’ve already passed is known.
For the current value, the only question we need to answer is: Have we seen this before?

If we can answer that question quickly, we can eliminate the expensive inner loop entirely. That’s exactly what a set gives us: we keep a running record of values we’ve encountered so far, and each time we read a new number, we do a constant-time lookup. If it’s already present, we’ve found a duplicate immediately; if not, we record it and continue.

Why do common approaches like sorting fail?

While sorting the array and then checking adjacent elements would work and give us O(n log n) time complexity, it’s slower than necessary and also requires either modifying the input array or creating a sorted copy. Our set based approach achieves better time complexity while leaving the original array unchanged.

Step-by-step algorithm

Initialize an empty set called records.
Iterate through the nums array, and for each number num,
1. Check if num already exists in records
  1. If it does, return True as we’ve encountered a duplicate.
  2. Otherwise, add it to records with num as the key.
After processing all numbers without finding duplicates, return False.

Take a look at the illustration below to understand the solution more clearly.

1 / 5

Code implementation

Let’s look at the code for the solution we just discussed.

def contains_duplicate(nums):
    # Hash set to track numbers we've already encountered
    records = set()
    
    # Check each number in the array
    for num in nums:
        # If we've seen this number before, we found a duplicate
        if num in records:
            return True
        
        # Otherwise, add it to our tracking HashMap
        records.add(num)
    
    # No duplicates found after checking all elements
    return False


def main():
    test_cases = [
        [3, 7, 1, 3],                      # Simple duplicate at different positions
        [5, 2, 8, 1],                      # All unique elements
        [2, 2, 2, 5, 5, 9, 5, 3, 9, 3],   # Multiple duplicates
        [42],                              # Single element (edge case)
        [1, 2, 3, 3, 4, 5]                # Consecutive duplicates
    ]
    
    for i, nums in enumerate(test_cases, 1):
        print(f"{i}. \t Input: {nums}")
        print(f"   \t Output: {contains_duplicate(nums)}\n")
        print("-"*75)


if __name__ == "__main__":
    main()

Code for the contains duplicate problem

Time complexity

The time complexity of this algorithm is O(n), where n is the number of elements in the input array. We iterate through the array exactly once, and for each element, we perform a set lookup and insertion, both of which are O(1) operations on average. Even in the worst case, where no duplicates exist, we still only make a single pass through all n elements.

Space complexity

The space complexity is O(n) in the worst case. This occurs when all elements in the array are distinct, requiring us to store every element in our records hash map.

217. Contains Duplicate

Problem statement

Examples

Why checking all pairs becomes expensive

Optimized approach using Set

Step-by-step algorithm

Code implementation

Time complexity

Space complexity

Frequently Asked Questions

Leave a Reply Cancel reply

30. Substring With Concatenation Of All Words

17. Letter Combinations Of A Phone Number

3. Longest Substring Without Repeating Characters

19. Remove Nth Node From End Of List

14. Longest Common Prefix

25. Reverse Nodes In K-group

15. 3sum

22. Generate Parentheses

23. Merge K Sorted Lists

136. Single Number