Why use a hash map instead of a set for tracking characters?

A hash map stores the index where each character was last seen, allowing you to jump window_start directly to the correct position in constant time. With a set, you would only know that a duplicate exists but not where, forcing you to incrementally remove characters from the left until the duplicate is cleared.

How do you decide when to update longest versus when to move the window?

You update longest right before moving the window start, because at that moment you know the current window is about to shrink. You also perform a final update after the loop to catch the case where the longest substring runs to the end of the string.

Can this approach be adapted for "at most K distinct characters" problems?

Yes. The same sliding window framework applies. Instead of requiring all characters to be unique, you maintain a frequency map and shrink the window whenever the number of distinct keys exceeds K. The core logic of expanding the right pointer and conditionally shrinking the left pointer remains the same.

Why is this considered a sliding window problem rather than a two-pointer problem?

Sliding window is a specific form of the two-pointer technique where both pointers move in the same direction and define a contiguous range. The distinction matters because the window represents a meaningful subarray or substring, and the algorithm maintains an invariant (uniqueness, in this case) over that range.

3. Longest Substring Without Repeating Characters

This problem trips up many candidates because it seems simple at first glance, but the subtle mechanics of when and how to shrink your window cause real confusion during interviews. Most people either over-complicate it with nested loops or fail to correctly handle the case where a repeated character appears outside the current window. At its core, this problem teaches the sliding window pattern, specifically a variable-width window that dynamically adjusts as you scan through the input. Mastering this pattern unlocks a whole family of substring and subarray problems that appear frequently in technical interviews.

Problem statement

Given a string s, find the length of the longest substring that contains no repeating characters. A substring is a contiguous sequence of characters within the string.

You need to return an integer representing the length of that longest substring. If the string is empty, return 0. Note that the answer must be a substring, not a subsequence, meaning the characters must be adjacent in the original string.

Constraints

0 ≤ s.length ≤ 5 * 10^4
s consists of English letters, digits, symbols, and spaces

Examples

Input	Output	Explanation
`s = ""`	`0`	The string is empty, so no substring exists.
`s = "a"`	`1`	The only character “a” forms the longest unique substring.
`s = "abcdefgh"`	`8`	Every character is distinct, so the entire string qualifies.
`s = "aab"`	`2`	“ab” starting at index 1 is the longest substring with all unique characters.
`s = "dvdf"`	`3`	“vdf” is the longest valid substring. “dvd” fails because ‘d’ repeats.
`s = "tmmzuxt"`	`5`	“mzuxt” starting at index 2 is the longest substring without duplicates.

Why enumerating all possibilities doesn’t scale

The most straightforward approach would be to examine every possible substring, check whether all its characters are unique, and track the maximum length found. For a string of length n, there are roughly n * (n - 1) / 2 substrings, and checking each one for uniqueness takes up to O(n) time. This leads to an overall O(n^3) time complexity, which becomes impractical when the string contains tens of thousands of characters.

Even if you optimize the uniqueness check with a set, you still end up with O(n^2) because you are re-examining overlapping portions of the string repeatedly. The fundamental issue is that when you encounter a duplicate, you throw away all the knowledge you already gathered about the current window and start fresh. An efficient approach should instead preserve that knowledge and only discard the minimum necessary portion.

Optimized approach

The sliding window technique is a natural fit here. You maintain two pointers, window_start and the current index, that define the boundaries of a window guaranteed to contain only unique characters. As you iterate through the string with the right pointer, you use a hash map to record the most recent index where each character was seen.

When you encounter a character that already exists inside the current window, instead of resetting everything, you simply slide the left boundary of the window to one position past where that character was last seen. This single adjustment restores the uniqueness invariant without any redundant scanning.

Why the optimized approach works well

The key insight is that you never need to move the left pointer backward. Every character is visited at most twice: once when the right pointer reaches it, and once when the left pointer skips past it. The hash map provides O(1) lookups to determine whether and where a character was last seen, so each step of the iteration is constant time.

This approach also handles a subtle edge case gracefully. If a repeated character’s last occurrence is before the current window_start, it means the duplicate is outside the active window and can be safely ignored. The algorithm only advances window_start when the duplicate genuinely lies within the window, preventing incorrect shrinkage.

Step-by-step algorithm

If the input string is empty, return 0 immediately.
Initialize window_start to 0, longest to 0, and an empty hash map last_seen_at that maps characters to their most recent index.
Iterate through the string using index as the right boundary of the window.
For each character at index, check if it exists in last_seen_at.
If the character is not in the map, store it with its current index and continue.
If the character is in the map and its last seen index is greater than or equal to window_start, it means the duplicate is inside the current window. Compute the current window length as index - window_start, update longest if this length is greater, and move window_start to one position past the duplicate’s last seen index.
Update the character’s entry in last_seen_at to the current index.
After the loop ends, perform one final check comparing longest with the length of the remaining window from window_start to the end of the string.
Return longest.

1 / 11

Python implementation

def find_longest_substring(input_str):
    # Handle the edge case of an empty string
    if len(input_str) == 0:
        return 0

    window_start = 0
    longest = 0
    last_seen_at = {}

    for index, val in enumerate(input_str):
        if val not in last_seen_at:
            # First time seeing this character, record its position
            last_seen_at[val] = index
        else:
            # Character seen before; check if it falls within the current window
            if last_seen_at[val] >= window_start:
                # Compute window length before shrinking
                window_length = index - window_start
                if longest < window_length:
                    longest = window_length
                # Move window start past the previous occurrence
                window_start = last_seen_at[val] + 1

            # Update the last seen position for this character
            last_seen_at[val] = index

    # Final window check after the loop ends
    if longest < (index + 1) - window_start:
        longest = (index + 1) - window_start

    return longest


# Driver code
def main():
    test_cases = ["", "a", "abcdefgh", "aab", "dvdf"]

    for i, s in enumerate(test_cases):
        result = find_longest_substring(s)
        print(f"{i + 1}. Input: \"{s}\"")
        print(f"   Result: {result}")
        print("-" * 60)


if __name__ == "__main__":
    main()

Code for Longest Substring Without Repeating Characters

Time complexity

The algorithm traverses the string exactly once with the right pointer, performing constant-time hash map operations at each step. The left pointer window_start only moves forward and never revisits an index. Therefore, the overall time complexity is O(n), where n is the length of the input string s.

Space complexity

The hash map last_seen_at stores at most one entry per distinct character in the string. Since the character set is bounded (English letters, digits, symbols, and spaces), the space used by the map is O(min(n, m)), where n is the string length and m is the size of the character set. In practice, this is O(1) because the character set is finite, but for very short strings it scales with n.

Edge cases

Empty string: The function returns 0 immediately since there are no characters to form a substring.
All identical characters (e.g., "aaaaa"): The window can never grow beyond length 1, so the result is 1. This tests that the window correctly shrinks every step.
All unique characters (e.g., "abcdefgh"): The window never needs to shrink, and the entire string is the answer. This confirms the final check after the loop captures the longest window.
Single character string: Returns 1, verifying that the algorithm handles minimal input correctly.
Duplicate at the very end (e.g., "abca"): The duplicate ‘a’ appears just as the loop finishes. This tests that the window update logic and the post-loop length comparison both work together.

Common pitfalls

Forgetting to check whether the duplicate is inside the current window: If you blindly move window_start whenever you see a repeated character, you may accidentally shrink the window past its current start, which produces incorrect results for inputs like "tmmzuxt".
Not performing the final length check after the loop: The longest valid substring might extend all the way to the end of the string. If you only update longest inside the loop when a duplicate is found, you will miss this case entirely.
Moving window_start backward: The left pointer must only move forward. If a stale entry in the hash map causes window_start to jump to an earlier position, the window invariant breaks.
Using a set instead of a map: A set can tell you whether a character is in the window, but it cannot tell you where it was last seen. Without that position information, you would need to shrink the window one character at a time, degrading performance.

3. Longest Substring Without Repeating Characters

Problem statement

Constraints

Examples

Why enumerating all possibilities doesn’t scale

Optimized approach

Why the optimized approach works well

Step-by-step algorithm

Python implementation

Time complexity

Space complexity

Edge cases

Common pitfalls

Frequently Asked Questions

Leave a Reply Cancel reply

30. Substring With Concatenation Of All Words

17. Letter Combinations Of A Phone Number

3. Longest Substring Without Repeating Characters

19. Remove Nth Node From End Of List

14. Longest Common Prefix

25. Reverse Nodes In K-group

15. 3sum

22. Generate Parentheses

23. Merge K Sorted Lists

136. Single Number