This problem trips up many candidates because it seems simple at first glance, but the subtle mechanics of when and how to shrink your window cause real confusion during interviews. Most people either over-complicate it with nested loops or fail to correctly handle the case where a repeated character appears outside the current window. At its core, this problem teaches the sliding window pattern, specifically a variable-width window that dynamically adjusts as you scan through the input. Mastering this pattern unlocks a whole family of substring and subarray problems that appear frequently in technical interviews.
Problem statement
Given a string s, find the length of the longest substring that contains no repeating characters. A substring is a contiguous sequence of characters within the string.
You need to return an integer representing the length of that longest substring. If the string is empty, return 0. Note that the answer must be a substring, not a subsequence, meaning the characters must be adjacent in the original string.
Constraints
0 ≤ s.length ≤ 5 * 10^4sconsists of English letters, digits, symbols, and spaces
Examples
| Input | Output | Explanation |
|---|---|---|
s = "" | 0 | The string is empty, so no substring exists. |
s = "a" | 1 | The only character “a” forms the longest unique substring. |
s = "abcdefgh" | 8 | Every character is distinct, so the entire string qualifies. |
s = "aab" | 2 | “ab” starting at index 1 is the longest substring with all unique characters. |
s = "dvdf" | 3 | “vdf” is the longest valid substring. “dvd” fails because ‘d’ repeats. |
s = "tmmzuxt" | 5 | “mzuxt” starting at index 2 is the longest substring without duplicates. |
Why enumerating all possibilities doesn’t scale
The most straightforward approach would be to examine every possible substring, check whether all its characters are unique, and track the maximum length found. For a string of length n, there are roughly n * (n - 1) / 2 substrings, and checking each one for uniqueness takes up to O(n) time. This leads to an overall O(n^3) time complexity, which becomes impractical when the string contains tens of thousands of characters.
Even if you optimize the uniqueness check with a set, you still end up with O(n^2) because you are re-examining overlapping portions of the string repeatedly. The fundamental issue is that when you encounter a duplicate, you throw away all the knowledge you already gathered about the current window and start fresh. An efficient approach should instead preserve that knowledge and only discard the minimum necessary portion.
Optimized approach
The sliding window technique is a natural fit here. You maintain two pointers, window_start and the current index, that define the boundaries of a window guaranteed to contain only unique characters. As you iterate through the string with the right pointer, you use a hash map to record the most recent index where each character was seen.
When you encounter a character that already exists inside the current window, instead of resetting everything, you simply slide the left boundary of the window to one position past where that character was last seen. This single adjustment restores the uniqueness invariant without any redundant scanning.
Why the optimized approach works well
The key insight is that you never need to move the left pointer backward. Every character is visited at most twice: once when the right pointer reaches it, and once when the left pointer skips past it. The hash map provides O(1) lookups to determine whether and where a character was last seen, so each step of the iteration is constant time.
This approach also handles a subtle edge case gracefully. If a repeated character’s last occurrence is before the current window_start, it means the duplicate is outside the active window and can be safely ignored. The algorithm only advances window_start when the duplicate genuinely lies within the window, preventing incorrect shrinkage.
Step-by-step algorithm
- If the input string is empty, return 0 immediately.
- Initialize
window_startto 0,longestto 0, and an empty hash maplast_seen_atthat maps characters to their most recent index. - Iterate through the string using
indexas the right boundary of the window. - For each character at
index, check if it exists inlast_seen_at. - If the character is not in the map, store it with its current index and continue.
- If the character is in the map and its last seen index is greater than or equal to
window_start, it means the duplicate is inside the current window. Compute the current window length asindex - window_start, updatelongestif this length is greater, and movewindow_startto one position past the duplicate’s last seen index. - Update the character’s entry in
last_seen_atto the current index. - After the loop ends, perform one final check comparing
longestwith the length of the remaining window fromwindow_startto the end of the string. - Return
longest.
Python implementation
Code for Longest Substring Without Repeating Characters
Time complexity
The algorithm traverses the string exactly once with the right pointer, performing constant-time hash map operations at each step. The left pointer window_start only moves forward and never revisits an index. Therefore, the overall time complexity is O(n), where n is the length of the input string s.
Space complexity
The hash map last_seen_at stores at most one entry per distinct character in the string. Since the character set is bounded (English letters, digits, symbols, and spaces), the space used by the map is O(min(n, m)), where n is the string length and m is the size of the character set. In practice, this is O(1) because the character set is finite, but for very short strings it scales with n.
Edge cases
- Empty string: The function returns 0 immediately since there are no characters to form a substring.
- All identical characters (e.g.,
"aaaaa"): The window can never grow beyond length 1, so the result is 1. This tests that the window correctly shrinks every step. - All unique characters (e.g.,
"abcdefgh"): The window never needs to shrink, and the entire string is the answer. This confirms the final check after the loop captures the longest window. - Single character string: Returns 1, verifying that the algorithm handles minimal input correctly.
- Duplicate at the very end (e.g.,
"abca"): The duplicate ‘a’ appears just as the loop finishes. This tests that the window update logic and the post-loop length comparison both work together.
Common pitfalls
- Forgetting to check whether the duplicate is inside the current window: If you blindly move
window_startwhenever you see a repeated character, you may accidentally shrink the window past its current start, which produces incorrect results for inputs like"tmmzuxt". - Not performing the final length check after the loop: The longest valid substring might extend all the way to the end of the string. If you only update
longestinside the loop when a duplicate is found, you will miss this case entirely. - Moving
window_startbackward: The left pointer must only move forward. If a stale entry in the hash map causeswindow_startto jump to an earlier position, the window invariant breaks. - Using a set instead of a map: A set can tell you whether a character is in the window, but it cannot tell you where it was last seen. Without that position information, you would need to shrink the window one character at a time, degrading performance.