Problem Statement
You’re given an integer array nums and an integer k. Your task is to count how many contiguous, non-empty subarrays have a sum exactly equal to k.
Constraints:
- $1 \leq$
nums.length$\leq 2 \times 10^4$ - $−1000 \leq$
nums[i]$\leq 1000$ - $-10^7 \leq$
k$ \leq 10^7$
Examples
| Input | Output | Explanation |
|---|---|---|
| [3, 4, 7], 7 | 2 | Valid subarrays are [3, 4] (sum = 7) and [7] (sum = 7). |
| [1, -1, 1], 1 | 3 | Valid subarrays are [1] (index 0), [1, -1, 1] (indexes 0 to 2), and [1] (index 2). |
| [0, 0, 0], 0 | 6 | All non-empty subarrays sum to 0. |
| [2, 2, 2], 1 | 0 | No contiguous subarray can sum to 1. |
Why Checking Every Subarray is Too Slow
The most direct way to solve this problem is to consider every possible subarray and compute its sum. For an array of length $n$, there are $O(n^2)$ such subarrays. Even if we optimize the sum calculation using prefix sums, we would still need to examine every $(start, end)$ pair explicitly.
With $n$ up to $20,000$, this is not feasible. So, to do better, we need a way to reason about subarray sums without enumerating all ranges explicitly.
More importantly, brute force treats each subarray as an independent object. It fails to take advantage of the fact that subarrays overlap heavily, and that much of their sum information is repeated. To solve this problem efficiently, we need a way to reuse work across overlapping subarrays instead of recomputing sums from scratch.
Optimized Approach Using Prefix Sum and Hashmap
A prefix sum represents the total sum of elements from the start of the array up to a given index. Once we view the array this way, every subarray sum can be expressed as the difference between two prefix sums.
If a subarray ending at index r has sum k, then the prefix sum just before the subarray starts must be exactly k less than the prefix sum at r. This reframes the problem: instead of searching for subarrays directly, we can search for pairs of prefix sums whose difference is k.
As we iterate through the array from left to right, we keep track of how many times each prefix sum has appeared so far. When we reach a new index, we can immediately determine how many valid subarrays end at that index by checking how often the required prefix sum has already occurred.
The hashmap is what makes this efficient. It allows us to answer the question “How many times have I seen this prefix sum before?” in constant time. Combined with prefix sums, this turns the problem into a single linear pass.
Why this works even with negatives
Unlike the sliding window (which relies on monotonic behavior), prefix sums and frequency counting using hashmap do not care whether numbers are positive, negative, or zero because it do not rely on sums increasing or decreasing in any predictable way.
Algorithm Steps
Let’s look at the algorithmic steps of the approach we just discussed.
- Initialize a hashmap,
prefixCount, to store the frequency of prefix sums. - Insert
{0: 1}into the hashmap to represent the empty prefix before the array starts. - Initialize,
prefixSum = 0, to track the running sum, andresult = 0, to store the total number of valid subarrays - Iterate through each element
numinnums:- Add
numtoprefixSum - If
(prefixSum - k)exists inprefixCount, add its frequency toresult - Increment the frequency of
prefixSumin the hashmap
- Add
- Return
resultafter processing all elements
To better understand the algorithm, let’s walk through an illustration example below:
Python Implementation
Let’s look at the code for the solution we just discussed.
Time Complexity
The time complexity of this solution is $O(n)$, where $n$ is the length of the input array, nums. The algorithm iterates through the array exactly once, and all hashmap lookups and updates are constant-time operations on average. As a result, the total amount of work performed grows linearly with the size of the input.
Space Complexity
The space complexity of this solution is $O(n)$. In the worst case, the hashmap may store up to $n$ distinct prefix sums if all cumulative sums are unique. So, the extra memory used by the algorithm scales linearly with the size of the input array.
Edge Cases to Consider
Let’s discuss important edge cases that should be considered while solving this problem:
- Subarrays starting at index 0
- Example:
nums = [3],k = 3→ should count[3] - That’s why we set
prefix_count[0] = 1at the start.
- Example:
- Negative numbers
- Example:
nums = [1, -1, 1],k = 1→ multiple overlapping answers
- Example:
- Many zeros
- Example:
nums = [0,0,0],k = 0→ output is6(all non-empty subarrays)
- Example:
Mistakes to Avoid
Following are the known mistakes to avoid:
- Using sliding window: fails when negatives exist.
- Forgetting
prefix_count[0] = 1: misses subarrays that start at index0. - Updating
prefix_count[prefix_sum]before counting: can accidentally count “empty-length” or wrong ranges.