What if the dictionary is extremely large? Can we optimize further?

If the dictionary contains millions of words, you could use a trie (prefix tree) data structure instead of a set. This would allow you to check all possible words starting at position j more efficiently, potentially reducing redundant substring operations. However, for most interview scenarios with reasonably sized dictionaries, the set-based approach is sufficient and easier to implement.

Can this problem be solved using memoization instead of tabulation?

Yes, you can solve this using top-down dynamic programming with memoization. Instead of building the dp array bottom-up, you would recursively check if a string can be segmented and cache the results for each starting position. The time and space complexity would remain the same, but some candidates find the recursive approach more intuitive.

How would you modify this solution to return the actual segmentation instead of just a boolean?

You would need to track not just whether segmentation is possible, but also which word was used at each position. This can be done by storing the index j where each successful split occurred, then backtracking from dp[n] to reconstruct the sequence of words. The time and space complexity would remain asymptotically the same.

Word Break Problem: Dynamic Programming Solution & Examples

The Word Break problem is a classic dynamic programming challenge frequently asked in technical interviews at companies like Google, Amazon, and Meta. It tests your ability to recognize overlapping subproblems and optimize recursive solutions, making it an excellent problem for mastering the fundamentals of dynamic programming and string manipulation.

Problem statement

You are given a string s and a dictionary of strings wordDict. Your task is to determine whether the string s can be segmented into a space-separated sequence of one or more dictionary words.

Note: All space-seperated words of s must exist in the dictionary, and the dictionary words can be reused multiple times during segmentation.

Constraints:

1 <= s.length <= 300
1 <= wordDict.length <= 1000
1 <= wordDict[i].length <= 20
s and wordDict[i] consist of only lowercase English letters.
All the strings of wordDict are unique.

Examples

Input (`s`)	Input (`wordDict`)	Output	Explanation
`"codingpro"`	`["coding", "pro"]`	`True`	The string can be broken as “coding pro” and both of these words are available in `wordDict`.
`"programmingproblem"`	`["programming", "problem", "gram"]`	`True`	The string segments as “programming problem”. Note that “gram” exists in `wordDict` But it isn’t needed as it is not present in `s`.
`"ilovedogs"`	`["dog", "cats", "cat", "loves", "and"]`	`False`	Even though individual substrings like “dog”, and “loves” are present in `wordDict` but as “i” is not present, so the output is `False`.

Why recursion fails

The most straightforward solution uses recursion. Starting from the beginning of the string, we try every possible prefix. Whenever a prefix is found in wordDict, we recursively check whether the remaining suffix can also be segmented into valid dictionary words. If any recursive path successfully consumes the entire string, we return True.

Although easy to understand, this naive recursive approach can be very inefficient. It may recompute the same suffix multiple times and ends up exploring a huge number of possible splits. In the worst case, the number of ways to partition a string of length n grows exponentially, so the time complexity is exponential (often written as O(2ⁿ), sometimes O(n·2ⁿ) depending on implementation details such as substring creation). The space complexity is O(n) due to the maximum recursion depth of the call stack.

Optimized approach using dynamic programming

The key optimization is to recognize that the recursive solution repeatedly solves the same subproblems. For example, whether the substring starting at position j can be segmented might be computed many times from different earlier splits. Dynamic programming avoids this redundancy by storing results in a lookup table.

We build a boolean array dp of size n + 1, where dp[i] indicates whether the prefix s[0:i] (the first i characters) can be segmented using words from the dictionary. The extra slot dp[0] represents the empty string, which is always segmentable, so we set:

dp[0] = True

Then for each i from 1 to n, we try all split points j from 0 to i - 1:

If dp[j] is True and s[j:i] exists in the dictionary, then dp[i] = True and we can stop checking further splits for that i.

This builds solutions for larger prefixes using already-solved smaller prefixes, eliminating repeated work.

Step-by-step algorithm

Store the length of the string s in a variable n.
Initialize a set wordSet using wordDict.
- Create a boolean array dp of size n + 1 and initialize all values to False.
Set dp[0] to True.
Iterate through each position i from 1 to n and for each i,
1. Iterate through all possible starting positions j from 0 to i – 1.
  1. If dp[j] is True and if substring s[j:i] exists in wordSet.
    1. Set dp[i] = True
    2. Break out of the inner loop.
After processing all positions, return dp[n].

Let’s look at the following illustration to get a better understanding of the solution:

1 / 6

Code implementation

Let’s look at the code for the solution we just discussed.

def word_break(s, wordDict):
    n = len(s)  # Length of the input string

    wordSet = set(wordDict)  # Convert list to set for O(1) average-case lookups

    # dp[i] represents whether the substring s[0:i] can be segmented into dictionary words
    dp = [False] * (n + 1)

    # Empty string is always a valid segmentation (base case)
    dp[0] = True

    # Iterate over each position i in the string
    for i in range(1, n + 1):
        # Try every possible split point j before position i
        for j in range(i):
            # If s[0:j] is breakable AND s[j:i] is a valid dictionary word,
            # then s[0:i] is also breakable
            if dp[j] and s[j:i] in wordSet:
                dp[i] = True
                break  # No need to check further split points for this i

    # dp[n] tells us whether the entire string s can be segmented
    return dp[n]

# Driver Code
def main():

    s = ["vegancookbook", "catsanddog", "highwaycrash",
         "pineapplepenapple", "screamicecream", "educativecourse"]
    wordDict = ["ncoo", "kboo", "inea", "icec", "ghway", "and", "anco", "hi", "way", "wa",
                 "amic", "ed", "cecre", "ena", "tsa", "ami", "lepen", "highway", "ples",
                 "ookb", "epe", "nea", "cra", "lepe", "ycras", "dog", "nddo", "hway",
                 "ecrea", "apple", "shp", "kbo", "yc", "cat", "tsan", "ganco", "lescr",
                 "ep", "penapple", "pine", "book", "cats", "andd", "vegan", "cookbook"]

    print("The list of words we can use to break down the strings are:\n\n", wordDict, "\n")
    print("-" * 100)
    for i in range(len(s)):
        print("Test Case #", i+1, "\n\nInput: '" + str(s[i])+"'\n")
        print("\nOutput: " + str(word_break(str(s[i]), wordDict)))
        print("-" * 100)

if __name__ == '__main__':
    main()

Time complexity

The time complexity of this algorithm is O(n³ + w x l), where n is the length of the input string s, w is the number of words in wordDict, and l is the average length of words in the dictionary.

Converting wordSet into a set takes O(w x l) time. The nested loops run O(n²) iterations in total since the outer loop runs n times and the inner loop runs up to i times for each iteration of the outer loop. However, the substring operation s[j:i] inside the inner loop creates a new string, which takes O(n) time in the worst case. Therefore, the dominant term is O(n³) from the nested loops with substring extraction, plus the O(w x l) preprocessing cost.

Space complexity

The space complexity of this algorithm is O(n + w x l), where n is the length of the input string.

The dp array requires O(n) space to store boolean values for each position. Additionally, converting wordSet to a set takes O(w x l) space to store all the dictionary words. The substring operations create temporary strings, but these are not stored persistently, so they don’t add to the overall space complexity. In the worst case, the total space needed is O(n + w x l).

Common pitfalls

Forgetting the base case: Many candidates forget to initialize dp[0] = True. Without this, the algorithm cannot recognize that an empty prefix is valid, and no segmentation will ever be marked as possible. Always remember that the empty string is your starting point for building up valid segmentations.
Using a list instead of a set for the dictionary: Checking if a substring exists in a list takes O(w) time for each lookup, where w is the number of words. This can increase your time complexity significantly. Always convert the word dictionary to a set first to achieve O(1) average-case lookup time.

Interview insights

The Word Break problem is a favorite among interviewers for testing:

Your understanding of dynamic programming
Ability to think in terms of prefix validity
Efficient lookup using hash sets
Recognition of overlapping subproblems
Clean reasoning about how segmentations propagate

Word Break

Problem statement

Examples

Why recursion fails

Optimized approach using dynamic programming

Step-by-step algorithm

Code implementation

Time complexity

Space complexity

Common pitfalls

Interview insights

Frequently Asked Questions

Leave a Reply Cancel reply

30. Substring With Concatenation Of All Words

17. Letter Combinations Of A Phone Number

3. Longest Substring Without Repeating Characters

19. Remove Nth Node From End Of List

14. Longest Common Prefix

25. Reverse Nodes In K-group

15. 3sum

22. Generate Parentheses

23. Merge K Sorted Lists

136. Single Number