Converting a sorted array into a height-balanced binary search tree is a classic problem frequently asked at companies like Google, and Microsoft. It tests your understanding of divide-and-conquer recursion and the properties of binary search trees, making it an essential pattern to master for technical interviews.
Problem statement
You are given an integer array nums containing elements sorted in ascending order. Your task is to construct a height-balanced binary search tree (BST) from this array.
A height-balanced binary search tree is a binary tree in which the difference in heights between the left and right subtrees of any node is at most 1.
Note that multiple valid balanced BSTs can exist for the same input array.
Constraints:
- 1 <=
nums.length<= 104 - -104 <=
nums[i]<= 104 numsis sorted in a strictly increasing order.
Examples
Intutition
The key intuition is that our input is already sorted. For a three-element sorted array, the middle element becomes the root, with the first as the left child and the third as the right child, automatically creating a balanced BST.
Optimized approach using recursion
We extend this through recursive subdivision: select the middle element as the current node, then recursively build the left subtree from the left half and the right subtree from the right half. Choosing the middle ensures that both subtrees have roughly equal numbers of nodes, guaranteeing balance.
As the array is sorted, elements to the left of the middle element are automatically smaller, and elements to the right are larger; the sorted order ensures the BST property, while middle-element selection ensures balance.
This approach works with any values (negative, positive, mixed) because we rely on positioning, not absolute values. The recursive divide-and-conquer naturally maintains balance by building symmetrically from the center outward.
Step-by-step algorithm
Helper Method sortedArrayToBstHelper(nums, low, high): A recursive helper function that builds the binary search tree from a segment of the sorted array. This function takes the nums array along with low and high indexes that define the current subarray range.
- If
lowis greater thanhigh, returnNoneas no elements remain in the current range representing an empty subtree. - Calculate the middle index by setting
mid=low + (high - low) // 2. - Initialize
rootas a newTreeNodewithnums[mid]as its data. - Build the left subtree by recursively calling
sortedArrayToBstHelperwith the range(low, mid - 1)and assign the returned subtree toroot.left. - Build the right subtree by recursively calling
sortedArrayToBstHelperwith range(mid + 1, high)and assign the returned subtree toroot.right. - Return the
rootnode of the constructed subtree, so it can be linked to its parent node in the recursion chain.
sortedArrayToBST(nums): This function initiates the BST construction process. This invokes the helper function sortedArrayToBstHelper with the complete array range by passing nums, 0 as the starting index, and len(nums) - 1 as the ending index, and returns the root node of the fully constructed height-balanced BST.
To better understand the solution, let’s walk through the algorithm using the illustration below:
Code implementation
Let’s look at the code for the solution we just discussed.
Code for the Convert Sorted Array to Binary Search Tree
Time complexity
The time complexity of this algorithm is O(n) where n represents the number of elements in the input array. We create exactly one tree node for each element in the array, visiting each element precisely once during the recursive construction process.
Space complexity
The space complexity is O(log n) for the recursion call stack in the average case where the tree is balanced. As we’re constructing a height-balanced BST, the maximum depth of our recursive calls equals the height of the tree, which is logarithmic (log n) for a balanced binary tree containing n nodes.
Common pitfalls and interview tips
- Integer overflow in middle calculation: Using
mid = (low + high) // 2can cause integer overflow in languages like Java or C++ whenlowandhighare both large values. Always use the safer formulamid = low + (high - low) // 2to avoid this issue. While Python handles large integers automatically, using the safer formula demonstrates best practices and awareness of potential edge cases. - Confusing node attribute names: The provided solution uses
TreeNode.datainstead of the more commonTreeNode.val. Make sure you’re consistent with your node class definition throughout your implementation. Mixing attribute names like usingnode.datain one place andnode.valin another will cause runtime errors that can be difficult to debug during an interview.