Level Up Your Coding Skills & Crack Interviews — Save up to 50% or more on Educative.io Today! Claim Discount

Arrow
Table of contents

Top K Elements

The Top K Elements pattern is one of the most practical techniques for solving selection and ranking problems in coding interviews. It appears frequently in easy and medium-level problems across major tech companies.

If you’ve ever solved problems involving finding the largest numbers, most frequent elements, or closest points, you’ve already encountered scenarios where Top K Elements is the natural solution.

A real-world intuition

Imagine you’re organizing a competition with 10,000 participants, but you only need to award the top 3 winners.

You don’t need to rank all 10,000 people from first to last. That would be wasteful. Instead, you maintain a small podium of 3 spots. As each participant finishes, you check: “Is this person better than my current 3rd place?” If yes, they replace that spot. If no, you move on.

At the end, you have your top 3 without ever fully sorting 10,000 results.

That is exactly how the Top K Elements pattern works.

What is the Top K Elements pattern?

The Top K Elements pattern is an algorithmic technique for efficiently identifying k elements from a collection based on specific criteria. These criteria typically involve finding:

  • The k largest elements
  • The k smallest elements
  • The k most frequent elements
  • The k closest elements to a target value

The fundamental challenge this pattern addresses is efficiency. While sorting the entire dataset would certainly allow us to pick the top k elements, it’s often unnecessarily expensive for our needs. The Top K Elements pattern provides a more targeted approach that optimizes both time and space complexity.

The above diagram illustrates how these elements are stored in a heap.

Implementation of Top K Elements

The Top K Elements pattern is implemented using a heap (priority queue) of size k. A heap is a specialized tree-based data structure that satisfies the heap property. There are two types of heaps:

Min Heap

The parent node is always smaller than or equal to its children. The smallest element is at the root. In a fixed-size min heap of k elements, when you try to insert a new element: if the heap isn’t full yet, the element is added directly; if the heap is full and the new element is larger than the root (smallest element), the root gets removed and the new element is inserted, maintaining only the k largest elements seen so far.

Let’s have a look at the following illustration to understand how to use a min heap to find the top three largest elements.

1 / 6

Notice how at each step, we only compare against the root (the smallest of our top k). If the new element is smaller than our threshold, we don’t even need to touch the heap. This is what makes the algorithm efficient.

Python implementation

Let’s have a look at the code for the algorithm we just discussed.

Time complexity

The time complexity is O(n log k), where n is the total number of elements in the array and k is the number of largest elements we want to find. We iterate through all n elements once, and for each element, we perform at most one heap operation (push or replace) that takes O(log k) time since the heap maintains exactly k elements.

Space complexity

The space complexity is O(k) because we only maintain a heap of exactly k elements regardless of the input size n. No additional data structures are needed that scale with the input, making this approach very memory-efficient, especially when k is much smaller than n.

Max Heap

The parent node is always greater than or equal to its children. The largest element is at the root. In a fixed-size max heap of k elements, when you try to insert a new element: if the heap isn’t full yet, the element is added directly; if the heap is full and the new element is smaller than the root (largest element), the root gets removed and the new element is inserted, maintaining only the k smallest elements seen so far.

Let’s look at the following illustration to understand how to use a max heap to find the top three smallest elements.

1 / 6

Notice how at each step, we only compare against the root (the smallest of our top k). If the new element is smaller than our threshold, we don’t even need to touch the heap. This is what makes the algorithm efficient.

Examples

The following examples illustrate some problems that can be solved with this approach:

  1. Third maximum number: Determine the third distinct maximum element in the array.

2. Sort characters by frequency: Sort a string in decreasing order based on the frequency of its characters.

When to use the Top K Elements pattern

Understanding when to apply this pattern and when you should avoid it is crucial for writing efficient solutions. You should consider applying this pattern if both of the following conditions are fulfilled:

  • Selection from unordered data: The problem requires finding a ranked subset of elements (largest, smallest, most frequent, closest to a value) from an unsorted collection, and sorting the entire dataset would be unnecessarily expensive. The subset identification may be the final answer or an intermediate step in a larger solution.
  • K elements needed, not just one: The problem explicitly asks for k elements where k > 1 and k is much smaller than n (total elements). Keywords like “top k”, “k largest”, “k most frequent”, “k closest points”, or “kth smallest” indicate this pattern, as maintaining a heap of size k is more efficient than sorting all n elements.

You should not apply this pattern if any of the following conditions is fulfilled:

  • Already sorted data: The input is pre-sorted according to the ranking criteria needed for the solution, making direct indexing or scanning sufficient without requiring heap operations.
  • Finding a single extreme: The problem asks for only one element (k = 1), such as the maximum or minimum value, which can be found in O(n) time with a single pass through the data without heap overhead.
  • Large k relative to n: When k is close to or greater than n/2, the space and time benefits diminish significantly. In these cases, sorting the entire array (O(n log n)) may be simpler and perform comparably to the heap approach.

Real-world applications

Many real-world problems use the top-K elements pattern. Let’s look at some of the examples.

Search engines and recommendation systems: Search engines like Google process millions of matching pages but maintain only a small heap of the top 10-20 most relevant results. Netflix uses the same pattern to show personalized recommendations without sorting all available content.

Real-time analytics and monitoring: Application monitoring tools track millions of metrics but display only “Top 10 slowest API endpoints” using heaps that update in real-time. Log aggregation systems use this to identify the most frequent errors without having to repeatedly process entire log histories.

Social media trending topics: Twitter processes millions of tweets per minute but shows only the top 10-20 trending hashtags by maintaining a heap of hashtag frequencies that updates as new tweets arrive.

E-commerce and pricing: Price comparison websites show “Top 5 cheapest options” from thousands of retailers using heaps. Amazon’s “Best Sellers” lists maintain category-specific heaps to show top performers with O(log k) updates.

Gaming leaderboards: Online games display “Top 100 Players” from millions of users by maintaining a heap that updates when scores change, avoiding expensive re-sorting after every match.

Network traffic analysis: Intrusion detection systems identify “Top K IP addresses by request volume” to detect DDoS attacks using heaps for real-time threat detection without sorting all IP addresses.

Financial trading systems: High-frequency trading platforms track “Top N stocks by volume” from thousands of instruments using heaps to provide the performance needed for split-second decisions.

Common pitfalls

When implementation the Top K Elements pattern, here are some of the common pitfalls to look out for:

  • Using the wrong heap type: Using a max heap to find the k largest elements forces you to maintain all n elements, defeating the purpose; to find k largest, use a min heap, to find k smallest, use a max heap.
  • Not handling edge cases: Always check for k = 0, k > array length, empty input arrays, and consider how duplicate elements should be treated in your specific problem.
  • Inefficient heap updates: Use heapreplace() instead of separate pop() and push() operations, as it’s a single optimized operation that can significantly improve performance.
  • Forgetting time complexity analysis: The pattern gives O(n log k), not O(n log n); when k is much smaller than n this is significantly faster than sorting, but when k ≈ n, sorting might be comparable or faster.
  • Comparison functions for complex objects: Heap operations fail if custom objects aren’t comparable; when storing tuples with non-comparable elements, add a tiebreaker index or define proper comparison methods.
  • Assuming heap order is sorted order: Heaps maintain partial ordering, not complete sorting; elements other than the root are not in any guaranteed order, so if you need fully sorted output, extract elements one by one using heappop.
  • Choosing heap when k is large: The pattern is most beneficial when k << n; when k approaches n/2 you’re storing a significant portion of the array anyway, so consider QuickSelect or sorting as alternatives.

Key takeaways

The Top K Elements pattern is powerful because it transforms O(n log n) sorting problems into O(n log k) selection problems. The key insights to remember:

  1. Heap size is k, not n: You only maintain k elements, making space O(k) instead of O(n)
  2. Counter-intuitive heap choice: Use min heap for k largest, max heap for k smallest
  3. When to use it: When k << n, this pattern shines. When k ≈ n, consider sorting instead
  4. Real-time updates: Heaps excel at dynamic scenarios where data streams in continuously
  5. Not just for numbers: The pattern works with any comparable objects: strings, custom objects, tuples

Share with others:

Unlock up to 68% off lifetime access to Coding Interview prep with Educative

Getting ready for coding interviews or sharpening your problem-solving skills? Unlock a lifetime discount with comprehensive resources designed to help you master technical interviews.

Data structures and algorithms

Pattern-based problem solving

Mock interview practice

Real-world coding challenges

Coding Interview Logo