Level Up Your Coding Skills & Crack Interviews — Save up to 50% or more on Educative.io Today! Claim Discount

Arrow
Table of contents

1086. High Five

The High Five problem is a classic introduction to combining hash maps with heaps for “top-K” style calculations. It is popular in interviews because it tests whether you can keep only the most relevant data while processing input in a single pass. While sorting every student’s full score list may seem straightforward, this problem shows why maintaining a fixed-size min-heap is the more efficient and scalable approach.

Problem statement

You’re given a list items where each entry is a pair [IDi, scorei], meaning student IDi received the score scorei. For every student, compute their top five average by:

  1. Taking their five highest scores
  2. Adding those five scores together
  3. Dividing the sum by 5 using integer (floor) division

Return the results as a list of pairs result, where each pair is [ID, topFiveAverage]. The final list must be sorted by ID in increasing order.

Constraints:

  • 1 <= items.length <= 1000
  • items[i].length == 2
  • 1 <= IDi <= 1000
  • 0 <= scorei <= 100
  • For each IDi, there will be at least five scores.

Examples

InputOutputExplanation
[[10,50],[10,40],[10,60],[10,70],[10,80],[10,90],[10,100]][[10,80]]Student 10 has 7 scores. Top five are 100, 90, 80, 70, 60 → sum 400 → 400 // 5 = 80. Lower scores 50, 40 are ignored.
[[5,100],[5,99],[5,98],[5,97],[5,96]][[5,98]]We have exactly five scores, so we average all: sum 490 → 490 // 5 = 98.
[[1,91], [2,88], [1,76], [2,100], [1,84], [2,93], [1,90], [2,85], [1,87], [2,92]][[1,85], [2,91]]We compute the top five averages per student, then return the results sorted by student ID.Student 1 has five scores: 91, 76, 84, 90, 87.
Since there are exactly five, all of them count as the “top five.” Their sum is 428, and the integer average is (428 // 5 = 85).Student 2 has five scores: 88, 100, 93, 85, 92.
Again, all five scores are included. Their sum is 458, and the integer average is (458 // 5 = 91).

Intuition

The main idea behind this intuition is: we don’t care about all scores; we only need the top 5 scores per student. So for each student, every time we see a new score, we ask:

  • Does this score belong in their top 5?
  • If we already have 5 scores, we may need to discard the smallest among the top 5.

Naive approach: Sorting 

One straightforward approach is to group all scores by student ID first. Then, process students in increasing ID order, and for each student, sort their scores in descending order. Finally, for each student, take their top five scores, compute their sum, and return the average.

As simple as this solution looks, it has a major drawback: Sorting every student’s full score list can be expensive. If a student has many scores, sorting will have a time complexity of O(n log n) for that student. This is a major overkill when we only need the top 5.

Optimal approach using Min-Heap

Instead of collecting every score and sorting the entire list for each student, it is more efficient to process the input in a single pass while keeping only the information you actually need. Since the final answer depends only on the five highest scores per student, any score that is not competitive with those top five can be discarded immediately. This reduces unnecessary work (especially repeated sorting) and keeps memory usage predictable, as each student’s stored data never exceeds a small, fixed size.

A practical way to enforce this is to maintain, for each student ID, a min-heap that holds at most five scores. When a new score arrives, you insert it into that student’s heap; if the heap now contains more than five elements, you remove the smallest one. This ensures the heap always contains the best five scores seen so far for that student, without ever sorting the full set. After processing all pairs, you compute each student’s average by summing the heap and dividing by 5 using integer division, then output the results sorted by student ID.

Step-by-step algorithm

  1. Create a hashmap scores_by_id mapping each studentId to a min-heap of that student’s top scores.
  2. Iterate through items and for each [studentId, score]:
    1. Push score into that student’s heap.
    2. If heap size becomes > 5, pop the smallest score (so only the top 5 remain).
  3. Create a list result that will store students and the average of their top five sum.
  4. Iterate through each studentId in sorted order and for each student:
    1. Sum the five scores currently in that student’s heap
    2. Compute the average scores using sum // 5, and append [student_id, average] to the result.
  5. Finally, return the result array.

Let’s look at the following illustration to get a better understanding of the solution:

1 / 5

Code implementation

Let’s look at the code for the solution we just discussed.

Code for the high five problem

Time complexity 

The time complexity of this solution is O(N + S log S), where N is the number of score entries in items and S is the number of unique student IDs.

  • For each of the N entries, we do a heappush (and sometimes a heappop) on a heap capped at size 5, so each heap operation is O(log 5) = O(1). This makes the first pass O(N).
  • We then sort the S student IDs, which costs O(S log S).
  • Summing each student’s heap is O(5) = O(1) per student, so this step is O(S).

Space complexity

The space complexity is O(S). The dictionary stores one heap per student, and each heap holds at most 5 scores, so total stored scores are at most 5S, which is O(S).

Common pitfalls and interview tips

  1. Forgetting integer division: The average must be sum // 5, not float division.
  2. Storing all scores unnecessarily: Sorting entire lists works, but interviewers often prefer the fixed-size heap since we only need the top 5.
  3. Assuming input is already grouped/sorted: It isn’t, always build your own grouping structure and sort ID only at the end.

Frequently Asked Questions

Can we solve this without a heap?

Yes, store lists per student and sort each list, but it’s typically slower when students have many scores.

What if scores arrive as a stream?

The heap approach is ideal for streaming since we can update each student’s top 5 without storing everything.

What if we need top k instead of top 5?

The solution would remain the same. The only difference would be to keep a min-heap of size k and pop when it exceeds k.

Share with others:

Leave a Reply

Your email address will not be published. Required fields are marked *