Why store value -> list of indices?

Because the array is fixed. Precomputing lets pick(target) run in O(1) by directly choosing from the stored indices.

How do we know the randomness is fair?

If target occurs m times, idx_list has length m. random.choice(idx_list) picks each element with probability 1/m, so every valid index is equally likely.

What’s the time/space tradeoff?

Build once: O(n) time, O(n) space Each pick: O(1) time You trade memory for fast repeated queries.

What if memory is a concern?

Then you’d use reservoir sampling (scan on each pick, O(1) extra space). It’s slower per call but avoids storing all indices.

Does calling pick many times “bias” results?

No. Each call is an independent uniform choice from the same list, so past results don’t affect future picks.

398. Random Pick Index | Codinginterview.com

This problem often appears in interview settings because it goes beyond “find an index” and asks you to prove fair randomness. The trick is making sure each valid index is chosen with exactly the same probability, even though you discover matches one by one as you scan the array. That’s a realistic skill for building systems that sample logs, events, or streams fairly while keeping memory usage small.

Problem statement

You are given an integer array, nums, that may contain duplicates. For a given integer, target, return a random index i such that nums[i] == target. If the target occurs multiple times, each matching index must be chosen with equal probability.

Design a Solution class that can:

Solution(int[] nums) Initializes the object with the array nums.
int pick(int target) Picks a random index i from nums where nums[i] == target. If there are multiple valid i’s, then each index should have an equal probability of returning.

It is guaranteed that target exists in nums when pick(target) is called.

Constraints:

1 <= nums.length <= 2 x 10⁴
-2³¹ <= nums[i] <= 2³¹ – 1
target is an integer from nums.
At most 10⁴ calls will be made to pick.

Examples

nums	target	picks	Output	Explanation
[4, 9, 4, 1, 4, 7]	4	8	e.g. [0, 2, 4, 4, 0, 2, 0, 4]	Target 4 occurs at indices {0,2,4}. Each call should return one of these indices uniformly at random.
[-10, 5, -10, 3, 0, -10]	-10	10	e.g. [2, 0, 5, 5, 0, 2, 0, 5, 2, 0]	Works with negative numbers too. Valid indices are {0,2,5}.
[42]	42	6	[0, 0, 0, 0, 0, 0]	Array length is 1, so the only valid index is always returned.
[8, 8, 8, 8, 8]	8	7	e.g. [3, 1, 4, 0, 2, 2, 1]	All values are the target; any index0..4is valid with equal probability.
[0, 1, 2, 3, 4, 5]	3	5	[3, 3, 3, 3, 3]	Target appears once, so it always returns that single index.
[231 – 1, -231, 7, 7, 7]	7	9	e.g. [4, 2, 3, 4, 2, 2, 3, 4, 3]	Includes extreme 32-bit integer values; target indices are{2,3,4}.
[6, 1, 6, 2, 6, 3, 6, 4]	6	12	e.g. [6, 0, 4, 2, 0, 6, 4, 2, 6, 0, 2, 4]	Target appears many times, spaced out. Valid indices are {0,2,4,6}. Output should only be from this set.

Naive approach

A simple way to solve this problem is to handle each pick(target) request independently. Every time pick is called, we scan the entire array nums from left to right and collect all indices i where nums[i] == target into a temporary list. After we finish the scan, we choose one index from that list at random and return it.

This approach is correct because the list contains exactly the valid indices for the target, and selecting uniformly from it guarantees that each valid index has the same probability of being returned.

However, the downside is efficiency. If pick is called many times, we end up scanning the entire array repeatedly and rebuilding the same list again and again. That makes the time complexity O(n) per call, which becomes expensive when the number of calls is large.

Key observation

The key observation is that the input array nums does not change after the object is initialized. Because the array is fixed, the set of indices where a particular value appears is also fixed.

That means we can do the “searching work” just once. If we precompute and store all indices for every distinct value in nums, then later, when pick(target) is called, we do not need to scan the array at all. We can immediately access the list of indices for target and randomly choose one index from that list.

This works especially well because the problem only requires that we pick a random index uniformly among the occurrences of target. If we have a list of all valid indices, uniform randomness is as simple as choosing a random element from the list.

Optimized solution

In the optimized solution, we build a hash map (dictionary) during initialization:

The key is a number from the array.
The value is a list of all indices where that number appears.

For example, if nums = [1, 2, 3, 3, 3], the map looks like this:

1 -> [0]
2 -> [1]
3 -> [2, 3, 4]

Now, when we call pick(3), we directly retrieve the list [2, 3, 4] and return one of those indices at random. Since the random choice is uniform over the list, every index in the list has an equal probability of being returned, which is exactly what the problem asks for.

This solution is efficient because we pay the cost of scanning the array only once during initialization. After that, each pick operation is constant time.

Python implementation for the optimal solution

Now, let’s take a look at the code that implements this solution:

import random
from collections import defaultdict
from typing import List

class Solution:
    def __init__(self, nums: List[int]):
        # 1) Create a map: value -> list of indices where that value appears
        self.indices = defaultdict(list)

        # 2) Walk through nums and store each index under its value
        for i, val in enumerate(nums):
            self.indices[val].append(i)

    def pick(self, target: int) -> int:
        # 3) Get all indices where nums[i] == target
        idx_list = self.indices[target]

        # 4) Randomly choose one of those indices (uniformly)
        return random.choice(idx_list)


# Driver code (no expected outputs)
def main():
    test_cases = [
        # nums, target, number_of_picks
        ([1], 1, 5),

        ([5, 1, 4, 4, 2], 4, 10),

        ([5, 5, 5,
          5, 1, 5,
          5, 5, 5], 5, 10),

        ([1, 2, 3, 4,
          2, 3, 4, 5,
          3, 4, 5, 6], 4, 10),

        ([9, 8, 7, 6,
          8, 7, 6, 5,
          7, 6, 5, 4], 7, 10),
    ]

    for idx, (nums, target, k) in enumerate(test_cases, 1):
        sol = Solution(nums)
        picks = [sol.pick(target) for _ in range(k)]

        print(f"{idx}.\tnums = {nums}")
        print(f"\tTarget = {target}")
        print(f"\tPicked indices ({k} times): {picks}")
        print("-" * 75)


if __name__ == "__main__":
    main()

Code for the Random Pick Index problem

Time complexity

Initialization (__init__): This takes O(n) because we iterate through the entire input array exactly once to populate the dictionary. Each dictionary insertion (appending to a list) takes O(1) on average.
Picking (pick): Retrieving the list of indices from the dictionary takes O(1) average time, and random.choice() selects an element from that list in O(1) time.

Space complexity

The space complexity of this solution is O(n) in the worst case, because we store each index exactly once across the lists.

Edge cases

Target appears once

Example: nums = [5, 1, 2], target = 1
Stored list: 1 -> [1]
random.choice([1]) always returns 1 (probability 1).

Target appears many times (duplicates)

Example: nums = [3, 3, 3, 3], target = 3
Stored list: 3 -> [0, 1, 2, 3]
List size m = 4, so each index is chosen with probability 1/4.

All values equal the target

Example: nums = [8, 8, 8, 8, 8], target = 8
Stored list contains every index [0..4], each returned with probability 1/5.

Negative and very large values

Example: nums = [-2**31, 7, -2**31], target = -2**31
Hash keys can be any int; comparison is exact, so indices are stored/retrieved normally.

Many repeated pick(target) calls

The map is built once and never changes.
Each call selects uniformly from the same index list, so the distribution stays uniform across calls.

Common pitfalls

Accidentally returning a random VALUE instead of a random INDEX

Your map should store indices, not values.

Using a non-uniform selection

Bad: random.randint(0, len(lst)) (off-by-one, can crash or bias if “fixed”)
Good: random.randrange(len(lst)) or random.choice(lst)

Rebuilding the list on every pick

Works, but wastes time: turns each pick into O(n) instead of O(1).

Mutating the stored lists by mistake

Don’t sort/shuffle/pop from self.indices[target] during pick, or you’ll change future probabilities.

398. Random Pick Index

Problem statement

Examples

Naive approach

Key observation

Optimized solution

Python implementation for the optimal solution

Time complexity

Space complexity

Edge cases

Common pitfalls

Frequently Asked Questions

Leave a Reply Cancel reply

19. Remove Nth Node From End Of List

14. Longest Common Prefix

25. Reverse Nodes In K-group

15. 3sum

22. Generate Parentheses

23. Merge K Sorted Lists

136. Single Number

108. Convert Sorted Array to Binary Search Tree

Merge Strings Alternately

2778. Sum of Squares of Special Elements