Level Up Your Coding Skills & Crack Interviews — Save up to 50% or more on Educative.io Today! Claim Discount

389. Find the Difference

Identifying the extra character between two shuffled strings is a neat example of finding a tiny mismatch without getting distracted by order. Even though one string is just the other with one additional letter, repeated characters can make it surprisingly easy to miscount or “match” the wrong occurrence. This problem highlights how using a clean counting-based idea can avoid slow, error-prone scanning and pinpoint the extra character in a single efficient pass.

Problem statement

You’re given two lowercase strings s and t, where t contains all characters of s but shuffled in a random order, plus exactly one extra character inserted somewhere. Your task is to return that extra character.

This is trickier than it sounds because a naive approach often ends up repeatedly scanning for matches or removing characters, which becomes slow and error-prone when there are duplicates (like many repeated letters) and when order is irrelevant.

Constraints:

0 ≤ s.length ≤ 1000
t.length == s.length + 1
s and t consist of lowercase English letters

Examples

Input	Output	Explanation
`s` = “”, `t` = “y”	“y”	Since `s` is empty, the only character in `t` is the added one.
`s` = “a”, `t` = “aa”	“a”	`t` has one extra `"a"` compared to `s`.
`s` = “ab”, `t` = “bab”	“b”	Both have “a” and “b”, but t has one additional “b”.
`s` = “xyz”, `t` = “xzyw”	“w”	`t` is the same letters as `s` plus extra `"w"` (order doesn’t matter).
`s` = “hello”, `t` = “hleloo”	“o”	`t` has all letters of `s`, plus one extra `"o"`.

Why matching characters one-by-one is inefficient

A straightforward idea is: for each character in t, try to “match” it against a character in s (or remove matched characters from s). If implemented with repeated searches/removals, you can end up doing work proportional to scanning most of the string many times. That repeated work adds up quickly, especially with duplicates where “which occurrence” you remove matters.

Optimized approach using Bitwise Manipulation

Think of each character as a small number (its ASCII/Unicode code). XOR has two key properties that make this trick work:

Cancellation: x ^ x = 0
Order doesn’t matter (commutative/associative): you can XOR in any order.

Now, since t contains every character from s plus one extra, if you XOR all characters from s and t together, every matching character appears twice overall (once from s, once from t) and cancels out. What remains is only the extra character, because it appears one more time.

Why this works even with duplicates

Duplicates don’t break this at all. If a letter appears, say, 5 times in s and 5 times in t, it cancels out 10 times total across the combined XOR. The only character that doesn’t fully cancel is the one added to t, because it appears one more time than in s.

Solution steps

Initialize an integer accumulator xor_val to 0.
Iterate over each character in s:
1. Convert the character to its ASCII/Unicode code using ord(ch). XOR it into the accumulator: xor_val ^= ord(ch).
Iterate over each character in t:
1. Convert the character to its ASCII/Unicode code using ord(ch). XOR it into the accumulator: xor_val ^= ord(ch).
Convert the final xor_val back to a character using chr(xor_val) and return it.

Take a look at the illustration below to understand the solution more clearly.

1 / 10

Python Implementation

Let’s look at the code for the solution we just discussed.

def findTheDifference(s, t):
    # We'll use XOR to cancel out matching characters.
    # Since t is s with one extra character added, the final XOR will be that extra character.
    xor_val = 0

    # XOR all characters from s
    # Each character contributes its ASCII code using ord()
    for ch in s:
        xor_val ^= ord(ch)

    # XOR all characters from t
    # Characters that appear in both s and t cancel out (a ^ a = 0),
    # leaving only the ASCII code of the extra character.
    for ch in t:
        xor_val ^= ord(ch)

    # Convert the remaining ASCII code back to a character
    return chr(xor_val)
        
def main() -> None:

    test_cases = [
        ("abcd", "abcde"),
        ("", "y"),
        ("a", "aa"),
        ("ae", "aea"),
        ("xyz", "xzyw"),
    ]

    for i, (s, t) in enumerate(test_cases, start=1):
        ans = findTheDifference(s, t)
        print(f"Test {i}:\n\ts = {s!r}, t = {t!r}")
        print(f"\tdifference = {ans!r}")
        print("-"*70)


if __name__ == "__main__":
    main()
...

Time complexity

The time complexity is linear in the total number of characters processed. We scan s once and t once, so the work grows proportionally with s.length + t.length, which is O(n) where nnn is the length of t.

Space complexity

The space complexity is O(1) because we only keep a single integer accumulator regardless of input size, and we do not allocate any additional data structures that grow with nnn.

Edge cases

Scenario: s is empty
- Example: s = "", t = "y"
- Why it works: XOR over an empty string contributes nothing, so the result is just the XOR of t, which leaves "y".
Scenario: Added character is a duplicate of an existing one
- Example: s = "aab", t = "abaa"
- Why it works: Counts still cancel correctly; the duplicate has one extra occurrence in t, so it remains after XOR cancellation.
Scenario: Added character appears at any position (beginning/middle/end)
- Example: s = "abcd", t = "eabcd"
- Why it works: XOR is order-independent, so position doesn’t matter.
Scenario: All characters are the same
- Example: s = "zzz", t = "zzzz"
- Why it works: Three z values cancel with three from t, leaving one extra z.

Common pitfalls

Using string removal/search loops that repeatedly scan for matches, leading to unnecessarily slow solutions and tricky duplicate-handling bugs.
Sorting both strings works but is heavier than needed and is easy to overlook as suboptimal in interviews.
Forgetting to convert characters consistently (mixing characters and ASCII codes) when doing XOR logic.
Assuming unique letters and using a set difference approach, which breaks as soon as duplicates exist.

Frequently Asked Questions

A set ignores duplicates. If the added character is the same as an existing one (like adding another "a"), set-based logic can’t reliably detect the extra occurrence.

When two collections contain the same items except for one extra/missing item, and order is irrelevant, XOR is a strong candidate because it cancels pairs and doesn’t care about ordering.

It’s acceptable, but it’s not the best. Sorting typically costs more time than a single pass. XOR demonstrates stronger optimization instincts.

Yes. A frequency array of size 26 works well here. XOR is simpler and uses constant space without explicit counting.

No. You must at least read the input to know what character was added, so linear time is optimal.

Share with others:

Adeel Qayyum
December 22, 2025
5 min read

389. Find the Difference

Problem statement

Examples

Why matching characters one-by-one is inefficient

Optimized approach using Bitwise Manipulation

Solution steps

Python Implementation

Time complexity

Space complexity

Edge cases

Common pitfalls

Frequently Asked Questions

Leave a Reply Cancel reply

791. Custom Sort String

767. Reorganize String

529. Minesweeper

540. Single Element in a Sorted Array

504. Base 7

2081. Sum of k-Mirror Numbers

42. Trapping Rain Water

217. Contains Duplicate

58. Length of Last Word

288. Unique Word Abbreviation