HARD Lab · Strings · 17

49. Group Anagrams

Group a list of words so every anagram lands in the same bucket — the trick is hashing each word by a 26-length count signature instead of sorting, giving O(n·k) instead of O(n·k log k).

MediumHash MapFrequency SignatureTypeScript

PROBLEM What we're solving

Given an array of strings, group words that are anagrams of each other. Output order inside groups and order of groups both don't matter.

Example: ["eat","tea","tan","ate","nat","bat"] [["eat","tea","ate"],["tan","nat"],["bat"]]. eat, tea, and ate are all rearrangements of the letters a,e,t; tan and nat share a,n,t; bat is alone.

KEY IDEA Canonical key = anagram identity

Insight → two words are anagrams iff they produce the same canonical key. Any function that maps a word to something order-independent works — sorted letters, or a 26-length frequency count. Use the frequency count: it's O(k) per word vs O(k log k) for sorting, and makes a perfect hash-map bucket key.

RECIPE Build a key → group map

  • 1 · Create an empty map. Keys will be count signatures; values will be arrays of words.
  • 2 · For each word, compute its count key. Allocate a number[26] of zeros; for each character, increment freq[ch.charCodeAt(0) - 97]. Serialize with freq.join(',') — this gives a unique, collision-free string key.
  • 3 · Look up or create the bucket. map.get(key) ?? [], push the word, re-set.
  • 4 · Return all buckets. Array.from(map.values()).
Classic confusion → using a sorted string as the key is simpler to write but O(k log k) per word. The count-signature approach is both faster and equally short once you memorize the charCodeAt(0) - 97 idiom. Don't reach for .sort() in an interview without mentioning the faster alternative.

COST Complexity & alternatives

Sort each word as key
O(n·k log k)
Simple but sorting each word of length k costs k log k.
Count-signature key
O(n·k)
O(k) per word; O(n·k) total; O(n·k) space for all strings.

Both share the same O(n·k) space for storing n words of max length k. The count key is a constant factor win: for a fixed-size alphabet the inner loop runs exactly 26 iterations regardless of k, while sort is genuinely super-linear.

Pattern transfer → the same frequency-signature idea powers Valid Anagram (compare two signatures), Find All Anagrams in a String (sliding window that maintains a live frequency count), and Minimum Window Substring (shrink/expand window keyed on character frequencies).

RUN IT Hash each word by its count signature

step 0 / 19
STARTInput: eat, tea, tan, ate, nat, bat. We will compute a 26-length count signature for each word and bucket words by that key.
eatteatanatenatbat
current wordcomputed key / new entryexisting group member
slowfast

TYPESCRIPT The solution, annotated

groupAnagrams.ts
function groupAnagrams(strs: string[]): string[][] {
  const map = new Map<string, string[]>();

  for (const word of strs) {
    const key = countKey(word);          // O(k) — no sorting
    const bucket = map.get(key) ?? [];
    bucket.push(word);
    map.set(key, bucket);
  }

  return Array.from(map.values());
}

/** Build a 26-length frequency signature. Faster than sorting: O(k) vs O(k log k). */
function countKey(word: string): string {
  const freq = new Array<number>(26).fill(0);
  for (const ch of word) {
    freq[ch.charCodeAt(0) - 97]++;
  }
  return freq.join(',');
}

Reading it block by block

Line 2 — the map. We use a Map<string, string[]> because the canonical key is a string and we need to accumulate word arrays per key. A plain object would also work, but Map is idiomatic for non-literal keys.
Lines 4–8 — one pass over the input. For each word we compute its count key in O(k), then push it into the right bucket. map.get(key) ?? [] creates the bucket on first use — we must re-set it afterward so the new array reference is stored back in the map.
Line 10 — collect results. Array.from(map.values()) returns all groups. Iteration order matches insertion order in a JS Map, so the first word's group comes first — matching LeetCode's acceptance behavior, though order is not required.
Lines 14–20 — countKey helper. A fresh number[26] zeros array per word ensures no cross-word contamination. Subtracting 97 maps 'a'→0, 'z'→25. freq.join(',') produces a unique string like 1,0,0,...,1,0,1 that equals the sorted form of any permutation.
Complexity → O(n·k) time — n words, each costing O(k) for the count loop plus O(26) for join. O(n·k) space for the map (dominated by storing every word). If k is bounded (e.g. ≤ 100), the 26-slot array makes the inner loop a constant and join always produces a 26-segment string of fixed maximum length.

INTERVIEWFollow-ups they'll ask

  • "What if words contain uppercase or Unicode?" Normalize to lowercase first; for Unicode use a Map<string, number> keyed on the character instead of a fixed-size array.
  • "Can you avoid allocating a new array per word?" Yes — reuse a single Int32Array(26), sort the entries, and serialize. Or accept the tiny allocation: in practice the GC handles it fine.
  • "Return the groups sorted within each group?" Map over map.values() and call .sort() on each bucket — O(n·k log k) extra work.
  • "What's the brute force?" Sort the whole array and scan for consecutive anagram runs — O(n·k log k) time and O(1) extra space, but tricky to implement cleanly and slower.
  • "Could there be hash collisions with the count key?" No — the comma- separated 26-element vector is a perfect representation of the multiset. Two words map to the same key iff they are anagrams of each other (by definition of anagram).

MNEMONIC The one-liner

"Count to 26 for each word, join with commas, use that as the bucket address."

TRIGGERS When you see ___ → reach for ___

"group / cluster words that are anagrams"frequency-count key → hash map buckets
"same letters, different order" equalitycanonical signature (count or sort)
grouping strings by some invariantmap<invariant, string[]>
need faster than O(k log k) per wordcount array → O(k) key

SKELETON The reusable shape

skeleton.ts
function groupAnagrams(strs: string[]): string[][] {
  const map = new Map<string, string[]>();
  for (const word of strs) {
    const key = countKey(word);
    const bucket = map.get(key) ?? [];
    bucket.push(word);
    map.set(key, bucket);
  }
  return Array.from(map.values());
}
function countKey(word: string): string {
  const freq = new Array<number>(26).fill(0);
  for (const ch of word) freq[ch.charCodeAt(0) - 97]++;
  return freq.join(',');
}

FLASHCARDS Tap to flip

What is the canonical key for an anagram group?
A 26-length frequency count serialized as a comma-separated string, e.g. 1,0,0,...,1,0,1. Any permutation of the same letters produces the identical string.
tap to flip
1 / 7
Answer to reveal explanations. No penalty for retries.Score: 0 / 7
QUESTION 01
What is the time complexity of groupAnagrams using the count-signature key?
QUESTION 02
What is the count-signature key for "eat"?
QUESTION 03
After processing ["eat","tea","tan","ate","nat","bat"], how many buckets does the map contain?
QUESTION 04
Why must you re-call map.set(key, bucket) after pushing to the bucket array?
QUESTION 05
Which of the following is NOT a valid canonical key for an anagram group?
QUESTION 06
What space complexity does the algorithm use?
QUESTION 07
If you used a sorted string as the key instead of a count signature, what would change?