HARD Lab · Graphs · 45

269. Alien Dictionary

Given a sorted list of alien-language words, reconstruct the ordering of the alien alphabet by reading edges from the first differing character in every adjacent pair, then running Kahn's BFS topological sort to linearise those constraints — returning "" if a cycle or prefix violation makes the ordering impossible.

HardTopological SortBFS (Kahn)TypeScript

PROBLEM What we're solving

You receive a list of words sorted in an alien language. Use the sorted order to infer which letters come before others, then output the full alien alphabet. Example: ["wrt", "wrf", "er", "ett", "rftt"] "wertf". Return "" if the ordering is contradictory (cycle) or a longer word illegally precedes a prefix (e.g. "abc" before "ab").

KEY IDEA Sorted order ⇒ directed graph ⇒ topological sort

Insight → Every adjacent pair of words hides at most one ordering fact: the first position where they differ tells you word[i][j] < word[i+1][j] in the alien alphabet. All later characters are uninformative. Collect these constraints as directed edges, then run a topological sort. If there is a cycle, no valid ordering exists.

RECIPE Init → derive edges → Kahn's BFS → cycle check

  • 0 · Collect characters. Walk every character in every word to initialise the adj and indegree maps. Characters not in any word cannot appear in the answer.
  • 1 · Derive edges. For each adjacent pair (words[i], words[i+1]), scan positions left to right and stop at the first mismatch. That gives edge a[j] → b[j]. Ignore all subsequent characters — they carry no ordering information about this pair.
  • 2 · Prefix check. If words[i] is longer than words[i+1] and is a prefix of it, the input violates sorted order — return "" immediately.
  • 3 · Kahn's BFS.Enqueue all characters with indegree 0. Each dequeue emits the character into the result and decrements its neighbours' indegrees; newly-zero neighbours are enqueued.
  • 4 · Cycle check. If the output length equals the total number of unique characters, return it. Otherwise a cycle prevented full processing — return "".
Classic confusion → Beginners try to extract edges from every differing position in a pair, or compare non-adjacent words. Only adjacent pairs matter (the list is sorted, not arbitrarily ordered), and within a pair only the first differing character gives a valid edge. A second differing position is meaningless without knowing the first relationship.

COST Complexity & alternatives

DFS topo sort
O(V + E)
Same complexity but cycle detection via colouring is trickier to get right.
Kahn's BFS
O(C + W·L)
C = unique chars, W = words, L = avg length. Cycle falls out naturally.

Edge derivation is O(W·L) where W is the number of words and L is the average word length. The BFS itself is O(V + E) where V ≤ 26 and E ≤ V² ≤ 676. In practice the bottleneck is reading the words.

Pattern transfer → Kahn's BFS also solves Course Schedule I & II (direct graph with indegree), and Sequence Reconstruction(unique topo order check). DFS-with-colouring covers the same problems but Kahn's makes the "output the order" variant trivial.

RUN IT Derive edges, then Kahn's BFS topo sort

step 0 / 19
STARTWords: wrt, wrf, er, ett, rftt. Collecting all characters and initialising adjacency/indegree maps.
State
w, r, t, f, e
chars
(none yet)
adj
w:0 r:0 t:0 f:0 e:0
indegree
dequeued / comparingin queue / visitingedge added / output orderinvalid / cycle
slowfast

TYPESCRIPT The solution, annotated

alienOrder.ts
function alienOrder(words: string[]): string {
  // 1. Collect every character; initialise maps
  const adj = new Map<string, Set<string>>();
  const indegree = new Map<string, number>();
  for (const w of words)
    for (const c of w)
      if (!adj.has(c)) { adj.set(c, new Set()); indegree.set(c, 0); }

  // 2. Derive ordering edges from adjacent word pairs
  for (let i = 0; i < words.length - 1; i++) {
    const a = words[i], b = words[i + 1];
    const minLen = Math.min(a.length, b.length);
    let found = false;
    for (let j = 0; j < minLen; j++) {
      if (a[j] !== b[j]) {          // ONLY the first differing char gives an edge
        if (!adj.get(a[j])!.has(b[j])) {
          adj.get(a[j])!.add(b[j]);
          indegree.set(b[j], indegree.get(b[j])! + 1);
        }
        found = true;
        break;                       // stop — later chars tell us nothing
      }
    }
    // Prefix edge case: "abc" before "ab" is impossible
    if (!found && a.length > b.length) return '';
  }

  // 3. Kahn's BFS topological sort
  const queue: string[] = [];
  for (const [c, deg] of indegree)
    if (deg === 0) queue.push(c);

  let result = '';
  while (queue.length) {
    const cur = queue.shift()!;
    result += cur;
    for (const nb of adj.get(cur)!) {
      const newDeg = indegree.get(nb)! - 1;
      indegree.set(nb, newDeg);
      if (newDeg === 0) queue.push(nb);
    }
  }

  // 4. Cycle ⇒ not all chars reached
  return result.length === indegree.size ? result : '';
}

Reading it block by block

Lines 3–7 — initialise maps. Walk every character in every word to seed adj and indegree. Using if (!adj.has(c)) avoids reinitialising characters seen earlier.
Lines 10–27 — derive edges. Compare each adjacent pair. The inner for loop stops at the first mismatch and adds exactly one directed edge. The break is critical — it prevents false edges from later positions. The prefix check on line 25 catches the "abc" before "ab" case before it can silently corrupt the sort.
Lines 30–33 — seed the BFS queue.Every character with indegree 0 has no predecessor in the alien alphabet — it can appear first. There may be multiple such characters if the input doesn't fully constrain the order.
Lines 35–42 — Kahn's BFS loop. Dequeue a character, append to result, decrement every neighbour's indegree. A neighbour hitting zero means all its predecessors are already placed — safe to enqueue. The order within tied-indegree characters is arbitrary (any valid topo order is accepted).
Line 45 — cycle check. If result.length < indegree.size some characters were never dequeued — they form a cycle. Return "".
Complexity → O(W·L) to derive edges, O(V + E) for BFS (V = unique chars ≤ 26, E ≤ V²). Total O(W·L + V²) — dominated by reading the word list in practice.

INTERVIEWFollow-ups they'll ask

  • "What if multiple valid orderings exist?"Return any one. If the problem asks for the lexicographically smallest, use a min-heap (priority queue) instead of a plain queue in Kahn's BFS.
  • "Can you detect the cycle early?"Kahn's BFS naturally detects it at the end (count check). DFS with three-colour marking (white/grey/black) detects it the moment a back-edge is found, but the logic is harder to implement correctly under interview pressure.
  • "What are all invalid cases?" Two: (1) a longer word is a prefix of a shorter word that follows it; (2) the edge constraints form a cycle. Both return "".
  • "Why not compare all pairs, not just adjacent?" The list is sorted, so transitivity is already encoded. Comparing non-adjacent pairs would add redundant and potentially false edges.
  • "What if the input has only one word?" No edges can be derived; return the characters of that single word in any order (all are in the alphabet but their relative order is unconstrained).

MNEMONIC The one-liner

"Scan left-to-right, stop at the first diff — that's your edge. Then Kahn drains the graph."

TRIGGERS When you see ___ → reach for ___

"reconstruct order from sorted examples"edge-per-adjacent-pair + topo sort
"first differing position"break after adding edge — later positions ignored
"longer word before its own prefix"return "" immediately (invalid)
"cycle in ordering constraints"Kahn's BFS count < total chars → return ""

SKELETON The reusable shape

skeleton.ts
// 1. init adj + indegree for every unique char
// 2. compare adjacent pairs → first diff char → edge u→v
//    if a is prefix of b but longer → return ''
// 3. Kahn's BFS:
const queue = [...chars with indegree 0];
let result = '';
while (queue.length) {
  const cur = queue.shift()!;
  result += cur;
  for (const nb of adj.get(cur)!) {
    if (--indegree[nb] === 0) queue.push(nb);
  }
}
// 4. cycle check
return result.length === totalChars ? result : '';

FLASHCARDS Tap to flip

Where does each ordering edge come from?
The first differing character between two adjacent words in the sorted list — only that position, not any later ones.
tap to flip
1 / 7
Answer to reveal explanations. No penalty for retries.Score: 0 / 7
QUESTION 01
For words ["wrt", "wrf"], which edge is derived and why?
QUESTION 02
What happens when "abc" appears before "ab" in the sorted word list?
QUESTION 03
After building edges from ["wrt","wrf","er","ett","rftt"], how many unique characters are there?
QUESTION 04
Kahn's BFS reports fewer output characters than total unique characters. What does this mean?
QUESTION 05
What is the time complexity of the full solution (W words, average length L, C unique chars)?
QUESTION 06
Why must you stop edge-derivation after the FIRST differing character in a pair?
QUESTION 07
To output the lexicographically smallest valid alien alphabet, which change is needed?