HARD Lab · Heap / Priority Queue · 77

295. Find Median from Data Stream

Numbers arrive one at a time and you must report the running median on demand. Split the data across two heaps — a max-heap for the smaller half and a min-heap for the larger half — kept balanced so the median always sits right on their tops.

HardTwo HeapsBalanced PartitionTypeScript

PROBLEM What we're solving

Design a structure with two operations: addNum(x) feeds a number from an infinite stream, and findMedian() returns the median of everything seen so far. The median is the middle value of the sorted data, or the average of the two middle values when the count is even. Stream 5, 2, 8, 1, 9, 4 → after all six, sorted is [1,2,4,5,8,9] so the median is (4 + 5) / 2 = 4.5.

KEY IDEA Two heaps that meet at the median

Insight → you never need the whole sorted order, only the value(s) at the boundary. Keep the smaller half in a max-heap (low) and the larger half in a min-heap (high). If their sizes differ by at most one, the median is either the average of the two tops (sizes equal) or the top of the bigger heap. Both tops are O(1) to read.

RECIPE Route, rebalance, read

  • 1 · Route. If low is empty or x ≤ low.max, push x to low; otherwise push it to high. This keeps every number in low ≤ every number in high.
  • 2 · Rebalance. Sizes may now differ by 2. If |low| > |high| + 1, move low's max into high. If |high| > |low|, move high's min into low. Now the sizes differ by at most one (with low holding the extra).
  • 3 · Read. Equal sizes ⇒ median = (low.max + high.min) / 2. Otherwise low is the bigger heap ⇒ median = low.max.
Classic confusion → the median rule depends on which heap you let hold the extra element. Here low is always the one that grows by one on odd counts, so for an odd total the median is unambiguously low.max — never high.min. Pick a convention and keep the rebalance asymmetric to match it.

COST Complexity & alternatives

Sort on each query
O(n log n)
Per findMedian — dies on a long stream.
Two heaps
O(log n)
addNum is O(log n); findMedian is O(1).

Why not a sorted array?

A sorted array gives O(1) median but each addNum is O(n) for the shift. The two-heap structure trades that for O(log n) inserts while still reading the median in O(1). Space is O(n) to hold the stream. If values are bounded small integers, a Fenwick/BIT or bucket-count approach can also hit O(log n) or O(1) per query.

Pattern transfer → the "balance two heaps around a boundary" trick reappears in Sliding Window Median (add a lazy-deletion heap), IPO(a max-heap of affordable projects), and any "k-th from the middle while streaming" problem.

RUN IT Balance two heaps around the middle

step 0 / 18
STARTEmpty structure. low holds the smaller half (max on top), high holds the larger half (min on top).
stream
low (max-heap)high (min-heap)
State
low.max
high.min
0
|low|
0
|high|
median
number just addedactive heap topheap top (median source)heap sizes
slowfast

TYPESCRIPT The solution, annotated

medianFinder.ts
class MedianFinder {
  // low = max-heap (smaller half), high = min-heap (larger half).
  // We keep |low| === |high|, or |low| === |high| + 1.
  private low: MaxHeap = new MaxHeap();
  private high: MinHeap = new MinHeap();

  addNum(num: number): void {
    // 1. Route by value so low always holds the smaller numbers.
    if (this.low.size === 0 || num <= this.low.peek()) this.low.push(num);
    else this.high.push(num);

    // 2. Rebalance so the two sizes differ by at most one.
    if (this.low.size > this.high.size + 1) {
      this.high.push(this.low.pop());
    } else if (this.high.size > this.low.size) {
      this.low.push(this.high.pop());
    }
  }

  findMedian(): number {
    if (this.low.size === this.high.size) {
      // Even count: average the two middle elements (the heap tops).
      return (this.low.peek() + this.high.peek()) / 2;
    }
    // Odd count: low carries the extra element, so its top IS the median.
    return this.low.peek();
  }
}

Reading it block by block

The two heaps. low is a max-heap holding the smaller half, so its top is the largest of the small numbers. high is a min-heap holding the larger half, so its top is the smallest of the big numbers. The invariant is |low| === |high| or |low| === |high| + 1.
addNum — route. A number that is ≤ the current boundary (low.peek()) belongs to the smaller half, so it goes into low; everything else goes into high. The empty-low case sends the very first number to low.
addNum — rebalance. After routing, one heap may be 2 larger. Moving a single element across (the max of low or the min of high) restores the size invariant. The asymmetric checks (> high.size + 1 vs > low.size) ensure low is the heap that carries the extra element on odd counts.
findMedian. Equal sizes mean an even count, so the two middle elements are exactly the two tops — average them. Unequal sizes mean an odd count, and since low holds the extra, the single middle element is low.peek().
Complexity → addNum is O(log n) (one or two heap operations); findMedian is O(1) (read one or two tops). Space is O(n) to store the stream.

INTERVIEWFollow-ups they'll ask

  • "What if 99% of queries are findMedian?" Already O(1) per query; the two-heap structure is ideal because reads are constant-time.
  • "What if all numbers are in [0, 100]?" Use a count array (bucket each value) and walk to the middle, or a Fenwick tree — O(1) / O(log range) without heaps.
  • "Sliding-window median?"Keep the two heaps but add lazy deletion (a "to-remove" multiset) so out-of-window elements are skipped when they surface at a top.
  • "Need a percentile, not just the median?" Generalize to two heaps with a target size ratio, or switch to an order-statistics tree / Fenwick tree.
  • "No heap in the language?" Implement a binary heap on an array (sift-up on push, sift-down on pop) or maintain a balanced BST / skip list.

MNEMONIC The one-liner

"Small half maxes up, big half mins down — read the median off the tops."

TRIGGERS When you see ___ → reach for ___

running / streaming mediantwo heaps (max-heap + min-heap)
need the middle without full sortbalanced partition at the boundary
sizes off by 2 after insertmove one element across to rebalance
sliding-window mediantwo heaps + lazy deletion

SKELETON The reusable shape

skeleton.ts
// low = max-heap (smaller half), high = min-heap (larger half)
addNum(num) {
  if (low.size === 0 || num <= low.peek()) low.push(num);
  else high.push(num);
  if (low.size > high.size + 1) high.push(low.pop());
  else if (high.size > low.size) low.push(high.pop());
}
findMedian() {
  if (low.size === high.size) return (low.peek() + high.peek()) / 2;
  return low.peek();
}

FLASHCARDS Tap to flip

Which heap holds which half?
low = max-heap for the smaller half (top is its largest); high = min-heap for the larger half (top is its smallest).
tap to flip
1 / 7
Answer to reveal explanations. No penalty for retries.Score: 0 / 7
QUESTION 01
Which data structures hold the two halves?
QUESTION 02
Time complexity of addNum?
QUESTION 03
Time complexity of findMedian?
QUESTION 04
With the convention that low carries the extra element, the median for an odd count is:
QUESTION 05
After routing a new number, the size invariant can break by how much, and how is it fixed?
QUESTION 06
For the stream 5, 2, 8, 1, 9, 4, what is the median after all six numbers?
QUESTION 07
Why is a plain sorted array worse for a long stream?