Union by rank vs union by size. — Cracked Java
// Data Structures & Algorithms · Union-Find (Disjoint Set Union)
SeniorTheory

Union by rank vs union by size.

Both heuristics decide the same thing: when merging two sets, which root becomes the parent of the other. The goal is identical — keep trees shallow so find stays fast — they just measure "bigger" differently.

Union by size

Track the number of elements in each tree; attach the smaller set under the larger. Intuition: fewer nodes get their depth increased.

boolean union(int a, int b) {
    int ra = find(a), rb = find(b);
    if (ra == rb) return false;
    if (size[ra] < size[rb]) { int t = ra; ra = rb; rb = t; }
    parent[rb] = ra;
    size[ra] += size[rb];          // maintain the count
    return true;
}

Union by rank

Track rank — an upper bound on tree height, not an exact count; attach the lower-rank tree under the higher-rank one. Rank increments by 1 only when two equal-rank trees merge.

boolean union(int a, int b) {
    int ra = find(a), rb = find(b);
    if (ra == rb) return false;
    if (rank[ra] < rank[rb]) { int t = ra; ra = rb; rb = t; }
    parent[rb] = ra;
    if (rank[ra] == rank[rb]) rank[ra]++;   // tie -> height grew by 1
    return true;
}

How they differ

  • Size stays exact and is independently useful — you often need "how big is this component?" anyway, so it comes for free.
  • Rank is only an upper bound on height once path compression starts flattening trees: compression lowers actual height but never decreases stored rank. That is fine — rank only needs to bound the merge decision, not stay precise.

Both deliver the same O(log n) standalone tree height, and both combine with path compression to reach the O(α(n)) amortized bound. In interviews, either is fully acceptable.

OperationBestAverageWorstNote
Union (by rank or size)O(α(n))O(α(n))O(α(n))amortized, with path compression
FindO(α(n))O(α(n))O(α(n))amortized, with path compression
Tree height (heuristic alone)O(log n)no compression

Mark your status