Consistent hashing — what problem does it solve? Virtual… — Cracked Java
// High-Level Design (HLD / Distributed Systems) · Replication & Partitioning (Sharding)
SeniorSystem DesignBig TechAmazonGoogle

Consistent hashing — what problem does it solve? Virtual nodes.

Consistent hashing — the problem it solves, and virtual nodes

The problem with naive hash sharding

The obvious way to map a key to one of N shards is shard = hash(key) mod N. It distributes load evenly — until N changes. Adding or removing one node changes N, so almost every key remaps to a different shard. With mod 4 → mod 5, roughly 80% of keys move. For a distributed cache that means a near-total cache miss storm; for a database it means moving nearly all the data. Rebalancing should touch a small fraction of keys, not most of them.

The idea: a hash ring

Consistent hashing maps both keys and nodes onto the same circular hash space (say, 0 to 2³²−1). To find which node owns a key, hash the key and walk clockwise to the first node you hit.

Consistent hashing ring: each key is owned by the next node clockwise

The payoff: when a node is added, it only takes over the keys between it and the previous node on the ring; when a node is removed, only its keys move to the next node clockwise. On average just K/N keys move (K = total keys), instead of nearly all of them.

 Before:        After adding Node 4:

    N1                 N1
   /  \               /  \
 N3 -- N2           N3 -- N2
                      \
                       N4   <- only keys in (N3 .. N4] move to N4
Adding Node 4 only steals the arc between Node 3 and Node 4 — every other key stays put

The problem consistent hashing still has — and virtual nodes

Plain consistent hashing has two weaknesses: with few nodes, the ring positions are uneven, so one node may own a much larger arc (and thus more load) than another; and when a node fails, all its load dumps onto its single clockwise neighbor.

Virtual nodes (vnodes) fix both. Instead of placing each physical node once, place it at many positions on the ring (e.g., 100–256 virtual points per physical node, by hashing node-id#0, node-id#1, …).

 Ring with virtual nodes (A, B, C = physical nodes):

      A  C  B  A  B  C  A  C  B  A  B  C
     (each physical node appears many times, arcs interleaved)

 - Load evens out: many small arcs average to ~equal share per node.
 - On failure of A: A's many arcs are inherited by MANY different
   neighbors, spreading the recovery load instead of dumping it on one.
Each physical node owns many small arcs scattered around the ring

Benefits of vnodes:

  • Even load — many small arcs average out, so each physical node gets ~equal share.
  • Graceful failure — a failed node's arcs are scattered, so its load is redistributed across many survivors, not piled on one.
  • Heterogeneous capacity — give a bigger machine more virtual nodes to take proportionally more load.

Where it's used

Consistent hashing with virtual nodes underpins Amazon Dynamo / DynamoDB, Cassandra, Riak, memcached client sharding (ketama), and many load balancers for cache-aware routing. It's the standard answer for "how do you shard a distributed cache or KV store so resharding is cheap?"

Mark your status