Single-leader replication — why is it the default? Failur… — Cracked Java
// High-Level Design (HLD / Distributed Systems) · Replication & Partitioning (Sharding)
SeniorSystem Design

Single-leader replication — why is it the default? Failure modes?

Single-leader replication — why it's the default, and its failure modes

Single-leader replication (also called leader-follower, primary-replica, or master-slave) is the replication model of nearly every relational database — Postgres, MySQL, SQL Server — and many others. It's the default because it's the simplest model that gives strong consistency: a single node imposes a total order on writes, so there are never write conflicts to reconcile.

How it works

Single-leader replication: writes to the leader, reads scale across followers
  • All writes go to the leader. The leader applies them, then streams its change log (WAL shipping, MySQL binlog, or logical replication) to the followers.
  • Reads can go to any follower (or the leader). This is the main scaling lever: a read-heavy workload scales by adding followers.
  • Followers apply the log in the same order, so they converge to the leader's state — eventually.

Synchronous vs asynchronous replication

This is the central tunable:

ModeLeader waits forTrade-off
SynchronousAt least one follower to confirmNo data loss on leader failure, but a slow/down follower stalls writes
AsynchronousNothing — acks immediatelyFast and available, but writes acked-but-not-replicated are lost if the leader dies
Semi-synchronousOne sync follower, rest asyncCommon compromise: durability of one copy, speed of the rest

Pure synchronous replication to all followers is impractical (any single follower outage blocks all writes), so production systems use semi-synchronous or async.

Failure modes

  • Follower failure — catch-up recovery. A follower that crashes knows its last applied log position; on restart it requests everything since and catches up. Low impact: reads just route elsewhere meanwhile.
  • Leader failure — failover. The hard case. Steps: detect the leader is down (timeout — risks false positives), elect a new leader (ideally the most up-to-date follower), and reconfigure clients/replicas to point at it. Failover is where the dragons live.

The fundamental limitation

Single-leader replication scales reads but not writes — every write funnels through one node. When write throughput exceeds what one leader can handle, you must either partition (shard) so each shard has its own leader, or move to multi-leader / leaderless replication. It also means stale reads from followers due to replication lag (covered in its own question).

Style note

For EU/contracting and regional interviews, single-leader Postgres with read replicas is often the right, cost-justified answer up to large scale — don't over-reach for leaderless complexity. For FAANG, be ready to explain failover, split-brain fencing, and the write-scaling wall that pushes you toward sharding.

Mark your status