What is transaction ID wraparound? Why is it catastrophic… — Cracked Java
SeniorTrickBig Tech

What is transaction ID wraparound? Why is it catastrophic if ignored?

Transaction ID wraparound is the failure mode where PostgreSQL's 32-bit transaction counter rolls over, causing old committed rows to suddenly appear to come from the future — and therefore become invisible. It's catastrophic because it can silently make committed data disappear; the fix is freezing, which VACUUM performs continuously.

Why a 32-bit counter is a problem

Every transaction gets a monotonically increasing XID, and each tuple's xmin records the XID that created it. The XID is only 32 bits — about 4 billion values. PostgreSQL treats XID space as a circle: for any current XID, half the space (~2 billion) is considered "in the past" and half "in the future." Visibility depends on a tuple's xmin being in the snapshot's past.

When the counter advances past ~2 billion transactions beyond a tuple's xmin, that old tuple flips from "past" to "future" relative to the current XID — and a tuple from the future is not yet visible. The committed row effectively vanishes. That is data loss, not corruption you can shrug off.

The fix: freezing

To prevent this, VACUUM freezes old tuples: it marks a sufficiently-old, still-live tuple as frozen, a special state meaning "always visible, regardless of the XID counter." A frozen tuple is exempt from the wraparound comparison entirely.

-- inspect how close a table is to forced action
SELECT relname, age(relfrozenxid) AS xid_age
FROM pg_class
WHERE relkind = 'r'
ORDER BY xid_age DESC LIMIT 10;

age(relfrozenxid) is how many transactions have elapsed since the table's oldest unfrozen XID. Watch it.

The thresholds that protect you

  • autovacuum_freeze_max_age (default 200 million) — when a table's XID age exceeds this, autovacuum launches an anti-wraparound vacuum even if autovacuum is "disabled" and even with no dead tuples. This is the safety net.
  • vacuum_freeze_min_age — how old a tuple must be before a normal vacuum freezes it.
  • At ~2 billion, if freezing still hasn't happened, PostgreSQL refuses new writes and demands a single-user-mode vacuum — a full outage.

What goes wrong in practice

Wraparound emergencies almost always trace to autovacuum being disabled, starved, or unable to keep up (long-running transactions hold back the "oldest XID," blocking freezing). The warning signs are WARNING: database must be vacuumed within N transactions messages in the log.

Mark your status