AtomicLong keeps one hot field updated by a CAS loop; LongAdder spreads the count across multiple cells so threads rarely collide. Under high write contention LongAdder is dramatically faster; for low contention or when you need an always-current single value, AtomicLong is the better choice.
The contention problem with AtomicLong
AtomicLong.incrementAndGet reads the field, adds one, and CASes. When many threads do this concurrently, all but one CAS fails on each round, the losers retry, and the single cache line holding the value bounces between cores (false-sharing-style cache coherency traffic). Throughput collapses as core count rises — the field is a serialization point.
How LongAdder scales
LongAdder maintains a base plus a dynamically-sized array of Cells. Each thread hashes to a cell and CASes that cell; collisions trigger growth so contending threads end up on different cells, each on its own cache line. The total is base + sum(cells) computed lazily by sum().
AtomicLong: LongAdder:
all threads -> [value] T1 -> [cell0]
(one hot cache line, T2 -> [cell1]
constant CAS failures) T3 -> [cell2]
sum() = base + cell0 + cell1 + cell2LongAdder counter = new LongAdder();
counter.increment(); // CAS only its own cell
counter.add(5);
long total = counter.sum(); // adds base + all cells; NOT atomic snapshot
The trade-offs
- Use
LongAdderfor write-heavy, read-rarely counters: request counts, hit/miss metrics, throughput stats. - Use
AtomicLongwhen contention is low, when you need an exact current value cheaply, or when you need atomic CAS semantics (compareAndSet) — whichLongAdderdoes not offer. - Memory:
LongAddergrows an array up to roughly the number of CPUs, so it costs more memory than a singlelong.
LongAccumulator generalizes the idea to any associative function (e.g. max, multiply), not just sum.