Redis vs Memcached, and local (Caffeine) vs distributed. — Cracked Java
// High-Level Design (HLD / Distributed Systems) · Caching Strategies
MidSystem Design

Redis vs Memcached, and local (Caffeine) vs distributed.

Redis vs Memcached, and local (Caffeine) vs distributed

Two orthogonal decisions live here. First, which distributed cache — Redis or Memcached. Second, whether the cache should be distributed at all, or in-process (Caffeine). The senior answer treats them as different axes and often picks both layers (L1 local + L2 distributed).

Redis vs Memcached

RedisMemcached
Data modelRich: strings, hashes, lists, sets, sorted sets, streams, bitmaps, HLLStrings only (opaque blobs)
ThreadingSingle-threaded per shard (event loop)Multi-threaded — scales on a big box
PersistenceOptional RDB snapshots / AOF logNone (pure cache)
Replication / HABuilt-in replication, Sentinel, ClusterNone native; client-side sharding
Pub/sub, Lua, transactionsYesNo
Evictionmaxmemory + policies (LRU/LFU/TTL/random)LRU
Best fitDefault choice; needs structures, persistence, or HASimple, huge, multi-threaded LRU cache

Takeaway: Memcached is a lean, multi-threaded, LRU-only cache that excels at pure "big bag of bytes" caching on a single large machine. Redis is the de facto default because its data structures (sorted sets for leaderboards/rate limits, hashes for objects), persistence, replication, Cluster mode, and pub/sub make it far more than a cache — and its single-threaded model rarely bottlenecks because you scale horizontally with Cluster sharding. Pick Memcached only when you specifically want its simplicity and multi-threaded throughput and need none of Redis's features.

Local (Caffeine) vs distributed

This is the more important architectural axis.

  • Local / in-process (Caffeine). Lives in the JVM heap of each app instance. Access is nanoseconds with no network hop and no serialization. The downsides: it is per-node (not shared — each instance has its own copy and its own hit rate), bounded by heap, and incoherent — when one node updates the source, the other nodes' local copies go stale. Best for tiny, ultra-hot, read-mostly data that tolerates brief staleness (feature flags, config, the hottest keys).

  • Distributed (Redis). A shared cache reachable by all instances over the network (~sub-millisecond). It is coherent (one copy, one place to invalidate), survives instance restarts, and scales independently of the app tier. The cost is the network round-trip and serialization.

The two-tier (L1/L2) combination

The strongest production answer uses both: Caffeine as L1 in front of Redis as L2. L1 absorbs the hottest keys at zero network cost; on an L1 miss it falls through to the shared L2, and only an L2 miss hits the database. The price is L1 coherence — an L1 entry can go stale when another node writes to L2 — handled with short L1 TTLs and/or a Redis pub/sub channel that tells every node to evict a key on change.

Mark your status