Cache Stampede

A cache stampede (or thundering herd) is the failure mode where many concurrent requests for the same hot key all miss the cache at the same time, hit the origin simultaneously, and overwhelm it. Stampedes typically happen the instant a popular key expires, when a cache is cold-started, or when a downstream system briefly fails and recovers.

How it happens

Consider a homepage feed cached with a 60-second TTL. Hundreds of requests per second hit the cache and get a fast cached response. The moment the entry expires, every in-flight request misses, each tries to regenerate the feed, and the database receives a burst of identical queries. The origin tips over, recovery takes longer than the TTL, the cycle repeats.

Common mitigations

  • Request coalescing. When a miss arrives, only one request goes to the origin; concurrent requests wait for and reuse that response.
  • Probabilistic early expiration. Requests have a small probability of triggering refresh before TTL expiry, so refreshes spread out instead of cliffing.
  • Stale-while-revalidate. Continue serving the stale value while one request refreshes in the background.
  • Locking. A distributed lock (Redis SETNX, Redlock) ensures only one regenerator runs.
  • Warmed caches. Periodically refresh hot keys before they expire so misses rarely happen on hot keys.

Subscribe to Sahil's Playbook

Clear thinking on product, engineering, and building at scale. No noise. One email when there's something worth sharing.
[email protected]
Subscribe
Mastodon