Backpressure: What It Is and How to Tame It
Where backpressure shows up, how to spot it, and practical relief valves for real systems. · 5 min read
Backpressure occurs when producers outpace consumers and work piles up. It is a naturally occurring phenomenon driven by architecture choices like fan-out depth, batch sizing, retry policy, and shared dependencies rather than a one-off bug. It most often appears when bursty or sustained load runs X times faster than the capacity of Y constrained resources (threads, DB IOPS, partition throughput, or external APIs). With careful planning you can keep flow steady instead of spiky: size buffers deliberately, cap concurrency, shape ingress, and degrade gracefully. In the worst case, queues explode, leaving users with an unknown amount of loading time.
Where It Shows Up
- APIs & gateways: client bursts overwhelm downstreams; 429s are common indicators.
- Streams & queues: hot partitions, unbalanced consumers, and small batch sizes starve throughput.
- Datastores: write amplification or index contention increases latency per-request.
- Async workers: fan-out without capping the amount of workers. One slow dependency stalls the whole pool.
Common Relief Valves
- Gate ingress: rate-limit by tenant.
- Bound concurrency: fixed worker pools with backoff instead of unbounded goroutines/promises.
- Shape work: coalesce, batch, and prefer idempotent retries. Favor adaptive batching.
- Protect dependencies: circuit breakers + timeouts, load-shedding early when detection signs are clear.
- Surface signals: track queue depth, service time histograms, and drop/timeout counts.