Skip to main content

Lock-Free Counters and When They're Enough

How x86-64 atomic instructions can replace mutexes for simple shared counters, and where the pattern breaks down. · 5 min read

When shared mutable state is a single monotonic counter, x86-64 hardware provides a simpler option than mutexes: LOCK XADD. This is an atomic read-modify-write instruction that completes in roughly 5-10 nanoseconds, requires no kernel involvement, and guarantees correctness without any locking primitives. For the right workload it replaces a mutex, a condition variable, and a futex all at once.

How It Works

The producer increments with LOCK XADD. The consumer reads with a plain MOV. Neither thread blocks.

Where It Fits

Where It Breaks Down

The Practical Threshold

A mutex acquire-release cycle on Linux costs roughly 25-50 nanoseconds uncontended, plus the risk of blocking if another thread holds it. LOCK XADD costs 5-10 nanoseconds and never blocks. For a counter hit millions of times per second in a hot loop, that difference compounds. For a counter hit a few hundred times per second, it doesn’t matter. Pick based on your actual update frequency.