Skip to main content

asmqdm: Python Progress Bars Created in x86-64 Assembly

Sole developer · Personal project

systemsassemblyperformancepython ~5ns per update in async mode via LOCK XADDRender thread on a separate core via clone() and sched_setaffinity

Problem / Context

Progress bars in Python add per-iteration overhead that can matter in tight loops. I wanted to see how low that overhead could go if the update path was a single atomic instruction in assembly, with rendering handled by a separate thread on a different CPU core.

Approach

Results

What I’d Do Next