Austin Pivarnik Turns to CPU Busywork to Create the "World's Most Stable Raspberry Pi"
No, this isn't about crashes β but about getting timing drift down as low as possible by stabilizing the on-board crystal's temperature.
Engineer Austin Pivarnik has tweaked his Raspberry Pi single-board computer to become what he believes may be the most stable example in the world β by doing away with heat-linked clock jitter with some clever busywork to keep the processor at a stable temperature.
"Despite having a stable PPS [Pulse Per Second] reference, my NTP [Network Time Protocol] server's frequency drift was exhibiting significant variation over time," Pivarnik explains of exactly what problem he set out to solve. "After months (years) of monitoring the system with Grafana dashboards, I noticed something interesting: the frequency oscillations seemed to correlate with CPU temperature changes. The frequency would drift as the CPU heated up during the day and cooled down at night, even though the PPS reference remained rock-solid."
Most people don't really care about "jitter" β tiny changes in timing accuracy measurable only under precise observation β until it gets really, really bad. Pivarnik, though, was aiming for sub-microsecond time synchronization, and that wasn't really achievable for one simple reason: heat, or rather the constantly-changing amount thereof.
"Modern CPUs, including those in Raspberry Pis, use dynamic frequency scaling to save power and manage heat," Pivarnik explains. "When the CPU is idle, it runs at a lower frequency (and voltage). When load increases, it scales up. This is great for power efficiency, but terrible for precision timekeeping. But hereβs the key insight: the system clock is ultimately derived from a crystal oscillator, and crystal oscillator frequency is temperature-dependent. The oscillator sits on the board near the CPU, and as the CPU heats up and cools down throughout the day, so does the crystal. Even a few degrees of temperature change can shift the oscillator's frequency by parts per million β exactly what I was seeing in my frequency drift graphs."
Setting the CPU governor to a single fixed frequency β at the upper or bottom end of its scale, it doesn't matter β wasn't enough to solve the problem, though. A bigger issue was the amount of heat generated as the processor chews through its workloads, cooling as it has little work to do and heating again when more is demanded. That alone was enough to cause around 86 nanoseconds of offset β "which isn't terrible," Pivarnik admits, "it's actually really, really good, but I knew it could be better."
The solution: busywork. Pivarnik modified the software on the Raspberry Pi to lock timing-critical tasks to a single CPU core, preventing them from being shuffled around the four cores on the chip, while running busywork on the remaining three cores to maintain a set temperature. This wasn't about just loading the CPU to its limit, though, but creating a proportional-integral-derivative (PID) controller that varied the workload according to temperature in order to remain as close to the target value as possible.
"The improvement was immediately visible," Pivarnik says. "The RMS offset is chronyd's estimate of the timing uncertainty. Cutting this nearly in half means the system is maintaining significantly better time accuracy. Is it worth it? For 99.999% of use cases: absolutely not. Most applications don't need better than millisecond accuracy, let alone the 35-nanosecond RMS offset Iβm achieving. Even for distributed systems, microsecond-level accuracy is typically overkill. For me, this falls squarely in the 'because you can' category."
For those who would like to try for maximum timing accuracy themselves, Pivarnik has written up the project in full, including source code, on his website.
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.