The Raspberry Pi Gets NVIDIA Horsepower
The latest and greatest NVIDIA GPUs can now accelerate AI workloads on the Raspberry Pi, thanks to a new driver patch.
In the world of edge AI, the Raspberry Pi is something of a little beast. But that, of course, is only when it’s stacked up against extremely resource-constrained devices like microcontrollers. When stepping into the realm of cutting-edge generative AI algorithms, such as large language models, a Raspberry Pi will quickly be brought to its knees. It simply doesn’t have the parallel processing capabilities or memory needed to run these models efficiently (or at all).
Or at least, that used to be the case. Jeff Geerling recently described some advances that made it possible for him to run virtually any NVIDIA GPU on a Raspberry Pi. Now, connecting an A4000 GPU to a Raspberry Pi may not make a lot of sense. If you need that kind of horsepower, you are going to hook it up to a proper computer so that you don’t introduce any unnecessary bottlenecks that slow things down. However, practicality never stopped Geerling, so he did it anyway, just because he could.
A few months ago, the community saw AMD GPUs come to life on the tiny board thanks to a 15-line driver patch from GitHub user yanghaku. This week, GitHub user mariobalanica pushed the effort even further with a significantly larger kernel module patch enabling NVIDIA GPU support across ARM platforms, including the Raspberry Pi. With a Pi 5, a fresh install of Pi OS 13, and the NVIDIA 580.95.05 ARM64 driver, Geerling compiled the open-source kernel modules and booted the system. nvidia-smi properly recognized the RTX A4000 hanging off the Pi’s PCIe link, reporting power usage, temperature, and memory stats as if it were running in a workstation.
Despite this success, no image appeared via DisplayPort, even with the onboard GPU disabled. But compute performance, which was the real reason to strap a workstation-class GPU to this tiny computer, is promising. Using llama.cpp with Vulkan acceleration, Geerling achieved impressive inference speeds for a 3B language model.
The same basic approach is designed to work on other ARM boards as well, including faster systems like those based on the RK3588. As driver patches evolve and display support improves, the future may bring full GPU workflows to devices that fit in the palm of your hand — whether that makes any sense to do or not.