Getting Out of the Game

Hacker nocoffei showed that the GPU in your Nintendo Switch can do more than pump out polygons by running an LLM on the console.

Nick Bild
11 months agoAI & Machine Learning
The world's smartest Nintendo Switch runs an LLM (📷: nocoffei)

It’s a real bummer being GPU-poor in this day and age. It seems like artificial intelligence (AI) algorithms are powering everything these days, and if you do not have the computational horsepower to run them, you are out of luck. You may have plenty of ideas about how you can leverage AI to make your life better, or make the world a better place, but without powerful GPUs, there is no way to bring those ideas to fruition.

Without GPUs, you might as well just sit around playing video games all day, wasting your life. Why not get ahead of the curve and move into your parents’ basement while you are at it, before AI eats the world and leaves you behind? Wait a minute… video games? Don’t modern game systems need powerful GPUs to pump out all of those polygons? Hmm…

Serial hacker nocoffei of the Insane Rambles About Technology blog was looking for an interesting AI project to work on, and realized that the Nintendo Switch is equipped with a stock NVIDIA Tegra X1 system on a chip, which has a Maxwell-architecture GPU with 256 cores. This platform may be a decade old and lacking power by today’s standards, but hey, it’s still a decent GPU. And unlike all of the fancy, shiny GPUs that you have no hope of ever owning, this one is sitting in your living room right now. So nocoffei attempted to get a modern large language model (LLM) running on a Switch.

As it turns out, most of the work has already been done. Linux4Tegra — which is exactly what it sounds like: Linux for NVIDIA devices such as the Tegra X1 — has already been ported to the Switch. This port includes a working copy of the CUDA platform, which is used for general-purpose computing on NVIDIA GPUs. Furthermore, the open source library llama.cpp packages up LLMs and makes it easy to run them on just about any platform (and it can take advantage of GPU acceleration via CUDA).

These components are all that are needed, yet ten years is a very long time in the world of technology, and support for the Tegra X1 is fading. It does not support the most recent versions of CUDA, which are leveraged by a library that llama.cpp relies on. No problem, thought nocoffei, who spent a weekend backporting an older version of CUDA into that library.

And with that, nocoffei’s Switch had entered the modern age. Well, sort of. To be sure, it ran the LLM without a problem, but at about four tokens per second it was not blazing fast. It is pretty reasonable, however, for a novel experimentation platform.

Are you ready to jump in and void your warranty today? Before you do, take a close look at nocoffei’s detailed project write-up so that you are prepared for the work ahead of you. And once you get an LLM running, you might want to try running some other tools as well. With a Linux environment and CUDA, you could probably get a text-to-image generator, and some other interesting applications, up and running on a Switch without too much difficulty.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles