Think Different
YouTuber KenDesigns used a few tricks to run an MNIST neural network on a 41-year-old Macintosh 512K, because why not?
Classifying handwritten digits with a simple neural network trained on the MNIST dataset is one of the first projects many people take on when they are just getting started with learning about machine learning. Generally, one would follow a tutorial that shows them how to do this, step-by-step, on a desktop computer using a popular framework like TensorFlow or PyTorch. It is a fairly simple algorithm, so the bar is pretty low in terms of system requirements — just about any modern machine can handle the computations.
But YouTuber KenDesigns wanted to find out just how low you can go. Could his 41-year-old Macintosh 512K, for instance, run this type of neural network? Maybe even the original Macintosh 128K? This is some seriously limited hardware by today’s standards, but KenDesigns was determined to give it a try to find out.
There are two primary problems standing in the way of getting any neural network to run on old hardware. First and foremost, the entire network needs to be loaded into main memory before an inference can run, and with only 128K available, that is not easy. Second, these algorithms perform many floating-point operations, and the original Mac has only its ancient Motorola 68000 CPU to execute them, with no separate floating-point unit (FPU) to help with the load.
Initially, KenDesigns considered taking a naive approach to the problem. The memory issue could be solved by just reducing the parameter count of the neural network. And as far as the floating-point math is concerned, there are well-known methods that can be used to emulate the operation of an FPU. There is no question that this approach would work, but the smaller model dramatically hinders its performance, and the FPU emulation is extremely slow, so it makes for a terrible experience all around.
To make for a much better experience, KenDesigns introduced a more sophisticated technique called quantization. In a nutshell, quantization converts floating-point numbers into 8-bit integers. These integers convey virtually the same information that their floating-point counterparts do, but they only occupy a quarter of the memory space. And what’s more, they can be directly operated on without conversion, almost eliminating the need for working with floating-point values.
With the core algorithm worked out, KenDesigns built an application called MacNIST68000 that runs on bare metal to avoid operating system overhead (see his recent project for details on how that works). Using MacNIST68000, a user can draw a digit on the screen with their mouse, then run the neural network to classify it. In general, it seems to work pretty well. But if you know what you are doing, you can fool it pretty easily by drawing a small digit in a corner of the screen, for instance.
If you want to see some more projects that squeeze machine learning algorithms into old hardware platforms, you might be interested in TensorFlow Lite for the Commodore 64 or an AI-powered competitor in Combat for the Atari 2600.