Running a convolutional neural network (CNN) for artificial intelligence is a resource intensive task, especially if it involves processing images or video. For that reason, most AI applications are run on powerful computers or cloud servers. The problem is that doing so requires either a bulky local computer or sending video out over the internet and waiting for a response. In many cases, it’s more desirable to run the AI “on the edge.” The NVIDIA Jetson Nano is a small single-board computer (SBC) designed to do exactly that, and Nick Bild is using one to play Doom using full-body gestures.
Doom is a video game that was originally released in 1993, but it has remained relevant for two reasons: it was one of the earliest first-person shooter (FPS) games that set the groundwork for the whole genre, and it has come to be popular as a benchmark test. If you can make a gadget run Doom, you’re successful.
That’s especially true if you can run Doom in a new and interesting way, which is exactly what Bild has done by using a CNN running on an NVIDIA Jetson Nano to recognize body gestures and use them as commands for the game. In order for the neural network to recognize those body gestures, it first had to be trained on them. In total, the neural network was trained on 3,300 images. Those correspond to 11 different control commands, including: walk forward, shoot, jump, crouch, and god mode.
After training, the NVIDIA Jetson Nano watches Bild through an attached CSI (Camera Serial Interface) camera. Anytime it detects one of those gestures, it notifies a REST API interface running on a laptop. That, in turn, simulates a physical key press. A projector is connected to the laptop so the game screen is in front of Bild while he plays. The result is an entirely new — if not particularly practical — way to play Doom.