Jetson Nano Mobile Object Tracking

Learn how to use machine learning and computer vision with your Jetson Nano to let it track recognized objects.

The Jetson Nano

The Jetson Nano has a Maxwell-based GPU that contains 128 CUDA cores, which are capable at computing.5 teraflops per second (.5 TFLOPs). And with its powerful Arm Cortex-A57 processor and 4GB of LPDDR4 RAM, the Jetson Nano is a powerful computer in a small package. It also has support for numerous peripherals, including gigabit ethernet, HDMI 2.0, DisplayPort 1.4, 2 DSI connectors, an M.2 PCIe connector, 4 USB 3.0 ports, and two CSI camera connectors (for use with cameras like the Pi Camera). All of this connectivity makes it an ideal platform for AI at the edge projects.

Demonstration Video

Necessary Hardware

Of course, this project relies on the Jetson Nano for its main board, but it also requires a couple more items to work. First, there has to be a camera to gather image data, and that is done through a Raspberry Pi V2 Camera Module. I had tried using a USB webcam, but the jetbot library is not compatible with it yet.

Next, I added a wireless keyboard/mouse combo and WiFi dongle to allow for communication with other devices.

Finally, I connected a mini HDMI monitor to the Nano in order to view the output of the camera.

Setting Up the Device

The Jetson Nano that I am using is for robotics applications, and uses Nvidia's Jetbot platform for its libraries. You can download the Jetbot OS image and flash it using Balena Etcher. Then simply insert the SD card and boot. If you want to connect to WiFi, use the GUI in Ubuntu and select your network to sign in.

Setting Up Software

Most of the Python libraries are already installed and ready to use, but two items are still needed: PyGame and the pre-trained model. Additionally, you can install the latest jetbot library by following the directions here. PyGame has to be built from source, as attempting to install it from pip3 leads to an error. Simply copy and paste the following commands in order:

sudo apt-get install mercurial
hg clone
cd pygame
sudo apt-get install python3-dev python3-numpy libsdl-dev libsdl-image1.2-dev \
libsdl-mixer1.2-dev libsdl-ttf2.0-dev libsmpeg-dev libportmidi-dev \
libavformat-dev libswscale-dev libjpeg-dev libfreetype6-dev
sudo apt-get install python3-setuptools
python3 build
sudo python3 install

The pre-trained model is based on the MobileNet V2 model, which uses the COCO dataset. Download it and place it into the same folder as your .py file. This is what is used to detect objects from the camera.

You can view the code yourself here.

Getting and Processing Image Data

Image data is collected by placing the raw camera data into an array from the Camera instance. It then gets processed by the model, where an array of the detected objects is returned.

This is all fine, and the project could now be considered completed, but what if we wanted to actually see what the camera sees?

Displaying with Pygame

After all of the objects have been detected, the first thing the program does is draw bounding boxes around each object with openCV. Since the openCV image is an array, it first needs to be converted to a pygame Surface object with pygame.make_surface(). Next, a label is also drawn which shows which item the model thinks it is. This is done by rendering text and then using pygame's blit function to place it on top of the image. Finally, the whole frame is scaled and transformed to make it be oriented correctly, where it is then drawn to the screen.

If any key is pressed while the program is running, the program exits.

Arduino “having11” Guy
19 year-old IoT and embedded systems enthusiast. Also produce content for and love working on projects and sharing knowledge.
Related articles
Sponsored articles
Related articles