Face and Emotion Recognition with Just an NXP i.MX RT Microcontroller

Hands-on experience with NXP's i.MX RT, an Arm-based solution that does not need the cloud.

SLN-VIZN-IOT evaluation module, with camera

Mobile phones continue to teach us new ways to interact with computers. Those interactions make their way into our other devices. For example, face recognition is quickly becoming an alternative for usernames, passwords, and fingerprints. Unlike passcodes, or even fingerprints, it is challenging to recognize a face accurately. There could be sampling issues such as lighting or camera angle, and there could be human challenges such as aging. Machine learning algorithms help with these challenges. In the past, it would take the full power of a high-end mobile phone or the help of the cloud to make face recognition work reliably and securely.

With NXP's i.MX RT160F, it is possible to get secure face recognition from a low-cost Arm Cortex-M7 processor that operates without access to the cloud. The SLN-VIZN-IOT evaluation kit is an example of a "turn-key face recognition" solution based on the i.MX RT microcontroller.

To understand what a "turn-key face recognition solution" means, we need to first look at the popular face recognition options readily available today and then do a hands-on with the IoT kit.

These are not the only computer vision tools available on the market, however, this list expresses the range of existing solutions: purpose-built libraries, cloud-based services, and machine learning frameworks.

Purpose-Built Library: OpenCV

OpenCV is an open Source computer vision and machine learning library. It runs on all major platforms, such as PCs, iOS, and Android, with C++ and Python being the most popular interfaces. Java and MATLAB increase OpenCV's flexibility for various applications, but all of these applications tend to run on high-end computing platforms. These requirements make it ideal for real-time applications like detecting intrusions in a surveillance video feed, monitoring equipment, and helping robots navigate mazes or pick-up objects. OpenCV generally runs without Internet access, but it does have relatively hefty hardware requirements.

Cloud-Based Service: Azure Computer Vision

Cloud-based services like the Microsoft Azure Computer Vision service can be used for digital asset management, like to detect objects, brands, or people in images. A service like this one focuses on individual images or frames from a video for visual detection. In other words, it is not meant for video. Or at least, real-time video.

It is important to verify what happens to data uploaded to a cloud service. Azure Computer Vision, for example, commits to removing data and not using it to help train its underlying algorithms.

On the other hand, the upside to the cloud approach is that it scales quickly as an application grows.

Machine Learning Framework: TensorFlow

TensorFlow is a software framework. Unlike OpenCV, TensorFlow has many machine learning applications with computer vision as just one. That said, there are some off-the-shelf options for an application like face recognition. A benefit of this framework is that it allows for extreme flexibility for an application.

Like OpenCV, you could use it as a (self-hosted) cloud-based service. For hardware requirements, there is a version called TensorFlow Lite, which targets microcontroller platforms. As a framework, this enables broad adaptability in potential applications, but it may not offer an optimized solution for a specific application, like face recognition on an Arm-based processor.

NXP's EdgeVerse Is the Fourth Option

Consider those three existing options. If you need a turn-key solution, optimized for energy-efficient hardware, low bill-of-material cost, and not relying on servers in the cloud, none of those three options fit well.

Such a situation is ideal for NXP's EdgeReady products. They fall under NXP's EdgeVerse, which is a comprehensive edge computing and security platform. They intend to bring together the building blocks to build high-performance, energy-efficient, and scalable solutions for industrial and IoT applications. EdgeReady products fit into categories like face recognition, Alexa voice service, and local voice control.

i.MX RT106F Processor

An example of an EdgeReady solution is a combination of the i.MX RT160F processor and the Oasis software library. Both come in a self-contained evaluation kit named SLN-VIZN-IOT. It demonstrates a turn-key facial recognition system that runs without access to the Internet and on an Arm Cortex-M7 running FreeRTOS. There is no need to dig into the complexity of machine learning libraries, APIs, or frameworks to add face recognition to an application.

At the core is an NXP i.MX RT106F processed based on an Arm Cortex-M7 running at 600 MHz. It has a built-in floating-point unit along with a wide array of peripherals. There are four watchdog timers, six general-purpose timers, and an IO multiplexer. There are also a couple of peripherals dedicated to motor control: four quadrature encoders, four QuadTimers, and four FlexPWMs.

The SLN-VIZN-IOT reference design contains additional devices, some non-NXP, to support evaluating their turn-key solution for face recognition applications. Here is a look at the extra hardware included.

SLN-VIZN-IOT Hardware Blocks

There are two boards in the SLN-VIZN-IOT kit. On the first board, you find the i.MX RT as a system on module (SoM). The other board is the application board. It has various sensors, including the vision sensor--aka camera.

On the i.MX RT processor module is an i.MX RT106F processor. The SLN-VIZN-IOT includes 256 Mbit (32MB) of HyperFlash memory and 256 Mbit (32 MB) SDRAM. Over SDIO and UART are a WiFi and BT/BLE module with an u.FL connection for an antenna. (There is no antenna included with the evaluation module.)

Across general-purpose 40-pin board-to-board connectors are connections to a pair of pulse density modulated (PDM) digital microphones, an I2S audio amplifier, a serial port, a couple of status LEDs, and four surface-mount push buttons.

Another 40-pin board-to-board connector provides GPIO access to the PIR (motion) sensor, CSI bus to the camera connector, SPI to a display connector (not included with the evaluation module), and a USB-C connector.

For battery-based applications, NXP's own MC34671AEP battery manager solution provides charging to a user-supplied LiPo battery.

The camera module has an image sensor with an RGB pass filter. As discussed later in the hands-on section, this single RGB camera makes the kit suitable for evaluating non-security applications.

The other piece of the EdgeReady solution is the included software library, which is called Oasis.

Oasis

NXP's Oasis is a facial recognition run-time library. Its processing engine uses a neural network. Unlike the solutions mentioned earlier, it does not require a cloud connection or a PC. The SLN-VIZN-IOT evaluation kit is running an operating system, but one that is lightweight. An Arm-based i.MX RT106F processor running FreeRTOS manages the face recognition process.

From the sparse details available about Oasis, it does not appear the user has direct access to the neural network used. Instead, it seems NXP intends you to use it as an off-the-shelf solution. As the evaluation module shows, the software library performs the initial face detection, can apply anti-spoofing (with two cameras), aligns the matrix, and then performs the face recognition along with a confidence factor.

To evaluate how accurate and fast the solution is, I performed some hands-on testing with the demonstration code provided by the kit.

Hands-On

The SLT-VIZN-IOT kit comes as a nice module with the evaluation board outfitted with an RGB camera. The on-board pushbuttons access a few of the functions which show off the Oasis face recognition capabilities. The USB-C connector provides both power and connection to a computer.

Over on the host operating system, the evaluation board appears as a USB camera. For this example application, you can store "usernames" based on just a person's face.

My face is now registered as "user_1" (renamed through the command line interface)

As mentioned, the one caveat is that the evaluation board only comes with a single RGB camera. Without an IR camera, it is quite easy to fool the Oasis engine with a photograph of a stored face. By adding an IR camera, however, it should be possible to use Oasis's anti-spoofing technology.

Detecting my face (and user) from a selfie on my phone

At first, some people might write-off the SLN-VIZN-IOT's demo as unimpressive. But that would be a mistake. It is easy to forget that everything happens on the i.MX RT106F processor. That point is what makes the demonstration so powerful! While the example code runs FreeRTOS, it does not run a "full" operating system like Linux, nor is it communicating in any way with the cloud. Plus, the demo has one more trick up its sleeve.

(The red square means I am not a registered user)

Not only can the Oasis software library detect a person's face, but it can also determine their mood! There are three emotion modes. They trade-off accuracy for the number of emotions it can detect. The most basic level can detect if the face is expressing happiness. A middle mode adds anger and surprise. While the last mode adds sadness, fear, and disgust.

Unfortunately, there does not appear to be a way to interact with the PIR motion sensor, PDM microphone, audio amplifier, or the display connector without modifying the code and re-programming the board with an external JTAG programmer. However, these additional sensors would appear to be an excellent complement to the types of sensors found in an IoT device based around the i.MX RT106F.

NXP's MCUXpresso SDK is the primary development environment for this kit. It is based on the Eclipse IDE. The documentation and getting starting guide mention that the SLN-VIZN-IOT evaluation kit does require a SEGGER JTAG programmer. There are references to programming over-the-air (OTA) and over USB, but no clear examples of how to accomplish re-programming with those methods.

Not having the JTAG programmer limits the evaluation to the example code shipped on the board. For that reason, my hands-on evaluation did not extend to the additional hardware devices mentioned above.

Conclusion

The fundamental advantage of NXP's face recognition solution is that it runs entirely on a low-cost, energy-efficient processor like the i.MX RT106F. The addition of detecting emotion expands the applications from security to simple identification to reacting based on a person's mood.

The SLN-VIZN-IOT evaluation board gives an excellent feel for what NXP's solution is capable of doing with a low-power microcontroller. It is worth checking out if your energy-efficient application requirements include face recognition but do not want to rely on the cloud.

For additional information on this face recognition solution, check out the collaboration happening between NXP, Microsoft, and Avnet for cloud and edge artificial intelligence applications.

Electronics enthusiast, Bald Engineer, and freelance content creator. AddOhms on YouTube. KN6FGY.