Our world and our lives are dominated by patterns. With the right models and training data, low-power IoT devices operating on the edge can unlock powerful use-cases by using machine learning to harness those patterns. Traditionally, planning a machine learning application brought forward a bewildering array of options. It was necessary to first understand how to build a machine learning algorithm. Then you needed to determine what combination of toolkits, platforms, connectivity models, and processor options benefits your end-application. It is only at that point you can think about how to train the models, test them, and finally deploy. With so many steps, it was too easy to get lost.
Fortunately, the Intel Neural Compute Stick 2 offers an ecosystem that simplifies that previously bewildering matrix and can be the focal point for a product that uses machine learning.
Please do not confuse the "Neural Compute Stick" with the "Compute Stick." While similarly named, they have very different applications. NCS and NCS2 are not standalone computers, they are more like USB-based machine learning co-processors. In contrast, the older "Compute Stick" was a computer in the form of a small stick, but with an HDMI port.
Intel's Neural Compute Stick 2 (NCS2) is a stick with a USB port on it. It looks like a beefy dongle. Inside is the Movidius Myriad X vision processing unit (VPU). At first, it might seem like this device is a "machine learning accelerator." And depending on your host platform, perhaps it could be considered so. For example, if you connect it to a Raspberry Pi, the NCS2 is a great fit. However, it does not add much to an Intel-based PC. Instead, when paired with a PC, it becomes a development platform. Using the USB-stick, you can see how a trained neural network would perform on an IoT device using the Movidius processor.
In the quick start guide for the NCS2, Intel even mentions such a model. They suggest using a PC to train and optimize the neural network before deploying it to the VPU running on a Raspberry Pi. This is a reasonable and expected use model, because, training a module is significantly more computationally intensive than inference. To understand why let's take a deeper look at what the NCS2 does to help with machine learning.
Machine learning is the study of computer algorithms that improve automatically through experience. It works by building mathematical models based on a known sample data. These data sets are also sometimes called "training data." The intention is that the algorithm can make decisions about new data, based on what it learned from its training.
The process starts with a blank neural network in need of training. Just like a biological brain, training the neurons cause them to make weighted connections. Each time it is given new data, the network makes a decision. Then it is told if the decision it made was correct. With that simple feedback, different neurons get different weights and different connections. Over time, the neural network becomes more accurate with its predictions.
This training process is why machine learning is considered a subset of the broader artificial intelligence field. In machine learning's case, the intelligence part is somewhat small. The neural network makes decisions based on data provided, but unlike a biological brain, no critical thinking occurs.
Many of us are already familiar with a prevalent machine learning training process. Google's reCAPTCHA may have asked you to identify elements like bicycles, crosswalks, or traffic lights in a series of low-quality pictures. While you might have realized you were proving to Google whether or not you are a human being, you were also training reCAPTCHA to identify those various objects—a win-win use of machine learning. For example, reCAPTCHA sometimes shows you pictures that it knows you should click, and sometimes it shows you images to see if you pick the same answer it did. With that simple action, it can determine if you are a person and it helps to train models for applications like self-driving cars.
There is a disparity between the computing power needed to train a model and to use the model to infer a result. Remember, the training process requires many iterations of getting new data, making a guess, and then adjusting its network based on a simple "yes" or "no" response. Processing platforms, such as modern server CPUs and GPUs, are well suited for parallel processing.
At the end of the training, there is a massive database representing the connections made within the neural network. A desktop-class CPU or GPU could likely use this database as-is to analyze real-world data. And in applications where you have full access to cloud-connected computers or high-end processing platforms, that trained network might be ready for deployment. However, if the desire was to have that inference step happen on the edge, without the cloud or a high-end processor platform, more processing of the dataset is necessary.
How do you take a dataset trained on your desktop's GPU and convert it into a dataset that a low-power consumption processor, like the Intel Movidius, can use while running inside a battery-powered IoT device? That exact situation is where OpenVINO comes in.
OpenVINO is short for Open Visual Inference and Neural-network optimization. It is an open source toolkit that takes a trained model and optimizes it for a target platform. Platforms it supports are desktop-class CPUs, GPUs, FPGAs, and VPUs. A VPU is a Visual Processing Unit or a processor that contains a neural compute engine. The most popular target for OpenVINO is the Movidius in the Neural Compute Stick 2. Since OpenVINO's license is Apache 2.0, third parties can add support for other hardware through a plug-in.
For clarity, OpenVINO is not a machine learning framework itself. As its name says, it is a model optimizer. You must use a framework, such as TensorFlow or Caffe, to generate the trained model. OpenVINO takes an inference model developed in those frameworks and then optimizes it for the target processor. An upside to this approach is that OpenVINO means that you can start working with machine learning algorithms without any additional hardware in most cases.
However, for resource-constrained platforms, the Neural Compute Stick 2 provides two options. First, when plugged into the USB-port of a low-power host, the NCS2 runs the optimized neural network much faster and with much lower power. The other case for NCS2 is development, training, and (performance) testing targeting an end-product based on the Movidius VPU. Using the USB device on a PC helps you gauge the performance of the neural network when deployed to a dedicated VPU.
Once OpenVINO is installed, and you can load the included examples to the VPU, there is one more GitHub repository to visit. Intel calls it the Open Model Zoo. This rich repository contains a slew of examples, ready for exploration. Examples of inference models include object, person, and vehicle detection. When it comes to people, more specific models can detect a raised hand, a face, a pedestrian, or a physical pose. And these are just a few of the examples provided.
Machine learning has become a very popular, and usable, technology to achieve artificial intelligence on the edge. As discussed in this article, with a toolkit like OpenVINO and a breadth of processor/platform technologies available, Intel has this application space covered. Models can be trained and tested on powerful desktop computers and then benchmarked with the Movidius VPU in the Neural Compute Stick 2. Or in cases where you need to use a lower-powered system, such as a Raspberry Pi, NCS2 is an off-the-shelf solution to add machine learning to the edge.
Head over to this Avnet page for more information about Intel's AI on the Edge solutions. You can even register to win the Neural Compute Stick (until August 31, 2020).