In an effort to move away from a reliance on centralized cloud servers for processing, researchers and developers have placed their focus on improving edge AI accuracy and efficiency in recent years. This approach has gained prominence due to its ability to bring real-time, on-device inference capabilities, enhancing privacy, reducing latency, and mitigating the need for constant internet connectivity. However, the adoption of Edge AI presents a significant challenge in balancing the competing interests of model accuracy and energy efficiency.
High-accuracy models often come with increased size and complexity, demanding substantial memory and compute power. These resource-intensive models may strain the limited capabilities of edge devices, leading to slower inference times, increased energy consumption, and a greater burden on the device's battery life.
Balancing model accuracy and energy efficiency on edge devices requires innovative solutions. This involves developing lightweight models, optimizing model architectures, and implementing hardware acceleration tailored to the specific requirements of edge devices. Techniques like quantization, pruning, and model distillation can be employed to reduce the size and computational demands of models without significantly sacrificing accuracy. Additionally, advancements in hardware design, such as low-power processors and dedicated AI accelerators, contribute to improved energy efficiency.
On the hardware front, a notable advancement has been made by a company called Innatera Nanosystems BV. They have developed an ultra-low power neuromorphic microcontroller that was designed specifically with always-on sensing applications in mind. Called the Spiking Neural Processor T1, this chip incorporates multiple processing units into a single package to enable versatility and to stretch the lifespan of batteries to their limits.
As the name of the chip implies, one of the processing units supports optimized spiking neural network inferences. Spiking neural networks are important in edge AI because of their event-driven nature — computations are triggered only by spikes, which can lead to potential energy efficiency gains. Furthermore, these networks have sparse activation patterns, where only a subset of neurons are active at any given time, which also reduces energy consumption. And it is not all about energy efficiency with these algorithms. They also model the biological behavior of neurons more closely than traditional artificial neural networks, which may result in enhanced performance in some applications.
The T1’s spiking neural network engine is implemented as an analog-mixed signal neuron-synapse array. It is complemented by a spike encoder/decoder circuit, and 384 KB of on-chip memory is available for computations. With this hardware configuration, Innatera claims that sub-1 mW pattern recognition is possible. A RISC-V processor core is also on-device for more general tasks, like data post-processing or communication with other systems.
To get started building applications or experimenting with the T1 quickly, an evaluation kit is available. It provides not only a platform from which to build device prototypes, but it also has extensive support for profiling performance and power dissipation in hardware, so you can evaluate just how much of a boost the T1 gives to your application. A number of standard interfaces are onboard the kit to connect a wide range of sensors, and it is compatible with the Talamo Software Development Kit. This development platform leverages PyTorch to optimize spiking neural networks for execution on the T1 processor.