Edge AI Is Learning to Adapt
Researchers have developed a compact AI system that mimics human eye movements to advance edge-based perception and planning applications.
The remarkable decision-making capabilities of the latest artificial intelligence algorithms make them very attractive to those building autonomous systems, such as self-driving cars, drones, and robots. What is not so attractive is the computational resources required to run many of these algorithms. This factor introduces latency when running inferences, as do delays associated with network communications, as the processing is typically carried out in a remote data center by necessity. And that latency is enough to sink any project that requires real-time decision making.
The development of tinyML techniques has made it possible to deploy deep learning models on low-power edge devices, opening up new possibilities for real-time perception on resource-constrained hardware platforms. But despite all the recent advances in this area, these models still face challenges in adaptability, particularly in responding to dynamic changes and uncertainty in the environment. These types of problems are usually dealt with by developing larger models trained on larger datasets, but these models are impractical for edge deployment, where constraints on memory, computation, and energy demand smaller model sizes.
In an effort to address these present limitations of tinyML, a trio of researchers at VERSES has developed a smart, agent-based system capable of on-device perception and planning that incorporates active inference to enhance adaptability. Their approach extends beyond standard deep learning capabilities to allow for real-time planning in dynamic environments with a compact model size.
While advances in deep learning have improved sensing capabilities, the adaptability of these models remains limited, particularly when scaled down for resource-constrained edge devices. Active sensing, which requires integrated perception and planning, remains a challenge under such constraints. Active inference, a paradigm rooted in probabilistic principles and the first principles of physics, offers a promising alternative. By modeling uncertainty and environmental dynamics, active inference enables smart systems to learn continuously and make adaptive decisions. Unlike cloud-dependent solutions, this approach supports real-time perception and planning on edge devices, ensuring low-latency responses and enhanced data privacy.
The team’s system was demonstrated through the creation and deployment of a “saccade agent” on an IoT camera with pan-and-tilt functionality, which was powered by an NVIDIA Jetson Orin Nano NX. The system integrates an object detection module for perception with an active inference-based planning module to adapt to the environment dynamically. The agent strategically controls a camera for optimized information gathering.
Saccading, a process akin to the human visual system’s ability to focus dynamically on key details, is a crucial component of active visual sensing. This innate capability enables organisms — and now artificial systems — to adapt to changing environments by selectively gathering important information.
The researchers’ work highlights the growing potential for edge-based adaptive systems in real-world applications that demand both efficiency and precision. By mimicking human-like saccadic motion, the agent demonstrated its ability to focus on the most important details in dynamic environments, paving the way for advancements in a number of fields, such as aerial search-and-rescue, sports event tracking, and smart city surveillance. This work marks a meaningful step forward in bridging the gap between AI-powered perception and practical, on-device decision-making.