Gazing Into the Future of AI
TinyTracker is an eye gaze estimator optimized for speed and energy efficiency in AI-in-sensor computer vision chip hardware.
Edge AI refers to the deployment of AI algorithms and models directly on edge devices such as smartphones, IoT devices, cameras, and wearable gadgets, rather than relying solely on cloud-based processing. This approach brings several advantages, including reduced latency, improved privacy and security, and enhanced real-time decision-making capabilities. Edge AI finds applications in various fields, ranging from industrial automation and healthcare to smart homes and autonomous vehicles.
Devices used for edge applications typically have very limited processing power, memory, and battery life, making it highly challenging to get demanding AI algorithms running acceptably. AI-in-sensor technologies are emerging that offer the promise of helping to remove some of the processing burden. These sensors equipped with AI capabilities can efficiently process and filter raw data as it is acquired, reducing the amount of data that needs to be processed by machine learning algorithms on traditional general-purpose processing units.
But even still, it is crucial for the AI algorithms running on the sensors to be lightweight, efficient, and optimized for the specific hardware. This constraint demands a delicate balance between the complexity of the AI model and the available resources, often requiring developers to make trade-offs between accuracy and efficiency.
A team from ETH Zurich in Switzerland set out to create a model that is optimized for a popular AI-in-sensor system, and to compare the efficiency of this setup with other popular edge AI platforms. Their model, called TinyTracker, tackles the very computationally intensive task of 2D gaze estimation. Since this task can be very sensitive to latency, the researchers were interested in learning how much of a speed boost can be achieved by running the algorithm on-sensor.
With the goal of deployment to a Sony IMX500 vision sensor with an integrated AI processor, the team patterned their model after the popular iTracker eye tracking system. The iTracker model is not well-suited to running on tiny hardware platforms, so some modifications were made. It requires four inputs, including images of the eyes, face, and a face grid calculated by a face detection algorithm.
That is a tall order for an edge device, so most of those inputs were removed. To make up for this loss of information, face coordinates were concatenated with the input. This data is processed by a MobileNetV3 object detection model pretrained on the ImageNet dataset. After training, the network was quantized to 8-bit integers to reduce its size and complexity. This resulted in a 41 times smaller model, weighing in at about 600 KB.
The trained TinyTracker model was tested on the IMX500 vision sensor, and also on the powerful Google Coral Dev Micro and Sony Spresense edge AI platforms. A set of metrics were captured when running the eye tracker on each device, all starting with 112 x 112 pixel images to ensure a level playing field.
In terms of latency, the AI-in-sensor chip far outperformed the other options, with a total inference time of 19 milliseconds. The Coral Dev Micro clocked in at 34 milliseconds, with the Spresense sitting at 522 milliseconds due to its reliance on a single core for inference. The all-in-one image sensor also wins the energy efficiency battle handily with an energy consumption of 0.06 mJ per inference. The Spresense and Coral Dev Micro come in at 31.97 mJ and 0.97 mJ, respectively.
This dramatic boost in performance came without much compromise — TinyTracker only lost a paltry 0.16 centimeters of precision as compared to iTracker. This work clearly demonstrates the importance of AI-in-sensor devices to edge AI efficiency. Further development of these technologies could lead to the creation of all manner of scalable computer vision systems in the near future.