Tremendous advances have been made in tinyML in recent years, with low-power, resource-constrained devices demonstrating impressive performance when running inferences against machine learning models. However, there are still many use cases for smart wearable devices that push well beyond the present limits of tinyML. The computational complexity of processing high-resolution images from multi-camera systems, for example, would bring nearly any microcontroller to its knees.
Simply increasing the onboard computational resources of a wearable is often not an option — toting around a backpack with 50 pounds of hardware and batteries just does not make a lot of sense to handle the processing for a smartwatch. In such situations, offloading the processing to the cloud can be a great option. But that also raises another question… how can data-dense sources, such as high-resolution video streams, be transferred to the cloud fast enough such that the results are still relevant when sent back to the device? In trying to answer this question, a group of researchers at New York University recently explored the viability of using 5G wireless networks for computer vision applications in wearable devices.
The VIS4ION system, which is a wearable device designed to help the blind and visually impaired to navigate their surroundings, was chosen as the platform to help the team test offloading processing via 5G. The current design of this device has high-resolution cameras, as well as onboard processing located in a backpack. Because everything is onboard, the device is a bit cumbersome to use, and it is also limited in terms of frame rate for object detection. For this study, VIS4ION was modified to add additional cameras to increase the field of view, and wireless transceivers were added to the design to allow data to be offloaded to the cloud.
To optimize the system, the team first created a dataset, dubbed the NYU-NYC StreetScene dataset, that assisted them in evaluating object detection in pedestrian commuting imagery. Using this dataset, they next performed an extensive study exploring the effects of image resolution and bitrate on the performance of object detection techniques. As a next step, they conducted detailed wireless network simulations to better understand network availability during typical pedestrian commutes. With this prior work complete, they were finally able to evaluate local data processing as it compares with offloading via LTE, and offloading via 5G and LTE.
The results showed that by using both LTE and 5G links, it was possible to support one video stream over 75% of the time, and to support two streams over 73% of the time with a maximum roundtrip delay of thirty milliseconds. Four cameras can be supported simultaneously 65% of the time with the same delay constraint. One and four camera support increases to 95% and 67%, respectively, when the delay constraint is relaxed to 50 milliseconds. These results show the feasibility of 5G and LTE for offloading large volumes of data to the cloud for processing, however, lack of coverage in certain locations would present problems for some applications. Blockages and range limitations present further challenges when using 5G communications. This method is not appropriate for all use cases, but it is another tool that can be considered when it matches with the design requirements.