Spec-tacular Body Pose Estimation

By analyzing the reflections of inaudible acoustic signals, PoseSonic glasses can estimate the positions of nine upper body joints.

Nick Bild
9 months ago β€’ Wearables
A prototype of the PoseSonic body pose estimation system (πŸ“·: S. Mahmud et al.)

Body pose estimation is a computer vision technique that involves identifying and tracking the human body's position and orientation in three-dimensional space. It aims to capture the positions and orientations of key body parts such as the head, torso, arms, and legs. By accurately estimating body poses, it becomes possible to analyze human activities, improve physical therapy, enhance gaming experiences, enable virtual try-on for apparel, and facilitate more natural human-computer interactions in augmented reality and virtual reality applications.

The advancement of body pose estimation has been aided by the integration of various technologies, with cameras and machine learning playing key roles. Depth cameras, such as Microsoft's Kinect and Intel RealSense, have been critical in capturing 3D information, enabling precise and robust body pose estimation. These cameras use structured light or time-of-flight principles to capture depth information, allowing for the creation of detailed 3D models of the human body. Machine learning algorithms, particularly deep learning models, have significantly improved the accuracy and robustness of the pose estimations.

Despite the impressive performance of these technologies, practical implementation in portable and wearable devices, such as glasses, remains challenging. The primary barriers include the physical size and energy consumption requirements of the necessary hardware. Incorporating depth cameras and the requisite high-performance computing units into wearable devices poses significant design constraints, making it difficult to achieve a compact and energy-efficient solution.

In a break from conventional solutions, a team at Cornell University has developed a new body pose estimation technology called PoseSonic that can be practically deployed to small hardware platforms, like eyeglasses. No cameras, or any other bulky or energy-hungry hardware is involved β€” a few microphones and speakers embedded in the hinges of the frames were shown to be sufficient to estimate the positions of several key points in the upper body. This was made possible by analyzing the reflections of inaudible acoustic signals with a deep learning algorithm.

The PoseSonic prototype was built on top of an off-the-shelf pair of eyeglasses. Two pairs of MEMS microphones and speakers were attached to each of the hinges. Inaudible Frequency Modulated Continuous Wave-encoded signals were emitted by the speakers. When these acoustic signals came into contact with the arms, torso, shoulders, and other regions of the upper body, they were reflected back to the microphones. A custom convolutional neural network then analyzed these received signals and determined how they were modulated by the surfaces that they were reflected by. The model interpreted this information as the positions of nine body joints in three-dimensional space.

This device was evaluated in a pair of experiments β€” in the first, 12 participants were recruited to test PoseSonic in a controlled, laboratory setting. In another trial, 10 participants were tasked with evaluating the device in a semi-in-the-wild study. By comparing the results of the PoseSonic estimations with a camera-based system that provided the ground-truth measurements, it was discovered that PoseSonic could estimate body joint positions with a respectable average error of less than 2.5 inches in a laboratory setting. The error was about twice that figure in a more natural setting, which indicates that more work needs to be done for this device to operate acceptably in real-world scenarios.

This work is still in the early stages, but with some refinement, it is possible that the PoseSonic technology could eventually provide a more practical and lower-cost solution to the problem of body pose estimation in wearable devices.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles