Why Not Wi-Fi?
Using Wi-Fi signals to estimate human body pose can save money, protect privacy, and overcome the limitations of current techniques.
Human pose estimation is the process of detecting and tracking the posture of a person in real-time, using machine learning algorithms. This technology has numerous applications in fields such as robotics, virtual reality, and human-computer interaction. While there have been many advances in the field in recent years that have resulted in the development of some very accurate systems, these solutions tend to rely on sensors that have significant limitations, or are otherwise not practical for real world use. For these reasons, the many potential applications of human pose estimation in everyday life have largely remained unrealized.
The vast majority of pose estimation systems rely on either cameras, LiDAR, or radar to collect data about the person of interest. While each of these solutions can work quite well under the right conditions, there are many situations that they are just not well suited for. Cameras, for example, will fail when there is an occlusion — perhaps a piece of furniture or even another part of the body — that blocks the camera's view. Cameras are also fraught with lighting-related issues and come with a slew of privacy concerns. LiDAR and radar-based systems sidestep many of these issues, but they tend to be quite expensive. This puts them out of the reach of most household and small business users.
A trio of researchers at Carnegie Mellon University have recently reported on their work that takes a different tack in dealing with the problem of human pose estimation. They demonstrated that the present issues related to cost, occlusions, and privacy can be resolved by leveraging a Wi-Fi antenna as the sensing instrument. They paired this sensing technology with a deep learning approach to estimate human body poses. Impressively, they showed that their novel approach rivals camera-based technologies in performance.
The team’s technique requires a series of three Wi-Fi transmitters and another three aligned Wi-Fi receivers. A pair of common, low-cost home Wi-Fi routers (with three antennas each) would be sufficient to fit the bill. In particular, the channel state information (CSI) signals, which represent the ratio between the transmitted signal wave and the received signal wave are leveraged. As these signals travel between the transmitter and receiver, they are modified as they interact with any people that they come into contact with, and that changes their characteristics.
Taking inspiration from computer vision models, the researchers constructed a region-based convolutional neural network-based analysis pipeline that locates the parts of a human body, then maps the phase and amplitude of Wi-Fi signals to coordinates within 24 human body regions. This final system was then put through its paces in a series of experiments. Between one and five participants were observed by the pose detector in laboratory and classroom environments. The results of the system were then compared with the image-based DensePose approach. It was found that the team’s novel approach performed similarly to the traditional, camera-based method.
During testing, it was found that if body poses that rarely occurred in the training dataset were encountered, the system would have a difficult time accurately estimating pose. It was also observed that accuracy decreased as the number of people under observation increased. The researchers believe that both of these problems can be resolved in the future by building a larger and more robust training dataset. If they can get past the issues around collecting a sufficiently large and varied Wi-Fi signal dataset, this solution may prove to be the perfect option for human body pose estimation — low cost, robust under varying conditions, and privacy preserving.