Researchers Combine Computer Vision with Wearable Sensors for High-Accuracy Gesture Recognition
Combining a traditional vision-based gesture recognition system with data from stretchable transparent sensors considerably boosts accuracy.
Researchers at the Nanyang Technological University, Singapore (NTU Singapore) have released a paper describing a machine learning platform for high-accuracy gesture recognition based on what they call a "bioinspired data fusion system" between computer vision and skin-like stretchable strain sensors.
Gesture recognition technology is for more than changing the volume of your TV with a wave of your hand: It can be found in everything from gaming to health monitoring, accessibility, and even the control systems of high-precision surgical robots. The problem, though, is in getting high-quality data on the user's gestures — something the NTU Singapore researchers claim to have resolved.
"Our data fusion architecture has its own unique bioinspired features which include a man-made system resembling the somatosensory-visual fusion hierarchy in the brain," explains lead author Professor Chen Xiaodong of the project. "We believe such features make our architecture unique to existing approaches."
"Compared to rigid wearable sensors that do not form an intimate enough contact with the user for accurate data collection, our innovation uses stretchable strain sensors that comfortably attaches onto the human skin. This allows for high-quality signal acquisition, which is vital to high-precision recognition tasks."
These wearable sensors, made from a transparent adhesive material, are combined with a more traditional computer vision approach — and considerably boost the accuracy. In testing, a camera-based computer vision recognition system resulted in six errors during gesture control of a robot through a maze; switching to the combined vision and wearable sensor approach complete eliminated these errors, while boosting accuracy in dark conditions to 96.7 percent.
"The secret behind the high accuracy in our architecture lies in the fact that the visual and somatosensory information can interact and complement each other at an early stage before carrying out complex interpretation," notes first author Dr. Wang Min. "As a result, the system can rationally collect coherent information with less redundant data and less perceptual ambiguity, resulting in better accuracy."
The paper has been published under open access terms in the journal Nature Electronics.