Stream Decks, and other similar reconfigurable input devices, have found a more prominent place on the desks of many in recent years, with the growing trend towards working from home. But now that we have by and large reemerged from the confines of our home offices, and portability has again risen in importance, many of these devices are getting left behind, along with the boost in productivity that they can provide.
There may be a good compromise on the horizon, however, thanks to a clever device built by a team of engineers at the University of Waterloo. Called Typealike, their system takes advantage of the webcam already present on nearly all laptops to expand the input surface beyond the keys on the keyboard.
A small plastic housing containing an angled mirror is first installed in front of the laptop’s built-in camera. The mirror gives the camera a view of the user’s hands and the keyboard, rather than looking straight ahead. By performing configurable hand gestures, either on or near the keyboard, Typealike is able to trigger actions in software, such as scrolling through a document or controlling a car in a driving simulator.
Hand gesture recognition was accomplished by using transfer learning on a ResNet-50 convolutional neural network that had been pretrained on the ImageNet dataset of millions of everyday objects. Retraining was conducted with a dataset of 350 thousand images of hand gestures, from 30 participants that were instructed to perform 36 different gestures. Images for typing and non-typing actions were also included to recognize when gestures are not being performed. A variety of participants were selected to capture differences in hand size and type, and lighting conditions were varied to ensure that Typealike will operate smoothly under real world conditions.
A study involving 20 additional participants was conducted to validate the accuracy and utility of the system. The team found low error rates and quick gesture formation times, indicating that Typealike can work well alongside normal typing on a keyboard. Across all 36 gestures, an average classification accuracy rate of 97% was observed. Some of their findings in the study may also inform the best gesture choices for future iterations of the technique — for example, open-handed gestures were found to be significantly faster to perform than close-handed gestures.
The researchers demonstrated Typealike controlling a word processor, switching tools in a graphic editor, triggering system commands, and playing video games. One drawback of the system is that the laptop’s webcam is unavailable for normal use while running Typealike. The simplicity of the device is appealing, as there is no need to tote around additional hardware or master any complex skills. Exactly how useful a tool like this is in practice is difficult to evaluate without testing it out firsthand, but it certainly looks very promising.