Giving Your Computer a Tongue Lashing

SilentSpeller is a fast, accurate, and hands-free alternative input method for your devices.

Nick Bild
2 years ago β€’ Machine Learning & AI
SilentSpeller hardware prototype (πŸ“·: N. Kimura et al.)

Over the past decade, many slick and intuitive user interfaces have been introduced or refined. But with all of the ways available to interact with our electronic devices, nothing is more natural than speech. Verbalizing our wishes in naturally spoken language is how we interact with each other, so it makes sense that this would be our ideal mode of communication with computers. Sometimes voice control is impractical, however, as is the case when in a crowded, noisy environment, or when privacy is of concern. For these situations, a number of attempts have been made to develop so-called silent speech recognition in which voice commands are "spoken" without making any audible sounds. This might seem like the ideal compromise between ease of use and privacy concerns, but implementations of silent speech are difficult to develop, often recognize only a small vocabulary of words, and require the speaker to remain stationary.

An alternative to silent speech was devised by a research group centered at the University of Tokyo that uses the mouth to silently spell words, rather than speak them. This setup, called SilentSpeller, allows users to inaudibly spell words with their tongue while wearing a special dental retainer. So, for example, rather than silently "saying" the word "start", a user of Silent Speech would mouth the letter "s" followed by the letter "t" and so on. In this way, SilentSpeller offers rapid and accurate commands to be issued to electronic devices on-the-go while ensuring that privacy is protected. Because spelled words are more easily recognized than the words themselves, much larger vocabularies are possible.

The dental retainer used by the system is equipped with 124 capacitive touch sensors, much like you would find in the screen of a smartphone. These sensors β€” located on the roof of the mouth β€” are capable of tracking the movement of the tongue. To recognize the 26 letters of the alphabet, the sensor data was analyzed by principal component analysis, and the top 16 principal components were selected. These components were fed into a Hidden Markov model that then classified sensor inputs as a silently spoken character of the alphabet. An offline test of the processing pipeline showed the system to have a 97% character accuracy on a 1,164 word vocabulary. When shown 100 words that the model had not previously seen, a 94% character accuracy was observed, demonstrating the potential utility of SilentSpeller in real-world scenarios.

A major concern of the researchers was the speed at which users could spell out words with SilentSpeller. A series of validation tests were conducted to compare silent speech, silent spelling, and on-screen mini-QWERTY keyboards (such as are present on smartphones). Expectedly, silent speech fared the best, averaging 115 words per minute (wpm). Silent spelling came in at 38.7 wpm, just ahead of on-screen keyboards at 36.6 wpm. Naturally, SilentSpeller will not be able to compete with silent speech in terms of speed, but it was shown that it is a viable alternative to on-screen typing.

While SilentSpeller works quite well for voice-like command inputs, the fact that it requires a custom-made dental retainer to be worn will most likely relegate it to special use cases. Those with certain types of disabilities may find SilentSpeller a big help, but it is unlikely to find mainstream acceptance. In any case, it is a great example of thinking outside the box to solve real problems.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles