Derma Converts Subvocalizations Into Voice Input

With a little practice, anyone could use this technology to issue silent voice commands.

You probably don’t notice it, but you most likely subvocalize speech when you’re reading — and sometimes even when just thinking about what you’re saying. Subvocalizations is inner speech, but usually manifests as barely noticeable contractions of the larynx muscles. Essentially, you’re speaking so quietly that it is inaudible. There are many explanations for why we perform subvocalizations, but it is normally completely subconscious. It is possible, however, to train yourself to consciously perform subvocalizations. Derma is a device developed by researchers from the University of Tokyo and the Sony Computer Science Research Institute (CSL) that can turn those subvocalizations into voice input.

Militaries around the world are investing in this kind of technology, as it would be extremely useful during stealth operations. Soldiers could, for example, speak among themselves without actually saying anything out loud. Those same benefits would apply in the civilian world. Imagine being able to issue a voice command to your smartphone without having to speak audibly. You could reply to a text message without disturbing a business meeting, or ask Siri a question in a crowded subway car without sacrificing your privacy. There is also potential to help people who aren’t capable of speaking, as long as their larynx muscles still function normally.

Derma is able to recognize subvocalizations by taking readings with small sensors and then performing deep learning to convert those readings into speech. Those sensors are micro electro mechanical systems (MEMS) that contain accelerometers and angular velocity sensors. Those are adhered to the sides of the user’s throat, and measure tiny muscle movements. The deep learning system determines the phonemes that the muscle movements represent. Phonemes are the basic sound building blocks that make up our speech, and connectionist temporal classification (CTC) is used to turn those phoneme sequences into speech that can be recognized by typical voice recognition-enabled digital assistants. With a little practice, anyone could use this technology to issue silent voice commands.

Cameron Coward
Writer for Hackster News. Proud husband and dog dad. Maker and serial hobbyist. Check out my YouTube channel: Serial Hobbyism
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles