TinySpeech Delivers Vastly Reduced Complexity, Improves Performance for TinyML Speech Recognition

Attention condenser approach delivers 207 times simpler networks with 21 times fewer multiply-add operations.

Gareth Halfacree
4 years ago β€’ Machine Learning & AI

Researchers from the University of Waterloo and DarwinAI Corp. have released a paper demonstrating a new approach to speech recognition on edge devices, designed for those currently working on TinyML: TinySpeech.

"Advances in deep learning have led to state-of-the-art performance across a multitude of speech recognition tasks," the team writes in the paper's abstract. "Nevertheless, the widespread deployment of deep neural networks for on-device speech recognition remains a challenge, particularly in edge scenarios where the memory and computing resources are highly constrained (e.g., low-power embedded devices) or where the memory and computing budget dedicated to speech recognition is low (e.g., mobile devices performing numerous tasks besides speech recognition)."

"In this study, we introduce the concept of attention condensers for building low-footprint, highly-efficient deep neural networks for on-device speech recognition on the edge. More specifically, an attention condenser is a self-attention mechanism that learns and produces a condensed embedding characterizing joint local and cross-channel activation relationships, and performs selective attention accordingly."

The team isn't the first to use self-attention mechanisms to augment machine learning algorithms, but it is one of the first to use the technique as a stand-alone approach. The results are impressive: Using Google's Speech Commands benchmark dataset, designed to showcase performance in limited-vocabulary recognition, TinySpeech delivered up to 207x lower architecture complexity and up to 21x fewer multiply-add operations - both helping its implementability on edge devices with constrained memory and compute performance β€” while delivering a slight uptick in accuracy.

"These results not only demonstrate the efficacy of attention condensers for building highly efficient deep neural networks for on-device speech recognition," the team concludes, "but also illuminate its potential for accelerating deep learning on the edge and empowering a wide range of TinyML applications."

The paper has been published under open access terms on arXiv.org, but the source code behind TinySpeech does not yet appear to be publicly available.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles