Published December 4, 2021 © MIT

COMP 554 TinyML Final Project: Wake Word Detection

A wake word detection device built on top of a microcontroller, leveraging the convolutional neural network for speech recognition.

BeginnerFull instructions provided10 hours341

COMP 554 TinyML Final Project: Wake Word Detection

Things used in this project

Story

Introduction

In today’s digital world, voice assistants are everywhere, from mobile phones to laptops, from speakers to home appliances. Unlike the graphic user interface that we are all familiar with, these voice assistants provide a voice interface that users can interact with by simply speaking, without touching the device. In order to achieve this functionality, the wake word detection technique is leveraged to help these devices to protect user privacy, save battery and listen to commands. Since wake word detection is so abundant in our everyday lives, it is exciting to investigate and understand this speech recognition algorithm and deploy it on our own hardware device.

Goal

The main objective of the project is to build our own wake word detection system on Arduino Nano 33 Sense BLE microcontroller. And this device is capable of detecting the words “yes” and “no”, and responds with its LED light. A speech recognition model needs to be trained in order to understand the different voice commands. We also have to consider the resource restrictions of the microcontroller. The traditional resource-intensive machine learning models need to be optimized before being deployed on the hardware device.

Approach

By referencing the steps in Chapter 7: Wake word detection: Building an application of the TinyML book, the first major piece of the system is to train the machine learning model. The Speech Commands dataset is used to train the model. The model is able to classify inputs into four categories: “yes”, “no”, “unknown”, and silence. Also, the input type of the model is chosen to be spectrogram represented in a two-dimensional array. We are using Convolutional Neural Network architecture for the model since it works well with spectrogram data. CNN is composed of convolutional layers and fully connected layers. And it is easy to leverage the power of pre-trained neural networks by reusing their convolutional layers while adjusting fully connected layers. This way, it saves a lot of time and resources. The next step is to quantize and convert the CNN model and deploy it to the microcontroller and set up things like the input and output of the device. We are going to use TensorFlow Lite and Arduino IDE to help accomplish this task.

Code

Credits

Zheng Fang

1 project • 0 followers

COMP 554 TinyML Final Project: Wake Word Detection

Things used in this project

Hardware components

Software apps and online services

Hand tools and fabrication machines