Team TinyML 9:

Weihao Huang (wh28)

•

Chih-Che Chung (cc128)

•

Wenbai Cheng (wc47)

Published December 2, 2020

COMP 554 Fall 20 TinyML Final Project: Wake Word Detection

Implemented wake word detection to detect "stop" or "go" on Arduino Nano 33 with Tensorflow lite.

BeginnerShowcase (no instructions)3 hours843

COMP 554 Fall 20 TinyML Final Project: Wake Word Detection

Things used in this project

Story

Team members

Chih-Che Chung (cc128)

Wenbai Cheng (wc47)

Weihao Huang (wh28)

Goal

We built an embedded voice recognition application by using Arduino Nano 33 to take an one-second voice as input and classify it. Our device will light the red or green LED for 3 seconds when hearing the word "stop" or "go" respectively.

Machine Learning Model

The embedded device converts the raw audio input into a two-dimensional array spectrogram, which allows the model to distill the most useful information. We used the convolutional neural network (CNN) that works well with multidimensional tensors to extract the feature. Our dataset is the Speech Commands dataset. The model will train for 12, 000 steps with a learning rate of 0.001, and then 3, 000 steps with a learning rate of 0.0001. The total number of steps will be 15, 000.

Before we start

In order to fully understand the project and learn how to set up Arduino Nano 33 as well as get familiar with Arduino IDE, we built a practice application before starting our final project following the instruction in Chapter 7 Wake-Word Detection: Building an Application in "TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers". By using the pre-trained model, we successfully upload our practiced application to Arduino Nano 33 board, which utilized the microphone on the board to detect input voice, transform input to suitable features and process the features with the model, then convert the result from the model to determine final output and visualize it with onboard LED. The following video showed our outputs of our practice application with green light representing input word "Yes" and red light representing input word "No".

Figure 1 Yes and No Video

Training our new model in Colab

Configuring parameters

The training scripts are configured through lots of command-line flags that control everything from the model’s architecture to the words it will be trained to classify.

To make it easier to run the scripts, the notebook’s first cell stores some important values in environment variables. These will be substituted into the scripts’ command line flags when they are run.

WANTED_WORDS allows us to select the words on which to train the model. By default, the selected words are “yes” and “no”. Our model selected "stop" and "go".

TRAINING_STEPS refers to the number of times a batch of training data will be run through the network and its weights and biases updated. LEARNING_RATE sets the rate of adjustment. By default, the model will train for 15, 000 steps with a learning rate of 0.001, and then 3, 000 steps with a learning rate of 0.0001. Our model trained for 12, 000 steps with a learning rate of 0.001, and then 3, 000 steps with a learning rate of 0.0001.

Installing the correct dependencies

Install a specific version of the TensorFlow pip package that includes the ops required for training.

Clone a corresponding version of the TensorFlow GitHub repository so that we can access the training scripts.

Monitoring training using something called TensorBoard

It’s a user interface that can show us graphs, statistics, and other insights into how training is going.

For instance, TensorBoard shows two graphs, “accuracy” and “cross_entropy”, as shown in Figure 2 and Figure 3.

Figure 2 The "accuracy" graph

Figure 3 The "cross_entropy" graph

The “accuracy” graph shows the model’s accuracy on its y-axis, which signals how much of the time it is able to detect a word correctly. The “cross_entropy” graph shows the model’s loss, which quantifies how far from the correct values the model’s predictions are.

Converting the training output into a model we can use

Firstly, convert the frozen graph file into a fully formed TensorFlow Lite model.

Secondly, convert the TensorFlow Lite model into a C array.

Using our newly trained model in our project

Replacing the model

Replace the contents of the array and the value of the constant g_tiny_conv_micro_features_model_data_len in micro_features_model.cpp file.

Updating the Labels

Swap the “yes” and “no” for “go” and “stop.” in arduino_command_responder.cpp file.

Updating arduino/command_responder.cpp

Swap the 'y' and 'n' with 'g' and 's'.

Compile and upload the board

Figure 4 Compile and upload the board

Test Functionality

Figure 5 Serial Monitor

Figure 6 Stop and Go Video

COMP 554 Fall 20 TinyML Final Project: Wake Word Detection

Things used in this project

Hardware components

Software apps and online services

Hand tools and fabrication machines