Published May 9, 2025

Local LLM Voice Assistant

An offline, LLM-based voice assistant with wakeword detection.

IntermediateFull instructions provided8 hours525

Things used in this project

Hardware components

Infineon PSOC™ 6 AI Evaluation Kit (CY8CKIT-062S2-AI)

Raspberry Pi 400 Personal Computer Kit

Story

I previously built an offline, LLM-based voice assistant (details here). It works quite well, but it has one problem — you have to press a button to ask it a question. That is inconvenient and does not go well with the whole concept of a voice assistant.

I do not want to bog the system down with continually looking for a wakeword, so I decided to upgrade my voice assistant with an Infineon PSoC 6 Artificial Intelligence Evaluation Kit. In this way, the PSoC 6 can continually watch for a wakeword, and only trigger the voice assistant to jump into action when I am ready to make a request.

The build of the voice assistant is already documented here, so I won't cover that again. I will just note that the chatbot.py script needs to be updated with this version to interact with the PSoC 6, rather than a push button. That said, I will focus on how I got the PSoC 6 to do wakeword detection for the voice assistant. But first, here is a demo of the project in action:

Building a Machine Learning Model for the PSoC 6

First, make sure that your dev board has already been flashed with the streaming firmware. There is additional information about that here. Once that has been sorted out, you will need to install and launch DEEPCRAFT Studio. Create a new data collection project, then configure it to collect data from the microphone using the drag-and-drop graphical interface. Here is one of my recording sessions after I labeled the data:

Once you have the data prepared, you create a new classification project, then add the data to it:

Be sure to click the Redistribute Sets... button to split the data between training, validation, and testing sets.

You will also need to select a model to perform the classifications. I chose a 1D convolutional LSTM network.

When ready, click the Start New Training Job... button. This job will run in the cloud, so it will not be eating up resources on your own machine, and you do not need to install any machine learning frameworks.

After the training process finishes, you will be given metrics to help you evaluate which model works best for your application. Once you make your selection, download the .h5 model file.

Now use the Code Gen tab to create representations of the model that can be deployed to the hardware.

Finally, pivot over to ModusToolbox. Create a new project based on this example project. After that, replace the files in the models folder of your new project with the model.c and model.h files produced by the Code Gen step. Now you can click on Build Project (you can also make any necessary code edits first in the Eclipse IDE).

The last step is to click the button to deploy your project. The predictions will be output via serial over USB, so I read that via my voice assistant Python script to determine when the wakeword was spoken.

That's it! You've got your own offline LLM-based voice assistant, with this little board doing the tough job of continuous wakeword detection: