Published © MIT

Ultra96 - Speech Recognition at the Edge

New attempt to create a real-time speech to text and text to speech system.

ExpertFull instructions providedOver 2 days2,782
Ultra96 - Speech Recognition at the Edge

Things used in this project

Hardware components

Ultra96-V1
Avnet Ultra96-V1
×1
Generic USB microphone
×1
Generic USB speakers
×1
Generic USB Hub
×1
Generic USB to Ethernet Adaptor
×1

Software apps and online services

Ubuntu Base
Yocto Project
Linux Kernel
Vivado Design Suite
AMD Vivado Design Suite
TensorFlow
TensorFlow

Story

Read more

Code

Project DeepSpeech

Project DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.

TensorFlow

TensorFlow is an open source software library for numerical computation using data flow graphs. The graph nodes represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture enables you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow also includes TensorBoard, a data visualization toolkit.

meta-xilinx

Yocto layer from Xilinx, the other layers can be found as part of this organization on GitHub.

Mali-Lima

Open source GPU driver stack

CHaiDNN

CHaiDNN is a Xilinx Deep Neural Network library for acceleration of deep neural networks on Xilinx UltraScale MPSoCs. It is designed for maximum compute efficiency at 6-bit integer data type. It also supports 8-bit integer data type.

Credits

Comments