Espressif's Latest ESP-SR Speech Recognition Library Boosts Accuracy and Optimizes Memory Usage
The library's latest release brings improved documentation, support for the ESP32-S3, ESP32-S2, and ESP32-C3, and custom keyword creation.
Espressif has launched a new version of its ESP-SR speech recognition library for the ESP32 microcontroller range, now offering better accuracy and a reduction in memory usage — plus support for custom keywords in voice control applications.
"We're excited to announce the release of ESP-SR v1.2.0, an advanced version of Espressif's Speech Recognition Library," writes Espressif's Sun Xiangyu of the new library release. "This update brings five significant enhancements to the previous version and is designed specifically for the ESP32-S3 microcontroller platforms."
The first of these improvements is accuracy: Espressif claims that the accuracy of on-device speech recognition using the library has been "substantially increased," pointing to a word error rate of nine percent and a response accuracy rate for speech commands of up to 96.8 per cent for spoken English.
Some of this improved accuracy comes from the second of the claimed improvements: enhanced noise reduction, which does a better job of separating speech from background noise than earlier releases. The third enhancement is in support for custom keywords for voice command recognition, rather than being limited to the pre-set keywords offered by Espressif.
The final improvements are new and more comprehensive documentation, designed to better ease newcomers into the library's use, and optimizations which reduce the library's memory usage — leaving more resources available for other tasks running on the same hardware.
The new library supports the base ESP32 family, the ESP32-S3, and the ESP32-S2 and ESP32-C3 — though while speech recognition is available in Chinese and English on all devices, the library's support for text-to-speech was limited to Chinese at the time of writing.
The new ESP-SR release is now available on GitHub, along with the source code under a permissive variant of the MIT license.