Speak Up So I Can Hear You
The ChatBox smart speaker lets you ditch the keyboard and have a one-on-one conversation with ChatGPT.
Over the course of the past several months, ChatGPT has become one of the most popular large language models in the world. With its ability to converse in a natural language, ChatGPT has captured the attention of businesses, individuals, and popular culture alike. This rise in its popularity has had a profound impact on many aspects of our lives.
Businesses have been quick to recognize the potential benefits of ChatGPT. The technology has become an integral part of many customer service operations. Companies are using ChatGPT to interact with customers on their websites, social media platforms, and messaging applications. With its ability to provide quick and accurate responses, ChatGPT has helped businesses streamline their customer service operations and improve customer satisfaction. Furthermore, the technology has been used in various industries such as healthcare, finance, and marketing to automate repetitive tasks and increase efficiency.
On a personal level, ChatGPT has also become an essential tool for many individuals. With its conversational ability, ChatGPT can help people with various tasks such as writing, research, and even mental health issues.
But to date, the primary way of interacting with this tool is through a text-based web application. We have become accustomed to interacting with this type of tool via smart speakers and other voice assistants in recent years, however. This has left many people wondering when they will finally be able to interact with a language model in a more natural way, using only their voice.
There may not be any major commercial products that have incorporated ChatGPT, or any of the other popular large language models, into them yet, but a number of personal projects have done so in a variety of ways. The latest such effort is a device called ChatBox, developed by Hoani Bryson.
ChatBox is a nicely finished device for a personal project, with a wooden case, large speaker, RGB LED ring, and an LCD display. A Raspberry Pi 4 provides the primary processing power, with a Teensy 4.0 development board included to handle the operation of the LEDs. A USB microphone and a slew of 3D-printed parts complete the hardware build.
The case of the device has a pushbutton that a user can press to start a new prompt. This triggers a custom application that records the speaker’s voice via a USB microphone, then sends it to Open AI’s voice transcription API that converts it to text. This text is then fed into OpenAI’s ChatGPT API, which returns a response, just as it would if one were using the web application. Finally, the result is passed into the free and open source eSpeak speech synthesizer, and the result is played on the system’s speaker.
The LED ring flashes in different patterns as the system works or speaks a response. The LCD display shows the state of the device, like if it is waiting for a prompt or waiting for an API response.
If you are technically inclined, the write-up provided by Bryson should be enough for you to build something that works like ChatBox, even if it does not completely resemble it. If not, you might have to wait until these groundbreaking language models are finally integrated into a commercial product.