Speak & Tell
This AI-powered magic picture knows you better than you know yourself. Will you like what it tells you?
Notifications, text messages, emails, flashing lights, jarring sounds, advertisements… look over here… now look there. Those old enough to remember the epic infomercials of the 1990s might be tempted to shout “Stop the insanity!” right about now. But that is how most of us live our digital lives these days. Phones, tablets, smartwatches, computers, and a whole host of other devices are constantly pestering us as they try to get our attention. Not a small part of it, but our whole, undivided attention.
Evan King is an engineer that spent the past four years working to earn a doctorate that was heavily focused on ubiquitous computing. For those unfamiliar with this area, ubiquitous computing (or ubicomp, for those in the know) is all about seamlessly blending technology into our everyday lives, rather than letting it dominate us and constantly steal away our full attention. With a shiny new degree hung on the wall — and a gig at Useful Sensors — King is now taking on the role of a Susan Powter for the digital era. In particular, he is developing calm technologies that subtly provide us with information, rather than beating us over the head with it and frying our nerves.
King’s first idea was to build a device that passively listens to all of the conversations that take place around it, then determines the most commonly used words before displaying them on a screen, with more commonly spoken words being shown in a larger font. Look at it, don’t look at it… either way is just fine with this easygoing device. But if you do happen to gaze at the screen — which looks no different from any other picture on an end table — you might learn some interesting things. What do you and those around you talk most about? Are things looking pretty healthy, or do you need to make some changes? Be careful with this one; it might reveal some things that you don’t want to hear.
I know what you’re thinking, but no, this device does not require that your every word be sent to a third-party, cloud-based service for processing. I’m sure that would be totally fine and would never be used against you in any way (ask me about the bridge in Brooklyn that I have up for sale, by the way), but since Useful Sensors is all about building artificial intelligence tools for edge computing systems, King ran all of the algorithms locally on a Raspberry Pi computer.
This was made possible through the use of Silero VAD to detect voice activity, and then running the detected audio through one of Useful Sensors’ own Moonshine models for transcription. To make sure that irrelevant words like “and” or “but” do not fill up the screen, some lists of excluded words were provided. With the transcriptions and the excluded word list in hand, determining spoken word frequency becomes a very simple calculation. And the most common of all are shown on a rather large ePaper display.
This device may not be in any way necessary, but it is incredibly thought provoking and makes me wonder what calm technologies could do in other areas of our lives. I also can’t help but wonder how much more interesting it would be to have a display like this for our thoughts, capturing the words that are left unsaid. But even if that did not require electrodes to be implanted in the brain, what it reveals would likely make it anything but a calm technology.