This "Proactive Hearing Assistant" Picks Your Conversational Partners' Voices Out of a Cacophony

Paired AI models, feeding into noise-canceling headphones, figure out who's involved in your conversation and mutes everyone else.

Researchers from the University of Washington, working with Hearvana AI, have developed a pair of smart headphones that can isolate a particular voice from a cacophony — allowing you to concentrate specifically on the person to whom you're speaking, even in a crowd.

"Existing approaches to identifying who the wearer is listening to predominantly involve electrodes implanted in the brain to track attention," says Shyam Gollakota, senior author on the paper detailing the team's work. "Our insight is that when we're conversing with a specific group of people, our speech naturally follows a turn-taking rhythm. And we can train AI to predict and track those rhythms using only audio, without the need for implanting electrodes."

Two chained AI models have given noise-canceling headphones a new trick: canceling out everyone except your conversation partners. (📹: Hu et al)

Dubbed a "proactive hearing assistant," the team's creation works using modified off-the-shelf noise cancellation headphones: once donned, an artificial intelligence (AI) model begins tracking the people participating in the conversation by tracking who speaks when, identifying the active participants by low levels of overlapping speech — indicating that the other parties are listening and waiting for their turn to speak.

Once the speakers are identified, the audio is fed to a second model that isolates only those speakers — canceling out all other speech and noise, to provide improved clarity. It all happens quickly enough, the systems' creators claim, that there's no confusing delay in the audio, and the prototype system can handle up to four conversation partners in addition to the person wearing the headphones.

"Everything we've done previously requires the user to manually select a specific speaker or a distance within which to listen, which is not great for user experience," says lead author Guilin Hu of the team's earlier efforts, which have included headphones capable of "semantic hearing" — picking out particular sound types to bypass the cancellation effect. "What we've demonstrated is a technology that's proactive — something that infers human intent non-invasively and automatically."

More information is available on the project's website, with a preprint of the paper available on Cornell's arXiv server; source code has been released on GitHub under the permissive MIT license.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles