VueBuds Put AI Cameras in Your Ears
VueBuds put AI vision in your ears, using tiny cameras to translate and ID objects in real-time without the bulk of awkward smart glasses.
Smart glasses have received a lot of attention in the press over the past few years, but how often do you actually see someone wearing a pair? Unless you live in Silicon Valley, seeing smart glasses in the wild is a rare occurrence. A major reason for this is that most people simply don’t want to wear glasses — especially not the bulky and awkward ones that have processors, cameras, and batteries stuffed into them.
However, people have no problem at all wearing earbuds. Wherever you go, you’ll find people using AirPods or other wireless earbuds around the clock. A group of researchers at the University of Washington noticed this and recognized that this platform might make for a better way to carry an always-on AI assistant around with us. The only question was how. Earbuds are far smaller than glasses, so how can you fit all of the extra hardware into them?
Their answer is a prototype system called VueBuds: wireless earbuds equipped with tiny, low-power cameras that can “see” the world and answer questions about it in real-time. Instead of streaming high-resolution video like smart glasses, VueBuds capture low-resolution, black-and-white still images and sends them over Bluetooth to a nearby device. There, an on-device AI model processes the images and responds in about a second, enabling interactions like translating text on packaging or identifying objects in view.
To make this possible, the team modified off-the-shelf earbuds — specifically the Sony WF-1000XM3 — and embedded camera modules roughly the size of a grain of rice. These cameras consume under 5 milliwatts of power and add only a modest battery overhead, even with frequent use. The system avoids continuous video capture, which would be too demanding for both power and Bluetooth bandwidth. Instead, it activates the cameras on demand.
Getting the cameras aligned correctly proved to be another challenge. Cameras positioned at ear level don’t align with the user’s eyes and can be partially blocked by the face. The researchers solved this by angling each camera slightly outward and combining images from both earbuds into a single stitched view. This binocular approach expands the field of view and reduces blind spots, while also improving processing speed compared to handling two separate images.
In testing, VueBuds performed on par with high-end smart glasses like Ray-Ban Meta Glasses across a range of visual tasks. Participants even preferred VueBuds for translation tasks, though the glasses performed slightly better at counting objects. Overall accuracy reached over 80% for object recognition and translation, and over 90% for reading book titles and authors.
The system runs on modern vision-language models such as Qwen2.5-VL, demonstrating that even low-resolution grayscale images can support meaningful AI interactions. Importantly, all processing happens locally on the device, addressing privacy concerns associated with cloud-based systems. A visible indicator light also alerts users when images are being captured.
VueBuds are still an experimental idea, so you can’t buy a pair just yet. If you’d like to learn more about this technology, the research paper is available online.
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.