This Affordable DIY “Camera” Uses AI to Describe the Scene

Code Strong's ESP32-powered breadboard camera spits out text descriptions instead of photos.

Photography is a relatively recent invention in historical terms. For most of humanity’s existence, we had to rely on illustrations and descriptions to communicate imagery to each other. We are, if nothing else, storytellers. But now that everyone has a camera in their pocket at all times, some of that storytelling has been lost. Since we’re doing it with everything else, we might as well offload that creativity to AI. That’s exactly what Code Strong did with this affordable DIY “camera” that uses an AI to describe captured scenes.

This device does have a camera, but it doesn’t produce photos — at least not to present to the user. Instead, captured images go to an AI for analysis. The AI looks at the photos and then generates a text description of what it “sees.” Once complete, it will display that description on a small OLED screen for the user to read.

The camera has a few different modes intended to provide varying kinds of descriptions. One, for example, is what you’d expect: a general summary of the scene. Another describes the weather based on clues in the image. And another even provides an analysis of people, which is perfect for feedback on selfies that aren’t getting much traction on Instagram.

This isn’t the first time we’ve seen a device like this, but Code Strong’s implementation stands out because it is very affordable. The key component is an ESP32-CAM module, which is a camera-equipped shield for an ESP32 development board. The other components include an OLED screen, a buzzer for notifications, and a few buttons for snapping photos and switching modes. Power comes over USB from a portable battery bank. All of those components go on a small breadboard, so this doesn’t require a fancy PCB or even an enclosure.

The firmware, programmed in the Arduino IDE, takes advantages of OpenAI’s API. The ESP32 connects to a predefined Wi-Fi network and then, after capturing a photo, it makes a request to the API to generate a description of the image. Additional prompts, like “describe the weather,” give it some direction to fulfill the requirements of the selected mode.

It would only cost around $35 to purchase the parts necessary for this project, so it is perfect for people who want to experiment with AI and API requests on a budget.

cameroncoward

Writer for Hackster News. Proud husband and dog dad. Maker and serial hobbyist. Check out my YouTube channel: Serial Hobbyism

Latest Articles