Everyday life often presents small inconveniences that we tend to ignore—until they repeat themselves. One morning, after finishing breakfast, I realized the trash bin wasn’t nearby. That simple moment sparked an idea: what if the trash bin could come to me instead?
This project is the result of that thought—a gesture-controlled smart trash bin robot that responds to hand signals using embedded AI vision. By combining Seeed Studio’s Grove Vision AI Module V2 with the compact yet powerful XIAO ESP32S3, I created a mobile trash bin that can move and interact intelligently without physical contact.
In the following sections, I’ll explain the concept, hardware, software, and design process behind the project. If you’re interested in improving or expanding it, I’d love to hear your ideas.
AI Vision LayerGesture detection is handled by the Grove Vision AI Module V2, which runs a ready-to-use AI model provided through SenseCraft AI.
Key Steps:Sign in to the SenseCraft AI platform
- Sign in to the SenseCraft AI platform
Choose a compatible gesture recognition model
- Choose a compatible gesture recognition model
Flash the model to Vision V2 via USB-C
- Flash the model to Vision V2 via USB-C
Monitor real-time detection results through the preview interface
- Monitor real-time detection results through the preview interface
The biggest advantage of this approach is accessibility—no complex training pipeline or deep AI knowledge is required. The visual feedback makes debugging fast and intuitive.
Once a gesture is detected, Vision V2 sends structured recognition data to the main controller over I²C communication.
Embedded Control LayerThe XIAO ESP32S3 serves as the central processing unit of the robot. Its tasks include:
Receiving recognition data from Vision V2
- Receiving recognition data from Vision V2
Decoding the transmitted JSON information
- Decoding the transmitted JSON information
Determining the appropriate action
- Determining the appropriate action
Driving motors and controlling the servo
- Driving motors and controlling the servo
The Grove Shield simplifies wiring and ensures compatibility across the XIAO series. Development is done using the Arduino framework, which provides flexibility and rapid iteration.
Development EnvironmentSoftware Stack:VS Code
- VS Code
PlatformIO
- PlatformIO
Board: Seeed Studio XIAO ESP32S3
- Board: Seeed Studio XIAO ESP32S3
Framework: Arduino
- Framework: Arduino
To improve readability and maintainability, the firmware is divided into independent modules:
AI data handling
- AI data handling
Motion control
- Motion control
Servo actuation
- Servo actuation
The complete project structure and source code are available in my GitHub repository:🔗 Gesture_AI
Gesture Logic ProcessingThe AI module outputs recognition results in JSON format. These results are parsed by a dedicated gesture-handling module, which maps gestures to actions.
Example gesture mapping:
Rock → Robot advances forward
- Rock → Robot advances forward
Paper → Robot reverses direction
- Paper → Robot reverses direction
Scissors → Trash bin lid opens or closes
void handle_target_0(const boxes_t *box); // Forward motion
void handle_target_1(const boxes_t *box); // Reverse motion
void handle_target_2(const boxes_t *box); // Servo action
This design allows new gestures or behaviors to be added with minimal changes.
Motion Control SystemThe robot uses four DC motors to move smoothly. Each motor is controlled through GPIO pins via motor driver modules.
#define motor_1_A D10
#define motor_1_B D9
#define motor_2_A D8
#define motor_2_B D7
#define motor_3_A D0
#define motor_3_B D1
#define motor_4_A D2
#define motor_4_B D3
void motor_init(){}
void motor_stop(){}
void motor_forward(){}
void motor_backward(){}
By switching pin states, the motors can be enabled, stopped, or reversed to control direction.
Servo ActuationA servo motor is responsible for opening and closing the trash bin lid. PWM output is used to define the rotation angle.
void servo_set_angle(int angle)
{
angle = constrain(angle, 0, 180);
int duty = map(angle, 0, 180, 51, 102);
ledcWrite(SERVO_CHANNEL, duty);
}
This method ensures stable and precise lid movement.
Vision Communication LibraryThe project relies on SSCMA, Seeed Studio’s official communication library, to manage data exchange between the microcontroller and the Vision AI module efficiently.
Enclosure & Mechanical DesignAll structural components and covers were designed using Onshape, a cloud-based CAD platform.
Benefits:Runs directly in the browser
- Runs directly in the browser
No installation required
- No installation required
Easy to iterate and modify designs
- Easy to iterate and modify designs
Once printed, the enclosure sits on top of the trash bin, hiding cables and giving the robot a more finished look.
Wiring & Assembly NotesDue to the number of components involved, wiring requires extra care. Incorrect connections may damage the motor drivers or microcontroller.
Tips:
Double-check polarity and pin assignments
- Double-check polarity and pin assignments
Test each subsystem independently
- Test each subsystem independently
Consider designing a custom PCB for future revisions
- Consider designing a custom PCB for future revisions
A proper enclosure significantly improves both reliability and appearance.
Results & ReflectionsAfter programming and testing, the robot successfully:
Detects hand gestures in real time
- Detects hand gestures in real time
Moves based on user input
- Moves based on user input
Opens the trash bin lid without physical contact
- Opens the trash bin lid without physical contact
This project is still a work in progress, but every experiment adds new insights. If you have ideas for improvements—better motion control, additional gestures, or smarter behavior—I’d be excited to hear them.










Comments