I recently built a fun and interactive web app that responds to facial movement using real-time animation. The project uses the Grove Vision AI Module V2 from Seeed Studio, which runs a pre-trained face detection model to track facial positions from a live camera feed. The face data is sent to the browser through the ESP32-S3 using the Serial Web API.
This project was created as part of MakerGram’s MakerChat0x2D – a month-long build series focused on Interactive Digital Signage.
On the frontend, I used Rive to create smooth, expressive animations that react to facial movement. Rive is a powerful tool for designing, rigging, and animating graphics—ideal for real-time, interactive UI elements.
🎥 Demo Video🛠 Hardware- Grove Vision AI V2
- XIAO ESP32S3
- USB-C
Rive supports something called a State Machine — this is like giving logic to your animation.
In my project, the animation moves based on x and y values. These values represent a person's face position. When the x/y changes, the animation moves to match.
This gave my animation a dynamic and interactive behavior — it feels alive!
📷 Getting Face Data Using Grove Vision AI ModuleTo detect the position of a person's face, I used a Grove Vision AI Camera v2.
Where i run a pretrained face detection model to track facial positions from a live camera feed
The camera can detect faces and give a bounding box — from that, we can get the x and y coordinates of the face.
Now I had real-time data of where the person's face was!
🔁 Sending Data with ESP32-S3I needed to send this face position data (x and y) to my web app. So I connected the Grove camera to an ESP32-S3,The ESP32-S3 reads the x and y values from the Grove and sends them over a serial port.
🧾 Code Snippet (ESP32-S3)#include <Seeed_Arduino_SSCMA.h>
SSCMA Infer;
void setup()
{
Serial.begin(9600);
Infer.begin();
}
void loop()
{
if (!Infer.invoke())
{
auto boxes = Infer.boxes();
if (boxes.size() > 0)
{
int x = boxes[0].x;
int y = boxes[0].y;
Serial.print("X: ");
Serial.print(x);
Serial.print(" | Y: ");
Serial.println(y);
}
}
delayMicroseconds(10);
}
This code runs on the ESP32-S3 and reads the face bounding box coordinates (x and y), then sends them over serial.
🌐 Reading Data in the Web App – Serial Web APINow the final part: getting the data into my browser.
I used the Serial Web API, a feature in modern browsers that allows web apps to read data directly from devices connected via USB (like our ESP32-S3).
In the code:
- I opened the serial port
- Read the incoming x and y data
- Passed it into the Rive animation using its runtime API
- Now, as your face moves in front of the camera, the animation moves in real time 🎯
Huge thanks to MakerGram ❤️🔥
Comments