In a world where connected devices are increasingly present, we wanted to create an object that feels both technological and alive. This led to the development of the Smart Rabbit, an interactive companion designed to communicate in a simple and intuitive way.
Inspired by devices like Nabaztag, the Smart Rabbit introduces a more modern approach by using a camera to perceive its environment. It can recognize faces and hand gestures, making the interaction more natural and responsive.
A key feature of the Smart Rabbit is its moving ears, which act as a communication tool. The system has two modes: a face recognition mode, where the ears go up when a face is detected and down otherwise, and a hand detection mode, where each ear follows the user’s hand movements (right hand for right ear, left hand for left ear, both hands for both ears).
The goal of this project is to explore how physical movement and computer vision can improve human-object interaction, without relying on traditional interfaces like screens or buttons.
This project also reflects a practical learning experience, combining several fields such as 3D design, mechanical integration, electronics, and programming. All these components were developed to work together as a complete system.
This project was carried out by two fourth-year engineering students at UniLaSalle Amiens, The Smart Rabbit was developed by two fourth-year engineering students at UniLaSalle Amiens, Solène Cordier and Mattéo Dumont, as part of our studies in computer networks and connected objects.
How does it work?The Smart Rabbit is based on the interaction between a Raspberry Pi, an Arduino Uno, a camera, and two servo motors.
The Raspberry Pi is connected to a dedicated Raspberry Pi camera module using a ribbon cable (CSI connector). This camera is placed in the nose of the rabbit to give it a natural point of view. A Python program running on the Raspberry Pi processes the video stream in real time to detect either faces or hand gestures, depending on the selected mode.
Once the detection is done, the Raspberry Pi sends simple commands (letters) to the Arduino Uno through a serial connection (USB). Each letter corresponds to a specific action.
The Arduino Uno is responsible for controlling the two servo motors, which are connected to pins 8 and 10. These motors are used to move the rabbit’s ears. According to the commands received :
- In face detection mode, both ears go up when a face is detected and go down otherwise.
- In hand detection mode, each ear moves independently: the right ear reacts to the right hand, the left ear to the left hand, and both ears move when both hands are detected.
This architecture allows the Raspberry Pi to handle the complex image processing, while the Arduino manages precise and stable control of the motors.
The Smart Rabbit can be controlled remotely through a web page, allowing users to manage its behavior via an internet connection.
The image below shows the video stream interface with three control buttons. Two buttons allow the user to choose the Smart Rabbit’s operating mode (face detection or hand detection), while the third button stops both the video stream and the Smart Rabbit.
if compteur_frames >= SEUIL_CONFIRMATION and not presence_confirmee:
if ser: ser.write(b'H')
presence_confirmee = True
elif not humain_vu and presence_confirmee:
if ser: ser.write(b'L')
presence_confirmee = False
etat_txt = "STATUT: CONFIRME" if presence_confirmee else "STATUT: EN ATTENTE"
couleur_etat = (0, 255, 0) if presence_confirmee else (0, 0, 255)
cv2.putText(frame, etat_txt, (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, couleur_etat, 2)
cv2.imwrite("stream.jpg", frame)
cv2.imshow("Detection Presence Stable", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
breakThis part of the code is responsible for confirming whether a human is present in front of the Smart Rabbit and sending the corresponding command to the Arduino.
First, the program analyzes each frame captured by the camera and checks if a person is detected. When a human is detected (humain_vu = True), a frame counter (compteur_frames) is incremented. If no person is detected, this counter is reset to zero. This mechanism avoids false detections by requiring the presence to be stable over several consecutive frames.
A threshold (SEUIL_CONFIRMATION) is defined to validate the detection. Only when a person is detected for a sufficient number of consecutive frames (here, 10 frames) is the presence considered confirmed.
When the threshold is reached and the presence was not previously confirmed, the Raspberry Pi sends the letter 'H' to the Arduino via the serial connection. This indicates that a human is present, and the Arduino can react accordingly (for example, by raising the ears).
- When the threshold is reached and the presence was not previously confirmed, the Raspberry Pi sends the letter 'H' to the Arduino via the serial connection. This indicates that a human is present, and the Arduino can react accordingly (for example, by raising the ears).
If no person is detected anymore and the presence was previously confirmed, the Raspberry Pi sends the letter 'L', indicating that the user is no longer present (for example, lowering the ears).
If no person is detected anymore and the presence was previously confirmed, the Raspberry Pi sends the letter 'L', indicating that the user is no longer present (for example, lowering the ears).
The variable presence_confirmee is used to store the current state and prevent sending repeated commands unnecessarily.
In addition, a status message is displayed on the video stream:
- “CONFIRME” (in green) when a presence is validated
- “EN ATTENTE” (in red) when the system is still checking
Finally, the processed image is saved (stream.jpg) and displayed in real time. The program continues running until the user presses the 'q' key, which stops the detection.
This approach ensures a more reliable interaction by filtering out brief or incorrect detections.
Hand Detection and Mirror Mode Logicif all(c > CONFIDENCE_THRESHOLD for c in [conf_eg, conf_ed, conf_pg, conf_pd]):
eg_y, ed_y = kpts[5][1], kpts[6][1]
pg_y, pd_y = kpts[9][1], kpts[10][1]
if pg_y < (eg_y - 40):
if ser: ser.write(b'D')
else:
if ser: ser.write(b'd')
if pd_y < (ed_y - 40):
if ser: ser.write(b'G')
else:
if ser: ser.write(b'g')This part of the code implements the “mirror mode”, where the Smart Rabbit reacts to the user’s hand movements using pose detection.
The system uses a YOLOv8 (you only look once) pose model (yolov8n-pose.pt) to detect human body key points in real time from the camera stream. For each frame, the model identifies people and extracts key points such as shoulders and wrists.
To simplify the interaction, the program only considers the closest person to the camera. This is done by selecting the person with the largest bounding box area.
Once a person is selected, the code retrieves the coordinates and confidence scores of four key points:
- Left shoulder
- Right shoulder
- Left wrist
- Right wrist
Before using them, the program checks that all key points have a confidence score above a defined threshold (CONFIDENCE_THRESHOLD). This ensures that the detection is reliable.
Then, the system compares the vertical positions (Y coordinates) of the wrists and shoulders:
- If a wrist is higher than the corresponding shoulder (with a margin of 40 pixels), it means the hand is raised.
- Otherwise, the hand is considered lowered.
Based on this logic:
- If the left hand is raised, the Raspberry Pi sends the command 'D' to the Arduino (raise one ear).
- If the left hand is down, it sends 'd' (lower the ear).
- If the right hand is raised, it sends 'G'.
- If the right hand is down, it sends 'g'.
- If no person is detected, both ears are lowered by sending 'g' and 'd'.
This creates a mirror effect, where each hand controls one ear in real time.
Finally, the detected skeleton is drawn on the image for visualization, and the processed frame is saved and displayed. This allows the user to see how the system interprets their movements.
This approach enables a natural and intuitive interaction, where the Smart Rabbit directly mimics the user’s gestures.
Easter Eggelif etat == "SCAN":
cv2.putText(frame, "SCAN EN COURS...", (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)
small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
rgb_small_frame = cv2.cvtColor(small_frame, cv2.COLOR_BGR2RGB)
face_locations = face_recognition.face_locations(rgb_small_frame)
face_encodings = face_recognition.face_encodings(rgb_small_frame, face_locations)
for face_encoding in face_encodings:
matches = face_recognition.compare_faces(known_face_encodings, face_encoding, tolerance=0.55)
if True in matches:
index = matches.index(True)
name = known_face_names[index]
print(f"C'est {name} !")
pygame.mixer.music.load(Personnes[name]["sound"])
pygame.mixer.music.play()
etat = "PLAYING"
breakWhen the system is in SCAN mode, it continuously analyzes the camera frames:
- The image is first resized (to 25% of its size) to improve performance.
- It is then converted from BGR to RGB format, which is required by the face recognition library.
- The program detects all faces in the frame and computes their encodings.
Each detected face is compared with the known faces using a tolerance value (0.55). This value defines how strict the recognition is:
- Lower values → more precise but less tolerant
- Higher values → more flexible but more risk of errors
If a match is found:
- The corresponding name is retrieved
- A message is displayed in the console
- The associated sound is loaded and played using pygame
- The system switches to the PLAYING state



_ztBMuBhMHo.jpg?auto=compress%2Cformat&w=48&h=48&fit=fill&bg=ffffff)







Comments