The Raspberry Pi AI Camera, launched in September 2024, is an edge AI camera module that works with any Raspberry Pi. For this project, we will implement and run the official Workout Monitoring sample application, and then customize and extend its functionality to create a brand new tailored application. - Project contributed by Takahashi from Sony Semiconductor Solutions.
Required equipment- Raspberry Pi 5 (Raspberry Pi 4 Model B also works)
- Raspberry Pi OS (64-bit) installed
- Raspberry Pi AI Camera
- Standard accessories (monitor, keyboard, mouse, HDMI cable, and so on)
- Set up camera communication with the Raspberry Pi following the official documentation.
- Verify the installation, to check that the video stream and object detection results are displayed correctly with:
rpicam-hello -t 0s --post-process-file /usr/share/rpi-camera-assets/imx500_mobilenet_ssd.jsonInstall the Application Module LibraryThe Application Module Library from Sony Semiconductor Solutions provides tools for data visualization and application development.
- Clone the Application Module Library repository:
git clone --branch release/v1.0.0 https://github.com/SonySemiconductorSolutions/aitrios-rpi-appication-module-library.git- Navigate to the cloned repository:
cd aitrios-rpi-application-module-library- Follow the following steps outlined in the README file to complete the setup:
- Development environment setup
- Python wheel building
For this project, we will proceed with development using the Workout Monitoring sample app from the IMX500 Sample Applications released by Sony Semiconductor Solutions.First, let's run the Workout Monitoring sample app as it is.
- Install the required package manager
uv:
cd ~
curl -LsSf https://astral.sh/uv/install.sh | shRestart terminal after installation.
- Clone IMX500 Sample Applications:
git clone https://github.com/SonySemiconductorSolutions/aitrios-rpi-sample-apps.git- Navigate to the workout monitor sample app directory:
cd aitrios-rpi-sample-apps/examples/workout-monitor- Set up and run workout monitoring:
uv venv --system-site-packages
source .venv/bin/activate
uv run app.py --exercise squatSee the README file for more details about the app.
Workout monitoring runs successfully when you see human skeletal tracking in action.
Note that the example image only shows the upper body; for full-body tracking, this application provides 17 skeletal keypoints (including facial and lower body landmarks).
This application also automatically counts exercises like squats. Try it out to monitor your progress and stay active!
Creating a virtual fitting application by modifying the workout monitoring sample appLet’s transform the workout monitoring app for Raspberry Pi AI Camera into a fun virtual fitting experience!
In this step, we’ll add a simple yet effective feature: overlaying a T-shirt image onto users’ upper bodies in real time. By analyzing four key skeletal landmarks (right shoulder, left shoulder, right hip, and left hip), we’ll dynamically calculate the T-shirt’s position, scale, and size to ensure a natural fit.
Here’s a quick visual to help you understand the process:
Ready to give it a try? We’ll make this happen by updating the app.py file from the workout monitoring sample — no extra tools are needed, just a little code and creativity!
Add an image overlay processWe will update app.py so that the workflow is:
- Real-time pose estimation by extracting and using skeletal keypoints
- Robust coordinate handling for partial detections
- Adaptive sizing based on detected body proportions
- Transparent image overlay with alpha blending
The application continuously processes camera frames, detects human poses, tracks individuals, and overlays a T-shirt image on a person's upper body in real-time.
Extract keypoint coordinates:We can receive the coordinates of the required four skeletal keypoints from the AI model and convert them to output image coordinates.
scene = frame.image
height, width = scene.shape[:2]
# Convert normalized keypoint coordinates to actual image coordinates
left_shoulder_x = int(keypoints[5 * 2 + 1] * width)
left_shoulder_y = int(keypoints[5 * 2] * height)
right_shoulder_x = int(keypoints[6 * 2 + 1] * width)
right_shoulder_y = int(keypoints[6 * 2] * height)
left_hip_x = int(keypoints[11 * 2 + 1] * width)
left_hip_y = int(keypoints[11 * 2] * height)
right_hip_x = int(keypoints[12 * 2 + 1] * width)
right_hip_y = int(keypoints[12 * 2] * height)For reference, the indices of each element in the keypoints array are as follows:
We can obtain valid values for the four keypoints we are interested in and use them to determine the coordinates of the upper body's top right and bottom left corners.
def select_valid_coordinate(coord_a, coord_b):
if coord_a == 0 and coord_b == 0:
return 0
elif coord_a == 0:
return coord_b
else:
return coord_aThis function handles missing keypoint data by selecting the first non-zero coordinate, so we can still obtain values even when some body parts aren't detected.
Calculate the upper body bounding box coordinatesThis determines the upper body region using validated keypoints.
# Determine upper body bounding box coordinates
right_top_x = select_valid_coordinate(right_shoulder_x, right_hip_x)
right_top_y = select_valid_coordinate(right_shoulder_y, left_shoulder_y)
left_bottom_x = select_valid_coordinate(left_shoulder_x, left_hip_x)
left_bottom_y = select_valid_coordinate(right_hip_y, left_hip_y)Configure overlay positioningCalculate the T-shirt placement with configurable offsets.
Using the calculated upper body bounding box coordinates (upper right and lower left points), the app applies configurable offsets and scaling factors to determine the overlay image's final position, height, and width.
Adjust these offset and scaling parameters based on your specific overlay image dimensions and desired fit.
# Constants definition
OVERLAY_IMG_X_OFFSET_RATIO = 2.0 # Decrease this to increase the leftward offset of the overlay image
OVERLAY_IMG_Y_OFFSET_RATIO = 3.5 # Decrease this to increase the upward offset of the overlay image
OVERLAY_IMG_HEIGHT_RATIO = 2.2 # Decrease this to reduce the overlay image height
OVERLAY_IMG_WIDTH_RATIO = 1.5 # Decrease this to reduce the overlay image width
CONFIDENCE_THRESHOLD = 0.3 # Detection confidence threshold
# Adjust overlay image position based on reference points
overlay_img_x = int(right_top_x - (left_bottom_x - right_top_x) // OVERLAY_IMG_X_OFFSET_RATIO)
overlay_img_y = int(right_top_y - (left_bottom_y - right_top_y) // OVERLAY_IMG_Y_OFFSET_RATIO)
# Resize overlay image based on reference points
overlay_img_h = int((left_bottom_y - right_top_y) * OVERLAY_IMG_HEIGHT_RATIO)
overlay_img_w = int((left_bottom_x - right_top_x) * OVERLAY_IMG_WIDTH_RATIO)Apply a transparent overlayThis applies the T-shirt image in a transparent overlay at the position calculated above, using the alpha channel for a realistic appearance.
scene = frame.image
overlay_image = cv2.imread("tshirt.png", cv2.IMREAD_UNCHANGED)
resized_overlay = cv2.resize(overlay_image, (overlay_img_w, overlay_img_h))
# Separate the alpha channel
alpha_channel = resized_overlay[:, :, 3] / 255.0
rgb_channels = resized_overlay[:, :, :3]
# Copy the overlay part to the scene
for c in range(0, 3):
scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c] = (alpha_channel * rgb_channels[:, :, c] + (1 - alpha_channel) * scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c])Complete application codeHere is the complete code including the processing part for each frame. You can also download the complete code from GitHub.
import cv2
from modlib.devices import AiCamera
from modlib.models.zoo import Higherhrnet
from modlib.apps.tracker.byte_tracker import BYTETracker
# Constants definition
OVERLAY_IMG_X_OFFSET_RATIO = 2.0 # Decrease this to increase the leftward offset of the overlay image
OVERLAY_IMG_Y_OFFSET_RATIO = 3.5 # Decrease this to increase the upward offset of the overlay image
OVERLAY_IMG_HEIGHT_RATIO = 2.2 # Decrease this to reduce the overlay image height
OVERLAY_IMG_WIDTH_RATIO = 1.5 # Decrease this to reduce the overlay image width
CONFIDENCE_THRESHOLD = 0.3 # Detection confidence threshold
class BYTETrackerArgs:
track_thresh: float = 0.25
track_buffer: int = 30
match_thresh: float = 0.8
aspect_ratio_thresh: float = 3.0
min_box_area: float = 1.0
mot20: bool = False
def select_valid_coordinate(coord_a, coord_b):
"""
Select a valid coordinate, prioritizing the first non-zero value
"""
if coord_a == 0 and coord_b == 0:
return 0
elif coord_a == 0:
return coord_b
else:
return coord_a
def overlay_image_on_upper_body(frame, keypoints, overlay_image):
"""
Overlay an image on the upper body
"""
scene = frame.image
height, width = scene.shape[:2]
# Convert normalized keypoint coordinates to actual image coordinates
left_shoulder_x = int(keypoints[5 * 2 + 1] * width)
left_shoulder_y = int(keypoints[5 * 2] * height)
right_shoulder_x = int(keypoints[6 * 2 + 1] * width)
right_shoulder_y = int(keypoints[6 * 2] * height)
left_hip_x = int(keypoints[11 * 2 + 1] * width)
left_hip_y = int(keypoints[11 * 2] * height)
right_hip_x = int(keypoints[12 * 2 + 1] * width)
right_hip_y = int(keypoints[12 * 2] * height)
# Determine upper body bounding box coordinates
right_top_x = select_valid_coordinate(right_shoulder_x, right_hip_x)
right_top_y = select_valid_coordinate(right_shoulder_y, left_shoulder_y)
left_bottom_x = select_valid_coordinate(left_shoulder_x, left_hip_x)
left_bottom_y = select_valid_coordinate(right_hip_y, left_hip_y)
# Adjust overlay image position based on reference points
overlay_img_x = int(right_top_x - (left_bottom_x - right_top_x) // OVERLAY_IMG_X_OFFSET_RATIO)
overlay_img_y = int(right_top_y - (left_bottom_y - right_top_y) // OVERLAY_IMG_Y_OFFSET_RATIO)
# Resize overlay image based on reference points
overlay_img_h = int((left_bottom_y - right_top_y) * OVERLAY_IMG_HEIGHT_RATIO)
overlay_img_w = int((left_bottom_x - right_top_x) * OVERLAY_IMG_WIDTH_RATIO)
try:
resized_overlay = cv2.resize(overlay_image, (overlay_img_w, overlay_img_h))
# Separate the alpha channel
alpha_channel = resized_overlay[:, :, 3] / 255.0
rgb_channels = resized_overlay[:, :, :3]
# Copy the overlay part to the scene
for c in range(0, 3):
scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c] = (alpha_channel * rgb_channels[:, :, c] + (1 - alpha_channel) * scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c])
except:
pass
return scene
def start_workout_demo():
device = AiCamera()
model = Higherhrnet()
device.deploy(model)
# Load the overlay image
overlay_image = cv2.imread("tshirt.png", cv2.IMREAD_UNCHANGED)
tracker = BYTETracker(BYTETrackerArgs())
with device as stream:
for frame in stream:
detections = frame.detections[frame.detections.confidence > CONFIDENCE_THRESHOLD]
detections = tracker.update(frame, detections)
for k, _, _, _, t in detections:
frame.image = overlay_image_on_upper_body(frame, k, overlay_image)
frame.display()
if __name__ == "__main__":
start_workout_demo()
exit()Testing and resultsTo test the virtual fitting functionality, I downloaded a T-shirt image from irasutoya and saved it as tshirt.png in the same directory as the modified app.py file.
The results look pretty good! The application successfully overlays the T-shirt with realistic positioning and scaling, and smooth tracking in real time, creating an effective virtual fitting experience.
The workout monitoring sample app used here detects 17 skeletal keypoints, including face and lower body, so you could modify it to allow trying on glasses using the facial keypoints, for example.
Sony Semiconductor Solutions has released additional sample applications on GitHub in IMX500 Sample Applications. Why not take a look at these and try modifying them to make new applications?
When in troubleIf you encounter any issues while reading the article, please feel free to comment on this article. Also, please check the support site below.Please note that it may take some time to respond to comments.
If you have questions related to Raspberry Pi, please check and utilize the forum below.
Want to learn moreExperiment further with the Raspberry Pi AI Camera by following the Get Started guide on the AITRIOS developer site.
Code












Comments