•

Published December 4, 2025 © Apache-2.0

Virtual fitting app using the Raspberry Pi AI Camera

Transforming the Raspberry Pi AI Camera workout sample into a virtual fitting app that adds clothing overlays using real-time pose tracking.

IntermediateFull instructions provided584

Virtual fitting app using the Raspberry Pi AI Camera

Things used in this project

Hardware components

Raspberry Pi 5

Raspberry PI AI Camera

Story

The Raspberry Pi AI Camera, launched in September 2024, is an edge AI camera module that works with any Raspberry Pi. For this project, we will implement and run the official Workout Monitoring sample application, and then customize and extend its functionality to create a brand new tailored application. - Project contributed by Takahashi from Sony Semiconductor Solutions.

Required equipment

Raspberry Pi 5 (Raspberry Pi 4 Model B also works)
Raspberry Pi OS (64-bit) installed
Raspberry Pi AI Camera
Standard accessories (monitor, keyboard, mouse, HDMI cable, and so on)

Setting up the project

Set up the Raspberry Pi AI Camera

Set up camera communication with the Raspberry Pi following the official documentation.
Verify the installation, to check that the video stream and object detection results are displayed correctly with:

rpicam-hello -t 0s --post-process-file /usr/share/rpi-camera-assets/imx500_mobilenet_ssd.json

Install the Application Module Library

The Application Module Library from Sony Semiconductor Solutions provides tools for data visualization and application development.

Clone the Application Module Library repository:

git clone --branch release/v1.0.0 https://github.com/SonySemiconductorSolutions/aitrios-rpi-appication-module-library.git

Navigate to the cloned repository:

cd aitrios-rpi-application-module-library

Follow the following steps outlined in the README file to complete the setup:
Development environment setup
Python wheel building

Run the workout monitoring sample app as it is

For this project, we will proceed with development using the Workout Monitoring sample app from the IMX500 Sample Applications released by Sony Semiconductor Solutions.First, let's run the Workout Monitoring sample app as it is.

Install the required package manager uv:

cd ~
curl -LsSf https://astral.sh/uv/install.sh | sh

Restart terminal after installation.

Clone IMX500 Sample Applications:

git clone https://github.com/SonySemiconductorSolutions/aitrios-rpi-sample-apps.git

Navigate to the workout monitor sample app directory:

cd aitrios-rpi-sample-apps/examples/workout-monitor

Set up and run workout monitoring:

uv venv --system-site-packages
source .venv/bin/activate
uv run app.py --exercise squat

See the README file for more details about the app.

Workout monitoring runs successfully when you see human skeletal tracking in action.

Note that the example image only shows the upper body; for full-body tracking, this application provides 17 skeletal keypoints (including facial and lower body landmarks).

This application also automatically counts exercises like squats. Try it out to monitor your progress and stay active!

Creating a virtual fitting application by modifying the workout monitoring sample app

Let’s transform the workout monitoring app for Raspberry Pi AI Camera into a fun virtual fitting experience!

In this step, we’ll add a simple yet effective feature: overlaying a T-shirt image onto users’ upper bodies in real time. By analyzing four key skeletal landmarks (right shoulder, left shoulder, right hip, and left hip), we’ll dynamically calculate the T-shirt’s position, scale, and size to ensure a natural fit.

Here’s a quick visual to help you understand the process:

Ready to give it a try? We’ll make this happen by updating the app.py file from the workout monitoring sample — no extra tools are needed, just a little code and creativity!

Add an image overlay process

We will update app.py so that the workflow is:

Real-time pose estimation by extracting and using skeletal keypoints
Robust coordinate handling for partial detections
Adaptive sizing based on detected body proportions
Transparent image overlay with alpha blending

The application continuously processes camera frames, detects human poses, tracks individuals, and overlays a T-shirt image on a person's upper body in real-time.

Extract keypoint coordinates:

We can receive the coordinates of the required four skeletal keypoints from the AI model and convert them to output image coordinates.

scene = frame.image
height, width = scene.shape[:2]

# Convert normalized keypoint coordinates to actual image coordinates
left_shoulder_x  = int(keypoints[5 * 2 + 1] * width)
left_shoulder_y  = int(keypoints[5 * 2] * height)
right_shoulder_x = int(keypoints[6 * 2 + 1] * width)
right_shoulder_y = int(keypoints[6 * 2] * height)
left_hip_x       = int(keypoints[11 * 2 + 1] * width)
left_hip_y       = int(keypoints[11 * 2] * height)
right_hip_x      = int(keypoints[12 * 2 + 1] * width)
right_hip_y      = int(keypoints[12 * 2] * height)

For reference, the indices of each element in the keypoints array are as follows:

Validate the coordinates

We can obtain valid values for the four keypoints we are interested in and use them to determine the coordinates of the upper body's top right and bottom left corners.

def select_valid_coordinate(coord_a, coord_b):
    if coord_a == 0 and coord_b == 0:
        return 0  
    elif coord_a == 0:
        return coord_b
    else:
        return coord_a

This function handles missing keypoint data by selecting the first non-zero coordinate, so we can still obtain values even when some body parts aren't detected.

Calculate the upper body bounding box coordinates

This determines the upper body region using validated keypoints.

# Determine upper body bounding box coordinates
right_top_x   = select_valid_coordinate(right_shoulder_x, right_hip_x)
right_top_y   = select_valid_coordinate(right_shoulder_y, left_shoulder_y)
left_bottom_x = select_valid_coordinate(left_shoulder_x,  left_hip_x)
left_bottom_y = select_valid_coordinate(right_hip_y,      left_hip_y)

Configure overlay positioning

Calculate the T-shirt placement with configurable offsets.

Using the calculated upper body bounding box coordinates (upper right and lower left points), the app applies configurable offsets and scaling factors to determine the overlay image's final position, height, and width.

Adjust these offset and scaling parameters based on your specific overlay image dimensions and desired fit.

# Constants definition
OVERLAY_IMG_X_OFFSET_RATIO = 2.0 # Decrease this to increase the leftward offset of the overlay image
OVERLAY_IMG_Y_OFFSET_RATIO = 3.5 # Decrease this to increase the upward offset of the overlay image
OVERLAY_IMG_HEIGHT_RATIO   = 2.2 # Decrease this to reduce the overlay image height
OVERLAY_IMG_WIDTH_RATIO    = 1.5 # Decrease this to reduce the overlay image width
CONFIDENCE_THRESHOLD       = 0.3 # Detection confidence threshold

# Adjust overlay image position based on reference points
overlay_img_x = int(right_top_x - (left_bottom_x - right_top_x) // OVERLAY_IMG_X_OFFSET_RATIO)
overlay_img_y = int(right_top_y - (left_bottom_y - right_top_y) // OVERLAY_IMG_Y_OFFSET_RATIO)

# Resize overlay image based on reference points
overlay_img_h = int((left_bottom_y - right_top_y) * OVERLAY_IMG_HEIGHT_RATIO)
overlay_img_w = int((left_bottom_x - right_top_x) * OVERLAY_IMG_WIDTH_RATIO)

Apply a transparent overlay

This applies the T-shirt image in a transparent overlay at the position calculated above, using the alpha channel for a realistic appearance.

scene = frame.image
overlay_image = cv2.imread("tshirt.png", cv2.IMREAD_UNCHANGED)

resized_overlay = cv2.resize(overlay_image, (overlay_img_w, overlay_img_h))

# Separate the alpha channel
alpha_channel = resized_overlay[:, :, 3] / 255.0
rgb_channels  = resized_overlay[:, :, :3]

# Copy the overlay part to the scene
for c in range(0, 3):
    scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c] = (alpha_channel * rgb_channels[:, :, c] + (1 - alpha_channel) * scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c])

Complete application code

Here is the complete code including the processing part for each frame. You can also download the complete code from GitHub.

import cv2
from modlib.devices import AiCamera
from modlib.models.zoo import Higherhrnet
from modlib.apps.tracker.byte_tracker import BYTETracker

# Constants definition
OVERLAY_IMG_X_OFFSET_RATIO = 2.0 # Decrease this to increase the leftward offset of the overlay image
OVERLAY_IMG_Y_OFFSET_RATIO = 3.5 # Decrease this to increase the upward offset of the overlay image
OVERLAY_IMG_HEIGHT_RATIO   = 2.2 # Decrease this to reduce the overlay image height
OVERLAY_IMG_WIDTH_RATIO    = 1.5 # Decrease this to reduce the overlay image width
CONFIDENCE_THRESHOLD       = 0.3 # Detection confidence threshold

class BYTETrackerArgs:
    track_thresh: float = 0.25
    track_buffer: int = 30
    match_thresh: float = 0.8
    aspect_ratio_thresh: float = 3.0
    min_box_area: float = 1.0
    mot20: bool = False

def select_valid_coordinate(coord_a, coord_b):
    """
    Select a valid coordinate, prioritizing the first non-zero value
    """
    if coord_a == 0 and coord_b == 0:
        return 0  
    elif coord_a == 0:
        return coord_b
    else:
        return coord_a

def overlay_image_on_upper_body(frame, keypoints, overlay_image):
    """
    Overlay an image on the upper body
    """
    scene = frame.image
    height, width = scene.shape[:2]

    # Convert normalized keypoint coordinates to actual image coordinates
    left_shoulder_x  = int(keypoints[5 * 2 + 1] * width)
    left_shoulder_y  = int(keypoints[5 * 2] * height)
    right_shoulder_x = int(keypoints[6 * 2 + 1] * width)
    right_shoulder_y = int(keypoints[6 * 2] * height)
    left_hip_x       = int(keypoints[11 * 2 + 1] * width)
    left_hip_y       = int(keypoints[11 * 2] * height)
    right_hip_x      = int(keypoints[12 * 2 + 1] * width)
    right_hip_y      = int(keypoints[12 * 2] * height)

    # Determine upper body bounding box coordinates
    right_top_x   = select_valid_coordinate(right_shoulder_x, right_hip_x)
    right_top_y   = select_valid_coordinate(right_shoulder_y, left_shoulder_y)
    left_bottom_x = select_valid_coordinate(left_shoulder_x,  left_hip_x)
    left_bottom_y = select_valid_coordinate(right_hip_y,      left_hip_y)

    # Adjust overlay image position based on reference points
    overlay_img_x = int(right_top_x - (left_bottom_x - right_top_x) // OVERLAY_IMG_X_OFFSET_RATIO)
    overlay_img_y = int(right_top_y - (left_bottom_y - right_top_y) // OVERLAY_IMG_Y_OFFSET_RATIO)

    # Resize overlay image based on reference points
    overlay_img_h = int((left_bottom_y - right_top_y) * OVERLAY_IMG_HEIGHT_RATIO)
    overlay_img_w = int((left_bottom_x - right_top_x) * OVERLAY_IMG_WIDTH_RATIO)

    try:
        resized_overlay = cv2.resize(overlay_image, (overlay_img_w, overlay_img_h))

        # Separate the alpha channel
        alpha_channel = resized_overlay[:, :, 3] / 255.0
        rgb_channels  = resized_overlay[:, :, :3]

        # Copy the overlay part to the scene
        for c in range(0, 3):
            scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c] = (alpha_channel * rgb_channels[:, :, c] + (1 - alpha_channel) * scene[overlay_img_y:overlay_img_y+overlay_img_h, overlay_img_x:overlay_img_x+overlay_img_w, c])
    except:
        pass

    return scene

def start_workout_demo():
    device = AiCamera()
    model = Higherhrnet()
    device.deploy(model)

    # Load the overlay image
    overlay_image = cv2.imread("tshirt.png", cv2.IMREAD_UNCHANGED)

    tracker = BYTETracker(BYTETrackerArgs())

    with device as stream:
        for frame in stream:
            detections = frame.detections[frame.detections.confidence > CONFIDENCE_THRESHOLD]
            detections = tracker.update(frame, detections)

            for k, _, _, _, t in detections:
                frame.image = overlay_image_on_upper_body(frame, k, overlay_image)

            frame.display()

if __name__ == "__main__":
    start_workout_demo()
    exit()

Testing and results

To test the virtual fitting functionality, I downloaded a T-shirt image from irasutoya and saved it as tshirt.png in the same directory as the modified app.py file.

The results look pretty good! The application successfully overlays the T-shirt with realistic positioning and scaling, and smooth tracking in real time, creating an effective virtual fitting experience.

Over to you

The workout monitoring sample app used here detects 17 skeletal keypoints, including face and lower body, so you could modify it to allow trying on glasses using the facial keypoints, for example.

Sony Semiconductor Solutions has released additional sample applications on GitHub in IMX500 Sample Applications. Why not take a look at these and try modifying them to make new applications?

When in trouble

If you encounter any issues while reading the article, please feel free to comment on this article. Also, please check the support site below.Please note that it may take some time to respond to comments.

If you have questions related to Raspberry Pi, please check and utilize the forum below.