•

ryota murakami

Published December 18, 2025

How to Create a Custom Keypoint Detection AI model pt.2

Part 2 of a beginner's guide to follow along step-by-step.

IntermediateProtip182

How to Create a Custom Keypoint Detection AI model pt.2

Things used in this project

Hardware components

Raspberry PI AI Camera

Story

Welcome to this tutorial where you'll learn how to create and deploy your own custom keypoint detection AI model on the Raspberry Pi AI Camera.

Building AI models that run efficiently on edge devices like the Raspberry Pi AI Camera can be challenging. To simplify this process, we've developed sample code and tools that optimize the entire workflow—from training to deployment.

This article is Part 2 of our series on AI model development for the Raspberry Pi AI Camera. Part 1 covered creating and running a custom object detection AI model on the Raspberry Pi AI Camera, using Nanodet.

🎯 Overview of this article

Create a custom keypoint detection model using our sample code and step-by-step instructions
Generate optimized model files ready for deployment on the Raspberry Pi AI Camera
Implement and run your trained model on the device

💻 Understanding keypoint detection AI models

A keypoint AI model is an AI model that detects distinctive points of interest (keypoints) on humans or objects, and outputs their positions as coordinates.

The Raspberry Pi AI Camera comes with ready-to-use pre-trained models specifically designed for human pose estimation. These models can detect skeletal keypoints including head, shoulders, elbows, wrists, hips, knees, and ankles.

These models work in real-time, making them ideal for applications like posture analysis, gesture recognition, and motion tracking.

Using pre-trained models

If you want to detect human poses without training your own model, you can use the pre-built models provided by Raspberry Pi. These are immediately ready for deployment without additional training.

You can find the official tutorial here.

⭐️⭐️ Expanding Application scope through re-training

While the pre-trained models excel at human pose detection, you'll need to train your own model if you want to detect keypoints on other objects.

By retraining with your own dataset, you can detect keypoints on nearly any object or subject. This opens up numerous possibilities:

Read analog gauges and dials on production lines
Estimate joint positions for robot manipulation and control
Track posture and movement patterns of pets or wild animals
Detect alignment points on manufactured parts for quality control

You define which keypoints matter for your application, making keypoint detection adaptable to any application requiring precise point detection on specific shapes.

In this tutorial, we'll start with a basic example: detecting the corner points of an arrow shape. This straightforward and practical use case will help you understand the fundamentals of edge AI keypoint detection before moving on to more complex applications.

Get started with sample data

We've provided a ready-to-use arrow dataset so you can quickly test the workflow. Once you understand the process, you can easily swap in your own datasets for any custom keypoint detection task.

🔧 System requirements

For Deployment

Raspberry Pi (any model compatible with the AI Camera)
Raspberry Pi AI Camera

For Model training

Computer with NVIDIA GPU (highly recommended for efficient training)
Ubuntu 22.04 (or compatible Linux distribution)
Python 3.10

Note: While a GPU significantly speeds up training, you can train on CPU but it may take longer.

Building the environment locally

Note: This section covers setting up the training environment locally on your machine. If you already set up the local environment in our custom object detection AI model tutorial, then you can skip this section.

1. Clone the repository

git clone https://github.com/SonySemiconductorSolutions/aitrios-rpi-training-samples.git
cd aitrios-rpi-training-samples

2. Setup

Install the necessary packages.

sudo apt update
sudo apt -y install --no-install-recommends apt-utils build-essential libgl1 libgl1-mesa-glx libglib2.0-0 python3 python3-pip python3-setuptools git gnupg

3. Create and activate a Python 3.10 virtual environment

This tutorial assumes Python 3.10, so first, let's confirm that 3.10 is installed.

Note: Due to version dependencies for the required libraries, make sure to use Python 3.10.

python3.10 --version

If this is displayed, Python 3.10 is installed:

If you don't have it yet, you can install Python 3.10 with the following steps, for example:

sudo apt update
sudo apt install -y software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install -y python3.10 python3.10-venv python3.10-dev

Next, create a virtual environment.

python3.10 -m venv .venv
source .venv/bin/activate

4. Install the packages

pip install .
pip install -e third_party/nanodet/nanodet

Step 1: Train the model

The sample repository provides various settings related to AI model training, quantization, and evaluation in a number of .ini configuration files.

For this tutorial, we will train the model using posenet_arrow.ini.

✅ Points

The characteristic feature is that you can flexibly adjust training conditions just by changing the settings and without directly rewriting the Python code. You can choose the dataset and task, and adjust parameters. This customization method is explained in the section Training with your own dataset.

Dataset used

The dataset used for this training is the arrow shape dataset.

You can go here to get the original images.

In the sample dataset, seven points are assigned at each vertex as shown below.

Move to the samples folder where the ini files are stored and run the following command to read the ini file and start training.

The following command will perform training and quantization of the model based on the specified.ini file.

cd samples

imx500_zoo posenet_arrow.ini

When run, this will download the dataset and start training. The default setting file specifies 50 epochs.

This generates the following model files:

./samples/model/posenet_arrow/
├── best_model_0001.h5
├── best_model_0002.h5
├── best_model_0003.h5
├── best_model_0005.h5
├── best_model_0006.h5
├── posenet_arrow.h5                # keras model (HDF5 format)
├── posenet_arrow.keras             # keras model
├── posenet_arrow_quantized.keras   # quantized keras model
└── posenet_arrow_quantized.tflite  # quantized tflite model

Step 2: Quantize and Convert the Trained Model

Convert and package the quantized model into a format compatible with the Raspberry Pi AI Camera.

Note: Continue working in your venv environmentUse the same TensorFlow version for quantization that you used for training. Version mismatches cause errors.

This tutorial explains the process to convert a.keras to network.rpk that we can upload to the IMX500.

1. Install Edge-MDT (Model Development Toolkit)

Install Edge-MDT (Model Development Toolkit)including tools required to quantize, compress, and convert AI models. There are versions available for PyTorch and TensorFlow, and we will install the TensorFlow version:

pip install edge-mdt[tf]

2. Perform the model quantization

The following command performs the model quantization, the -o flag specifies the directory to create and store the AI model files. In the following example, they will be output to convert_result.

cd ./samples/model/posenet_arrow/
imxconv-tf -i posenet_arrow_quantized.keras -o convert_result

The conversion takes about 30 seconds.

This generates the following files:

./samples/model/posenet_arrow/convert_result/
├── dnnParams.xml                           # Network parameter definition
├── packerOut.zip                           # ⭐️Packaged output model for IMX500
├── posenet_arrow_quantized_MemoryReport.json # Memory usage report
└── posenet_arrow_quantized.pbtxt           # TensorFlow GraphDef textual representation

When packerOut.zip is generated, then Step 2 is complete.

Step 3: Package for AI Camera

Note: ⭐️ From here, the operations will be on Raspberry Pi ⭐️

Move the converted folder to the Raspberry Pi.In this tutorial, the folder name is convert_result.
Move the file packerOut.zip generated in Step 2 to any location on the Raspberry Pi. You can use scp or any tool you prefer for file transfer.
If you don't have it installed, install the model packaging tools on the Raspberry Pi:

sudo apt install imx500-tools

Perform the conversion to package the model into an RPK file. The following command specifies packerOut.zip for input and saves the packaged output file to rpk_output_folder.

imx500-package -i packerOut.zip -o rpk_output_folder

This generates a packaged AI model with an .rpk extension.

./rpk_output_folder/
└── network.rpk           # ⭐️The final output to be deployed to the AI Camera

When the network.rpk file is generated, then Step 3 is complete.

Step 4: Execute on Raspberry Pi

Let's install the necessary libraries on the Raspberry Pi.

To visualize the keypoints, we'll use Modlib, a library for application development for the Raspberry Pi AI Camera. This library comes with several sample detection models and allows easy implementation from pre-trained models to custom models. In this tutorial, we'll use this library for a simple implementation of AI model visualization.

1. Install Modlib:

pip install git+https://github.com/SonySemiconductorSolutions/aitrios-rpi-application-module-library.git

2. Prepare a script for visualization

Create a file visualize_arrow.py based on the following sample script.

This script detects the seven vertices of an arrow and visualizes them with different colors. You need to change the path to the rpk AI model file in your own script.

Example Visualization Script

from pathlib import Path
from typing import List
import argparse

import cv2
import numpy as np

from modlib.devices import AiCamera
from modlib.models import COLOR_FORMAT, Model, MODEL_TYPE
from modlib.models.results import Poses
from modlib.models.post_processors import pp_personlab

# TODO: Update this path to point to your actual model file
MODEL_PACKED = Path("./path/to/your/model.rpk")


class ArrowPosenet(Model):

    def __init__(
        self,
        weights,
        model_type=MODEL_TYPE.RPK_PACKAGED,
        is_quantized: bool = True,
    ):
        super().__init__(
            model_file=weights,
            model_type=model_type,
            color_format=COLOR_FORMAT.RGB,
            preserve_aspect_ratio=False,
        )

        self.num_kp = 7
        self.peak_thresh = 0.1
        self.nms_thresh = 0.05
        self.kp_radius = 3

        self.in_width = 481
        self.in_height = 353
        self.is_quantized = is_quantized

    def pre_process(self, img):
        img_resized = cv2.resize(
            img,
            (self.in_width, self.in_height),
            interpolation=cv2.INTER_AREA,
        )
        in_tensor = img_resized / 255.0
        in_tensor = np.expand_dims(in_tensor, axis=0)
        return img_resized, in_tensor

    def post_process(self, output_tensors: List[np.ndarray]) -> Poses:
        edges = [
            (0, 1), (0, 6), (1, 2), (2, 3),
            (3, 4), (4, 5), (5, 6),
        ]
        
        return pp_personlab(
            output_tensors,
            num_keypoints=self.num_kp,
            edges=edges,
            peak_threshold=self.peak_thresh,
            nms_threshold=self.nms_thresh,
            kp_radius=self.kp_radius,
        )


def visualize_arrow(
    model_file,
    threshold=0.1,
    edge_thickness=4,
    draw_edges=False,
    draw_keypoints=True
):
    """Display arrow keypoints and skeleton edges on camera feed.
    
    Args:
        model_file: Path to the model file
        threshold: Detection threshold
        edge_thickness: Thickness of skeleton edges
        draw_edges: Whether to draw skeleton edges (default: False)
        draw_keypoints: Whether to draw keypoints (default: True)
    """

    model_path = Path(model_file)
    if not model_path.exists():
        raise FileNotFoundError(f"Model file not found: {model_path}")

    device = AiCamera()
    model = ArrowPosenet(weights=model_path)
    device.deploy(model)

    try:
        with device as stream:
            for frame in stream:
                if not hasattr(frame, "image") or frame.image is None:
                    continue

                if (
                    not hasattr(frame, "detections")
                    or frame.detections.n_detections == 0
                ):
                    frame.display()
                    continue

                try:
                    annotation_threshold = threshold
                    colors = [
                        (0, 0, 255),
                        (0, 255, 0),
                        (255, 0, 0),
                        (0, 255, 255),
                        (255, 0, 255),
                        (255, 255, 0),
                        (128, 128, 255),
                    ]
                    radius = 8
                    h, w, _ = frame.image.shape
                    poses = frame.detections
                    
                    keypoint_map = {}
                    
                    for i in range(poses.n_detections):
                        for k in range(model.num_kp):
                            score = poses.keypoint_scores[i, k]
                            if score > annotation_threshold:
                                x = int(poses.keypoints[i, k, 0] * w)
                                y = int(poses.keypoints[i, k, 1] * h)
                                if x > 0 and x < w and y > 0 and y < h:
                                    keypoint_map[k] = (x, y)
                    
                    if draw_edges:
                        edges = [
                            (0, 1), (0, 6), (1, 2), (2, 3),
                            (3, 4), (4, 5), (5, 6),
                        ]
                        edge_color = (255, 255, 255)
                        
                        for start_kp, end_kp in edges:
                            if (start_kp in keypoint_map and
                                    end_kp in keypoint_map):
                                x1, y1 = keypoint_map[start_kp]
                                x2, y2 = keypoint_map[end_kp]
                                cv2.line(frame.image, (x1, y1), (x2, y2),
                                         edge_color, edge_thickness)
                    
                    if draw_keypoints:
                        for k, (x, y) in keypoint_map.items():
                            col = colors[k % len(colors)]
                            cv2.circle(frame.image, (x, y), radius, col, -1)
                    
                    frame.display()
                except Exception:
                    continue

    except KeyboardInterrupt:
        pass


def main():
    parser = argparse.ArgumentParser(
        description="Arrow keypoint detection visualization"
    )
    parser.add_argument(
        "--model_file",
        type=str,
        default=str(MODEL_PACKED),
        help="Path to the model file"
    )
    parser.add_argument(
        "--threshold",
        type=float,
        default=0.1,
        help="Detection threshold (default: 0.1)"
    )
    parser.add_argument(
        "--edge_thickness",
        type=int,
        default=4,
        help="Thickness of skeleton edges (default: 4)"
    )
    parser.add_argument(
        "--draw_edges",
        action="store_true",
        default=False,
        help="Draw skeleton edges (default: False)"
    )
    args = parser.parse_args()
    
    # Keypoints are drawn by default
    draw_keypoints = True
    
    visualize_arrow(
        model_file=args.model_file,
        threshold=args.threshold,
        edge_thickness=args.edge_thickness,
        draw_edges=args.draw_edges,
        draw_keypoints=draw_keypoints
    )


if __name__ == "__main__":
    main()

3. Execute visualize_arrow.py:

python3 visualize_arrow.py

Results

When you run the script, the AI model automatically deploys to your Raspberry Pi AI Camera, and a preview window appears showing real-time detection.

This is what you'll see:

The keypoints are accurately detected at each vertex, confirming that your custom-trained model is working correctly.

Training with your own dataset

Ready to use your own dataset? This is what to change:

Udate the .ini configuration file:
[DATASET] NAME = YourDatasetName
[MODEL] CLASS_NUM = Number of classes
[TRAINER] CONFIG = Your YAML configuration file
Follow the instructions to replace the dataset and modify the configuration files:

When in Trouble

If you encounter difficulties during the article, please feel free to comment on this article. Please note that it may take some time to respond to comments.

If you have questions related to Raspberry Pi, you can also check and use the forum below:

Raspberry Pi Forums

Conclusion

In this tutorial, you've learned how to create and deploy custom keypoint detection models on the Raspberry Pi AI Camera using our retraining tools and sample code.

While we used arrow detection as a learning exercise, the techniques you've learned apply to countless real-world scenarios. Just by swapping datasets, you can create keypoint detection solutions for countless edge AI applications.

Now it's your turn to explore! We encourage you to experiment with your own datasets and use cases. The flexibility of keypoint detection makes it a powerful tool for solving unique problems in edge AI.

Thank you for reading this article.

Found this tutorial helpful? Consider giving it a like and share it with others who might benefit from custom AI model development on Raspberry Pi!

Want to learn more

Experiment further with the Raspberry Pi AI Camera by following the Get Started guide on the AITRIOS developer site.

Code

How to Create a Custom Keypoint Detection AI model pt.2

Patrick Johnson

14 projects • 0 followers

ryota murakami

2 projects • 0 followers

How to Create a Custom Keypoint Detection AI model pt.2

Things used in this project

Hardware components

Story

🎯 Overview of this article

💻 Understanding keypoint detection AI models

Using pre-trained models

⭐️⭐️ Expanding Application scope through re-training

Get started with sample data

🔧 System requirements

For Deployment

For Model training

Building the environment locally

Step 1: Train the model

✅ Points

Dataset used

Step 2: Quantize and Convert the Trained Model

Step 3: Package for AI Camera

Step 4: Execute on Raspberry Pi

Results

Training with your own dataset

When in Trouble

Conclusion

Want to learn more

Code

Credits

Patrick Johnson

ryota murakami

Comments

Embed the widget on your own site

How to Create a Custom Keypoint Detection AI model pt.2

How to Create a Custom Keypoint Detection AI model pt.2

Things used in this project

Hardware components

Story

🎯 Overview of this article

💻 Understanding keypoint detection AI models

Using pre-trained models

⭐️⭐️ Expanding Application scope through re-training

Get started with sample data

🔧 System requirements

For Deployment

For Model training

Building the environment locally

Step 1: Train the model

✅ Points

Dataset used

Step 2: Quantize and Convert the Trained Model

Step 3: Package for AI Camera

Step 4: Execute on Raspberry Pi

Results

Training with your own dataset

When in Trouble

Conclusion

Want to learn more

Code

Credits

Patrick Johnson

ryota murakami

Comments

Related channels and tags