Published April 23, 2026

I've Found Someone Dozing Off with Raspberry Pi AI Camera

A simple app using Raspberry Pi AI Camera and pose estimation to detect when someone is dozing off based on minimal body movement.

IntermediateFull instructions provided10

I've Found Someone Dozing Off with Raspberry Pi AI Camera

Things used in this project

Hardware components

Raspberry PI AI Camera

Story

Summary of This Article

You can easily create a dozing detection application using the Raspberry Pi AI Camera
This article explains the entire flow of application development using the Raspberry Pi AI Camera
It is mainly aimed at those interested in or planning to use the Raspberry Pi AI Camera and AITRIOS

Introduction

With the release of the Raspberry Pi AI Camera, I thought it would be fun to play around with it and developed an application!I tended to doze off during my student days, but as a teacher, seeing students dozing isn't exactly a pleasant experience...I sometimes notice people dozing off in crowded dining places too.In such situations, it would be easier to respond if you know someone is dozing off!Anyway, I would like to create a dozing detection application to help keep myself in check...!

Main Text

No advanced programming or AI knowledge is required, but please understand that we will be using Git and Python to some extent.

1. Setting up the Raspberry Pi AI Camera
2. Implementing skeleton estimation from the Raspberry Pi AI Camera
3. Creating the application

1. Setting up the Raspberry Pi AI Camera

First, let's set up the Raspberry Pi AI Camera.

Follow this Raspberry Pi setup guide for the Raspberry Pi AI camera to ensure you have everything setup.

2. Implementing Skeleton Estimation from the Raspberry Pi AI Camera

Next, let's run the sample to see if we can estimate the skeleton with the Raspberry Pi AI Camera!

Implementing skeleton estimation and display from scratch would take considerable effort and time.However, recently, our company released something called Application Module Library, which allows you to easily create samples.Since it's a great opportunity, I will use it!

First, let's install the Modlib repo.

git clone https://github.com/SonySemiconductorSolutions/aitrios-rpi-application-module-library.git

Next, install uv by following the installation instructions provided in the UV documentation to set it up. Verify your uv installation by running:

uv --version

Then ensure that your Raspberry Pi runs the latest software:

sudo apt update && sudo apt full-upgrade
sudo apt install imx500-all
sudo apt install python3-opencv python3-munkres python3-picamera2

Next, we can run modlib's posenet example to see test our system and see keypoint detections.

Finally, please run the following command.Finally, please run the following command.

cd aitrios-rpi-application-module-library
uv run examples/aicam/posenet.py

How about that!It was an easy operation, but you can see that skeleton estimation is working properly!Even with the head down and wearing a mask, it seems to recognize quite well!Now we can certainly detect someone who is dozing off!

3. Creating the Application

We will modify part of the sample touched in Chapter 2 to actually create the dozing detection application!This time, we will judge that someone is dozing off when the coordinates of KeyPoints during skeleton estimation hardly change.

There are various methods to judge whether the eyes are closed or not; feel free to implement it in your preferred way!Let's proceed with the implementation as follows.

[3.1 Save skeleton data]
        |
        v
[3.2 Compute KeyPoint distance (prev vs current)]
        |
        v
{3.3 Distance > threshold?}
   | No                 | Yes
   v                    v
[motionless_count++]  [motionless_count = 0]
        \              /
         v            v
{3.4 motionless_count >= motionless_threshold?}
   | Yes                | No
   v                    v
 [Dozing]          [Not dozing]
        |
        v
     [3.5 UI display]

3.1. Save Skeleton Estimation Data

First, create a new python script SleepDetection.py. Then to start the logic of the script we will save the skeleton estimation data. We will add data to a list called poses_record. For comparison between the previous and current frames, the length of the list is limited to 2, but feel free to change it as needed.I added the portion marked with + from the earlier program.

from modlib.apps import Annotator
from modlib.devices import AiCamera
from modlib.models.zoo import Posenet

device = AiCamera()
model = Posenet()
device.deploy(model)

annotator = Annotator()

poses_record = []

with device as stream:
    for frame in stream:
        detections = frame.detections[frame.detections.confidence > 0.5]
        annotator.annotate_keypoints(frame, detections)
        frame.display()

        poses = frame.detections
        poses_record.append(poses)
        if len(poses_record) > 2:
            poses_record.pop(0)

3.2. Calculate the Distance of Estimated KeyPoint Coordinates

Next, we will calculate the distance of the estimated KeyPoint coordinates between the previous and current frames.We will prepare a new function called track_motionless() for this purpose.

KEYPOINT_NAME = [
    "nose",             #0
    "leftEye",          #1
    "rightEye",         #2
    "leftEar",          #3
    "rightEar",         #4
    "leftShoulder",     #5
    "rightShoulder",    #6
    "leftElbow",        #7
    "rightElbow",       #8
    "leftWrist",        #9
    "rightWrist",       #10
    "leftHip",          #11
    "rightHip",         #12
    "leftKnee",         #13
    "rightKnee",        #14
    "leftAnkle",        #15
    "rightAnkle"        #16
]

# Track motionless
def track_motionless(w, h, poses_record, motionless_count):
    for keypoint_idx in range(len(KEYPOINT_NAME)):
        x0 = int(poses_record[0].keypoints[0][keypoint_idx][0] * w)
        y0 = int(poses_record[0].keypoints[0][keypoint_idx][1] * h)
        x1 = int(poses_record[1].keypoints[0][keypoint_idx][0] * w)
        y1 = int(poses_record[1].keypoints[0][keypoint_idx][1] * h)
            
        distance = (x1 - x0)**2 + (y1 - y0)**2

Access the x, y coordinates from previous data (=poses_record[0]) and current data (=poses_record[1]) and calculate the distance.There are supposedly 17 types of KeyPoints that can be obtained!We don't specifically designate which KeyPoints to use this time, but it might be a good idea to impose restrictions, for example, only processing when leftEye and rightEye are recognized.

3.3. Judgment if the Distance Exceeds the Threshold

Next, we will judge whether the distance exceeds the threshold.We will use a variable called motionless_count to count the number of frames where the KeyPoint coordinates hardly move.

a. If the distance is below the threshold, increment motionless_count by 1  
b. If the distance is above the threshold, reset motionless_count to 0

If someone is dozing off and the coordinates do not change much compared to before, we will increase motionless_count.We have prepared a threshold called DISTANCE_THRESHOLD to determine if the coordinates are not changing, but please feel free to modify it as necessary.However, if only one KeyPoint is detected, the reliability of the data is not very high, so motionless_count will only be increased if multiple KeyPoints are below the threshold.This also accommodates the case where no one is in the frame.

KEYPOINT_THRESHOLD = 5
DISTANCE_THRESHOLD = 100
KEYPOINT_SCORE_THRESHOLD = 0.5

# Track motionless
def track_motionless(w, h, poses_record, motionless_count):
    keypoint_count = 0
    for keypoint_idx in range(len(KEYPOINT_NAME)):
        if (poses_record[0].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD and 
            poses_record[1].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD):
            x0 = int(poses_record[0].keypoints[0][keypoint_idx][0] * w)
            y0 = int(poses_record[0].keypoints[0][keypoint_idx][1] * h)
            x1 = int(poses_record[1].keypoints[0][keypoint_idx][0] * w)
            y1 = int(poses_record[1].keypoints[0][keypoint_idx][1] * h)
            
            distance = (x1 - x0)**2 + (y1 - y0)**2
            if distance < DISTANCE_THRESHOLD:
                keypoint_count += 1
    
    if keypoint_count >= KEYPOINT_THRESHOLD:
        print("[INFO] Detected 5 or more KeyPoints.")
        motionless_count += 1
    else:
        motionless_count = 0
    
    return motionless_count

3.4. If motionless_count Exceeds the Threshold, Judge it as Dozing

Next, we will process the judgment of dozing if motionless_count exceeds the threshold.Nonetheless, it will not be anything complicated, merely adding the following:

if motionless_count > SLEEP_THRESHOLD:
    print("[INFO] Detect Sleep")

3.5. UI Display

Finally, let's clearly display that someone is dozing off on the UI! This time, we are using OpenCV to display "SLEEP" on the image.

You could also use other packages or means such as making a sound! Here, I've extracted only the part related to the UI display.

import cv2

TEXT = "SLEEP"

def display_text(image):
print("[INFO] Display Text")
cv2.putText(image,
        text=TEXT,
        org=(100, 50),
        fontFace=cv2.FONT_HERSHEY_SIMPLEX,
        fontScale=2.0,
        color=(0, 0, 255),
        thickness=3)

Complete Program

Finally, I will provide the complete program.The program is also publicly available on GitHub, so feel free to refer to it!

import time
import numpy as np
from modlib.apps import Annotator
from modlib.devices import AiCamera
from modlib.models.zoo import Posenet
import cv2

# Definition of const
KEYPOINT_NAME = [
    "nose",             #0
    "leftEye",          #1
    "rightEye",         #2
    "leftEar",          #3
    "rightEar",         #4
    "leftShoulder",     #5
    "rightShoulder",    #6
    "leftElbow",        #7
    "rightElbow",       #8
    "leftWrist",        #9
    "rightWrist",       #10
    "leftHip",          #11
    "rightHip",         #12
    "leftKnee",         #13
    "rightKnee",        #14
    "leftAnkle",        #15
    "rightAnkle"        #16
]


KEYPOINT_THRESHOLD = 5
DISTANCE_THRESHOLD = 100
KEYPOINT_SCORE_THRESHOLD = 0.5
SLEEP_THRESHOLD = 10
PAUSE_AFTER_DETECTION = 0.5  # (seconds)
TEXT = "SLEEP"


# Track motionless
def track_motionless(w, h, poses_record, motionless_count):
    keypoint_count = 0
    # Calculate the coordinate difference between the previous frame and the current frame for each KeyPoint.
    for keypoint_idx in range(len(KEYPOINT_NAME)):
        if (poses_record[0].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD and 
            poses_record[1].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD):
            x0 = int(poses_record[0].keypoints[0][keypoint_idx][0] * w)
            y0 = int(poses_record[0].keypoints[0][keypoint_idx][1] * h)
            x1 = int(poses_record[1].keypoints[0][keypoint_idx][0] * w)
            y1 = int(poses_record[1].keypoints[0][keypoint_idx][1] * h)
            print(f"{KEYPOINT_NAME[keypoint_idx]}   x:{x0}, y:{y0}")
            
            distance = (x1 - x0)**2 + (y1 - y0)**2
            if distance < DISTANCE_THRESHOLD:
                keypoint_count += 1
    
    if keypoint_count >= KEYPOINT_THRESHOLD:
        print("[INFO] Detected 5 or more KeyPoints.")
        motionless_count += 1
    else:
        motionless_count = 0
    
    return motionless_count

# Display Text
def display_text(image):
    print("[INFO] Display Text")
    cv2.putText(image,
            text=TEXT,
            org=(100, 50),
            fontFace=cv2.FONT_HERSHEY_SIMPLEX,
            fontScale=2.0,
            color=(0, 0, 255),
            thickness=3) 


# Detection by modlib
def detect_sleep():
    print("[INFO] Starting sleep detection.")
    device = AiCamera()
    model = Posenet()
    device.deploy(model)

    annotator = Annotator()

    last_detection_time = 0
    motionless_count = 0
    poses_record = []

    with device as stream:
        for frame in stream:
            current_time = time.time()
            if current_time - last_detection_time > PAUSE_AFTER_DETECTION:
                last_detection_time = time.time()

                h, w, _ = frame.image.shape

                poses = frame.detections
                poses_record.append(poses)
                if len(poses_record) > 2:
                    poses_record.pop(0)
                    motionless_count = track_motionless(w, h, poses_record, motionless_count)

                print(f"Motionless Count: {motionless_count}")
                if motionless_count > SLEEP_THRESHOLD:
                    print("[INFO] Detect Sleep")
                    display_text(frame.image)

                annotator.annotate_keypoints(frame, frame.detections)
                frame.display()

# Main
if __name__ == "__main__":
    detect_sleep()

You can run it with the following command and if you need a pyproject.toml file one is provided on our Github, so please give it a try if you're interested!

uv run SleepDetection.py

When in Trouble

If you encounter any issues while reading the article, please feel free to comment on this article. Also, please check the support site below.Please note that it may take some time to respond to comments

If you have questions related to Raspberry Pi, please check and utilize the forum below.

Raspberry Pi Forums

Conclusion

In this article, I built a dozing-detection application using the Raspberry Pi AI Camera.

While purchasing a dedicated device solely for dozing detection may feel unnecessary, the Raspberry Pi AI Camera is a versatile platform that can be repurposed for many other computer-vision projects—making it a practical choice beyond this single use case.

I hope this project serves as a useful reference and a starting point for your own experiments. If you build on it or have ideas for additional features or related applications, I’d love to hear your suggestions.

Patrick Johnson

12 projects • 0 followers

I've Found Someone Dozing Off with Raspberry Pi AI Camera

Things used in this project

Hardware components

Story

Summary of This Article

Introduction

Main Text

1. Setting up the Raspberry Pi AI Camera

2. Implementing Skeleton Estimation from the Raspberry Pi AI Camera

3. Creating the Application

3.1. Save Skeleton Estimation Data

3.2. Calculate the Distance of Estimated KeyPoint Coordinates

3.3. Judgment if the Distance Exceeds the Threshold

3.4. If motionless_count Exceeds the Threshold, Judge it as Dozing

3.5. UI Display

Complete Program

When in Trouble

Conclusion

Credits

Patrick Johnson

Comments

Embed the widget on your own site

I've Found Someone Dozing Off with Raspberry Pi AI Camera

I've Found Someone Dozing Off with Raspberry Pi AI Camera

Things used in this project

Hardware components

Story

Summary of This Article

Introduction

Main Text

1. Setting up the Raspberry Pi AI Camera

2. Implementing Skeleton Estimation from the Raspberry Pi AI Camera

3. Creating the Application

3.1. Save Skeleton Estimation Data

3.2. Calculate the Distance of Estimated KeyPoint Coordinates

3.3. Judgment if the Distance Exceeds the Threshold

3.4. If motionless_count Exceeds the Threshold, Judge it as Dozing

3.5. UI Display

Complete Program

When in Trouble

Conclusion

Credits

Patrick Johnson

Comments

Related channels and tags