- You can easily create a dozing detection application using the Raspberry Pi AI Camera
- This article explains the entire flow of application development using the Raspberry Pi AI Camera
- It is mainly aimed at those interested in or planning to use the Raspberry Pi AI Camera and AITRIOS
With the release of the Raspberry Pi AI Camera, I thought it would be fun to play around with it and developed an application!I tended to doze off during my student days, but as a teacher, seeing students dozing isn't exactly a pleasant experience...I sometimes notice people dozing off in crowded dining places too.In such situations, it would be easier to respond if you know someone is dozing off!Anyway, I would like to create a dozing detection application to help keep myself in check...!
Main TextNo advanced programming or AI knowledge is required, but please understand that we will be using Git and Python to some extent.
1. Setting up the Raspberry Pi AI Camera
2. Implementing skeleton estimation from the Raspberry Pi AI Camera
3. Creating the application1. Setting up the Raspberry Pi AI CameraFirst, let's set up the Raspberry Pi AI Camera.
Follow this Raspberry Pi setup guide for the Raspberry Pi AI camera to ensure you have everything setup.
2. Implementing Skeleton Estimation from the Raspberry Pi AI CameraNext, let's run the sample to see if we can estimate the skeleton with the Raspberry Pi AI Camera!
Implementing skeleton estimation and display from scratch would take considerable effort and time.However, recently, our company released something called Application Module Library, which allows you to easily create samples.Since it's a great opportunity, I will use it!
First, let's install the Modlib repo.
git clone https://github.com/SonySemiconductorSolutions/aitrios-rpi-application-module-library.gitNext, install uv by following the installation instructions provided in the UV documentation to set it up. Verify your uv installation by running:
uv --versionThen ensure that your Raspberry Pi runs the latest software:
sudo apt update && sudo apt full-upgrade
sudo apt install imx500-all
sudo apt install python3-opencv python3-munkres python3-picamera2Next, we can run modlib's posenet example to see test our system and see keypoint detections.
Finally, please run the following command.Finally, please run the following command.
cd aitrios-rpi-application-module-library
uv run examples/aicam/posenet.pyHow about that!It was an easy operation, but you can see that skeleton estimation is working properly!Even with the head down and wearing a mask, it seems to recognize quite well!Now we can certainly detect someone who is dozing off!
3. Creating the ApplicationWe will modify part of the sample touched in Chapter 2 to actually create the dozing detection application!This time, we will judge that someone is dozing off when the coordinates of KeyPoints during skeleton estimation hardly change.
There are various methods to judge whether the eyes are closed or not; feel free to implement it in your preferred way!Let's proceed with the implementation as follows.
[3.1 Save skeleton data]
|
v
[3.2 Compute KeyPoint distance (prev vs current)]
|
v
{3.3 Distance > threshold?}
| No | Yes
v v
[motionless_count++] [motionless_count = 0]
\ /
v v
{3.4 motionless_count >= motionless_threshold?}
| Yes | No
v v
[Dozing] [Not dozing]
|
v
[3.5 UI display]3.1. Save Skeleton Estimation DataFirst, create a new python script SleepDetection.py. Then to start the logic of the script we will save the skeleton estimation data. We will add data to a list called poses_record. For comparison between the previous and current frames, the length of the list is limited to 2, but feel free to change it as needed.I added the portion marked with + from the earlier program.
from modlib.apps import Annotator
from modlib.devices import AiCamera
from modlib.models.zoo import Posenet
device = AiCamera()
model = Posenet()
device.deploy(model)
annotator = Annotator()
poses_record = []
with device as stream:
for frame in stream:
detections = frame.detections[frame.detections.confidence > 0.5]
annotator.annotate_keypoints(frame, detections)
frame.display()
poses = frame.detections
poses_record.append(poses)
if len(poses_record) > 2:
poses_record.pop(0)3.2. Calculate the Distance of Estimated KeyPoint CoordinatesNext, we will calculate the distance of the estimated KeyPoint coordinates between the previous and current frames.We will prepare a new function called track_motionless() for this purpose.
KEYPOINT_NAME = [
"nose", #0
"leftEye", #1
"rightEye", #2
"leftEar", #3
"rightEar", #4
"leftShoulder", #5
"rightShoulder", #6
"leftElbow", #7
"rightElbow", #8
"leftWrist", #9
"rightWrist", #10
"leftHip", #11
"rightHip", #12
"leftKnee", #13
"rightKnee", #14
"leftAnkle", #15
"rightAnkle" #16
]
# Track motionless
def track_motionless(w, h, poses_record, motionless_count):
for keypoint_idx in range(len(KEYPOINT_NAME)):
x0 = int(poses_record[0].keypoints[0][keypoint_idx][0] * w)
y0 = int(poses_record[0].keypoints[0][keypoint_idx][1] * h)
x1 = int(poses_record[1].keypoints[0][keypoint_idx][0] * w)
y1 = int(poses_record[1].keypoints[0][keypoint_idx][1] * h)
distance = (x1 - x0)**2 + (y1 - y0)**2Access the x, y coordinates from previous data (=poses_record[0]) and current data (=poses_record[1]) and calculate the distance.There are supposedly 17 types of KeyPoints that can be obtained!We don't specifically designate which KeyPoints to use this time, but it might be a good idea to impose restrictions, for example, only processing when leftEye and rightEye are recognized.
3.3. Judgment if the Distance Exceeds the ThresholdNext, we will judge whether the distance exceeds the threshold.We will use a variable called motionless_count to count the number of frames where the KeyPoint coordinates hardly move.
a. If the distance is below the threshold, increment motionless_count by 1
b. If the distance is above the threshold, reset motionless_count to 0If someone is dozing off and the coordinates do not change much compared to before, we will increase motionless_count.We have prepared a threshold called DISTANCE_THRESHOLD to determine if the coordinates are not changing, but please feel free to modify it as necessary.However, if only one KeyPoint is detected, the reliability of the data is not very high, so motionless_count will only be increased if multiple KeyPoints are below the threshold.This also accommodates the case where no one is in the frame.
KEYPOINT_THRESHOLD = 5
DISTANCE_THRESHOLD = 100
KEYPOINT_SCORE_THRESHOLD = 0.5
# Track motionless
def track_motionless(w, h, poses_record, motionless_count):
keypoint_count = 0
for keypoint_idx in range(len(KEYPOINT_NAME)):
if (poses_record[0].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD and
poses_record[1].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD):
x0 = int(poses_record[0].keypoints[0][keypoint_idx][0] * w)
y0 = int(poses_record[0].keypoints[0][keypoint_idx][1] * h)
x1 = int(poses_record[1].keypoints[0][keypoint_idx][0] * w)
y1 = int(poses_record[1].keypoints[0][keypoint_idx][1] * h)
distance = (x1 - x0)**2 + (y1 - y0)**2
if distance < DISTANCE_THRESHOLD:
keypoint_count += 1
if keypoint_count >= KEYPOINT_THRESHOLD:
print("[INFO] Detected 5 or more KeyPoints.")
motionless_count += 1
else:
motionless_count = 0
return motionless_count3.4. If motionless_count Exceeds the Threshold, Judge it as DozingNext, we will process the judgment of dozing if motionless_count exceeds the threshold.Nonetheless, it will not be anything complicated, merely adding the following:
if motionless_count > SLEEP_THRESHOLD:
print("[INFO] Detect Sleep")3.5. UI DisplayFinally, let's clearly display that someone is dozing off on the UI! This time, we are using OpenCV to display "SLEEP" on the image.
You could also use other packages or means such as making a sound! Here, I've extracted only the part related to the UI display.
import cv2
TEXT = "SLEEP"
def display_text(image):
print("[INFO] Display Text")
cv2.putText(image,
text=TEXT,
org=(100, 50),
fontFace=cv2.FONT_HERSHEY_SIMPLEX,
fontScale=2.0,
color=(0, 0, 255),
thickness=3)Finally, I will provide the complete program.The program is also publicly available on GitHub, so feel free to refer to it!
import time
import numpy as np
from modlib.apps import Annotator
from modlib.devices import AiCamera
from modlib.models.zoo import Posenet
import cv2
# Definition of const
KEYPOINT_NAME = [
"nose", #0
"leftEye", #1
"rightEye", #2
"leftEar", #3
"rightEar", #4
"leftShoulder", #5
"rightShoulder", #6
"leftElbow", #7
"rightElbow", #8
"leftWrist", #9
"rightWrist", #10
"leftHip", #11
"rightHip", #12
"leftKnee", #13
"rightKnee", #14
"leftAnkle", #15
"rightAnkle" #16
]
KEYPOINT_THRESHOLD = 5
DISTANCE_THRESHOLD = 100
KEYPOINT_SCORE_THRESHOLD = 0.5
SLEEP_THRESHOLD = 10
PAUSE_AFTER_DETECTION = 0.5 # (seconds)
TEXT = "SLEEP"
# Track motionless
def track_motionless(w, h, poses_record, motionless_count):
keypoint_count = 0
# Calculate the coordinate difference between the previous frame and the current frame for each KeyPoint.
for keypoint_idx in range(len(KEYPOINT_NAME)):
if (poses_record[0].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD and
poses_record[1].keypoint_scores[0][keypoint_idx] >= KEYPOINT_SCORE_THRESHOLD):
x0 = int(poses_record[0].keypoints[0][keypoint_idx][0] * w)
y0 = int(poses_record[0].keypoints[0][keypoint_idx][1] * h)
x1 = int(poses_record[1].keypoints[0][keypoint_idx][0] * w)
y1 = int(poses_record[1].keypoints[0][keypoint_idx][1] * h)
print(f"{KEYPOINT_NAME[keypoint_idx]} x:{x0}, y:{y0}")
distance = (x1 - x0)**2 + (y1 - y0)**2
if distance < DISTANCE_THRESHOLD:
keypoint_count += 1
if keypoint_count >= KEYPOINT_THRESHOLD:
print("[INFO] Detected 5 or more KeyPoints.")
motionless_count += 1
else:
motionless_count = 0
return motionless_count
# Display Text
def display_text(image):
print("[INFO] Display Text")
cv2.putText(image,
text=TEXT,
org=(100, 50),
fontFace=cv2.FONT_HERSHEY_SIMPLEX,
fontScale=2.0,
color=(0, 0, 255),
thickness=3)
# Detection by modlib
def detect_sleep():
print("[INFO] Starting sleep detection.")
device = AiCamera()
model = Posenet()
device.deploy(model)
annotator = Annotator()
last_detection_time = 0
motionless_count = 0
poses_record = []
with device as stream:
for frame in stream:
current_time = time.time()
if current_time - last_detection_time > PAUSE_AFTER_DETECTION:
last_detection_time = time.time()
h, w, _ = frame.image.shape
poses = frame.detections
poses_record.append(poses)
if len(poses_record) > 2:
poses_record.pop(0)
motionless_count = track_motionless(w, h, poses_record, motionless_count)
print(f"Motionless Count: {motionless_count}")
if motionless_count > SLEEP_THRESHOLD:
print("[INFO] Detect Sleep")
display_text(frame.image)
annotator.annotate_keypoints(frame, frame.detections)
frame.display()
# Main
if __name__ == "__main__":
detect_sleep()You can run it with the following command and if you need a pyproject.toml file one is provided on our Github, so please give it a try if you're interested!
uv run SleepDetection.pyWhen in TroubleIf you encounter any issues while reading the article, please feel free to comment on this article. Also, please check the support site below.Please note that it may take some time to respond to comments
If you have questions related to Raspberry Pi, please check and utilize the forum below.
ConclusionIn this article, I built a dozing-detection application using the Raspberry Pi AI Camera.
While purchasing a dedicated device solely for dozing detection may feel unnecessary, the Raspberry Pi AI Camera is a versatile platform that can be repurposed for many other computer-vision projects—making it a practical choice beyond this single use case.
I hope this project serves as a useful reference and a starting point for your own experiments. If you build on it or have ideas for additional features or related applications, I’d love to hear your suggestions.








Comments