Published April 30, 2026

AI Camera Motion-Triggered Capture with BYTETracker

Combines object detection, tracking, and motion analysis to trigger image capture only when real movement is detected.

IntermediateFull instructions provided2 hours12

Things used in this project

Hardware components

Raspberry PI AI Camera

Story

This project demonstrates an end-to-end real-time edge vision pipeline that combines object detection, multi-object tracking, and pixel-based motion estimation to reliably capture images (or record video) only when something is actually moving—optionally only inside user-defined Areas.

Using a NanoDet object detection model on the Raspberry Pi AI Camera, BYTETracker for stable IDs, and a Motion module that computes motion bounding boxes from pixel changes over time, the app matches “detections” to “motion” to decide when to trigger recording. The result is a practical motion sensor that’s smarter than raw frame differencing and more efficient than recording continuously.

Story

Classic motion sensors based on pixel differences are noisy: lighting changes, shadows, and camera noise can trigger false events. On the other hand, pure object detection can be too “always-on” and may save frames even when objects are static.

This project merges both approaches:

Motion tells you where pixels are changing (something is moving).
Detection + tracking tells you what the object is and keeps a stable ID over time.
Matching logic confirms that a detected object is also in motion, and triggers capture only then.
Areas let you restrict triggers to specific regions (e.g., your doorway, a corridor, a loading bay).

The outcome is a lightweight edge application that can capture evidence (images) or record clips (video) only when it matters—reducing storage, reducing false positives, and making events easier to review.

Architecture Overview

The application consists of six main components:

Model Inference (Object Detection) Runs NanoDet on each frame to produce bounding boxes, class IDs, and confidence scores.
Detection Filtering Applies confidence thresholds and optional class filters (e.g., only “person”) to reduce noise.
Object Tracking (BYTETracker) Assigns persistent IDs to detections across frames so the system can reason about “the same object” over time.
Motion Estimation (Pixel Change → Motion Boxes) Computes motion bounding boxes by analyzing pixel changes across frames.
Area Gating (Optional) Restricts detection/motion checks to user-defined polygon Areas loaded from a JSON file.
Matching + Trigger Logic Matches tracked detections to motion boxes and triggers image capture (or video recording) when conditions are met.

1. Object Detection with NanoDet

The app deploys a NanoDet model to the AI Camera and reads detections from each frame:

model = NanoDetPlus416x416()
device = AiCamera(frame_rate=15)
device.deploy(model)

Detections are filtered by confidence:

detections = frame.detections[frame.detections.confidence > 0.5]

And (in this sample) filtered to a single class:

detections = detections[detections.class_id == 0]

You can remove or extend this filter to monitor multiple classes depending on your use case.

2. Multi-Object Tracking with BYTETracker

The BYTETracker module stabilizes detections over time by assigning a persistent tracker ID:

tracker = BYTETracker(BYTETrackerArgs())
detections = tracker.update(frame, detections)

This is important because motion triggers often need temporal logic (“moving for N frames”, “missing for N frames”), which is much easier when objects have stable IDs.

3. Motion Detection from Pixel Changes

The Motion module computes motion bounding boxes by looking at pixel changes over time:

motion = Motion()
motion_bboxes = motion.detect(frame)

This produces bounding boxes representing where motion is happening, independent of object class.

4. Area-Based Filtering (Optional)

Areas are loaded from a JSON file (created/edited manually or via the points selector tool):

motion_area = json_regions_extraction(args.json_file)
areas = [Area(area["points"]) for area in motion_area]

Then detections and motion boxes can be filtered to only those inside the Area:

area_detections = detections[area.contains(detections)]
area_motion = motion_bboxes[area.contains(motion_bboxes)]

This is ideal for “only trigger near the door” or “only trigger inside the loading zone” scenarios.

5. Matching Detections to Motion (Who is moving?)

The key idea is to confirm that a detected/tracked object overlaps with a motion region. The Matcher module performs this association:

matcher = Matcher(max_missing_overlap=10, max_missing_tracker=10)
motion_detections = detections[matcher.match(area_detections, area_motion)]

Now motion_detections represents detected objects that are also moving (and optionally moving inside the Area).

6. Trigger Logic: When to Capture an Image

The check() function implements the trigger rules. In short, it can trigger when:

there are no detections and motion is constant at a threshold (a “motion-only” fallback), or
a tracked object is matched as moving for long enough (e.g., uptime > 10 frames), and
the same ID isn’t repeatedly triggering without leaving/resetting.

When the trigger fires, the app saves a frame:

output_path = directory / f"{frame.timestamp}.jpg"
cv2.imwrite(str(output_path), frame.image)

This produces a timestamped image evidence trail in./images.

To find the full source code and project clone the public Github. The source code for this project is in examples/motion-sensor.

Application Summary

Input: Live stream from Raspberry Pi AI Camera
Detection: NanoDet bounding boxes + class IDs
Tracking: BYTETracker assigns stable IDs
Motion: Pixel-change motion boxes
Area gating: Optional region restriction via JSON Areas
Matching: Confirms detected objects are moving
Output: Annotated live view + saved images (or recorded video in the alternate app)

Try It Yourself

1. Define Motion Areas (JSON)

To change monitored Areas, edit example.json or just use one of the two tools to edit the points :

In app Configurator tool
The points selector tool

2. Run the Image-Capture Version

Using uv (installs dependencies from pyproject.toml and runs the app):

uv run app.py --json-file example.json

Args:

--json-file <path> (Required): JSON file containing Area polygons
--area (Optional): Visualize Areas overlayed on the stream

3. Run the Video-Recording Version

If you prefer recording video instead of capturing images:

uv run app_video.py --json-file example.json

When in Trouble

If you run into issues with Raspberry Pi setup, use the official forum:

https://forums.raspberrypi.com/

If you have questions about this Python project, share your error output and your example.json (with any sensitive coordinates removed), and I’ll help you debug it.

Patrick Johnson

13 projects • 0 followers

AI Camera Motion-Triggered Capture with BYTETracker

Things used in this project

Hardware components

Story

Story

Architecture Overview

1. Object Detection with NanoDet

2. Multi-Object Tracking with BYTETracker

3. Motion Detection from Pixel Changes

4. Area-Based Filtering (Optional)

5. Matching Detections to Motion (Who is moving?)

6. Trigger Logic: When to Capture an Image

Application Summary

Try It Yourself

1. Define Motion Areas (JSON)

2. Run the Image-Capture Version

3. Run the Video-Recording Version

When in Trouble

Credits

Patrick Johnson

Comments

Embed the widget on your own site

AI Camera Motion-Triggered Capture with BYTETracker

AI Camera Motion-Triggered Capture with BYTETracker

Things used in this project

Hardware components

Story

Story

Architecture Overview

1. Object Detection with NanoDet

2. Multi-Object Tracking with BYTETracker

3. Motion Detection from Pixel Changes

4. Area-Based Filtering (Optional)

5. Matching Detections to Motion (Who is moving?)

6. Trigger Logic: When to Capture an Image

Application Summary

Try It Yourself

1. Define Motion Areas (JSON)

2. Run the Image-Capture Version

3. Run the Video-Recording Version

When in Trouble

Credits

Patrick Johnson

Comments

Related channels and tags