Published June 17, 2026 © Apache-2.0

WearEdge Pro: Jetson Edge AI Agent

A wearable industrial edge AI runtime benchmarked across five compact VLMs on Jetson with action-card guards and audit logs.

IntermediateWork in progress8 hours2

Story

## Story

WearEdge Pro is a prototype industrial edge AI system that connects wearable first-person capture to a local Jetson inference node. The goal is to support frontline workflows such as predictive maintenance, quality inspection, changeover, work-instruction assistance, and hazard review without sending sensitive factory images to a cloud model.

The prototype uses a Jetson-class edge node running local multimodal inference. A wearable or browser client captures an image and sends it to the gateway. The model response is not treated as a final uncontrolled answer. It is validated into route-specific fields and converted into an operator action card, follow-up plan, and audit record.

## Why This Project Exists

Industrial AI has a different risk profile from consumer AI. A factory operator needs more than a fluent answer. They need a traceable recommendation:

What image or evidence was used?
Which workflow route was selected?
Which fields were required?
Is this a maintenance, quality, changeover, WI, or EHS action?
Does the action require human confirmation?
Can the event be audited later?

WearEdge Pro is built around that idea.

## Hardware

Jetson-class 8GB edge node
NVMe model storage
Wearable first-person capture device or browser capture client
Local network connection between capture device and Jetson

## Software

llama.cpp multimodal endpoint
GGUF model artifacts and multimodal projectors
FastAPI gateway
route-specific prompt contracts
action-card and audit runtime
PowerShell benchmark harness for Windows-to-Jetson testing

## Five-Model Jetson Benchmark

To choose the current baseline, we ran five compact multimodal candidates through the same Jetson endpoint path:

| --- | ---: | ---: | --- |

| SmolVLM2-2.2B | 5/5 | 12.84s | Fast triage candidate |

| Qwen2.5-Omni-3B | 5/5 | 50.09s | Future audio/video branch |

## What We Learned

Qwen2.5-VL was the most compelling challenger. It read a changeover machine and SKU exactly as `LABELER-FL1` and `SKU-C500`, and it produced useful defect-score details for quality inspection.

SmolVLM2 was the fastest, but the answers were too generic for trusted industrial guidance.

InternVL3 needed a larger context window to complete the full matrix, and the resulting latency was too high for the current product baseline.

Qwen2.5-Omni ran successfully and is interesting for future speech/video workflows.

Gemma 4 E2B stayed the baseline because it best fit the complete runtime: local inference, multimodal evidence, structured contracts, action cards, guards, and audit logs.

## Build Notes

The practical architecture is:

```text

wearable image + operator context

> local Jetson gateway
> llama.cpp multimodal endpoint
> route-specific contract
> action card / follow-up plan
> audit log

```

The model comparison uses one VLM endpoint at a time to avoid memory pressure and to keep results comparable.

## Future Work

Put Qwen2.5-VL behind the same WearEdge guards for IQC/changeover A/B tests.
Add better runtime telemetry capture during each model run.
Evaluate native audio/video branches with Qwen2.5-Omni or future Gemma runtime support.
Reduce latency for high-detail maintenance workflows.
Prepare a fully public reproducibility package.

Public artifact note: the measured benchmark report and reproducible harness summary are ready locally; a public artifact mirror can be attached after the repository/artifact is opened.

Ryan Hsu

1 project • 0 followers

WearEdge Pro: Jetson Edge AI Agent

Story

Credits

Ryan Hsu

Comments

Embed the widget on your own site

WearEdge Pro: Jetson Edge AI Agent

WearEdge Pro: Jetson Edge AI Agent

Story

Credits

Ryan Hsu

Comments

Related channels and tags