You’ve trained a model. It classifies images accurately in the lab. You’ve deployed it to devices in the field. And then reality happens.
The lighting changes. The camera angle shifts. New object classes appear that weren’t in the training data. The model that scored 95% accuracy in testing now struggles at 70% in production. You need to push an improved model. Not next quarter. Not during a maintenance window. Now. To every device. Without sending an engineer to each one.
This is the model update problem, and it’s one of the biggest gaps in the edge AI lifecycle. The ML community has mature tools for training (Edge Impulse, TensorFlow, PyTorch), mature tools for inference (TFLite, ONNX Runtime), but model delivery to production devices is often still a manual process: SSH into the device, copy the file, restart the service. That works for a prototype. It does not work for ten devices. It certainly does not work for a thousand.
In this article, we’ll walk through a practical solution using Golioth, an IoT device management platform, to deliver over-the-air (OTA) model updates to Linux edge devices running image classification. We’ll cover the full workflow: setting up devices, uploading models, creating releases, and rolling out updates that take effect in seconds, while inference is still running.
2. The Challenges of Updating Models in ProductionDeploying a model to an edge device is a one-time event. Keeping that model current is an ongoing operational challenge. Here’s what makes it hard:Model Drift
Real-world data changes over time. A model trained on summer images degrades in winter. A factory inspection model trained on one product batch may miss defects in the next batch. The model doesn't break. It slowly becomes less accurate, often without anyone noticing until it’s too late.
Fleet Management
When you have multiple devices, each might be running a different model version. Some devices might have been offline during the last update. Some might be in locations with intermittent connectivity. You need a system that tracks which device has which version and can reconcile the differences.
Zero-Downtime Requirements
Production systems can’t go offline for updates. A security camera needs to keep classifying while a new model is being loaded. A quality inspection system can’t pause the production line for a firmware reflash. The update process needs to happen in the background, with the new model taking effect seamlessly.
Rollback Safety
Sometimes a new model performs worse than the old one. You need the ability to revert to the previous version quickly, without retraining or redeploying from scratch. This requires keeping versioned copies and having a mechanism to switch between them.
Security
Model files are intellectual property. They’re transmitted over the internet to devices that may be in untrusted physical locations. The delivery mechanism needs encrypted transport, authenticated devices, and integrity verification to ensure models aren’t tampered with in transit.
3. The Solution: Golioth for ML Model DeliveryGolioth is an IoT device management platform that provides secure, scalable OTA firmware updates. While it’s typically used for firmware delivery to microcontrollers, its artifact management system works perfectly for delivering ML models to Linux edge devices.
The architecture is straightforward:
- Edge Impulse (or your training tool of choice) produces a trained.tflite model and a labels file
- Golioth hosts the model as a versioned artifact and manages releases across your device fleet
- Your edge device runs a lightweight updater that pulls new models from Golioth and a Python script that performs inference
The key insight is separation of concerns. The model updater handles cloud communication, authentication, download, and file management. The inference script handles camera capture and classification. They run as independent processes. When a new model arrives, the updater writes it to disk and signals the inference script to reload — no restart, no downtime.
4. Device Onboarding and AuthenticationBefore a device can receive model updates, it needs to be registered with Golioth and given credentials to authenticate. This ensures that only authorized devices can pull your models, and that models are delivered over encrypted connections.4.1 Creating a Project
Everything in Golioth lives inside a project. A project groups your devices, artifacts, and releases together. Create a project on the Golioth console that represents your deployment — for example, “factory-inspection” or “retail-analytics.”
4.2 Adding a Device
Each physical device gets a unique identity in Golioth. Navigate to Devices and create a new device. Give it a meaningful name that identifies its location or function, such as “camera-line-3” or “entry-classifier-01.”
4.3 Authentication with PSK Credentials
Golioth supports two ways to authenticate devices: Pre-Shared Key (PSK) and certificate-based (PKI).
PSK is the simpler option. Each device gets a PSK Identity and a PSK Secret from the Golioth console, which are set as environment variables on the device. This is a good choice for quick prototyping and small deployments.
For this project, we use certificate authentication (PKI), which is covered in the next section.
4.4 Certificate Authentication
For production deployments, certificate-based authentication (PKI) provides stronger security. Each device gets a unique X.509 certificate signed by your Certificate Authority. The device presents its certificate during the TLS handshake, and Golioth verifies it against the registered CA.
Certificate auth eliminates the need to manage individual PSK secrets and supports automated provisioning at scale. The model updater reads three PEM files from disk:
- Device certificate: Identifies this specific device
- Device private key: Proves ownership of the certificate
- Golioth root CA: Verifies the server is actually Golioth
Both PSK and certificate auth provide the same functionality — they differ in provisioning complexity and security posture. Use PSK for development and small deployments; use certificates for production fleets.
For this demo we would use the PSK authentication method, follow the instruction here for more details - Link
5. Deploying Models Through GoliothOnce your device is registered and authenticated, you're ready to push models. Golioth's OTA system uses three concepts: packages (what you're delivering), cohorts (who receives it), and deployments (when they get it).5.1 Creating Packages
A package in Golioth represents a single upgradeable component on your device. Each package has a name, a description, and a list of versioned artifacts. For image classification, we create two packages:
If two different devices in your project use the same model, you only need one ai-model package. You can deploy it to both device types independently through cohorts.
To create a package:
- Navigate to Packages in the Golioth Web Console
- Click Create
- Enter the package name (e.g.,
ai-model) - Optionally add a description and metadata properties
- Click Create
Repeat for the labels package.
To upload a version:
- Open your package in the package list
- Click New Version
- Set the version number (e.g.,
1.0.0or simply1) - Choose the binary file to upload (your
.tflitemodel orlabels.txt) - Click Upload Artifact
The new version appears in the version list. Each version is immutable — once uploaded, it cannot be modified. To push an updated model, upload a new version (e.g., version 2). This gives you a complete history of every model that's been deployed, enabling rollback at any time.
5.2 Creating a Cohort
Before you can deploy updates to devices, you need to assign them to a cohort. A cohort is a group of devices that receive the same set of packages and firmware updates.
Typically, you create one cohort per device type or deployment stage. For example:
- production: all production devices in the field
- canary: a small subset that receives updates first for validation
- dev: internal test devices
For getting started, a single cohort is sufficient.
To create a cohort:
- Navigate to Cohorts in the Golioth Web Console
- Click Create Cohort
- Enter a name (e.g., default or develop)
- Click Create
To assign devices to the cohort.
There are three ways to do this in the console:
- From the cohort page: Click Add Devices, find your device in the table, click Add
- From the Device Index: Select devices with checkboxes, click Bulk Actions → Assign to cohort, select the cohort
- From the Edit Device page: Click Edit, select a cohort from the dropdown, click Save
Once a device is assigned to a cohort and connected to Golioth, it immediately receives the cohort's active deployment manifest.
5.3 Deploying Updates
A deployment pushes a specific set of package versions to all devices in a cohort. Each cohort can only have one active deployment at a time. When you create a new deployment, it replaces the previous one and is immediately pushed to all connected devices in the cohort.
To create a deployment:
1: Navigate to Cohorts and select your cohort
2: Click Deploy in the top right corner
3: Select the packages and versions to include:
ai-model→ version1labels→ version1
4: Click Next to review the changes
5: Click Start Deployment
The deployment is pushed to all devices in the cohort. On the device side, the model updater receives the manifest, sees the new packages, and begins downloading.
5.4 Updating to a New Model Version
When you have a retrained model ready to deploy:
1: Go to Packages → ai-model → New Version
2: Upload the new .tflite file as version 2
3: Go to Cohorts → your cohort → Deploy
4: Select ai-model v2 + labels v1 (or v2 if labels changed too)
5: Review and click Start Deployment
The model updater on each device in the cohort receives the new manifest, downloads only the changed artifacts, updates the symlinks, and signals the Python inference script to reload — all within seconds.
6. Seeing It in ActionLet’s walk through a real deployment scenario. We have an image classification model trained in Edge Impulse to classify two types of fruit: orange and bell-pepper. The initial model (v1) was trained with limited data and performs inconsistently. We’ll push an improved model (v2) via Golioth and see the update take effect live.
6.1 Starting Fresh
The device starts with an empty model directory. The model updater connects to Golioth and subscribes to manifest updates:
ubuntu@ubuntu:$./build/ei-golioth-model-updater-linux
[model_updater] ========================================
[model_updater] Golioth OTA Model Updater for Linux
[model_updater] Model directory: /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models
[model_updater] ========================================
[model_updater] No model version found. Waiting for first release.
[model_updater] Certs loaded: CA=2737B cert=534B key=227B
[model_updater] Golioth client connected
[model_updater] Connected to Golioth.
[model_updater] Subscribed to OTA manifest. Waiting for releases...Meanwhile, the Python inference script starts and waits for the model to arrive:
(edge-env) python3 camera_infer_tflite.py --model models/ai-model.lite --labels labels.txt --camera 0 --top_k 3
[inference] Using tensorflow.lite.Interpreter
[inference] WARNING: Labels file not found: .golioth/models/labels.txt
[inference] WARNING: Model file not found: .golioth/models/ai-model.tflite
[inference] No model found. Waiting for Golioth updater to deliver one...
[inference] Watching: .golioth/models/ai-model.tflite6.2 First Model Delivery (v1)
We roll out the first release containing ai-model v1 + labels v1 from the Golioth console. The updater receives the manifest and begins downloading:
[model_updater] Manifest received: 2 component(s), seqnum=1613245206
[model_updater] Component: package="labels" version="1" size=19
[model_updater] Queued for download: labels v1
[model_updater] Component: package="ai-model" version="1" size=419192
[model_updater] Queued for download: ai-model v1
[model_updater] Processing 2 pending download(s)
[model_updater] Downloading labels v1 (19 bytes) -> /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/labels_1.txt
[model_updater] Downloading ai-model v1 (419192 bytes) -> /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/ai-model_1.tflite
[model_updater] Downloaded 19 bytes...
[model_updater] Block download complete for labels v1 (19 bytes)
[model_updater] Download complete: labels v1 (19 bytes)
[model_updater] Saved: /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/labels_1.txt
[model_updater] Updated symlink: /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/labels.txt -> /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/labels_1.txt
[model_updater] Updated version file: 1
[model_updater] Model update complete for labels v1. Signaling Python script.The Python script receives the signal and loads the model:
[inference] SIGUSR1 received — model reload requested
[inference] === RELOADING MODEL AND LABELS ===
[inference] Loaded 2 labels: ['bell-pepper', 'oranges']
[inference] Loading model: .golioth/models/ai-model.tflite
[inference] -> resolved to: /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/ai-model_1.tflite
[inference] Model input shape: [ 1 320 320 3], dtype: <class 'numpy.int8'>
[inference] Model output shape: [1 2], dtype: <class 'numpy.int8'>
[inference] === RELOAD COMPLETE ===The camera feed starts showing predictions. But this initial model was trained with limited data. It frequently misclassifies orange as bell-pepper, and confidence scores hover around 20–30%.
6.3 Pushing an Improved Model (v2)
After retraining with additional data and more epochs in Edge Impulse, we have a v2 model with significantly better accuracy. We upload it to Golioth as ai-model v2, create a new release (ai-model v2 + labels v1), and roll it out.The updater detects the new version and downloads it:
[model_updater] Manifest received: 2 component(s), seqnum=569015593
[model_updater] Component: package="ai-model" version="2" size=419192
[model_updater] Queued for download: ai-model v2
[model_updater] Component: package="labels" version="2" size=19
[model_updater] Queued for download: labels v2
[model_updater] Processing 2 pending download(s)
[model_updater] Downloading ai-model v2 (419192 bytes) -> /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/ai-model_2.tflite
[model_updater] Downloading labels v2 (19 bytes) -> /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/labels_2.txt
[model_updater] Downloaded 1024 bytes...Notice that the labels file was not re-downloaded. The updater detected that labels v1 already exists on disk and skipped it. Only the new model was transferred.The Python script reloads instantly:
[inference] === RELOADING MODEL AND LABELS ===
[inference] Loaded 2 labels: ['bell-pepper', 'oranges']
[inference] Loading model: .golioth/models/ai-model.tflite
[inference] -> resolved to: /home/ubuntu/Documents/ei-golioth-model-updater-linux/.golioth/models/ai-model_2.tflite
[inference] Model input shape: [ 1 320 320 3], dtype: <class 'numpy.int8'>
[inference] Model output shape: [1 2], dtype: <class 'numpy.int8'>
[inference] === RELOAD COMPLETE ===The camera feed now shows accurate classifications with confidence scores in the 70%–85% range. The “MODEL RELOADED” banner appears briefly on the video feed, confirming the update took effect.
The entire update — from clicking “Roll Out” on the Golioth console to the new model running inference on the device — took a few seconds for a 419KB model on a standard broadband connection.
6.4 What’s on Disk
After both versions have been deployed, the model directory looks like this:
.golioth/models/
├── ai-model.tflite → ai-model_2.tflite (symlink to latest)
├── ai-model_1.tflite (v1, retained for rollback)
├── ai-model_2.tflite (v2, currently active)
├── labels.txt → labels_1.txt (symlink to latest)
├── labels_1.txt (v1)
└── version.txt (contains "2")Both versions are retained on disk. If v2 performs worse in a specific environment, you can roll back by creating a new release that references ai-model v1, or manually re-pointing the symlink to the older version.
7. Best Practices for Model Deployment7.1 Version Everything
Every model uploaded to Golioth gets a version number. Use this to track exactly what’s running on each device. When a device reports an issue, the first question should be: “What model version is it running?” The version file on disk and the Golioth console both answer this instantly.
7.2 Always Bundle Labels with Models
If you add a new class to your model, the labels file must be updated to match. By including both in the same Golioth release, you ensure they arrive together as a consistent pair. A mismatch between model outputs and labels is a subtle bug that produces confident but completely wrong predictions.
7.3 Use Staged Rollouts
Don’t push a new model to your entire fleet at once. Tag a subset of devices as “canary” and roll out to them first. Monitor their accuracy for a day or two before rolling out to production. Golioth’s device tags and targeted releases make this straightforward.
7.4 Monitor After Rollout
The model updater can report its state back to Golioth using the state reporting API. Use this to confirm that every device has successfully downloaded and loaded the new model. Devices that failed to update will show a stale version on the console, making it easy to identify and troubleshoot.
8. Conclusion and Future Work8.1 What We Built
We demonstrated a practical system for deploying and updating ML models on Linux edge devices using Golioth’s OTA infrastructure. The system supports multi-artifact releases (model + labels), automatic version management, zero-downtime hot-reload, and rollback capability. The entire workflow from uploading a new model to seeing it run on the device, takes under a minute.
The approach separates model delivery from model inference, allowing each component to evolve independently. The device updater handles cloud communication and file management; the inference script handles camera capture and classification. They communicate through the filesystem and a simple Unix signal, keeping the system reliable and easy to debug.
8.2 What This Enables
The OTA model delivery infrastructure we’ve built is a foundation for more advanced edge AI workflows:
Active Learning Systems: Devices can monitor their own prediction confidence in real-time. When confidence drops below a threshold, the device flags those samples for review. These uncertain samples are often the most valuable training data. A pipeline can collect them, add them to the training set, retrain in Edge Impulse, and push the improved model back through Golioth. This creates a continuous learning loop where every deployed device contributes to making the model better.
A/B Model Testing: With version management and device tags, you can run two different model versions on different device groups simultaneously. Compare their real-world accuracy to determine which model performs better before committing to a fleet-wide rollout.
Automated Retraining Pipelines: Edge Impulse's REST API can be integrated into a CI/CD pipeline that automatically retrains on new data, exports the model, uploads it to Golioth as a new artifact version, creates a deployment, and rolls out to canary devices. No human in the loop for routine model improvements.
Domain Adaptation: Different deployment sites may have different environmental conditions (lighting, angles, backgrounds). The same base model can be fine-tuned per-site and pushed to specific device groups through Golioth’s targeted releases, ensuring each device runs the model optimized for its environment.
Model Compression and Optimization: As better quantization techniques or model architectures become available, you can push optimized versions that run faster or use less memory. No changes needed to the device hardware or inference code.
The gap between training a model and maintaining it in production has long been the hardest part of edge AI. With the infrastructure described in this article, that gap shrinks to a console click and a few seconds. The model gets better. The devices update themselves. The system learns.
The hardest part of edge AI isn't building the model. It's keeping it current. Golioth makes that part invisible.Public Edge Impulse Project: Link
GitHub Repository: Link










Comments