•

Anete Zepa

Published December 20, 2025

HappyBees: Edge ML Beehive Monitoring with RPi Pico 2 W

Your bees can't tell you they're about to swarm. This $50 Pico-based edge ML monitor can.

IntermediateFull instructions provided6 hours284

HappyBees: Edge ML Beehive Monitoring with RPi Pico 2 W

Things used in this project

Hardware components

Raspberry Pi Pico 2 W

Adafruit SPW2430

Texas Instruments TLC272

Pimoroni SHT20

Resistor 10k ohm

Resistor 100k ohm

Through Hole Resistor, 5.1 kohm

Resistor 1k ohm

Resistor 4.75k ohm

Capacitor 100 µF

Capacitor 10 µF

Capacitor 47 µF

Hand tools and fabrication machines

Soldering iron (generic)

Solder Wire, Lead Free

Prototyping Kit, Breadboard

Story

HappyBees: Edge ML Beehive Monitoring with a Raspberry Pi Pico 2 W

Detect swarming events before you lose your colony. No cloud dependency. Under $50.

The Problem

Bees contribute $577 billion annually to the global economy. Yet most beekeepers (especially hobbyists) have no idea what's happening inside their hives until they open them. By then, the swarm has left or the colony has starved.

Commercial monitoring systems cost €500+ per hive. They're designed for industrial operations, not someone with eight hives in their backyard.

We needed something different: a system that could survive a winter, cost less than $50 in parts, and provide actual insights beyond just temperature readings uploaded hourly.

The Solution

HappyBees runs edge ML directly on a Raspberry Pi Pico 2 W. It listens to the colony's acoustic signature, distinguishes between normal hum and pre-swarm piping, and only alerts you when something matters.

The key insight: we don't stream raw audio to the server. We process everything locally, 6 seconds of audio becomes 20 features, runs through a TFLite model, and produces a classification in under 200ms. The local web server only sees results.

Technical specs:

16kHz audio capture via DMA (96, 000 samples per inference)
Custom DSP: 2nd-order Butterworth highpass (100Hz), 3rd-order lowpass (6kHz)
512-point FFT, Hanning window, 187 windows averaged per capture
TensorFlow Lite Micro inference on RP2350
WiFi telemetry to FastAPI + TimescaleDB backend
Real-time Dash dashboard

Hardware

Components

Why Pico 2 W? The RP2350 has 520KB SRAM. Our audio buffer alone is 192KB. ESP32's 320KB wouldn't cut it. The Pico also has better documentation and community tooling for ML deployment than ESP32-S3.

Wiring

Pico Connections:

Microphone Preamp:

The SPW2430 outputs a weak signal. We feed it through a non-inverting amplifier:

Mic output → AC-coupled via 10µF → Op-amp (+) input
R1/R2 voltage divider sets 1.65V DC bias
Gain = 1 + (100k/5k) = 21x
Output filtered through 1kΩ → GP26

This gain is important. The model was trained on different microphone data. Our op-amp circuit produces signals 3-4x stronger. We compensate in software with a gain factor (default 0.35).

Firmware

Building

git clone https://github.com/AneteZepa/HappyBees.git
cd HappyBees/firmware

# Get CMake helper
curl -o pico_sdk_import.cmake https://raw.githubusercontent.com/raspberrypi/pico-sdk/master/external/pico_sdk_import.cmake

# Install Pico SDK
pushd ~
git clone https://github.com/raspberrypi/pico-sdk.git
cd pico-sdk && git submodule update --init
popd

# Build
mkdir -p build && cd build
cmake .. -DPICO_BOARD=pico2_w -DPICO_SDK_PATH=~/pico-sdk
make -j4

Flashing

Hold BOOTSEL on Pico
Plug in USB
Copy beewatch_firmware.uf2 to the mounted RP2350 drive

Configuration

Connect via serial:

tio -b 115200 /dev/tty.usbmodem*

Configure WiFi and server:

> wifi YOUR_SSID YOUR_PASSWORD
> server 192.168.1.50
> p   # Ping to verify

Serial Commands

DSP Pipeline

The signal processing chain matters. Here's what happens to each audio sample:

Raw ADC → DC Removal → Gain Compensation → HP Filter → LP Filter → FFT

DC Removal: Subtract mean (typically ~2048 for 12-bit ADC centered at 1.65V)

Gain Compensation: Scale by 0.35 to match training data amplitude

High-Pass Filter (100Hz, 2nd Order Butterworth):

const float HP_B0 = 0.9726139f, HP_B1 = -1.9452278f, HP_B2 = 0.9726139f;
const float HP_A1 = -1.9444777f, HP_A2 = 0.9459779f;

Low-Pass Filter (6kHz, 3rd Order Butterworth): Two cascaded biquads

FFT: 512-point with Hanning window, 187 non-overlapping windows per 6-second capture, averaged.

The frequency bins we care about: indices 4-19, corresponding to 125-594 Hz. This is where bee acoustic signatures live.

The ML Model

Anete developed both the Summer and Winter models using Edge Impulse's BYOM (Bring Your Own Model) workflow. This let us train in Python and automatically generate optimized C++ for the Pico.

Summer Model (Swarm Detection)

Input: 20 features

Temperature, humidity, hour
Spike ratio (key feature)
16 FFT frequency bins

Architecture: Dense 64 → Dense 32 → Softmax 2 (Normal/Event)

The critical insight: Through systematic testing, we discovered the model primarily uses the spike ratio—current audio energy divided by rolling average. The FFT bins have minimal impact.

spike_ratio = current_rms_density / rolling_average_density

spike < 0.7: Activity decreasing → Normal
spike ≈ 1.0: Steady state → Ambiguous (defaults to Event)
spike > 1.3: Activity increasing → Event

This explains why a fresh boot always predicts "Event", with no history, spike ratio is 1.0. After 5-6 readings, the rolling average stabilizes.

Winter Model (Anomaly Detection)

An autoencoder trained on healthy winter cluster data:

Input: 5 features

Temperature, humidity
Temperature stability (variance over 12 readings)
Heater power (sum of 183Hz, 213Hz, 244Hz bins—the "cluster hum")
Heater ratio (heater power / total audio density)

Output: Reconstruction of input

Anomaly score: MSE between input and reconstruction. High MSE means the model doesn't recognize the pattern: potential colony death, starvation, or cluster instability.

Backend

Stack

FastAPI for the API (async, auto-documented)
TimescaleDB for time-series storage (PostgreSQL with hypertables)
Dash/Plotly for real-time dashboard

Why not Grafana? we think it adds deployment complexity while Dash gives us simple, pure Python end-to-end.

Quick Start

1. Start database:

podman run -d --name happybees-db \
    -p 5432:5432 \
    -e POSTGRES_USER=postgres \
    -e POSTGRES_PASSWORD=happybees_dev \
    -e POSTGRES_DB=happybees \
    timescale/timescaledb:latest-pg16

2. Start backend:

uv venv && source .venv/bin/activate
uv pip install -r requirements.txt
uvicorn backend.app.main:app --host 0.0.0.0 --port 8000 --reload

3. Start dashboard:

python -m backend.dashboard.app --node pico-hive-001

4. Test with mock device:

python backend/scripts/mock_stream.py --node pico-hive-001

API Endpoints

The command queue uses polling (every 2s) instead of WebSockets. Simpler firmware, works through NAT, and 2-second latency is fine for our use case.

Troubleshooting

"Always predicts Event"

Normal on fresh start. Run inference 5-6 times to build history.
Or: make noise during one capture, then go quiet. The spike ratio will drop.

"FFT bins too high"

Lower gain: g0.25
Target: bins 0.02-0.06 for quiet room

"Pico won't connect to WiFi"

Must be 2.4GHz (Pico doesn't support 5GHz)
Check SSID/password spelling

What's Next

Weight sensing: HX711-based load cells are in progress. Weight is the most reliable indicator of honey production and winter food reserves.
Better training data: Current model needs more "bad event" examples. We're collecting data from more hives.
Mesh networking: Multiple hives, one gateway.
Alerts: Email/SMS when something needs attention.
Hardening: The dashboard occasionally stalls, firmware restart handling needs work, UI polish.

The board is a "functional prototype", however, our soldering skills are not great. We're exploring learning KiCad to turn it into a PCB that can fit more neatly into a 3D printed case.