What started as a simple idea — “Let’s make a door camera using ESP32-CAM” — quickly turned into a deep engineering journey across embedded systems, networking, and edge intelligence.
The goal was clear:
- A smart door camera
- Fully offline (no cloud)
- Capable of motion detection, OTA updates, and real-time alerts
But the path to get there was anything but straightforward.
The VisionMost DIY camera systems rely heavily on cloud platforms. I wanted to build something:
- Private
- Lightweight
- Fully controllable
So I designed a system where:
- ESP32-CAM acts as the edge device
- An old Android phone runs a local server (via Termux)
- Everything communicates over local WiFi
┌──────────────────┐
│ ESP32-CAM │
│ (Edge Device) │
└──────┬───────────┘
│ WiFi
▼
┌──────────────────┐
│ Android Phone │
│ (Termux Server) │
└──────┬───────────┘
│
┌───────────┼────────────┬──────────────┐
▼ ▼ ▼ ▼
OTA Server Image Store Doorbell Log System
(firmware) (JPEG files) Trigger (Debugging)Data Flow[ESP32-CAM]
│
├── GET /version ───────────────▶ Check firmware version
│
├── GET /firmware ─────────────▶ Download update
│
├── POST /upload ──────────────▶ Send captured image
│
├── GET /ring ─────────────────▶ Trigger doorbell
│
└── POST /log ────────────────▶ Send debug logsFeaturesOTA Firmware UpdatesESP32 → checks version → downloads firmware → flashes → rebootsDoorbell TriggerESP32 → HTTP GET /ring → Server → Plays sound (Termux API)Image Capture & UploadCamera Capture → HTTP POST → Server → Save image locallyRemote LoggingESP32 logs → HTTP POST → Server → Terminal + file loggingMotion Detection (Custom Built)This was the hardest — and most interesting — part of the project.
Challenges Faced (And Lessons Learned)1. Power Instability (Brownout Errors)E BOD: Brownout detector was triggeredRoot CauseWeak power supply → voltage drop → ESP32 resetFixStable 5V supply → consistent operation2. Camera Initialization FailuresCamera probe failed / not supportedRoot CauseRepeated camera init/deinit → driver crashFixSingle initialization → stable runtime3. OTA & HTTP FailuresIssues:- Broken pipe
- Timeouts
- OTA loops
Improper HTTP handling + missing headersFixCustom socket server → correct headers → stable OTA4. Motion Detection NightmareEvolution:Stage 1: No detection
Stage 2: Always triggering
Stage 3: Never triggering
Stage 4: Stable + accurate The Breakthrough: Custom Motion DetectionFinal Detection PipelineCamera Frame (JPEG)
│
▼
Sparse Sampling (select bytes)
│
▼
Temporal Smoothing
│
▼
Compute Diff
│
▼
Adaptive Baseline
│
▼
Dynamic Threshold
│
▼
Multi-frame Validation
│
▼
Trigger EventSparse SamplingInstead of full frame:
Pick ~200 points → faster + stableTemporal Smoothingnew_value = (old * 3 + current) / 4Adaptive Baselinebaseline = rolling average of diff
(only when scene is stable)Dynamic Thresholdthreshold = baseline + marginBaseline Freezeif (diff < baseline + margin)
update baseline
else
freeze baselineMulti-frame Validationmotion must persist for N framesStrong Motion Overrideif diff >> baseline → trigger instantlyDetection BehaviorNo Motion:
Diff ~ 90–110
Threshold ~ 100–110
→ No trigger
Movement:
Diff ~ 130–180
Threshold ~ 110
→ TriggerLow-Light HandlingProblemLow light → sensor noise ↑ → false motionSolutionEstimate brightness → control flash
If brightness < threshold:
Flash ON
Else:
Flash OFFFinal System BehaviorInput: Environment + Movement
│
▼
ESP32 Motion Detection
│
▼
Decision Engine
│
├── No Motion → Do nothing
│
└── Motion Detected →
├── Trigger Doorbell
├── Capture Image
└── Upload to ServerWhat’s NextESP32 (edge)
↓
Server AI (human detection)
↓
Smart alerts (phone)- Human detection
- Notifications
- Video clips
- Docker backend
This project evolved from a simple ESP32 camera into a:
Edge-based Smart Surveillance SystemThanks for reading!If you’re working on ESP32, IoT, or edge AI — let’s connect











_3u05Tpwasz.png?auto=compress%2Cformat&w=40&h=40&fit=fillmax&bg=fff&dpr=2)
Comments