Plumerai Brings Its TinyML People Detection Model to Espressif's Low-Cost ESP32-S3

Running at a usable 3.3 frames per second, the newly-shrunken model can track 20 people at distances over 65 feet.

TinyML specialist Plumerai has announced that its People Detection model, trained on 30 million images, is now compatible with the Espressif ESP32-S3 microcontroller — running at 3.3 frames per second in just 166kB of RAM.

"Running the Plumerai People Detection on Espressif’s MCU [Microcontroller Unit] enables new smart home, smart building, smart city, and smart health applications," the company claims of the port. "Tiny smart home cameras based on the ESP32-S3 can provide notifications when people are on your property or in your home. Lights can turn on when we get home and the AC can direct the cold airflow toward you. The elderly can stay independent longer with sensors that notice when they need help. Traffic lights notice automatically when you arrive."

TinyML specialist Plumerai has brought its compact People Detection model to the Espressif ESP32-S3. (📹: Plumerai)

The Plumerai People Detection model was originally designed for edge devices with constrained resources, but higher than those available in Espressif's ESP32-S3. The company's initial release offered 55 frames per second running on a single-core Arm Cortex-A application-class processor with a memory footprint of 1MB — but the ESP32-S3 is a microcontroller with a 32-bit Tensilica Xtensa LX7 core running at up to 240MHz and just 512kB of static RAM (SRAM).

The ESP32-S3 port of Plumerai's model is surprising svelte, even by the company's standards, requiring just 166kB of RAM at its peak and running at a usable if not speedy 3.3 frames per second on a single Xtensa LX7 core. To prove it, the company has announced a demo designed for the ESP-S3-EYE AI board — pairing an ESP32-S3 with an Omnivision OV2640 camera and 1.3" 240×240 SPI display, plus an always-useful 8MB of external pseudo-static RAM (PSRAM).

According tot Plumerai, the newly-shrunken model is capable of detecting up to 20 people and distances over 65 feet, entirely on device, and works with indoor, outdoor, and low-light footage. Interested parties can contact the company to receive access to the ESP32-S3-EYE demo.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles