It started as a music player. I loaded a tiny **M5Cardputer** with some kids' songs I'd made for my 3-year-old, and he was instantly hooked — he carried it around for days telling everyone *"this is the little computer Daddy gave me."*
We have other small-screen gadgets he likes, but this one was different: it has a **real, physical keyboard**. That's what made him certain it was a *real* computer. So I asked myself one question: **how do I make the most of that keyboard for a 3-year-old?**
Everything below is my answer.
The design rule is simple: **every key makes a sound.** A toddler mashing the keyboard must never hit silence.
Two modes share one device.
**ABC mode — letter keys A–Z.** Press a letter and it:
- plays a short **phonics song** for that letter (letter name ×2, phonics ×3, then 3 example words);
- **writes the letter on screen, stroke by stroke** (uppercase, then lowercase);
- animates the matching word in time with the music — the apple gets bitten three times, the ambulance flashes its siren, the ostrich buries its head in the sand;
- and **any new letter interrupts instantly** (essential when a 3-year-old is hammering keys).
**Music-player mode — number keys 0–9.** Ten channels in your pocket: each digit plays a folder of your own audio (mp3/wav/m4a/…), with cover art, title, and a switchable visualizer — a tiny, offline, ad-free kids' MP3 player.
Every non-letter key gives a friendly *"ding"* — feedback without interrupting playback.
The result? He's **hooked — and actually learning.** With no instruction from us, he's taught himself several letters and can now find each of them on the keyboard in a flash (he still mashes **E** far more than any other key, for reasons nobody understands). I've half-started thinking about adding screen-time limits.
**Parent settings, kid-proofed.** All the knobs live behind a deliberate two-key combo — **Fn + Backspace** (two opposite corners of the keyboard, so a toddler mashing one-handed can't stumble into it). Inside, you can toggle the animations, switch the art style (retro-pixel or flat), pick the music-player visualizer, set the screen-off timeout (saved across reboots), choose the active content pack, and flip the device into USB-drive mode — so you can load new content straight over USB, no microSD card reader needed.
The Cardputer-ADV is a Stamp-S3A: **ESP32-S3, 8 MB flash, and — importantly — no PSRAM.** That one fact shaped everything.
- **Audio streams from the SD card** through small internal-RAM ring buffers (there's no room to load whole songs into memory), with MP3 decoded on-device via libhelix.
- **Animations are pre-rendered flip-frames.** All the heavy pixel work happens on a computer (Python + Pillow); the device just flips through 240×135 JPEGs — sidestepping the no-PSRAM / alpha / tearing traps entirely.
- **The picture follows the song.** Each letter's song is force-aligned to its lyrics, so the screen switches to the exact word being sung and fires the "action" animation on the 3-beat refrain.
- Everything is **double-buffered** (drawn to an off-screen canvas, pushed once) so there's no flicker.
The part I'm most proud of architecturally: **the firmware knows nothing about "letters" or "phonics."** It just plays whatever files a *content pack* provides.
A pack is a folder on the SD card (`/packs/<name>/`). Changing the voice, accent, or language is a **folder swap — zero code, no reflashing.** Switch packs from the on-device settings menu, or load new content over **USB drive mode** — plug the device into a computer and its SD card shows up as a USB drive, so you can drag songs and packs across without ever popping the card in and out.
That makes it a platform, not a single toy:
- **Swap the letter-key content — zero code.** Other voices and accents, other languages and phonics systems (e.g. Chinese pinyin), or different word/theme sets — anything that fits the A–Z engine.
- **Add new modes — a small firmware step beyond a folder swap.** Since every key already maps to a sound + a visual, **interactive quizzes** are a natural next step — "press the letter that says /æ/", or "find the animal that starts with B", with right/wrong feedback (on the roadmap).
And these packs aren't only mine to make — **anyone can author one** (a pack is just a folder of audio and image frames in a simple layout). I'm already building a **vehicles** pack my son is obsessed with, and I plan to keep following his interests — a **Minecraft**-themed pack is next on my list.
Today it teaches ABCs. The architecture makes it a tiny, swappable **learning console.**
## What was hardThe toy itself took a few hours. The next five days went into making it *feel right* — and that's where the real work was:
- **Pixel art that reads on a 1.4" screen.** Turning emoji into clean retro-pixel art took a dozen failed approaches (edge-rebuild wiped the animals' faces; pure-black outlines looked muddy). The fix: keep the original black pixels in place and recolor the outlines with a hue-shifted dark tone.
- **Syncing picture to song.** Landing the screen on the exact word being sung was the hardest part — speech recognition kept mis-hearing words and flashing the wrong picture. Forced alignment against the known lyrics finally nailed it.
1. **Flash** the firmware: grab `cardputer_abc_v1.0_merged.bin` from [Releases](https://github.com/fancyoung/cardputer-abc/releases/latest) and flash it (via **M5Burner** — search *Cardputer ABC* — or `esptool write_flash 0x0 <bin>`).
2. **SD card:** format it FAT32 and extract the ready-to-use content pack (also on Releases) to the root.
3. Insert the card, power on, and press **A–Z**. Number keys `0–9` are the music player.
Repo, docs, and bilingual README: **https://github.com/fancyoung/cardputer-abc**





Comments