Alt+128077 for This Emoji Input Device

Computer vision and tinyML take away the pain of typing emojis with a physical keyboard. No more slightly frowning face!

Nick Bild
1 year ago β€’ Machine Learning & AI
Testing out the gesture recognition capability (πŸ“·: S. Mistry)

When communicating via written text with a computer, the auditory and visual cues that normally convey expressions of emotion are missing. This removes important context from conversations and often leads to misinterpretations of the writer's original intent. Studies have shown that, by and large, we are much worse at expressing our emotions through written word, and interpreting the emotions of others, than we believe we are. So, not only do we do a bad job of it, but we are fairly certain that our incorrect interpretations are spot on. This is a formula that frequently leads to misunderstandings, hurt feelings, or bad business decisions.

Being careful about the language one uses, and proofreading documents before sending them can help, but that only goes so far. The emoji, in one form or another, has long been with us, but they have really risen into common use with the increase in popularity of smartphones. These little icons that can be embedded in text have the power to change the perception of a sarcastic comment from a cruel dig to a good-natured joke between friends. On some devices, like smartphones, they are easy enough to insert into a conversation. But on laptops and desktop computers with standard keyboards, they are still a bit of a pain to use, requiring complex keyboard shortcuts to be remembered, or on-screen selectors to be launched that break the normal flow of typing. A frictionless solution for including emojis in text could go a long way towards resolving the miscommunications that are so often experienced through this medium.

Sandeep Mistry of Arm recently blogged about a project he completed that serves exactly this purpose. He used computer vision and tinyML to detect when the person using a computer performs a gesture, like giving a thumbs up, then automatically "types" the corresponding emoji code so that it appears on-screen. The project relies on a simple-to-work-with and inexpensive microcontroller development board with a built in image sensor that serves as an all-in-one hardware solution. It was programmed with commonly used, open source tools like TensorFlow Lite for Microcontrollers, which makes it possible for nearly anyone to replicate this project on their own.

The build is powered by the OpenMV Cam H7 R2 with an STM32H743VI Arm Cortex-M7 processor operating at 480 MHz and 1 MB of SRAM. That may not sound like a lot if you are used to running machine learning algorithms in the cloud, but it is more than sufficient for this application when pairing TensorFlow Lite for Microcontrollers with the CMSIS-NN software library that maximizes performance, and minimizes the memory footprint, of neural networks running on Arm processors. Mistry found that this setup could process 96 x 96 pixel grayscale images at a rate of 20 frames per second.

Sample gesture images would be needed to train a machine learning classifier, so a gesture recognition dataset was downloaded from Kaggle, then the images were pared down to a set that represents thumbs up, thumbs down, hand up, fist, and no gesture. In total, about 14,000 images comprised this training dataset. With data collection out of the way, Mistry built a MobileNetV1 model in TensorFlow Lite, then started the training process. After it had finished and the model was deployed to the OpenMV Cam H7 R2, Mistry demonstrated being able to perform a hand gesture in front of his computer, and have the emoji automatically appear on-screen in a text editor. This was accomplished by using the OpenMV Cam as a USB Human Interface Device to simulate typing on the keyboard when a particular gesture was recognized with a high degree of confidence.

At present, the device is limited to just four simple hand gesture emojis. As you might imagine, expanding this method to a much larger set might be difficult, or even undesirable. Who would want a smiley face being typed every time they happened to crack a smile? And how useful would it be if you had to work up some fake tears to get a sad face? This device may not be the optimal solution for all cases, but it is a great way to educate yourself about computer vision and tinyML, so be sure to check out the blog post so you can follow along at home.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles