Stack-chan is an open source (Apache-2.0 licensed) communication robot. Because it is open source, secondary development is also active, and AI Stack-chan is one of the most well-known. AI Stack Chan can converse with users using Web APIs of various Internet services such as LLM (ChatGPT), STT (speech recognition), and TTS (speech synthesis).
Later, Module-LLM was released as an expansion board for the M5Stack microcontroller module used in Stack-chan. Module-LLM is equipped with a dedicated chip for neural network inference (NPU), and can run LLM (Qwen2.5-0.5b and others), STT, and TTS on its own. In other words, using Module-LLM means that the conversation functions of AI Stack-chan can be realized using only an edge device.
In this project, we applied Module-LLM to Stack-chan first. Also, Module-LLM's default settings only support STT and TTS in English and Chinese, so it cannot be used in Japanese, but in this project we changed the default model so that it can be used in Japanese as well.
Furthermore, by replacing the LLM with other models published by Hugging Face, Function Calling was also supported.











Comments