Dim Tass Gets a 31 Percent Speed Boost on STM32 Microcontrollers Using Core-Coupled Memory

Designed with zero wait-state penalties, CCM can have a major impact on code performance.

Gareth Halfacree
4 years agoHW101
Some STM32 microcontrollers can enjoy a major speed boost with CCM. (📷: RobotDyn)

Developer Dim Tass has published a guide to using the Core-Coupled Memory (CCM) functionality of selected STM32 microcontrollers as a means of dramatically improving performance for real-time and computationally-intensive tasks.

"Vendors need to make themselves stand out from their competitors and this is done in many different ways. Of course, the most important is the price, but some times that’s not enough, because even the low price doesn’t mean that the controller fits your project," Tass explains. "Therefore, vendors come with different peripherals, clocks, power saving modes etc. Sometimes though, vendors provide some very interesting features in their cores and in this post I will get down to the Core-Coupled Memory (CCM) that you can find in some STM32 MCUs."

Tass builds his example on an STM32F303CC microcontroller development board, which has 256kB of flash storage, 40kB of static RAM (SRAM), plus 8kB of the aforementioned CCM RAM. STMicro describes the CCM as being for "real-time and computation intensive routines [including] digital power conversion control loops (switch-mode power supplies, lighting), field-oriented 3-phase motor control, [and] real-time DSP (digital signal processing) tasks" 0 and unlike executing code from flash storage offers a high performance and a zero wait-state penalty.

Using the LZ4 compression algorithm as a benchmark, and a custom cmake which allows for flash, SRAM, and CCM RAM execution, Tass showcases how much of a difference the CCM can make: At the default 72MHz clock speed and with a block size of 8k, executing the LZ4 code from flash took between 279 and 304 milliseconds; switching to SRAM dropped this to 251ms; but switching to CCM lowered it still further to 172ms. Switching to 128MHz through an "overclocking" setting dropped the performance for flash execution to 156-171ms, SRAM to 141, and CCM to just 97ms for the same block size.

"I’ve spotted by chance this CCM RAM in the datasheet and I thought, meh, let’s try it. I was expecting that it would be a bit faster, but I didn’t expect that the difference would be that great," Tass notes. "31% faster is a lot of performance gain, you can’t ignore this, especially in time critical code."

The full write-up is available on Tass' website Stupid Projects, along with a cmake template which enables overclocking and CCM RAM usage.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles