Whenever you have to compare two items there are a lot of features you have to look at. Some of them might be important for you while others will be completely irrelevant. In this project we are looking at the performance of calculations using floating point numbers.
An example where multiplication becomes crucial is producing fractal images which was getting famous during the late 70's when professor Benoit Mandelbrot created fractal images, taking long hours to finish in those days. When doing this, a command like this
z = z * z + c;where z and c are complex numbers, has to be evaluated again and again until z exceeds a given limit, in this case about 5000000 times. As can be seen in the picture above an R3 and an R4 have been started simultaneously, The R4 has finished its job within 19 seconds while the R3 is still running and eventually will finish after 519 seconds. It is understood that this is not a typical ARDUINO application, most programs will have a lesser amount of floating point operations, and so the advantage of the R4 will shrink more or less.
Side note concerning R4-compatibility of TFT libraries:
Up to now, providers of TFT-LCD libraries did not publish updates compatible to the UNO R4. I finally found some useful hints (see credits).
----------------------------------------------------------------------------------------------------
Thanks for Christian's answer, it showed me that I didn't make my intention sufficiently clear: I just wanted to compare the floating point arithmetic capabilities of the UNO versions R3 and R4. As a reminder: the Commodore computers at the end of the 70s were equipped with 5-byte floating point arithmetic (no integer arithmetic), and a little later the British Sinclair ZX80 came completely without floating point numbers. The C compiler for Arduino can handle 4-byte arithmetic (float) easily, but the ATmega328 does not have enough registers for 8-byte arithmetic (double). The successor to the 328P, the 328PB, which was never officially supported by Arduino, was no better.
And now Arduino has decided to present the Renesas RA4M1 (which can do floating point arithmetic in hardware) as the successor to the Atmel chip.
The results in short: the R4 performs one double multiplication within 35 microseconds while the R3 takes 150 microseconds for the float multiplication.
The question remains: what is it good for?
Of course: you can easily calculate memory sizes in kilobytes (kilobytes=bytes*0.001;) so you might print: 1.024 kilobytes. However, most sensors deliver their data as integers. The advantage of faster processing of decimal numbers rarely comes into play.
Therefore an algorithm was sought in which floating-point arithmetic makes up the lion's share. Such an algorithm is included in the calculation of fractals. When executing the statement z = z * z +c; continuously that is exactly the case. For readers who may not be very familiar with complex numbers, here is a short explanation:
The complex variable z consists of a component called the real part re and a component called the imaginary part im.
The multiplication z*z produces a new complex number whose real part is re*re – im*im and whose imaginary part is 2*re*im. So there are four floating point multiplications to perform.
What is also not taken into account is that the amount of the complex number has to be calculated with each iteration, which also requires several multiplications.
In 1979, when Professor Benoit Mandelbrot developed the fractal theory, he had plenty of computing time available on mainframe computers for these calculations. In the period that followed, there were countless efforts to optimize the algorithms on smaller machines. But none of this was part of my research. If you really go for calculation speed you better apply “distributed computing” using plenty of Arduinos as slaves. Maybe I write an article about that at a later time.
At the end of the day, I still ask myself: why did Arduino swap from Atmel to Renesas. A possible reason might be: ten years ago, the British BBC developed the micro:bit board for education in schools, some time later the German Calliope was introduced which is completely unknown outside Germany. Both of them are equipped with an nRF51822 which has 32 bit registers and includes 256 kB ROM, 32 kB RAM and a floating-point unit, so the RA4M1 reaches the same data. May be the R4 should surpass their performance …


_ztBMuBhMHo.jpg?auto=compress%2Cformat&w=48&h=48&fit=fill&bg=ffffff)




Comments