Arm vs RISC-V? Which One Is The Most Efficient?

  Переглядів 115,011

Gary Explains

Gary Explains

День тому

Arm has been making power efficient processors for decades. RISC-V is relativity new and many parts of its specifications aren't even ratified, but that hasn't stopped chip designers making RISC-V processors, including microcontrollers. Can RISC-V challenge Arm's power efficiency supremacy?
---
Let Me Explain T-shirt: teespring.com/gary-explains-l...
Twitter: / garyexplains
Instagram: / garyexplains
#garyexplains

КОМЕНТАРІ: 407
@Matthigast
@Matthigast Рік тому
2010 does indeed seem 23 years ago
@GaryExplains
@GaryExplains Рік тому
🤦‍♂️😜 Darn. That was a stupid mistake! But I think you are right it feels like soooo long ago!
@kurakuson
@kurakuson Рік тому
Apple's first iPad: April 2010
@ZDevelopers
@ZDevelopers Рік тому
@@GaryExplains Future proofing the video, that's all
@TechPill_
@TechPill_ Рік тому
@@ZDevelopers Yea that's what I was going to say
@NovasVilla
@NovasVilla Рік тому
It’s so sad hear that 😢
@rajivpalayan7028
@rajivpalayan7028 11 місяців тому
For measuring relative performance, it is wrong to do a per MHz calculation. The only metric that should matter is the total time needed to run the same application on both processors. A more complicated ISA means clock speeds will be reduced (which gives better per MHz performance), but that does not mean the processor is faster
@ralfbaechle
@ralfbaechle Рік тому
RISC-V, ARM and others like MIPS which I have plenty of experience with are just architectures; the chips you can buy are implementations of these architectures. First thing to notice from a 30,000ft perspective is that 64-bit ARM, MIPS and RISC-V are surprisingly similar. In the past CPU architects were more adventerous. These days no more bat crazy shit like segments (x86) or register windows (SPARC, and totally batshit crazy on IA-64). MIPS is an early but well designed RISC architecture; 64-bit ARM (which fortunately is rather dissimilar to the 32-bit ARM architecture) is surprisingly similar. Which is unsurprising because one of the architects used to work for MIPS. And RISC-V was designed by the fathers of SPARC and MIPS. So are they all the same? Not quite but 64-bit ARM and RISC-V benefit significantly from hindsight. Now, once you take things to the limit things will be different. RISC-V's smaller footprint allows fitting more cores running at a higher clockrate on a die. It barely matters for the birdseed class of microcontrollers that's polluting most PCBs ;- So for most uses architecture doesn't matter - software does. That's where ARM is very well supported, MIPS is well established and RISC-V is still catching up. That said, the folks behind RISC-V is smart and have impressed me by what they have achieved and in my discussion so they're going to close that gap. Plus hgiher end implementations are going to show up. Being a truely open architecture however the RISC-V market can be as confusing as a ant pile - or open source in general ;-)
@markhaus
@markhaus Рік тому
RISCV having open specs should make software catching up easier than for arm no?
@ralfbaechle
@ralfbaechle Рік тому
@@markhaus Yes and no. Availabillity of documentation is greatly simplyfying a port to a new architecture, especially when there is already a port to a simila architecture. It still remains a major underrtaking in terms of manhours required. Been there, done that. Three times 🙂 As far as software development is concerned, x86, MIPS, RISC-V, MIPS, ARM are open enough to aloo development of dcent software. IA-64 was special in that it's performance characteristics are ... complex. without NDA or possibly even being Intel it was not possible to certain software including high-end compilers. The level beyond that is licensing for the architecture itself to develop a core by anybody who wants to. RISC-V didnt really inovate there, there have been other such public domain or similarly unrestricted architectures before. But they were the first polished architecture with academic and industry acceptance, documentation and very liberal licensing on top. It's this mix (and probably a few more things on top) which made the rise of RIsc-V possible.
@conorstewart2214
@conorstewart2214 Рік тому
It’s not just about the people who designed the RISC-V ISA it is also about the people that implement it, they are the ones that actually determine the efficiency and clock speeds and similar.
@ralfbaechle
@ralfbaechle Рік тому
@@markhaus Yes and no. Programming specs are open for most architectures though the degree of detail varies. On the example of Intel, Intel CPUs accept only a signed blob as microcode and the microcode programming interface is not even documented. Probably few users care. More painful was Intel's attitude towards protecting the IA-64. The Merced documentation covers four thick books printed on thin paper and is also available for download (1-click sign-away your soul acceptance of terms required ;-) but errata required and NDA and certain very deep secrets were only available under terms of a much stricter NDA - the most restrrictive I've ever seen. Finally some further aspects such as deep details on performance aspects of the pipeline which are essential for the ijmplementation of top notch code generators and compilers were not available outside of Intel at all. I don't want to single out Intel but just picken them as an illustrative example Companies very in their degree of paranoida, protectiveness and openness and corporate history and experiences are part of that. RISC-V may be an open architecture. That means the architecture is open. It does not mean an actual implementation is open. It is possible to implement a fully RISC-V-compliant processor under the terms of the RISC-V licensing conditions - great. Yet iI can keep the implementation as closed as a traditional microprocessor implementation from companies such as Motorola, IBM, Intel, AMD, HItachi, MIPS, ARM etc. The result may be something that executes RISC-V code just fine yet for certain aspects such as performance has to be treated as opaque, as a black box. As somebody who has ported Linux to MIPS it's been occasinally helpful to have access to the folks who did all the mental heavy lifting and wrote the specs. With a RISC-V compaliant core one may or may not have the same kind of access. for a particular project.
@ralfbaechle
@ralfbaechle Рік тому
@@conorstewart2214 While this is correct one should consider the architecture defiinition of any processor architecture as something that sets the absolute limits of what's possible. A good implementation can reach for near 100% of that; a bad one will stay well below. With my MIPS experience I found the size of early MIPS and RISC-V cores which are somewhat comparable And the RISC-V implementation is much smaller in terms of transistors / gates which is an indication of how polished the architecture is. Ok,RISC-V had the benefit modern software tools to aid the implementation. Such comparisons across decades are bound to limp somewhat. An interesting aspect is how the RISC-V architecture is made up of several optional parts. Just to pick one example, an implementation does not need to have a multiplication or division instruction. They were looking at other architectures' pain points. MIPS was born as a super-fast RISC processor for super-mini computers, later workstations and servers. Nobody early on thoght of embedded computing. Such omissions are hard to rectify lateron in a clean manor. One point where RISC-V is brutally efficient is cost due to absence of licensing fees for the architecture itself. To some users that's the #1 aspect that matters.
@marklewus5468
@marklewus5468 Рік тому
One more comment. Most processor manufacturers produce a spec called DMIPS/MHz, or millions of integer calculations per megahertz clock speed. This allows you to do a clock for clock comparison between parts.
@ralfbaechle
@ralfbaechle Рік тому
Let me wind back the clock to the mid-80s to point you at the horrors of the Dhrystone benchmark which back then was more or less the canonical benchmark for integer performance. Even in the best case Dhrystone results didn't represent real world performance very well. Dhrystone wasn't only ignoring fp math entirely, its results also got more and more comically absurd as architectures got more sophisticated (caches and out-of-order made a giant difference) but also as compilers improved and started to "optimize away" part of dhrystone. The peak was rached when certain compilers started to recognice Dhrystone and applied Dhrystone-specific optimizations for almost arbitrary benchmark results - whatever marketing orders ;-) It gives me headaches to see parts of the industry are still using DMIPS decades after it's been throughly proven to be rubbish. (It seems many folks don't know these days - the D in DMIPS stands for Dhrystone).
@marioprawirosudiro7301
@marioprawirosudiro7301 Рік тому
@@ralfbaechle Thank you for this informative comment. One learns something new everyday.
@mnomadvfx
@mnomadvfx 6 місяців тому
It's an artificial value though and not very helpful for real world performance comparisons. I know this because Qualcomm quoted quite a high DMIPS for their Krait CPU core back during the ARMv7-A generation, and it routinely got thrashed by the lower DMIPS rated Cortex-A9 based SoCs in actual performance.
@markwarburton8563
@markwarburton8563 Рік тому
I was surprised that the now somewhat venerable Black Pill did so well in these tests against the newer upstarts, especially in power consumption and power efficiency. Thanks Gary!
@dekus80
@dekus80 Рік тому
Not just "black pill" but stm32f401 or 411. Today one mc on black pcb tomorrow another... And f411 has 12.7mA at 100MHz core with periph disabled. Not all periph is need to be on. I have doubts about this video test 20mA. The Chinese have a lot of analogues stm32. And for example CH32V203 (f103 clone with riscv core) has 8mA at 144MHz. CH32V30x ( riscv core with fpu) 12mA at 144MHz. And they have ever tssop20 case. As f103 clone CAN onboard, that f411 doesn't have. And 307 has Ethernet, 208 has bluetooth + Ethernet and 2.2$ in my local store. I have not been interested in buying stm32 for a long time. Only Chinese only like stm32 has CH32, HK32, AT32, GD32 and so on.
@terrylyn
@terrylyn Рік тому
When I studied computer science Risc-V was my favorite to program. Good to see they are now doing a comeback.
@jacobrosen
@jacobrosen Рік тому
A nice explanation as always. But I'm missing the sleep current for the different boards. It would be intresting to see how they perform compared to eachother. It is more if a comparison between MCU brands than core architechture, but still! :D
@jamesmcintyre2747
@jamesmcintyre2747 Рік тому
I appricate that Gary is right here that RISC-V is not yet *as* effecient as but I'm very impressed that RISC-V is already *almost* as efficent as ARM with for the same processes being run 1.36mWh compaired to the equivlent ARM board getting 1.31mWh and even compaired to the *much* more established Pi Pico, it's only 8% less effient (than the Pico). Obviously being almost 89% less efficient than the Blackpill isn't ideal for RISC-V but this is still early days for it compared to ARM and just with there being so many less RISC-V processors produced vs ARM, I don't think you can expect it to be beating the leaders of the pack in ARM just yet. Maybe when there are as many models of RISK-V processor as ARM processors the leader will be arm. Maybe with more time for tuning, the leader of the RISC-V pack will beat the leader of the ARM pack; even with less models out there. Encouraging stuff. Stating my bias: I want RISC-V to succeed as I think open source is the way forward and garding "intelectual property" like dragons over gold, is holding humanity back. Thanks for the interesting video Gary!
@kayakMike1000
@kayakMike1000 Рік тому
You're in the realm of potential compiler optimizations... And which process node these chips are made of...
@xade8381
@xade8381 Рік тому
arm & risc-v are nearly of same age. Sadly, only ARM got attention at that time.
@TheWallReports
@TheWallReports Рік тому
I agree. RISC-V is not there yet but made a very good showing being the new kid on the block. ARM has been at this game for decades. It is unrealistic to expect the new kid to outperform the veteran. ARM has been optimized over decades. RISC-V has to pay its dues to take the crown. I am strong RISC-V advocate. I look at this as there is plenty of room for RISC-V to improve. The ground to cover in some areas are not that great to close the gap.
@BruceHoult
@BruceHoult Рік тому
@@xade8381 that's not correct. ARM started to be designed in 1983 and the first chips and boards were in 1986. ARM the company started in 1991, when there were already 100,000 ARM-based Archimedes PCs in use. RISC-V started to be designed in Berkeley university in 2010 (27 years after ARM), the initial frozen spec was published in 2014, the first board you could buy commercially from the first RISC-V company was in 2016 (30 years after ARM).
@TAP7a
@TAP7a Рік тому
@@xade8381 RISC-V was still an educational tool for years though, with zero plans for reaching any sort of market. Whereas ARM was made from the very beginning as a commercial ISA, and is 30 years older to boot. Not very comparable.
@ShirleyMarquezDulcey
@ShirleyMarquezDulcey Рік тому
One detail that you missed is that the Pico and Pico W do not have a linear regulator; they have an on-board buck-boost switching power supply. Current consumption will not be constant; it will go up as voltage decreases.
@derekchristenson5711
@derekchristenson5711 Рік тому
That was very interesting to see, and I liked the different ways you picked to look at the question.
@abdox86
@abdox86 Рік тому
I really love the black pill , actually currently I’m working on project using it (STM32F411CE), so glad to hear it did will in the benchmark, but Gray I have question Did u write the program for each board in assembly or C ? In case the answer C , then What compiler did u use for each one? I hope didn’t throw up a lot of questions 😅😅. Amazing work man, thanks a lot for this benchmark and I hope see more of them!!
@petermolnar6017
@petermolnar6017 Рік тому
Thanks Gary for this wonderful comparison! Greatly appreciated!
@GaryExplains
@GaryExplains Рік тому
My pleasure!
@bsheldon2000
@bsheldon2000 Рік тому
I would love to see the xiao nrf52840 board or equivalent, put to the test as it is running at 64 mhz. This is the microcontroller used on a lot of smartwatches. Plus it would also be interesting to see the boards already test, retested at lower clock speeds, if that option is available. I know some esp32 can have the clock lowered. For pure power efficiency, I believe lower clock speed tends to be more power efficient for the same work done, as power usage tends to go up on an exponential scale, whereas processing power for the same processor goes up linearly. If the amps are the same at 3.3 and 5, then it is using an inefficient regulator to drop the voltage. Just curious, did you calculate the power efficiency using 3.3 or 5 volts? I am not a fan of any architect as I just use whatever is better suited to the task. Of course having one that does it all would be nice and save having to learn all the differences, but now that assembly language is rarely used, it is not like having to learn an entirely new instruction set. By the way, if anyone gets a xiao nrf52840, if they say double click the button beside the usb c, the double click speed is a bit slower double click than I was used to. Took me a lot of tries to get it right. Luckily someone mentioned doing a slow double click somewhere.
@magfal
@magfal Рік тому
A huge factor for efficiency is compiler quality which grows with age. The major design differences ariund efficiency is stuff like dark silicon for common tasks and SIMD engine implementation plus caches.
@laci272
@laci272 Рік тому
As I watch, I get questions, and as soon as they pop into my mind, Gary already responds to them. It's rare that a tech video is this well thought out and structured this well!
@gigihanmandarin
@gigihanmandarin Рік тому
the one and only legendary Gary Explains.
@angeldude101
@angeldude101 Рік тому
You mentioned that you encryption algorithms don't use floating point or integer division, but does use bit manipulation. I'll ask if it also uses integer multiplication, because multiplication by default comes in the same extension as division, but was also made available on its own as Zmmul. Bit manipulation instructions beyond basic bitwise logic are also their own extension B and its parts. Did the RISC-V processors used support these extensions, and if so did you tell the compiler to use them when compiling your code?
@marklewus5468
@marklewus5468 Рік тому
Great work as always. Benchmarking is always a can of worms because it is as dependent on the application as it is on the processor. Do you need fast integer? Fast interrupt response? Floating point? DMA? If you used newer M3 and M4 parts they would have performed much better even in this integer-only test both with regard to processing speed and power consumption given that they’re built on *much* newer process nodes. And a recent STM32 M7 would’ve blown everything else out of the water.
@BruceHoult
@BruceHoult Рік тому
Why are so many of the commenters here so obsessed with process node? It strikes me that many (not aimed at you in particular Mark, sorry) may just be reciting jargon without understanding it. Even a very old node such as 180nm is good enough for making a 300+ MHz chip (e.g. the SiFive FE-310 on many RISC-V microcontroller boards) which is plenty for anything in this test. Smaller process nodes do allow higher clock speeds, but if you're not USING that ability then they are not just a waste of money in the much more expensive design and manufacturing process, but they may actively be WORSE because of things such as higher leakage current when operated at low clock speeds or in low power sleep modes. It's also a complete waste when you're making a simple stand-alone chip such as a microcontroller with a small core and a small amount of SRAM because even with the old nodes you end up with the actual processor&memory being a tiny little square inside a huge bit of silicon with the I/O pin pads taking up 90% or 99% of the extremely expensive small process node chip area. The default assumption unless you're a real expert should be that the manufacturer has chosen the best process node to optimise what they want to achieve with their chip.
@adymode
@adymode Рік тому
We are familiar with needing to sample the test code many times to generate benchmark results which are not misleading, but it is also essential to sample different kinds of test code, to not be misled even by random compiler differences on each bit of code tested. With the performance results between the esp-c and the black pill coming within 1% of each other, that suggests the test was entirely memory bound on those systems and the systems share very similar memory systems. Multiple programs need to be benchmarked for a picture to emerge.
@DFPercush
@DFPercush Рік тому
@@BruceHoult In general I would think that a smaller feature size would mean less parasitic capacitance, but I didn't think about leakage current. Is that from quantum tunneling? I wonder where the sweet spot is for that. But there's also the matter of different topologies like finfet and gaa, that might reduce the switching current. Mostly I think it's an economic decision. Everybody wants better speed and battery life, but how much are they willing to pay for it? For a computer that only runs a single program continuously, all you need is "good enough". Microcontrollers often have external power anyway. The main concern vis a vis power consumption is cooling.
@repostor
@repostor Рік тому
Very interesting article. I have always thought about how RISC-V would be compared to ARM. Do you have similar comparising for enterprise chips too? comparing RISC-V with x86 (Intel/AMD) and perhaps also including ARM?
@autohmae
@autohmae Рік тому
I think 'process node' probably has a huge influence
@vikaspoddar9456
@vikaspoddar9456 Рік тому
I guess Gary, you really should put out a video series explaining the differences between ISAs, microarchitecture, process node etc. to the general public, as I have watched many people are disagreeing with you on various issues. I think this video series will work as prelude to ARM vs RISCC-V video BWT i also felt that I need some more help 😅😅😅😅 on this. Thank you
@muha0644
@muha0644 Рік тому
1:10 I did, RV32I in fact! although i had a hard drive failure so now it's abandoned...
@adrianalanbennett
@adrianalanbennett Рік тому
Hello from Tennessee, Mr. Simms. Love your channel. Thanks for the video.
@BrianKelsay
@BrianKelsay Рік тому
Not sure if this is a valid question, but here goes. Based in these clock speeds, could one of these chips act as a processor in a micro DOS or Windows environment? Thinking kiosk that runs a corporate webpage and allows customer data entry or order entry on-site. Or tiny web book or a tablet just for web or ebook reader where its mostly text. I know that the Pi, which is more powerful and has a video decoder is slow at video and graphics. Just thinking that if not much computing power was needed, you could pair with a mid power graphics chip for running the display and decoding video streams. Then maybe you get TVs with minor computing and networking power. Or is this how they are making smart TVs?
@alexseleni3314
@alexseleni3314 Рік тому
Literally searched this a few days ago with all the news about RISC V Vs ARM. And there was no video. Thank you for this one.
@GaryExplains
@GaryExplains Рік тому
What news are you referring to? Also, did you see this video of mine? ukposts.info/have/v-deo/f6mIrZ-ieWiZp6c.html
@alexseleni3314
@alexseleni3314 Рік тому
@@GaryExplains Talking about an efficiency specific comparison.
@IamTheHolypumpkin
@IamTheHolypumpkin Рік тому
I just bought my first RISC-V chip, an esp32-c3 from adafruit. Mostly bought it to learn RISC-V Assembly. Generally want to learn AVR, ARM and RISC-V Assembly.
@BruceHoult
@BruceHoult Рік тому
Sounds like a great plan. All are good ISAs. If you have any questions the Reddit /r/asm forum is pretty good for any ISA, and /r/avr and /r/riscv are helpful too. Sadly, /r/arm seems dead and/or non-technical.
@ReneDeGroot
@ReneDeGroot Рік тому
Do you have thoughts about potential of Risc-V? I was curious about any production difference, like applied node size.
@GaryExplains
@GaryExplains Рік тому
I talk about RISC-V's potential in my RISC-V series.
@ReneDeGroot
@ReneDeGroot Рік тому
@@GaryExplains indeed you did! Quite a few as well ukposts.info/slow/PLxLxbi4e2mYFTkLsNYqWLrSQZtLB94wnY
@AbelShields
@AbelShields Рік тому
I'm glad you addressed the point about WiFi on/off not making a difference, although I'd like to ask about those mWh numbers - you said for the ESP32 that it's the same current draw whether you're supplying 3.3V directly or 5V, so which voltage are these energy numbers for?
@GaryExplains
@GaryExplains Рік тому
They are for 5v. But they are all 5v (i.e. for all the boards). I have the 3.3v numbers are well, but of course it changes nothing, just smaller numbers.
@AbelShields
@AbelShields Рік тому
@@GaryExplains so all these chips run at 3.3V natively? Fair enough
@volodumurkalunyak4651
@volodumurkalunyak4651 Рік тому
@@AbelShields some chips use lineal voltage regulator (5V to 3.3V) some - switching voltage regulator (at least Raspberry Pi Pico with an RP2040). Switchers waste way less power (probably 92% efficiency for regulator, 94% efficient reverse voltage protection, 85% in total vs 64-66% in total with lineal one)
@kasperlhde7893
@kasperlhde7893 Рік тому
Interesting video :) I do not think it is enough to just to power the 3.3v rail since there are other onboard electronics which also require a power (usb to serial converter) on the esp32 chip. It could have been interesting to see it compared to the datasheet :)
@nateb1804
@nateb1804 Рік тому
The silicon fab processor node tech used to make the chips plays a huge role in their efficiency. It would be good to include fab node info in the comparison data.
@GaryExplains
@GaryExplains Рік тому
Indeed, it is something I will note for future videos. As for this video the key is that the Arm Cortex-M4 is using 90nm and the RISC-V ESP32-C3 is on 40nm, which makes the performance of the RISC-V processor even worse.
@nateb1804
@nateb1804 Рік тому
@@GaryExplains Wow that's very telling. Thanks Gary!
@geoemm
@geoemm Рік тому
Also the area of the chip also should be a criteria
@BruceHoult
@BruceHoult Рік тому
Crazy to use only a single RISC-V board as representing a whole ISA. Obviously not all ARM cores or boards are created equal, and neither are all RISC-V cores or boards. Espressif doesn't even say in their datasheet what RISC-V core it uses. Crazy also not to include Sipeed Longan Nano ($4.80, 108 MHz, been around for three years), some Bouffalo lab BL602 board (similar price to ESP32s, we know it uses a SiFive core) or even extend the price limit a fraction to include a K210 board (dual core 400 MHz 64 bit) such as Maix Bit. Still, it is interesting to see that from the same chip/board manufacturer the RISC-V does in fact give better performance per MHz and per Watt than what they were using before. A really interesting test would be the Longan Nano (GD32VF103 clone of an STM32 but with a RISC-V core) vs either a GD32F103 (same manufacturer STM32 clone with a real licensed ARM core) and/or a real STM32F103.
@aneeshprasobhan
@aneeshprasobhan Рік тому
Top notch work ! Thanks for the video :)
@GaryExplains
@GaryExplains Рік тому
Glad you liked it!
@borbetomagus
@borbetomagus Рік тому
Hopefully you look into purchasing the DeepComputing/Xcalibyte ROMA RISC-V laptop (or a related RISC-V laptop or desktop) for a future video, but much more refinement will probably be necessary for it to reach it's full potential.
@gamerlucky
@gamerlucky Рік тому
waited for this for so long ... thanks to mr.sims for making it happen finally. thank you
@marcusk7855
@marcusk7855 Рік тому
Isn't the manufacturing process(how many nm) a major factor in power consumption?
@fakecubed
@fakecubed 29 днів тому
Gonna take a while for the RISC-V manufacturers to figure out how to design really great chips with it, but there's no reason not to expect it will be roughly the same as ARM in the long run, just with an open ISA which is an absolute win on its own. Hobbyists who aren't trying to squeeze every last bit of performance and efficiency out of their projects should support RISC-V to help it along and encourage faster development. It's already outpacing ARM's development, which was already quite rapid.
@kychemclass5850
@kychemclass5850 Рік тому
V. Informative. Tq :)
@broccoloodle
@broccoloodle Рік тому
Back in uni, I still remember the active power (total power - leakage current power) is proportional to square of frequency. Can we use it to extrapolate the power usage of the pi to 160 or 240 mhz?
@GaryExplains
@GaryExplains Рік тому
Or better still watch my previous video on this topic where I actually changed the clock speed of the Pico and measured the power usage.
@volodumurkalunyak4651
@volodumurkalunyak4651 Рік тому
Active power is proportional to Vcore^2 * frequency. Not frequency squareq but just frequency multiplied by core voltage squared. You may get around frequency squared when cores are pushed harder than above mentioned microcontrollers (not as hard as full boost latest Intel or AMD chips, frequency still has to be supported by changing core voltage).
@leonardosabino2002
@leonardosabino2002 Рік тому
@@volodumurkalunyak4651 The formula I remember from university is proportional to voltage and to frequency squared (P ∝ V * f^2).
@volodumurkalunyak4651
@volodumurkalunyak4651 Рік тому
@@leonardosabino2002 i literally wrote the very same formula: Vcore^2 * frequency power is proportional to frequency and to voltage squared. Power scaling does also resemble frequency squared at some part of volt-frequency curve (probably 0,7 to 1V region for latest chips)
@leonardosabino2002
@leonardosabino2002 Рік тому
@@volodumurkalunyak4651 Not the same formula. Look again, it's the -frequency that's squared.- EDIT: I just looked up the formula, looks like voltage squared is correct. Sorry about that.
@TheElectronicDilettante
@TheElectronicDilettante 8 місяців тому
Were the connectors taken into account? USB C has transfer rates close to 10Gbps while micro usb is pushing over 450 Mbps. Then as far as power, USB-C handle nearly an order of magnitude power than the micro usb at 100W. Just curious.
@GaryExplains
@GaryExplains 8 місяців тому
The test didn't use the USB ports.
@JohnnieHougaardNielsen
@JohnnieHougaardNielsen Рік тому
I'd say that when it comes to efficiency, a number of major interest is how much power the chip/board burns while idle. Typical MCU systems are not to crunch numbers, but for control purposes. Numbers when ready to respond to Wifi may be the most interesting, but of course there are also applications not needing wifi while waiting to do a bit of work. As usual, comparing MHz across architectures is not useful, a more realistic yardstick could be a "maximally trivial" task like how fast it can count.
@peterschets1380
@peterschets1380 Рік тому
Thanks Gary, now i must think about an application that does allot of calculations.
@IamTheHolypumpkin
@IamTheHolypumpkin Рік тому
Out of curiosity how would the old ATmdga328p fair in such a comparison. Max 20Mhz , very very old node (I think I once looked it up and it was still in the micrometer range).
@Nomoreidsleft
@Nomoreidsleft 7 місяців тому
No comparision. Atmega328 is an 8-bit processor.
@riscy00
@riscy00 Місяць тому
I'm more interested on what the best ide to use and library for risc-5. Are they up to stm32 level of library support however buggy they may be.
@riscy00
@riscy00 Місяць тому
I mean in context of cmsis which includes dsp and many things
@GaryExplains
@GaryExplains Місяць тому
The difference is that STM32 is from a company, i.e. STMicroelectronics, to support the ARM chips they make. RISC-V is an architecture, so you need to pick a company and see what it provides. The Espressif chips seems to have a mature development system for all their processors including Arduino support.
@GaryExplains
@GaryExplains Місяць тому
Also, CMSIS is an Arm thing, from Arm itself.
@riscy00
@riscy00 Місяць тому
What about DSP math tools, QMath, and feature like this, are they included in the Espressif ?, is there nice youtube session to give me overview. I review that last time (1-2 year ago) but not confident with IDE setup that seen so far
@GaryExplains
@GaryExplains Місяць тому
You appear to have detailed requirements, and I'm not sure I can provide a thorough response that fully addresses all of your questions. It might be best for you to reassess these platforms to determine if they align with your needs.
@TheEulerID
@TheEulerID Рік тому
I think it quite surprising that a 13 year old design stands up so well. I would suspect that if the power saving features of more modern ARM processor designs were to be exploited for a micro-controller SoC, then it might do better still. However, presumably the priority has switched to producing much more powerful, low-power architectures for use in servers, laptops and the like. producing the ultimate in low power micro-controllers is probably not a priority as these things are rarely required to do heavy number crunching.
@pnachtwey
@pnachtwey Рік тому
4K fits in the cache. What about external memory access speew?
@todayonthebench
@todayonthebench Рік тому
A decent video. And yes, instruction set architectures don't largely impact power efficiency. Hardware implementation however impacts efficiency far more. But there is nuances on the ISA level that sets limits for actual implementations of the ISA. Be it limits on minimum transistor count, power efficiency, peak clock speed, etc. Sometimes one has to trade one aspect for another. As an example, a resource efficient architecture using few transistors will generally not offer all that great peak performance. While a more peak performance oriented ISA will tend to be hard to build with few resources. Power efficiency is meanwhile largely decoupled from this view of complexity, since power efficiency is more about how well a given piece of software can make use of the architecture provided. It is oftentimes better for efficiency to have dedicated instructions for complex tasks, but what tasks to choose is a debatable subject in itself. If one throws in everything but the kitchen sink, then it is often far from trivial to make an efficient hardware implementation of it in practice. In short, designing an ISA is all about compromises to reach a prespecified goal. And then make a good hardware implementation of that along the way. Then it is up to the market to find/make applicable software for it.
@ristekostadinov2820
@ristekostadinov2820 Рік тому
Are all these microcontrollers fabbed on same process node (and by same manufacturer), for example fabbing m4 or 40nm and 20nm will differ in performance and power efficiency.
@GaryExplains
@GaryExplains Рік тому
The Arm one is on 90nm, the RISC-V on 40nm.
@rursus8354
@rursus8354 Рік тому
Good video. 13:59: Board A uses 20mA·26s = 0.52 Coulomb = 3.2448·10²¹ electrons to accomplish the task, and Board B uses 51mA·18s = 0.918 Coulomb = 5.72832·10²¹ electrons, so Board A peruses only ~57% of the electrons that Board B uses. Therefore A is more efficient.
@andrewsutton6640
@andrewsutton6640 Рік тому
How do these compare with x86 chips, specifically in running programs that are designed for x86?
@GaryExplains
@GaryExplains Рік тому
x86 chips can only run Arm and RISC-V programs using emulation. The opposite is also true.
@Zhaymoor
@Zhaymoor Рік тому
great video, thank you
@kayakMike1000
@kayakMike1000 2 місяці тому
Well... There are certain extras in your core implementation that will make a difference; stuff like the different caches and the coherency mechanism, the branch predictor, cpu internal bus and the bus arbiters, there's just so many extra internals that are all abstracted away in complex logic. Some of that complex logic is just more appropriate to implement in another program, i think some of the cpu caches are governed by a whole other "management engine" that runs its own firmware to keep track of the bits in the cache....
@adriancoanda9227
@adriancoanda9227 Рік тому
On the eficient test, what data was used and where it was stored
@darssmare915
@darssmare915 Рік тому
Nice. You mentionned design importance but, to reiterate, the designer of the microcontroller is important here. I think your results show STMicro expertise.
@ryan258147
@ryan258147 Рік тому
You also need to consider the code density. The firmware binary size is usually smaller using ARM cortex compare to RISC-V or ESP32.
@mrrolandlawrence
@mrrolandlawrence Рік тому
There is also a compact version of arm called thumb which offers higher code density.
@JamesFraley
@JamesFraley Рік тому
Great video! I'd love to see one where you analyze just power efficiency. I use microcontrollers around my house to monitor just about everything. I'd love to know which would last the longest on a battery. They need WIFI so they can report in. But my requirements use very little processing. Just check the sensor and report in. Thanks!
@lepidoptera9337
@lepidoptera9337 10 місяців тому
You aren't doing anything around your house that requires more than a lemon battery's worth of power. What you would need, though, are low power drivers for your network, which are hard to get, it seems. Just use whatever works and plug it into the wall. Who cares about a couple of Watts of extra power consumption.
@gadlicht4627
@gadlicht4627 Рік тому
It might be better to run multiple types of programs bc different ones may compute using different power drawing
@savejeff15
@savejeff15 Рік тому
About the ESP32 power consumption when powered by 3V3: cheap LDOs like the AMS11x consume quite a lot of power when only voltage is applied at the out pin. Is in the range of 3-10mA
@GaryExplains
@GaryExplains Рік тому
And on the 3.3v pin?
@savejeff15
@savejeff15 Рік тому
@@GaryExplains Yes cheap LDO Voltage Regulator like the AMS1117 consume power even when no voltage is converted by it. this is called "Quiescent Current" and can be found in all LDO datasheets. For the AMS its between 3 - 10mA and is the main reason why cheap EPS32 boards consume above 1mA when in deep sleep. There are better one available but they cost 60 cents not 6 cents. I had to find this out the hard way when when designing a battery powered ESP32-S3 board. Its enough to have 3V3 on the output pin of the LDO for this current to flow from the 3V3 output to GND through the LDO chip. its a kind of leak current. an easy way to fix this is to just desolder the LDO and power it directly with 3.3V on the 3.3V pin
@GaryExplains
@GaryExplains Рік тому
Thanks for the info, very helpful. 👍
@savejeff15
@savejeff15 Рік тому
@@GaryExplains no problem ;] power consumption is a bitch x]
@nitinj1234
@nitinj1234 Рік тому
Hi Gary, it would be really interesting to have you do a Intel Atom/E-core (Alderlake/Gracemont) architectural deep dive video, and a comparison to Arm/Risc-v.
@MarquisDeSang
@MarquisDeSang Рік тому
They are too fare away to be compared.
@MatrixJockey
@MatrixJockey Рік тому
e-cores aren't efficient whatsoever
@adriancoanda9227
@adriancoanda9227 Рік тому
Ah, it can't be compared. Atom is an x86 cpu, and it depends on how the cache and fsb are set. Arm chips usually operate at max 0.5 volt atom can go up.to 2 volt on turbo so is definitely diffrent clas of cpu ah why not against an ia64 cpu lol 😆 wanna se that race 😆
@MarquisDeSang
@MarquisDeSang Рік тому
@@adriancoanda9227 In the end what makes the difference between slow and fast, is 99% software. I would win that race if I am the programmer : would use inline assembly, lookout tables with pre-computed values, would not miss the caches with visibility list, local goto.... Sofware always wins.
@adriancoanda9227
@adriancoanda9227 Рік тому
@MarquisDeSang not always. It still needs hardware to run on y saw some remastered games to be used via the browser chromebook target it ah tnd that launcher looped 2 gb of data but target just on cpu core so the loading screen took 10 minutes search for ah y like to see you in quantum pc your thinking won't apply there cause it is not a digital cpu is analog and capable of insane parallel computing and it exists already a portable one withou a transmission it won't run any apps like you are used to, 😉
@darthrainbows
@darthrainbows Рік тому
May have already been mentioned, but Amps != power. When you change the input voltage to 3.3V and the current doesn't change, that indicates a change in power. I'm not fasmiliar with these boards, so IDK what the initial input voltage as, but if we assume 5V, and the current doesn't change when switching to 3.3V, then that is a 34% decrease in power.
@GaryExplains
@GaryExplains Рік тому
Yes, of course, but that doesn't change the relative results, does it. What exactly is the point you are making?
@volodumurkalunyak4651
@volodumurkalunyak4651 Рік тому
@@GaryExplains yes it does change relative results. Rpi pico does have a switching regulator not lineal one that outher boards have.
@daniahmed
@daniahmed Рік тому
Gary, there was an article that i read about a week age that Apple may be shifting away from ARM to RISC-V. What do you think that Apple will switch to RISC V or continue with ARM for the time being?
@GaryExplains
@GaryExplains Рік тому
If we read the same article it says that Apple is using RISC-V for some of its small co-processors, that is all. It is a good engineering choice, if it has to design bespoke hardware blocks then RISC-V is a workable solution.
@daniahmed
@daniahmed Рік тому
@@GaryExplains Maybe but that Article had some text about moving to RISC-V that Apple might be considering. Moving to RISC-V would benefit Apple in long-term as they wouldn't have to keep paying ARM for royalties or whatever deal they have with ARM. What's your take on this?
@GaryExplains
@GaryExplains Рік тому
No, that part was just pure speculation because otherwise it would be a boring article and no one would read it.
@daniahmed
@daniahmed Рік тому
@@GaryExplains ok, thanks for clarifying.
@psiah9889
@psiah9889 Рік тому
As I see it: Arm's been around for a while. It's had an awful lot of work put into its efficiency, power, etc. over the decades. RISC-V is new, and there isn't a lot of money in perfectly optimizing it (yet). The fact that it is at all competitive now is a good sign for things to come, but it's gonna need more time, work, and support to be fully realized in this regard.
@georgeh6856
@georgeh6856 Рік тому
This is good for RISC-V. It is comparing ARM which has been around (and refined) for decades with RISC-V which is quite new.
@bobweiram6321
@bobweiram6321 Рік тому
Can't you measure the current directly from the VCC and GND pins?
@toorero
@toorero Рік тому
I would have loved to see more different benchmarks hitting different areas of the MPUs, since concluding based on one very specific crypto-benchmark not even using floats seems quite off to me...
@GaryExplains
@GaryExplains Рік тому
LOL, other people complained when they thought I was using floats (as some MCU's don't have an FPU). I just can't win. UKposts comments for the victory! 🤪
@BruceHoult
@BruceHoult Рік тому
Outside of very specialised areas, almost no software uses floating point on desktop computers, let alone on microcontrollers! I've been programming professionally for 40 years and 99% of C programs I work on don't even have the word "float" or "double" in them. Gary's previous "Primes by division" benchmark was quite unrepresentative of normal programs, but this one sounds pretty good (I don't know if the actual source code is available?) so I for one applaud this change.
@Winnetou17
@Winnetou17 Рік тому
@@BruceHoult "almost no software uses floating point on desktop computers" u wot mate ? Browsers and games are "almost nothing" ? Though to be fair, I don't know much about other software, but I'd be surprised if these would be the only major ones. Still, I'd also say it's kind of irrelevant what desktop-level software use and then compare to what MCU-level software uses.
@BruceHoult
@BruceHoult Рік тому
@@Winnetou17 "outside of very specialised areas". Games and browsers are specialised. A lot of people run them, it's true, but they constitute a very small proportion of the lines of code written or programmers employed.
@GaryExplains
@GaryExplains Рік тому
Bruce, the code to Oceantoo is in my GitHub repo, there is also an accompanying video here on this channel.
@Shrek_Holmes
@Shrek_Holmes 4 місяці тому
frequency scaling with power usage isn't linear, its exponential, its better to have all of them at the same clock frequency
@GaryExplains
@GaryExplains 4 місяці тому
While I agree that it isn't necessarily linear, as far as I know that is only if the voltage changes with the frequency. In my testing I didn't only use extrapolation, I did clock them (where possible) at the same freq and the results correlated with my extrapolations.
@chipcode5538
@chipcode5538 Рік тому
Gary did you check the real clock speed of the RP2040. The maximum clock speed is 133 MHz but in the SDK it is set to 120 MHz because it is easier to get the correct clock for peripherals like the USB. Check SystemCoreClock in the SDK. Are you running the test from RAM or XIP? You probably see a difference here.
@TheFerdi265
@TheFerdi265 Рік тому
There is actually even more fun stuff here: The chip has 2 PLLs; it sets one to 48MHz for USB, and one to 125MHz for CPU and bus clock. 125 is also much more manageable to get useful clocks for other peripherals as you said. You can also push the pico MUCH further than what it is specced for. I have run complex programs with PIO and PWM at 300MHz just fine running from RAM, and ~250MHz when running from XIP.
@AndersHass
@AndersHass Рік тому
I do wonder how much current ran through them at the same clock speed.
@drstrr
@drstrr Рік тому
It's pretty quiet on the Speedtest G channel. Any plans for new speed tests?
@GaryExplains
@GaryExplains Рік тому
Sadly, no.
@adriancoanda9227
@adriancoanda9227 Рік тому
Arm is a risc chip. Also, it stands for reduced instruction set. Actually, you will nrrd to have the same motherboard with a socket mount in order to exclude other factors in the testing, but even then the fastest chip was at 240 mhz y won't se where those can make a use maybe in remote controls, elsewhere those are to slow, or use them I a insane cluster 999999999999x cluster but you will need a dam fast cluster management running within the firmware
@mecatronicsforeveryone9565
@mecatronicsforeveryone9565 Рік тому
I wish you included ESP32_S3 in the list.
@GaryExplains
@GaryExplains Рік тому
I wish that as well, but I ordered an S3 boards weeks ago, but it hasn't arrived yet. Having said that I don't think the performance will be any different. I will include the S3 in my dual-core power/perf showdown video.
@FranzzInLove
@FranzzInLove Рік тому
Some feedback: - Current draw is not exactly directly proportional to clock frequency, for instance at lower frequencies, efficiency can be worse because there is some "idle current" that doesn't change much and becomes more important relative to the clock based current. So I think it would be better to set the clock frequency of the MCU at the same speed, and do the same tests at different clock speed (because they might have different sweet spots). - If the goal is to compare architecture and not simply the MCUs, I think this is only a fair comparison if the chips are manufactured using the same technology node, I do not know if it is the case. - I think measuring the board current instead of the MCU current is not great either, I don't know for those specific circuits, but there are many ICs which easily consume a few mA doing nothing, some of them even when they are "turned off" (shutdown current in datasheets is usually low, but not always). One way to measure just the MCU current would be to completely remove other circuits from the board (yes, it's more challenging, and destructive to the board).
@GaryExplains
@GaryExplains Рік тому
Some feedback on your feedback: - I did that in the previous video on MCU power efficiency. - The goal was to show the current state of RISC-V MCUs and to debunk the myth that just because a processor is RISC-V, it somehow means it is inherently better. - I covered that in the video and made the same point myself, did you miss that segment?
@FranzzInLove
@FranzzInLove Рік тому
@@GaryExplains Thanks for your reply, I had not seen the other video. Your graph at around 12 mins shows what I mean. For instance, at 240 MHz, rpico consumes 0.16 mA/MHz, while at 50MHz, it consumes 0.26 mA/MHz. Similar results are seen for ESP32. If it was linear, it would be the same number. That's actually a larger difference than I thought it would be. It is counterintuitive, but I believe MCUs tend to be more efficient at higher clock speed (likely up to a certain threshold). Hence, comparing the energy usage at different clock speed seems to favor the boards running at higher clock speeds. If the goal is simply to show that a risc-v chip can be less efficient than an arm processor, it is achieved, but then IMO, the title "Arm vs RISC-V? Which One Is The Most Efficient?" is a tad misleading, I was hoping to get a comparison of efficiency of risc-v compared to ARM, which would need to control the other parameters (especially the technology node, since it is likely a huge factor). Still an interesting video nonetheless. You did mention it in the video that you measure the board current. Depending on what's on the board this may have a huge impact. I now had a quick look at some schematics and it looks like the boards are quite bare (though I'm not sure what's the exact board you use in some cases), so it may not be that important in the end. One thing I noted though is that most board use an LDO while the Pico apparently uses a DC/DC converter. Boards that use an LDO should indeed have the same current going in 5V as in 3V, however this should not be the case for the DC/DC converter. Efficiency of those LDO is 3.3/5 ~= 65%, while efficiency of the DC/DC converter of the pico is mentioned "up to" 90% (though this varies with consumption). This is an advantage towards the pico board, not related to architecture. If you indeed measure the same current when supplying the pico from 3.3V, it is either because the efficiency of the DC/DC converter is actually 65% as well, or because there is some leakage to the DC/DC converter when there is a voltage applied to its output while its input is floating (which is possible since it is likely not an intended use case). Just to make it clear, I just wanted to provide some constructive feedback, I'm subscribed and enjoy watching some of your videos, I hope this doesn't come off as arrogant.
@TorbjrnViemNess
@TorbjrnViemNess Рік тому
​@@FranzzInLove I agree; if the goal was indeed to compare the efficiency of Arm vs RISC-V, the best way to do it (aside from getting two different chips that are identical, apart from the CPU core - so same node, same class, same memory, same speeds etc.) would be to record the actual number of instructions executed for a given benchmark - i.e. the _dynamic instruction count_. This is the only meaningful number to look at when comparing one ISA vs another. Otherwise you're just comparing chip vs chip. And the direct comparison of cycle counts that was done in this video isn't realistic either, for the exact reason that Gary actually explained just before showing the comparison; memory systems are running slower than the cores themselves and often have a somewhat fixed latency when reading data (and instructions), so you'll typically waste more cycles waiting for memory when running the CPU at a higher frequency. So Gary: nice try and I really appreciate that you focus a bit on my field (MCUs) as well, but for this particular comparison it could've been a bit better - at least from a "comparing ISAs" point of view, from a "comparing MCUs" point of view it was great! :)
@samiam4039
@samiam4039 Рік тому
The comparison is not with new hardware. The visionfive 2 board looks to be 4 core risc v and by having a risc instruction set allows for better parallel processing, making the possibility of higher efficiency. The ability to boot from an nvme and the concurrent processing will need better coding , to achieve faster processing .
@GaryExplains
@GaryExplains Рік тому
What has booting from nvme got to do with the efficiency of RISC-V?
@samiam4039
@samiam4039 Рік тому
@@GaryExplains just a big improvement on visionfive 2 board efficiency’s. Not Risc-v specific. Currently no soc boards have nvme boot up and processing, not even raspberry pi.
@GaryExplains
@GaryExplains Рік тому
Nvme boot doesn't improve efficiency, it improves IO performance, which isn't related to RISC-V in any way.
@GaryExplains
@GaryExplains Рік тому
Also, I have a VisionFive 2 board, and looking at it there doesn't seem to be support to boot from NVME.
@tetraquark2402
@tetraquark2402 7 місяців тому
Just spent three months learning the wrong instruction set. I'm a bit miffed about it
@El.Duder-ino
@El.Duder-ino Рік тому
3:21 22/23 years ago? R u sure Gary?😂🤣🤣🤣 Anyway Gary, well done comparison, thx!
@Andrew-rc3vh
@Andrew-rc3vh 6 місяців тому
ESP32 also has an ultra low power processor.
@fjgaston
@fjgaston Рік тому
It would be interesting to know also the idle power consumption, it would give an idea of how the boards would behave when powered with a battery.
@justinhall7819
@justinhall7819 Рік тому
I was just thinking the current measurements aren't very useful because of all the extra stuff on a lot of those boards. Plus the esp32 are not known for low power. You would have to compare active current with the idle current of each board.
@tails4e
@tails4e Рік тому
Yes the delta power should show the true cpu energy used for the benchmark, maybe Gary can follow up?
@GaryExplains
@GaryExplains Рік тому
The tricky thing with a delta number is that a CPU can never actually be idle. Even doing nothing is still looping and reading instructions waiting to no longer be "idle". To help in this situation there are two general solutions. 1. Lower the clock frequency and the voltage. This is something that smartphones and laptops do. 2. Put the CPU to sleep, this is a feature MCUs tend to have and it is similar to 1 but not dynamic.
@tails4e
@tails4e Рік тому
@@GaryExplains thanks for replying. The motivation for the delta is to see the difference between the dynamic power consumption of the cpu architectures. I take the point that the cpu is never really idle, but I the case of MCUs, it should be at least the cores are idle, or running noops. I think the data would be interesting nevertheless. Idle power in itself would be interesting, so all 3 data points tells a story, idle, full load, and 'full load - idle'. Its quite surprising that a 22 year old design/process can still beat a 2 year old one.
@GaryExplains
@GaryExplains Рік тому
I will look into this more and see if it is interesting enough for a follow up video...
@TheShorterboy
@TheShorterboy Рік тому
Your difference may be compiler, you would need to check the assembler out with gcc -S
@fluiditynz
@fluiditynz Рік тому
Gary, M4 has an FPU speced on core. C3 has cryptographic modules. I'm very impressed with the quite new C3's placing on the list, but do you know if the C3 cryptographic processing components were used in your compiled code? This influences your results quite significantly.
@JustAnotherAlchemist
@JustAnotherAlchemist Рік тому
The cryptographic co-processor in the C3 accelerate very specific algos (SHA and AES), and need to be expressly enabled in code through C headers as well as through the NVM configuration. His crypto algorithm is very custom(?), so I doubt it can even take advantage of the co-processor, let alone the fact that putting code forward that used co-processor on the C3 would kill compilation for all the other chips, since the header would have definitions for C3 specifics.... unless of course Gary was a complete A-hole and put #IF guards around that part of the code. (which would absolutely give the C3 and advantage.)
@stefandebruijn3167
@stefandebruijn3167 Рік тому
First off, nice that someone takes the time to do benchmarks; we can really use some more of that. However, I also think any benchmark that leaves out the different basic types is inherently flawed. An int32 benchmark is nice for pure int32 operations, but it still tells me nothing about int64, float32 and float64. For example, the ESP32 has an FPU for float32, but not for float64. It also leaves out any peripherals - but that's okay (if you need a certain peripheral you should just select on that)... For example, I have a few ESP-S2's here that use the TinyUSB stack. They are great, but whenever you feel like using the native USB in instead of the hardware uart, it starts to eat up your cpu cycles like cookie monster... it'll be the same story for the RP I suspect. Especially float can give very nasty surprises, I suspect it will be the same in terms of power consumption / efficiency.
@GaryExplains
@GaryExplains Рік тому
I think the general wisdom is that floating point code accounts for less than 1% of microcontroller code. So doing a test that focusses on floating point is inherently flawed.
@stefandebruijn3167
@stefandebruijn3167 Рік тому
​@@GaryExplains Where did you get that "general wisdom"? I know I've never seen it in my 30+ years of professional software engineering... Not saying it's incorrect, but in my experience it very much depends on the application how much floats are being used... Source? But even if it is correct, I don't think you understand how bad it really is. I actually did some benchmarks on the esp32 a while back, because I couldn't make heads or tails of the performance numbers. It has roughly 600 MIPS and just 1 MFLOPS (!) for common operations. That means that even if only 0.2% of your code is using floating point, it will consume 50% of your cpu power. It's that bad...
@GaryExplains
@GaryExplains Рік тому
When I say general wisdom, I mean general wisdom, there isn't a particular source. However over the years I have seen multiple presentations that analyze real-world code and FP code is minimal, certainly on microcontrollers. That is why some microcontrollers don't even include an FPU, not needed really.
@stefandebruijn3167
@stefandebruijn3167 Рік тому
@@GaryExplains Right, and as I said, I'm no amateur, and I've seen a lot of issues with FP over the years. At the end of the day it doesn't matter what the exact percentage is: since FP is so much slower than integer operations (for obvious reasons), the effects on the application as a whole are still significant. Whether or not FP is required for applications at all is a totally different discussion. Again, such discussion is eventually irrelevant; the fact is that regardless if it's a good idea or not, people use it for everything from motion control to PID loops and from UI's to signal processing. That is why there's a tendency for vendors to add an FPU: because it is needed. ESP, STM32F4 seem to agree with me. The RP2040 does not have one.
@byteme6346
@byteme6346 5 місяців тому
I just created a NAS with a Raspberry Pi 4B and an external USB HDD. This would be a good application to verify an SBC is useful.
@canislupus616
@canislupus616 9 місяців тому
Is there a fully stable and official Python interpreter specifically tailored for RISC-V? ,
@GaryExplains
@GaryExplains 9 місяців тому
What do you mean by "specifically tailored for RISC-V"? What alterations do you want in this RISC-V specific version?
@canislupus616
@canislupus616 9 місяців тому
@@GaryExplains Thank you for your response. By "specifically tailored for RISC-V", I meant a version of the Python interpreter that's been optimized to run on RISC-V architectures, taking advantage of its specific features and instructions. Just as we have optimized versions or builds of software for different platforms or architectures (e.g., ARM, x86), I was wondering if there's an equivalent for RISC-V. Essentially, Any version of Python that might offer similar performance or other benefits when running on a RISC-V system.
@GaryExplains
@GaryExplains 9 місяців тому
Hmmm... I am not sure that Python has special optimizations for different architectures. I just downloaded the Python source code and I see very little code that is optimized for say SSE3 or SSE4 or AVX. There isn't much assembly language either. I see a little bit of x86 ASM code in one of the math libraries, but there isn't an equivalent for ARM64. It is just C code in general. 🤷‍♂️
@canislupus616
@canislupus616 9 місяців тому
@@GaryExplains Thanks Gary.
@zizlog_sound
@zizlog_sound 7 місяців тому
Languages such as Python defeat the purpose of efficiency.
@SlugCatLife
@SlugCatLife Рік тому
I don't feel like I got an answer. Also you did not list what the architecture (risc or arm) was in the graphs.
@GaryExplains
@GaryExplains Рік тому
The processor is shown along the x axis.
@michaelkaercher
@michaelkaercher Рік тому
In general, performance of risc-5 is not up to the standards of ARM. Full stop. But this battle does not stop today. ARM just announced, that they will charge their customers in future based on the device prices instead for IP. That will drive the research in the area of Risc-V up. I expect the Risc-V to become a contender in the Mobile Phone space (low end) in about 3 years and in the high end market in 6-7 years.
@GaryExplains
@GaryExplains Рік тому
ARM has not announced anything of the sort. You are repeating a rumor published by the FT.
@michaelkaercher
@michaelkaercher Рік тому
@@GaryExplains It came from Softbank, the owner of ARM. Let us wait and drink tea. Maybe it is a hoax.
@GaryExplains
@GaryExplains Рік тому
Again, nothing official has been said by Softbank or Arm.
@michaelkaercher
@michaelkaercher Рік тому
Let us wait and drink tea. Btw. Enjoying most of your content. Great channel.
@lepidoptera9337
@lepidoptera9337 10 місяців тому
@@michaelkaercher I am waiting and drinking my tea while the attention trolls on UKposts keep asking me for all the love they didn't get from their Moms. :-)
@marcwagner3762
@marcwagner3762 Рік тому
What about some more RISC-V Boards...
@Chris-wf2lr
@Chris-wf2lr Рік тому
Why not transistor count instead of energy used, too many variables. Assuming transistor numbers usually correlate to cost ultimately… to show what architecture more efficient for the theoretical cost of production (if they were same fab, same node)
@GaryExplains
@GaryExplains Рік тому
Transistor count doesn't correlate in any meaningful way. It won't help you decide what size battery to use etc. Power usage is the most important thing, everything else is just statistics.
@LogioTek
@LogioTek Рік тому
Useful test but not good test on the topic of CPU core efficiency for several reasons: 1. likely system bus speed differences between these (system bus interfaces to on-chip SRAM) obfuscate differences between true CPU core performance/MHz/Watt unless you downclocked all of them to lowest common denominator system bus speed, 2. differences in flash memory/prefetchers further obfucate CPU core performance unless you ran the benchmark from RAM and even then some like M3/M4 could use dual-buses 1 for data and 1 for instructions making it unfair, 3. finally at least some of these probably manufactured on different process nodes
@GaryExplains
@GaryExplains Рік тому
How would you suggest I resolve those issues?
@LogioTek
@LogioTek Рік тому
@@GaryExplains Actually I didn't finish watching when I commented, I see you ran all of them at 1MHz later to level the playing field and I assume system bus was dropped to 1MHz also and that's a first important step. I would run all of these CPU cores at the system bus speed of the lowest common denominator system bus speed. The second step is to link to run the code out of SRAM instead of Flash on all of them. That's probably the best you can do to isolating core performance efficiency.
@cheebadigga4092
@cheebadigga4092 6 місяців тому
Damn the M4 is really nice!
@GaryExplains
@GaryExplains 6 місяців тому
Indeed. I think it is my favorite Cortex-M processor!
@jpjude68
@jpjude68 Рік тому
Isn't power consumption also a function of speed though? i wouldn't be surprised if the microcontroller's power consumption is directly proportional to the speed
@GaryExplains
@GaryExplains Рік тому
Of course it is proportional to clock frequency.
@hinz1
@hinz1 Місяць тому
Why no new 68k?? That was an actually nice CPU and now that silicon and transistors are cheap, it should be rather easy to make it go fast.
@TheLouKou
@TheLouKou Рік тому
Garry, please, you;re killing me! It's ESPRESSIF, there is no X in there! XD
@happyatheists9361
@happyatheists9361 Рік тому
Dr gary
@TheFlashPod
@TheFlashPod Рік тому
I have to say that only the last plot (mWh to the task) makes at least some sense... But in general I would say that you can not generalize these boards and compare them directly. MHz is not linear to power comsumption. It's quite simple: The esp32 boards can run at 240MHz and are there for the fastest. It does not matter if the M4 can "compute more per MHz", if it is capped at 100Mhz and therefore is still slower to do the task... If you are looking at power efficiency you probably do not need those high clock speeds anyway. You can power down the Modem of the ESP and that will cut down the power substantialy. If you want to compare the ESP32 to the M4, you should clock down the ESP to comparable levels and run the tests again.
@GaryExplains
@GaryExplains Рік тому
Hmmm... If you look at my previous video about microcontrollers you will see that I actually did change the clock speeds. While it isn't linear it is very close.
@PrivateSi
@PrivateSi Рік тому
The ISA does make a small difference, and the fetch-decode speed was a large factor up until high mHz and pipelined branch prediction. A clean(ish) slate approach to both the ISA and IMPLEMENTATION SPECIFICATIONS of Risc-V working in tandem is what gives RISC-VECTOR the edge. -- A proper Vector Processing specification instead of SIMD (an ISA DISASTER that should have stopped at SSE4 on the X86, and should never have been introduced into the ARM ISA... A Vector processor would have been vastly preferable and the tech was well proven., -- A major benefit is to combine CPU + GPU programming into one (much more) bare metal ISA for both, eliminating a ton of API translations and JIT compilation, Short but quite efficient or very long and perhaps more efficient pipelines can be experimented with by LOTS MORE CHIP (PART) DESIGNERS, while developers get a STANDARDISED ISA.. -- Bare metal GPU Compute will be much EASIER. Integrated graphics and general purpose Vector processing compliment each other, but software-only graphics systems using just the vector processor and a few CPU cores could be more efficient and good enough for web + office..
@GaryExplains
@GaryExplains Рік тому
You think microcontrollers have branch prediction?
@PrivateSi
@PrivateSi Рік тому
@@GaryExplains .. not yet, and hopefully never! I agree on the microcontroller front RISC-V is no better than the Pi Pico spec. It's also less RISC than the pico.. The low end RISC-V spec now includes basic, MMX level integer SIMD, probably FP SIMD when it's finalised then extended, so quite bloated compared to Pi Pico ISA. -- I'm an ARM fan but think the High End RISC-V spec is a better idea (Vectors vs fixed sized SIMD).. Risc-V is an ARM killer, X86 never was... ARM is still the most likely X85 killer but Intel and AMD will probably race to replace X86 with native Risc-V and emulated X86. 10s to 100s of smaller SOC designers and manufacturers will obviously also prefer Risc-V. -- Sadly ARM's days are numbered. It may well have to abandon its ISA and many core implementation details when it too goes Risc-V.. Open Standards are very powerful forces.. Look at IBM PC, HTML + CSS, Unicode. For better or or worse, these royalty-free technologies alays dominate. -- I actually prefer 2 byte opcode ISAs using a few tricks and vector processing over SIMD. de-bloats the cache and pipeline. Risc-V is getting more bloated despite its lack of SIMD. Too many cooks spoiling the broth will be the reason Risc-V fails, if it does, which it probably won''t. A (US) Big Boy could buy out the project I suppose, and ruin or bury it, but that's unlikely too.
@schrodingerscat1863
@schrodingerscat1863 Рік тому
Although the Arduino IDE hardware abstraction does a good job of providing a common programming interface it is not really a good platform for performance comparisons. Some of these chips have a lot of functionality to improve performance per watt which isn't supported by Arduino HAL and the HAL has to do a lot more work with some architectures slowing down performance too. That said it is clear that the now ancient ARM architectures still hold up extremely well to the modern competition.
@GaryExplains
@GaryExplains Рік тому
"Some of these chips have a lot of functionality to improve performance per watt which isn't supported by Arduino HAL" - Could you please give me some examples.
@schrodingerscat1863
@schrodingerscat1863 Рік тому
@@GaryExplains You can shut down the ESP32s entire radio circuitry if you have access to the low level registers. This saves a lot of power even when the radio isn't being used. If you have access to clock multipliers on the STM chips you can tune them to give lower power consumption too. Your encryption algorithm may be able to take advantage of encryption hardware on some of the chips which would make a big difference but the HAL won't necessarily take advantage if it.
@GaryExplains
@GaryExplains Рік тому
Well, you can shutdown the entire radio circuity using the Arduino HAL. In fact I tried that, and said so in the video. Switching on low-power idle modes isn't relevant to this test. Also I used my encryption algorithm as a example of a heavy CPU load, it doesn't matter that it is about encryption. In my previous video I used finding primes and in my next I might use nqueens. It isn't about using special HW encryption blocks, but about testing the CPU.
@schrodingerscat1863
@schrodingerscat1863 Рік тому
@@GaryExplains Turning off the radio is not a low power idle mode it is just turning off the WIFI circuitry when the application doesn't require it. The rest of the chip runs at full speed and full power. It gives a better apples to apples comparison when testing say STM chips with ESP. Like when people compare the PI Pico to others ignoring the programmable IO which is it's most unique and powerful feature.
@GaryExplains
@GaryExplains Рік тому
Hmmm... I seem to be repeating myself, one more go I guess: You can shutdown the entire radio circuity using the Arduino HAL. In fact I tried that, and said so in the video.
@AndersHass
@AndersHass Рік тому
But which is more efficient RISC-V or ARM in Minecraft lol. But still important point that what is used to handle the instruction sets matter way more than instruction sets themselves.
@PeetHobby
@PeetHobby Рік тому
That is very slow M4, low power version? Most M4's run at 168Mhz or 180Mhz or so. Edit: And real power of the stm32 M4 is the FPU. Maybe you can do a floating-point test between esp32, risc-v and the M4.
@riscy00
@riscy00 Місяць тому
Low-power variants of M4 have reduced performance, especially in DMA and Bus Matrix as they have been simplified (in order to maximize battery saving and system power consumption under green something) compared to workhorse F4 with better parallel architecture in DMA and Bus-Matrix. The reason for M0+ from M0 is that interrupts have been proven to be too limiting in the past, where M0+ relieved some of this issue.
@alvinnorin8820
@alvinnorin8820 Рік тому
3:37 *reality checks myself*
@Schutti73
@Schutti73 Рік тому
I am waiting for a fullsize PC with RISC-V CPU.
@GaryExplains
@GaryExplains Рік тому
Why? What will it give you that x86 or Arm don't do/have?
@Schutti73
@Schutti73 Рік тому
@@GaryExplains A useful PC instead of a developer Board that cannot do my averyday work with a open ISA AND a Open Source OS like Linux. X98_64 or the ARM Cors are not free.
@GaryExplains
@GaryExplains Рік тому
@@Schutti73 When you say free, what do you mean?
@-Slade-
@-Slade- 5 місяців тому
Its kinda wrong to average out the performance. The esp32, esp32-s2 and esp32-c3 have an adjustable clock ( 80 Mhz,160 MHz and 240 Mhz). The newer Esp32-s3 can go as low as 10 Mhz. You can set the esps to 160 Mhz to compare to each other. You can also average the time it takes for fixed set of operations etc
@GaryExplains
@GaryExplains 5 місяців тому
But the point is the power efficiency per MHz, which is what I showed. I don't think you understood the video.
@kayakMike1000
@kayakMike1000 Рік тому
Hmmm ... Efficiency is largely dependent on the implementation and which extensions are used...
@GaryExplains
@GaryExplains Рік тому
Did I not say that?
@kayakMike1000
@kayakMike1000 2 місяці тому
​@@GaryExplainsyeah, sometimes I type out my thoughts before I watch the whole video. You did great.
@mementomori1868
@mementomori1868 Рік тому
Its not about performance only!!!! The biggest thing RISCV is OPEN SOURCE processor...
@GaryExplains
@GaryExplains Рік тому
Really? You understand that only the document describing the instruction set is open source. What advantage does that give consumers?
@mementomori1868
@mementomori1868 11 місяців тому
@@GaryExplains Pls read (even in google) why riscv and open instruction set is so important.
@GaryExplains
@GaryExplains 11 місяців тому
@@mementomori1868 😂 Or please watch my videos as I have several about RISC-V and what it really is.
@ArniesTech
@ArniesTech Рік тому
Both are amazing and exciting alternatives to X86 💪🙏
Explaining RISC-V: An x86 & ARM Alternative
14:24
ExplainingComputers
Переглядів 406 тис.
RISC-V 2024 Update: RISE, AI Accelerators & More
14:03
ExplainingComputers
Переглядів 68 тис.
ВИРУСНЫЕ ВИДЕО / Мусорка 😂
00:34
Светлый Voiceover
Переглядів 6 млн
Jim Keller: Arm vs x86 vs RISC-V - Does it Matter?
10:11
TechTechPotato: Clips 'n' Chips
Переглядів 53 тис.
Arm vs RISC V- What You Need to Know
22:19
Gary Explains
Переглядів 296 тис.
Dr. Ian Cutress Explains The Hype Around RISC-V
13:32
PCWorld
Переглядів 80 тис.
Cheap Risc-V Supercluster for $2 (DIY, CH32V003)
9:02
bitluni
Переглядів 235 тис.
RISC-V isn't killing Arm (yet)
9:05
Jeff Geerling
Переглядів 323 тис.
RISC-V is Coming to Android (Eventually)
11:16
Gary Explains
Переглядів 21 тис.
ПРОЦЕССОРЫ ARM vs x86: ОБЪЯСНЯЕМ
12:07
Droider
Переглядів 676 тис.
The Genius of RISC-V Microprocessors - Erik Engheim - ACCU 2022
1:01:17
ACCU Conference
Переглядів 82 тис.
This RISC-V cyberdeck is not for you
9:15
Jeff Geerling
Переглядів 207 тис.