RISC vs CISC - Is it Still a Thing?

День тому

People have often debated the pros and cons of CISC (Complex Instruction Set Computer) vs RISC (Reduced Instruction Set Computer), but is that debate still valid today?
Introduction to Android app development: www.dgitacademy.com
Let Me Explain T-shirt: teespring.com/gary-explains-l...
Twitter: / garyexplains
Instagram: / garyexplains
#garyexplains

КОМЕНТАРІ: 523

@paulk314 5 років тому

I'm an engineer at ARM (actually just about to end my work day and clicked on this video) and this was a great explanation of all these concepts. I actually didn't know about delayed branch instructions, cool! I was also surprised to learn that branch prediction didn't become standard practice until a while after it was thought of. Neat!

@BruceHoult 5 років тому

Hi from SiFive :-) In CISC processors, branch prediction started with the 80486 and 68040, and heated up a bit in original Pentium and PowerPC, but really wasn't very good then -- maybe something like a 20% or 30% misprediction rate. Intel cracked the problem with the Pentium MMX and Pentium Pro (SUPER SECRET SAUCE back then) with essentially what we use today with 2% or 3% misprediction rate.

@boriscat1999 3 роки тому

SH had branch registers, you could manually load your branch destination in advance before jumping to it. Giving you some of the advantages of a delay slot and much shorter encodings (less duplication) for conditional branches.

@paulk314 3 роки тому

@Dr ROLFCOPTER! Knowing about branch delay slots is rather arcane knowledge, given that it only existed on some RISC architectures and definitely isn't how modern processors work. The majority of my ARM knowledge was for AArch64, not 90s era technology. Anyway, on second viewing, the concept does sound familiar, though I think what I am remembering is possibly the (closely related) concept of load delay slots. At Arm I was a verification engineer and I had to possess a detailed understanding of the architecture, including concepts like virtualization, multiple stages of address translation, exception handling, and about a thousand other things specified in a architectural reference manual that was over 7,000 pages not to mention the extensions. I had to understand how all these features interact and to design tests that worked across a variety of implementations in order to stress the microarchitectural features including branch prediction, speculative execution, caching, etc. I had to carry around a lot of knowledge in my head, so I guess a few details like load delay slots that haven't been used for decades slipped my mind. And the reason I'm speaking in the past tense about ARM Is because I decided to accept an offer at SiFive, and am now working on learning the ins and outs of yet another architecture.

3 роки тому

@@paulk314 Nice one Paul, and extremely well done getting a job at SiFive - an incredibly innovative company that's, quite literally, "Leading the RISC-V revolution".

@pnachtwey 2 роки тому

I programmed a TI DSP C30. It had delayed branches where up to instructions could be placed after a jump. This could get tricky if the jump was conditional. In some ways it was like the conditional execution part of the ARM machine codes. The C30 could use 3 registers in an instruction. For a CISC type of DSP I think it was pretty good for its time.

@laustudie 5 років тому

First time i actually understand the difference between cisc and risc thanks mate

@shirshanyaroy287 5 років тому

@Z3U5 Off-topic but I feel like I've seen you on Quora XD

@kpsayyed84 3 роки тому

Same

@abstractapproach634 2 роки тому

@Z3U5 that's sad, but common. It's because the students dream of leaving academia (thus the brightest aren't teaching). It may have been part of why I studied Mathematics, the students passion about the subject is reflected brighter in the instructors (at least when you get higher up, pro tip kids going to a community College first them to Uni you end up with very few Student Teachers.)

@green4free 5 років тому

As you said x86 is moving more aganist risc with things like micro ops. But it goes the other way too. With things like vector instructions(neon) and other more complex instructions Arm is moving towards cisc aswell. I thing everyone is just aiming for that sweetspot

@Waccoon 5 років тому

Microcode and nanocode have been used since the beginning, and that's not what was introduced with the Pentium Pro or Pentium M. What's changed in modern processors is the idea that nanocode can be reordered and cached independently of the main ISA, so the processor is effectively translating the main ISA into a new ISA before execution. It's way more advanced than what any RISC processor is doing, and the idea was probably inspired by (or stolen from) the now defunct Transmeta line of processors.

@nextlifeonearth 5 років тому

Vector instructions aren't necessarily CISC though. If it has to be divided into individual instructions (fetch, op, store etc.) it is not really CISC. SIMD is RISC compatible.

@pwnmeisterage 5 років тому

Most instruction extensions are primarily intended to expand the advertised feature sets which sell more processors ... they might technically be categorized as RISC (or RISC-based or RISC-compatible, whatever) ... but their use in non-synthetic applications is infrequent and specialized enough that most of the time they're little more than inert silicon and inflated transistor counts ... which basically undermines all the advantages offered by RISC philosophy.

@gazlink1 4 роки тому

@@nextlifeonearth yup.. vector operations make the CPU GPU-like, not CISC-like. And no-one going to call GPU's outdated, or inefficient, quite the opposite. Vecorisation of instructions is great.. for parralelisable operations. "Parralelisable" is just the same benefit multi core CPUs have, but this is even more efficient, it's just a x4 or x8 of floating point or integer units, to be better (faster/more efficient) at processing larger numbers. If anything it's somewhat more RISC-like than Intel's "just make one core faster no matter the power needed" approach.

@user78405 4 роки тому

but doing both everyone gonna for....intel cores are wider for full set while AMD is not wide to do any RISC instruction in one clock cycle...but it can do in 2 clock cycle per core ...thats gonna hurt amd future when times comes when AR and next gen windows came about 2021....that be end of AMD dominance on x86 again...due to INTEL PLAYED THIS RIGHT FROM BEGINNING ....BY MAKING SURE RISC INSTRUCTION IS WAY TO GO FORWARD WITH ARM COMPETING HUGE WITH INTEL WILL END UP WINNING THE WAR FROM BOTH SIDES....WHILE AMD BE THE VICTIM OF ITSELF FOR SELFISH SMALL MISTAKE...128BIT IS VERY IMPORTANT NOT THE MEMORY BUT PROGRAMS IS NEEDED FOR FUTURE NSTRUCTIONS IS NO LONGER DEPENDING ON GPU ANYMORE...THATS WHEN NVIDIA GONNA SEE THIS A HUGE THREAT

@RafaelKarosuo 4 роки тому

Thank you for referring the RISC I: A REDUCED INSTRUCTION SET VLSI COMPUTER paper here, also really helped your quotes and the sharp descriptions. Great work on distilling this comparison.

@Sunshrine2 3 роки тому

Well, this just aged like wine.

@hotamohit 3 роки тому

just like Gary

@jcdentonunatco 3 роки тому

Why? Everything he said is still pretty relevant. The RISC vs CISC battle will continue for decades

@Sunshrine2 3 роки тому

@@jcdentonunatco Well, that is precisely what “aged like wine” means. = good became better. The other expression would be “aged like milk”.

@jcdentonunatco 3 роки тому

@@Sunshrine2 lol sorry thought you were being sarcastic

@Sunshrine2 3 роки тому

@@jcdentonunatco No, no, I insert the needed /s if I do that on the internet :D

@RonnieBeck 5 років тому

Concise, informative and well spoken. Thanks for awesome explanation!

@JeremyChone 3 роки тому

Wow, what a great explanation. Love the last bits about the first instruction splitters and how it relates to heat/power.

@amiralavi6599 5 років тому

Only you can explain such complex stuff in such a simplified manner.

@davejoubert3349 5 років тому

I appreciate that you are hinting to your viewers the beautiful layers that sit between the instruction set and the silicon.

@nimrodlevy 5 років тому

Your lectures are always a delight! Many thanks super interesting!!!

@JayanandSupali 5 років тому

I just felt like my brain was fed with very soft baby food of Info. My dear friend Gary, you did an exceptionally good job at simplifying this for anyone to understand. Again, ThanQ so much :-) #BigFan

@NexuJin 5 років тому

Interesting video. Kinda brings me back when the Pentium came out and back than lots of computer magazines were writing about "Is RISC dead?". 20 years later ARM processor is what made the majority of the populations actually use a computer without calling it a computer!

@jasonknight1085 5 років тому

Yes, but with SIMD extensions, NEON, virtualization instructions, the new execution level, MMU, LSE, VFP, etc, etc, can ARM really be called RISC anymore? Of course still pisses me off they keep adding all that stuff, but still won't provide a simple flipping set of string operations. 12 clocks even with NEON to move 32 bits from one memory address to another, or 18 clocks just to send from memory to a port (when looping) is ridiculous, hence something like a 180mhz M4 being barely on par with a DX2/66 on actual delivered computing power unless you sit there clock counting at the ASM level. (don't expect GCC to produce anything worth a damn...) Just like how a 1ghz A7 is about on par with a 450mhz P2 in compute per clock. Wouldn't even be useful if it didn't run circles around CISC in computer per watt... though that is the real point of it. But what's the old joke? RISC is for people who write compilers, CISC is for people who write programs...

@Waccoon 5 років тому

PowerPC was a major disappointment, and is what really sealed the fate of RISC on the desktop. It wasn't nearly as fast as promised, IBM and Motorola kept fighting each other with incompatible extensions, and I remember just how damn HOT they ran, too. ARM is okay for mobile stuff, but not really competitive on the desktop (and by that, I mean workstation). There's a reason why ARM has helped to take the "computer" out of computer. But, hey, ARM can do what they always do... make a new alternate ISA and continue to make their processors even more complicated and less RISC-y.

@lookoutforchris Рік тому

@@Waccoon the distinctions you’re using are 20+ years out of date. RISC v CISC is a meaningless phrase today. Read the Arstechnica article on this from 1999.

@Waccoon Рік тому

@@lookoutforchris It's not meaningless. The differences between RISC and CISC are almost entirely related to instruction encoding, not microarchitecture, so at the low level most RISC and CISC processors are designed and perform much the same way these days. However, the differences do matter, as the way compilers generate code for each design has to be fundamentally different if you want good performance and efficient use of the caches. I've read whitepapers and studied ISA encodings for more than 20 CPUs, so trust me, I'm not going to learn anything new from some watered-down article from 20 years ago.

@hoberdansilva2894 2 роки тому

I used to program microcontrollers in the 90s in assembler language, risc architecture was really fast and excellent for simple applications. But when things where a bit complicated I preferred x86 family as the instruction set really simplified things for me. I'm really happy to see the evolution of those architecture through the years....

@cedartop 3 роки тому

Your explanation of RISC reminded me to my study time, when we had to program assembler on a 80C31. There you really do all that stuff, including writing the exact address to write or read into/from the RAM.

@UncleRichie101 5 років тому

Got to be honest, I knew nothing about this before today. 😊 Now much more informed, thanks professor. 😁 Found this fascinating and you did an amazing job of breaking it down in a way that would be easily understood. 😊 P.s. truly this is something every computer fan should be aware of. Thanks so much for making me much more informed than I was before. I really learn a lot from your channel. 😁

@NexuJin 5 років тому

People from the Macintosh/IBM compactible era should still know the differences...or atleast aware there is that difference.

@sennabullet 3 роки тому

Awesome!!! Thank you Gary Sims!!!

@MarsorryIckuatuna 3 роки тому

Wow, that was a lot of information right there. I lived through that period and only had the basics covered. Awesome video.

@GaryExplains 3 роки тому

Glad you enjoyed it!

@robinbanerjee3829 5 років тому

Excellent video! Thanks a lot. Keep it up!

@JJDShrimpton 5 років тому

An excellent video, thank you Gary.

@44r0n-9 3 роки тому

What a great video! So easy to follow and explains stuff I didn't even know I wanted to know.

@feedmyintellect 3 роки тому

Thank You!!!! You are great at explaining complex things!!!

@Luredreier 5 років тому

Nice video. =) While I knew all of this already I don't think I'd be able to really express myself as well and explain this as clearly as you did in this video.

@Handskemager 2 роки тому

So refreshing that someone actually explains it and explains the x86 splitting CISC instructions down to RISC instructions to be put down the pipeline.. ty! Will be referring to this when one of my friends doesn’t get it.

@sureshkhanal3801 5 років тому

Knowledgeable ❤.

@pixannaai 3 роки тому

The best explanation ever. Thanks! keep going!

@antonnym214 3 роки тому

I designed a Minimal Instruction Set architecture with only 16 instructions (4-bit opcode): ADD, AND, NOT, OR, SHR, SUB, XOR, LDA, PSH, POP, STA, RDM, JC, JN, JV, JZ . If you are familiar with an 8080 or Z-80, it's like a cut-down version of that. All ALU operations result onto the stack, which is convenient because PSH and POP to/from the stack are great for transferring between registers. No CMP (compare) is needed because a compare is nothing but a sub where you don't care about the result, only the flags. All good wishes!

@richo13 5 років тому

Great video Gary I learnt a lot

@lastmiles 5 років тому

Always a pleasure to listen to someone that knows what they are talking about. All the way down to the wires.

@BILLYZWB 2 роки тому

cheer man really helpful description!

@BobDiaz123 5 років тому

I like how the very simple 8 bit PICs deal with a jump. Most instructions take 1 cycle, but any branch or jump clears the pipeline and takes 2 cycles. Microchip is going for very low cost chips, so the delay of an extra cycle for a jump helps to keep the chip cost down. PICs are imbedded in many products and are used a lot.

@kshitijvengurlekar1192 5 років тому

Hey there Gary! Young as always

@takshpatel8109 Рік тому

One of the best teacher on hardware stuff.

@DaywaIker 4 роки тому

Thankyou Gary!

@shikhanshu 5 років тому

such an amazing video!

@thekakan 5 років тому

I love the fact that you included the microcode part. A lot of people talking about this topic entirely forget that fact. Thanks :)

@Arthur-qv8np 5 років тому

ARM processors (which are RISC ISA) also use micro ops. micro-ops is not related to RISC vs CISC ISA but to superscalar architecture or not.

@thekakan 5 років тому

@@Arthur-qv8np IIRC, the only microcode that ARM processors have is related to THUMB instructions. And yeah, I was talking about micro codes :x

@thekakan 5 років тому

Interesting. Well, if RISC processors will start including microcodes to "simplify" instructions, wouldn't that make them the same as CISC ones? Anyways, I hope it doesn't happen. Compilers can do amazing stuff and the lesser the instruction set is, the easier it is for the compiler to optimize, or so I think how it should be.

@Arthur-qv8np 5 років тому

@@thekakan From my point of view, RISC and CISC only define the philosophy of the instruction set. This does not define the characteristics of the architecture. Obviously, the instruction set and the architecture are closely related, but it's orthogonal. You can design both CISC or RISC processor with the same kind of architecture. For example with Out of Order, speculative execution, branch prediction, register renaming, simultaneous multithreading, vector instructions, micro ops, microcoded cpu, pipelined cpu, ..

@dlwatib 5 років тому

Excellent explanation. RISC was a very elegant solution, but I think CISC has inherent advantages that will win out in the end. CISC programs can be shorter because the instructions can more closely express the programmer's (or at least the compiler's) intentions. That's valuable information. Techniques like branch prediction depend on having that information available. A shorter program also means fewer instruction fetches from memory, so less load on the memory bandwidth. In other words, CISC programs have more dense and less distorted information content than RISC. RISC represents a premature optimization technique. It distorts and bloats the information content of the program for the purpose of making it easier for a specific machine implementation to understand and process. But machine instruction architectures last longer than a specific machine implementation. Historical artifacts like delayed branch instructions become needless complexity later on. Of course, the x86_64 architecture, a CISC architecture, has plenty of historical artifacts of its own that also add needless complexity, even bugs. But there's no reason to encourage the accumulation of cruft.

@sheelpriyagautam8333 5 років тому

CISC programmes have fewer instruction fetches but require extra logic to parse them as they have variable length. Plus, these variable length instructions are split into multiple instruction and there is no optimal way of doing it except brute force, this requires even more extra logic. This makes CISC machines spend time on doing something which is not calculation. This makes them more power hungry and less efficient. At this point, I really cant see any benefit to x86 except compatibility. I wonder how ARM processors became so fast in the last couple of years. Perhaps, its because the hardware limitations for which RISC was designed are no longer there.

@erikengheim1106 3 роки тому

If CISC will win out in the end then why are no new CPU designs CISC? Why is RISC taking over super computers, server and now entering the desktop market. It seems to me CISC is gradually painting itself into a corner. > CISC programs can be shorter because the instructions can more closely express the programmer's (or at least the compiler's) intentions. That's valuable information. The experience seems to be the opposite. One of the reasons RISC began to rise was that they discovered compiler writers where not very good at picking and utilizing these more complex instructions. Meanwhile juggling registers of which the RISC CPUs have many is something compilers have gotten a lot better at. Contrary to your representation of reality, CISC was to a larger degree made to facilitate people hand writing assembly code. RISC OTOH is designed with the assumption that code will be compiled. > RISC represents a premature optimization technique. I would say it is the opposite. CISC prematurely optimizes instructions in hardware. An area not easily or quickly changed. RISC programs in contrast can be optimized simply by getting better compilers that arrange the instructions in a more optimal fashion. > A shorter program also means fewer instruction fetches from memory, so less load on the memory bandwidth. RISC has tricks to get around this. E.g. ARM uses the Thumb format which makes most common 32 bit instructions take 16 bits. That means you double the number of instructions you can fit in memory. In addition the large number of registers in RISC processor allow them to reduce the number of load and store instructions further reducing required memory for instructions. > In other words, CISC programs have more dense and less distorted information content than RISC. Very odd way of putting it. RISC instructions are not distorted information. They tend to be simple instructions orthogonal instructions. CISC introduce the bloat by creating complex instructions requiring complex silicon to decode. For a compiler writer it is easier to compose things out of simple building blocks than trying to hunt down specialized instructions. I think the CISC approach really only benefits people doing compilation by hand, not for real compilers. If I was writing assembly code by hand I would likely prefer CISC. Having played with different assembly code, I would say Motorola 68 000 was hands down the easiest one for me to ever use. I kind of like AVR but being RISC is of course a bit cumbersome having to do so many things with multiple instructions. But there is a certain beauty in knowing every single instruction takes 1 cycle. I can easily count how long time a particular segment of code will take to execute. No such luck with CISC. For real time systems that is pretty nice.

@BillCipher1337 4 роки тому

wow you have explained it perfectly, now i finally understand the whole thing about CISC RISC :)

@alliejr 5 років тому

Well stated. 👍🏽

@Ko_kB 2 роки тому

Great explanation.

@profounddevices 5 місяців тому

i like the explanation of risc vs cisc in this video. risc has come along way in part from faster ram and greater cache or even tightly coupling ram on the soc. i know first hand that specifically intel cisc handles loading of registers for avx and simd faster. intel might be doing a wrapper to convert cisc to risc in microcode, but the simd steps and avx loading are streamlined. risc on arm loading registers is complex and slow. when this is solved for risc, cisc may not be needed anymore. it is these specific high performance operations keeping it alive.

@dogman2387 5 років тому

Great video!

@patrickdaxboeck4056 5 років тому

In fact modern X86 CPUs are internally RISC machines with a translating layer to the outside. AMD was the first to do so for it‘s 64bit instructions and later Intel followed the same path. On the other hand the classic RISC CPUs have so many instructions now and special additions for e.g. math, that they are not really RISC anymore.

@GaryExplains 5 років тому

The way you say "in fact" makes me wonder if you watched the video because I talk about this subject in the video.

@patrickdaxboeck4056 5 років тому

Dear Gary, you are right, just before the end of the Video you told about the micro Ops of Intel CPUs and that was just after I sent the comment.

@PEGuyMadison 4 роки тому

Actually... Intel was... sort of.. they purchased this technology from Digital Equipment which was used on the DEC Alpha chips. This was introduced into the 2nd generation of Pentium Processors which outperformed the P52C in so many ways. So there is the history... long before AMD there was DEC Alpha... which is now owned by Intel.

@gazlink1 4 роки тому

.. and they still have CISC instructions going into them. If they have a RISC-like inner core, surrounded with a (pointless?) CISC to RISC converter, then that just changed the definition of what a modern CISC architecture does with CISC instructions, redefining what CISC architectures.. are. ... But it's still a CISC architecture. And all that stuff surrounding the RISC-like core - that's why iPads are so damn capable and smooth at such low power.

@PEGuyMadison 4 роки тому

@@gazlink1 CISC instructions are far more packed and lower power compared to RISC instructions... plus with CISC you get a much higher dispatch rate of instructions achieving higher parallelism and usage of independent functional units within a CPU.

@eterusilvers3919 3 роки тому

Wow you are really good at explaining complex stuff! :)

@CommandLineCowboy 5 років тому

No mention of micro-code and larger register sets. When RISC designs were first envisioned in the early 80's these two things were often talked of in the RISC vs CISC conversation. Micro--code was using a bank of ROM memory in the processor whose stored bits matched gates that controlled the flow of bits, bytes and words between the various registers, logic, address and data busses . The add one to a memory location example might have several stages. Connect the two bytes of the location from the current instruction register to the address bus, strobe the address bus read, connect the data bus to the accumulator, connect the accumulator to the increment logic, connect the accumulator to the data bus, strobe the write to store. The micro-code ROM could have an arbitrarily long sequence and this enable complex instructions. The First ARM processor eschewed micro-code, its decoding was all logic. By not spending transistors on micro code they could spend the transistors on more registers. The 32 bit x86 is a register poor device. X86 code is full of instructions that push and pop data of the stack, because there's few registers to store intermediate results of calculations. The ARM had 16 32 bit registers, of which up to 14 were available to store intermediate results. Much less accessing of the by then slower memory. The first microprocessors were only as fast as memory, by the time of the 80286 and certainly 80386 PC motherboards could contain fast static RAM to be used as processor cache because the CPUs were losing performance waiting for slower memory. Keeping data in registers is the ultimate cache. Avoiding the need to load data from external memory saves at least a couple of clock cycles, more if the memory isn't cached.

@dlwatib 5 років тому

Machines with 16 registers were the norm for CISC machines even before RISC was invented. Even machines that didn't have a 32-bit word size still usually had 16 general purpose registers of whatever word size they did have (though register 0 was often hardwired to hold a 0 value). The register-poor x86 architecture was the exception, not the rule.

@CommandLineCowboy 5 років тому

@@dlwatib Being doing a bit of google research. Probably the most common 16 register processor was the 68000. Also the IBM 360, VAX and NS32016. I've only worked as a programmer on 68000 and x86 machines. Any other CISC processor types with 16 registers I've missed? I would argue 'the norm' for most people was an x86 machine or an 8 bit micro at the time of RISC's introduction. Having a little trouble finding processor production numbers to justify my assumption. Any graph of number of processors built would skew to 8 bit types because of the huge number of embedded processors. By the time of the Macintosh, Amiga and Atari ST had popularised the 68000 the PC was dominant. A list of CISC processor types might show many 16 register types, but in actual computers people used 9 out of 10 would be register poor x86, Z80 and 6502.

@pallavprabhakar 3 роки тому

Amazing video!!!

@MegaLazygamer 5 років тому

7:49 x86 (Intel specifically) has an overwhelming dominance in the server market. The Power architecture used by IBM in servers and such are edge cases.

@denvera1g1 5 років тому

IA64-itanium might be more common than powerPC(now), i know many banks, financial institutions, and even EDU(where i work) used those even though the only thing they would support was a special version of server08 and then some specially kerneled verions of linux/bsd, it isnt directly related to CISC or RISC, and during develompent was thought to eventually displace both RISC and CISC for servers, workstations and desktops. The main feature for this ISA was that it could execute multiple instructions per cycle, per thread. From what i understand the intel-HP EPIC/IA64/Itanium/VLIW was sort of like hyperthreading, on top of hyperthreading, which is what started in the itanium 9500 series and later, where the processors came with hyperthreading so up to 4 instructions could be run on each core in a single cycle, and in theory you could have even more istructions per thread. Immagine this arcetecture being used on that demostration purposes intel processor that had, what was it 8 threads per core on a 4 core daughter board, i think they canned it around the time xeon phi started getting into the 50 core mark probably a year or two before the launch of the x100 series because more cores is always going to outperform more threads per core

@skilletpan5674 5 років тому

@@denvera1g1 Yes. A cpu core needs to dump what ever is being done by the other threads when it executes a jmp. So unless your code is highly optimized it'll mean that 2 or how ever many threads need to be stopped and restarted (as I understand hyperthreading) every time a jmp needs to happen. This is why jump prediction etc has been so heavily worked on by intel and amd over the last few decades. I wonder how much the neural network thing amd use now really helps?

@boriscat1999 3 роки тому

two of the top 3 super computers are POWER. And x86 only just makes it into the top 5 of super computers. I think x86's dominance in the server market doesn't translate to dominance in all markets. Especially if the requirements become very specialized, like a supercomputer or an mobile device. Ultimately it's not the architecture of x86 that drives this, it's the licensing and available software. Building an x86 server means you can optionally sell Windows license with it, have a broader market, and make more profit. It's hard for a company to make a modern x86 without stepping on patents and there is no way to license it from Intel (or AMD). ARM can be licensed by anyone and adapted to the special needs of a product. And IBM works very closely with vendors to adapt their POWER chips to meet the specialized requirements of supercomputers.

@squirlmy 3 роки тому

Are you responding to 7:07 ? His point was that it was silly to even base "CISC won the war vs RISC" on the server market. So, you are defending this in spite of it just being an indicator of an ignorent viewpoint? Are you trying to confirm this: "I really was making a really dumb argument but I was right about that particular point?" Is that what you're trying to say?

@squirlmy 3 роки тому

@@denvera1g1 also many years ago PowerPC dominated the automobile computer market. And that's several chips per auto. I learned this at the time Apple was leaving PowerPC for Intel, and they may still be dominant in autos. If your argument (whatever is being argued!) doesn't include cars, trucks, airplanes, large appliances, etc. you're not making a good argument about the CPU "multiprocessor" market.

@Dsnsnssnsnsjej 6 місяців тому

Thank you. 👍🏻

@GaryExplains 6 місяців тому

You're welcome! 👍

@Raul_Gajadhar 4 роки тому

Towards the end you are right, because in 2000 the Pentium 4 had 42,000,000 transistors, but something changed somewhere, because the next year in 2001 Intel started to make Itanium processor with with 25,000,000 transistors. I really enjoyed this presentation.

@louiscouture9139 4 роки тому

Good explanations, I like it.

@baghdadiabdellatif1581 Рік тому

Thank you

@DemiImp 5 років тому

"ARM V8 consists of 3 ISAs: 64Bit AAarch64 and 32 bit ARM, which is further divided up into A32 and Thumb (16 and 32). 32-bit ARM is clearly CISC-y: variable length instructions, instructions that read/write multiple registers (push/pop), and a variety of odd instructions in Neon (floating point), just to name a few. These complex instruction crack into a variable number of ops, which is no-no in RISC. Aarch-64 cleaned up much of the ISA, but left in plenty of things that are CISC-y: loads/store pair, load/store with auto increment, arithmetic/logic with shifts, vector ld/st instructions in Neon to do strided reads/writes, etc. Again, fairly CISC-y. ARM instructions encode more information than say your typical DEC Alpha instruction; it’s closer to x86 than Alpha/SPARC in that sense. I think the RISC vs CISC lines have been blurred for over a decade, ever since out-of-order execution went mainstream. The advantage of RISC is clear in in-order machines. In OoO machines, not so much, with the sole exception of fixed-length instructions. ARM came from the embedded system world where lots of assembly code is handwritten. Accordingly, their ISA reflects the common usage patterns. At any rate, Cisc-y instructions are preferable to some of the oddball ISA choices made by early RISC ISAs (register windows, branch delay slots, load delay slots, reciprocal step instructions, etc)."

@erikengheim1106 3 роки тому

Interesting take but based on my reading this also seems a bit pedantic. E.g. Thumb follows a very RISC like philosophy in reducing cache usage. Rather than adding complex instructions to reduce memory usage, they came up with Thumb, which is just a compressed version of a subset of their 32 bit instruction set. It is not a new instruction set as such. As far as I understand RISC is based on the 80/20 kind of rule, that 20% of the instructions are used 80% of the time. Hence they try to keep these 20% instructions equal in length and execution time to make the pipelines work effectively. Maybe I am wrong about this, but I would bet that the specialized longer clock cycle ARM instructions are used in special contexts, where one might use a lot of them. As far as I understand, you want to keep feeding as many equal sized instructions in a row as possible. They may pull that off still even with variable length instructions if these instructions are not usually mingled a lot with other instructions. Obviously there must be something very RISC like about a lot of the ARM instruction sets when their ARM Cortex-M0 can be implemented in a mere 12 000 transistors which is HALF of that of a Intel 8086, despite being a 32 bit processor with way higher clock frequency.

@DemiImp 3 роки тому

@@erikengheim1106 The quote I pasted was talking about how ARM really isn't RISC anymore. It once was, but it has bloated a lot in the past decade.

@erikengheim1106 3 роки тому

@@DemiImp Let me rephrase my point since I don't think it got across. If you add a bunch of functional programming constructs to an object-oriented programming language, does that language then become functional? Or if you add a bunch of object-oriented features to a functional programming language, does it become object-oriented? There are many way of answering such a question. You have the pedantic who operate on one-drop rules. E.g. if there is only a hint of object-oriented features, the whole thing must be categorized as object oriented. Then there are the pedantic that go say, now both languages are multi-paradigm. The OOP and functional divide no longer makes sense! Yet at this point the pedantic has lot track of why humans engage in taxonomy in the first place. If you end up categorizing every single programming language as say multi-paradigm, then your categorizing criteria are worthless. Your taxonomy adds no value to people looking at a landscape of different technologies and try to reason about them. Hence I believe taxonomies should be pragmatic. They should be based on heuristics rather than hard rules. Just because you add a bunch of non-RISC like instructions doesn't mean that the core of the instruction set wasn't design around the RISC philosophy. It is kind of like adding OO features to a functional language. It does not change the fact that the core has been centered around functional thinking.

@DemiImp 3 роки тому

@@erikengheim1106 I would disagree. C++ is C but with more OO design. Are you suggesting that C++ is not actually an OO language? The defining characteristic of a RISC architecture is that it has close to the minimum set of instructions to be Turing complete. The moment you start adding in more instructions, the less RISC it becomes and the more CISC it is. Modern ARM ISAs are not very RISC-y.

@erikengheim1106 3 роки тому

DemiImp Definitions should be useful. Defining RISC as being about as few instructions as possible isn’t a useful definition. PowerPC G3 had more instructions than pentium pro e.g. Yet it was a very RISC like architecture because it used instructions of equal size, requiring same number of clock cycles etc. All to make pipelining easier. It was designed around reducing load and store instructions by using many registers. This is a very RISC like philosophy.

@rafaelangelopiad8919 3 роки тому

thank you

@jimreynolds2399 3 роки тому

Worth mentioning that while modern processors have billions of transistors, compared to back in the 80s, the vast majority of those are for L2 and L3 cache rather than CPU implementation. I think they also include graphics functions now as well. I remember the 6502 - it had only about 3,500 transistors.

@cmilkau 5 років тому

It is probably worth noting that IA64 (not x86_64) is actually kind-of a modern VLIW arch. Kind-of as it still hides implementation details that a pure VLIW would not, like the real number of ALU units.

@TeenyPort 3 роки тому

Gary, did your UKposts channel name/idea actually come from that post on the "Instructions Per Cycle" Android Authority video you did 4 years ago?

@nyambelilonga555 4 роки тому

Very informative

@tunahankaratay1523 3 роки тому

The thing is that nowadays you probably better off focus on instruction level parallelism rather than crazy complex instructions. Most complex stuff can be offloaded to a dedicated coprocessor anyways(things like cryptography, video encode/decode, AI etc.). And those coprocessors can be completely powered down when not in use to save tons of power.

@onisarb 5 років тому

We find new information yet again!

@edwardpkps 3 роки тому

Great video Gary! Seems like in data center and PC Arm based chip still have very low market share. I keep hearing people say X86 are more powerful. I wonder what's your thought on that? Does instruction set have direct relation to computing power?

@GaryExplains 3 роки тому

At the moment the majority of Arm based processors are designed for smartphones. This is just the beginning of a shift towards laptops, desktops and servers. You can't compare the performance of a smartphone processor running from a battery with passive cooling to a desktop processor running from mains power with a huge fan on it. Probably the best example we have at the moment is Amazon's Graviton 2 processor. It has show better perf/watt than many Intel chips.

@nawmeerahman8574 5 років тому

i like watching your video.they are very good and informative.

@BruceHoult 5 років тому

The x86 "instruction decode tax" hasn't mattered on the desktop for a long time with just one or even four or six cores. It's very noticeable (as you allude to) as you go smaller. Small 32 bit microcontrollers such as ARM Cortex M0 and SiFive E20 and PULP Zero RISCY have a similar number of transistors to a 16 bit 8086 (29000), as did the first few generations of ARM chips. The smallest modern x86, the Atom, apparently has 17,000,000. This matters both when you want to put some teeny tiny CPU in the corner of another chip, and also when you want to have thousands or tens of thousands of CPUs and need to supply them with electricity and cooling.

@allmycircuits8850 3 роки тому

I'm pretty sure most of 17,000,000 transistors are used for cache of various levels. But instruction decode tax exists nonetheless, just not as huge as one could think :)

@BruceHoult 3 роки тому

@@allmycircuits8850 no,. 17 million is just the core. Including cache etc the Silverthorne CPUs had 47 million and Lincroft 140 million after GPU and DDR controller were moved on-die.

@alexlimion2624 3 роки тому

brilliant!

@valenrn8657 5 років тому

Modern X86 CPU has instruction compression for it's RISC core. Many X86 instructions in AMD Jaguar CPUs are single cycle throughput.

@gazlink1 5 років тому

From my understanding there's another downside to CISC.. Sort of second order effects from the ones mentioned. Having to turn CISC into RISC on the fly, to get at the RISC like micro-ops is also the reason x86 has a much harder (complex to design, number of transistors, amount of silicon and power consumption) time with branch prediction, and out of order execution. Each incoming CISC instruction can take any amount of time to execute once decoded, and can jump to anywhere else in the list of CISC instructions that make up the program - each of which again needs to be decoded into micro-ops, whereas RISC knows what all the "micro-ops" will be, they're already written as RISC instructions. I guess with an equally complex compiler, you can create programs that are just as friendly on a modern x86 core as on a RISC core, with a suitably complex compiler for RISC too, but you can get more unpotimised "dogs" of programs on CISC than on RISC, where optimisation will be easier and more dependable. Micro-code updates may change with CISC, and make optimisations needed change over time. This is probably part of the reason that Intel (and AMD to some extent) suffers from so many security leaks in regard to their branch prediction and speculative execution - its a much more complex job to implement than with RISC architectures. All that spare microcode silicon will always be a power hog that ARM doesn't need, even when some of it is sitting there doing nothing most of the time because it's for legacy operations, that are used infrequently.

@AykevanLaethem 5 років тому

ARM also suffers from these processor security issues. It's just that only their very high end processors are affected, because only those processors match a typical x86 processor in complexity and performance. So in a sense, it's the complexity and performance optimizations that led to the security issues, not ARM vs x86 (or RISC vs CISC).

@Arthur-qv8np 5 років тому

"This is probably part of the reason that Intel (and AMD to some extent) suffers from so many security leaks in regard to their branch prediction and speculative execution - its a much more complex job to implement than with RISC architectures. " Not really, seculative & out of order execution and branch preduction are performed on the microops (who are in the Out of Order world). CISC instruction are issued In Order. And ARM is also affected by vulnerabilities that exploit speculative and out of order execution.

@avid0g 5 років тому

The vulnerability that this thread is referring to is the lack of housekeeping on abandoned register and cache data; An error in thinking that hidden or obscure is the same as secure. The patches applied to this problem are at the software level. The insecurity has been built into the hardware for years...

@squirlmy 3 роки тому

@@Arthur-qv8np I think you may have written that without reading the comment above. They're probably vulnerable at the level of smartphones and some Chromebooks, but not much else. Really, it seems like any of this wouldn't be an issue today except Intel has so much IP built on old decisions about CISC vs RISC "approaches". And ARM vice versa.

@Arthur-qv8np 3 роки тому

@@squirlmy I wrote this comment a while ago, and I haven't developed it much. These vulnerabilities (like spectrum/Meltdown) are really not a problem of CISC vs RISC, it's a problem in the concept of superscalar processors (processors that read a scalar instruction flow but manage to execute these instructions in parallel by exploiting Instruction level parallelism (ILP)). The problem is that this sort of processor, which uses OoO & speculative execution, make the isolation between threads more complex than we thought. All superscalar processors have been affected to some degree, depending on their microarchitecture design. The parts of the micro-architecture that are responsible for the issues are not in the instruction decoder or even in the execution units of the processor, but rather in the memory system (in a very large meaning) and how the transient instructions will affect this memory system. So it's really not because of the CISC. The CISC only makes the instruction decoder more complex, not the rest of the micro-architecture. The more optimization the processor contains in this memory system, the more it is exposed to potential vulnerabilities. Intel processors contained a lot of this optimization (especially bypasses to access data faster). We can blame Intel's security section for not figuring out these problems, but we can't really blame the designers who created these optimizations for not having thought of security flaws that didn't exist in the first place (the meltdown flaw actually exploits a behavior described in an Intel patent). Superscalar processors from ARM, IBM or AMD are much less advanced in this type of optimization and have therefore been less affected. The processors have been affected at their "optimization" level. (can we still talk about optimization when we talk about a behavior that adds a flaw?) For these reasons Intel's CISC ISA is not involved, a RISC ISA would have caused the same problem to Intel. It is important to understand that ISA is only a very small part of a processor, its memory system is much larger and causes much more problems (processors have a computing power limited by the memory system - memory is the bottleneck).

@Jorge-xf9gs 2 роки тому

What do you think of ZISC as a general purpose "architecture"?

@IslandHermit 3 роки тому

Another motivation behind RISC was that the CPU real estate freed up by using a less complex instruction set could be used to add more registers - a LOT more registers - which would speed up computation by reducing cache hits and would also allow compilers to do much heavier optimization.

@fredrikbergquist5734 Рік тому

That is in my opinion the real reason that RISC was so successful, it was in some way implementing a cache with very few logic elements! And implementing a cache algorithm in hardware is difficult, here the cache is actually implemented by the compiler and it can be optimized in a better way and might analyze how the program will most likely run. Today with billions of transistors and a cache in three layers that might be not so important, but still, giving the compiler a lot of control is a good idea.

@BruceHoult 5 років тому

A pretty fair video. One interesting point is that x86 with instructions from 1 to 15 bytes long does not actually have more compact code than modern RISC such as ARM Thumb or RISC-V which have both gone radically away from having a single 4 byte instruction length by ... having both 2 and 4 byte instructions. Radical! That's enough to let both of those have more compact code than i686. Interestingly, the very first RISCs, the IBM 801 project and RISC-I also had both 2 and 4 byte instructions.

@Arthur-qv8np 5 років тому

"does not actually have more compact code than modern RISC such as ARM Thumb or RISC-V" you mean "RISC-V C extension", right?

@BruceHoult 5 років тому

@@Arthur-qv8np Yes. It's pretty much only student projects or tiny FPGA cores that don't implement the C extension. Once you get to even a couple of KB of code you get an overall savings in implementing C.

@erikengheim1106 3 роки тому

Bruce what is the average length of executed x86 instructions though? I have been trying to understand how much thumb matters. But it is hard to assess without knowing the length of the typical x86-64 instruction. And also how much does a 15 byte long instruction matter if it does the job of 30 RISC instructions? I am a RISC fan, but I want to make sure I understand the tradeoffs properly.

@BruceHoult 3 роки тому

@@erikengheim1106 this talk (and referenced paper) discusses this ukposts.info/have/v-deo/gZmQpHuPgoGKtps.html

@erikengheim1106 3 роки тому

@@BruceHoult Thanks, interesting video. From what I could gather x86 would have 3.7 bytes per instruction on average. Their RISC-V compressed instruction set got down to 3 bytes per instruction on average.

@boriscat1999 3 роки тому

To me, the critical difference between CISC and RISC is that on CISC you have a fairly complex number of possible state transitions for bus access (on 8080/Z80 this was formalized as T-states). The point that RISC doesn't access memory to do operation is how this plays out. In CISC you might have a complex operation that loads a value, adds a number to it, and writes it back. That means the bus access will have to follow that read/wait/write sequence. Versus another instruction that simply reads a value and stores it in a register, that's only has a read cycle. Two different sequences for state. As you get more complicated instructions you end up with even more possible sequences. RISC generally doesn't need to pick from a broad set of possible sequences and does every operation (roughly) the same or in a more asynchronous fashion.

@sin3r6y98 5 років тому

X86 today is largely moving towards more and more risc concepts, with complex instructions decoded in microcode. If Intel really wanted to make a low power phone chip they could by just simply removing all the backwards compatibile cisc instructions the emulate from the 80s and 90s. AMD did this a long time ago, if anyone was more poised to make low power chips it's be them. But the reality is that arm has largely already won this front. There's no reason for Intel or AMD to try to compete in that market alongside ARM as it would require a lot of porting effort and for that to be convincing x86 would not only have to be power comparable but also provide significant advantages. With how long ARM has spent in the past focusing on performance-per-watt I'm not entirely sure that's really even possible.

@RonLaws 5 років тому

People forget though that ARM don't manufacture the processors. They design the specifications for other companies to make and implement as they see fit. Intel have in the past produced ARM CPUs, all HP PocketPC from around 2003 for example shipped with an ARM5te which was an Intel chip (The Intel XScale) using the ARM Licensed Specifications.

@mrrolandlawrence 5 років тому

soon as apple get their laptops / desktops on ARM processors and windows 10 support for ARM improves... x64 will be a dying breed.

@PsychoticusRex 5 років тому

Is operable memory still a thing? Whence instead of reading the memory first, you just send the number too add or subtract to the memory and it'd do the simplest math on it since the cpu doesn't care what the results are at that point. It'd introduce a lot of streamlining to cpu-memory communication and inter-thread communication preventing unnecessary blocks.

@rndompersn3426 5 років тому

i remember years ago reading from a tech website that the Atom CPUs on mobile phones were actually more power efficient than the ARM cpus. This was a while ago.

@GaryExplains 5 років тому

Since you mentioned it I went looking and foudn this: www.tomshardware.com/reviews/atom-z2760-power-consumption-arm,3387.html from 2012, I would take such a report with about 3 metric tons of salt.

@mas921 5 років тому

I still have to deal with low level optimizations from time to time but on GPUs. Which made me think, Professor; that In mobile, how dominant (or not) are SIMD instructions vs usual "logic" RISC instructions. Because when i was watching the video on my note 8 i thought "RISC insts. are showing the professor on my screen right now lol....oh wait.." ...then i realized actually there is a SIMD video decoder taking care of that! And then there is the GPU for the UI, DSP for the camera...etc etc. So i was reallllly intrigued by how much SIMD is actually "running our multimedia rich mobile world" hence it might not be RISC vs CISC now in 2019 as much as dominance of SIMD/ co-processors ;)

@GaryExplains 5 років тому

Yes I agree, a smartphone has a processor with not only a RISC CPU, but there is the FPU, SIMD, Cryptography extensions, a DSP, an NPU plus the GPU. But there is also seperate video processor (decoder) and a separate display processor!

@sir.sonnyprof.ridwandscdba227 3 роки тому

what kind of chip architecture do they use for quantum computer? for example: samsung phone will release samsung A quantum phone..what kind of chip architecture for that quantum cpu..? thx

@GaryExplains 3 роки тому

The Samsung A Quantum isn't a smartphone with a quantum computer, it just has a built-in hardware based random number generator: www.androidauthority.com/samsung-galaxy-a-quantum-1118992/

@JNCressey 5 років тому

9:15 although, when the prediction isn't done exactly right, you get vulnerabilities like spectre and meltdown.

@glitchysoup6322 5 років тому

Can you cover RISC V cpus?

@GaryExplains 5 років тому

Yes, I plan to cover RISC-V soon.

@centuriomacro9787 5 років тому

I have a question regarding the transistors of a CPU. You said that there are billions on a die. What I want to know is the following: are the transistors already used provide specific numbers of AND, NAND, OR, XOR gates or can they be put together to perform a specific logic operation at any time? And how would they do that?

@Arthur-qv8np 5 років тому

In a conventional CPU: Transistors are fixed and their interconnections are also fixed in a classic CPU. In contrast, in an FPGA: you can configure the connection between logical blocks, called Look Up Table (LUT). But once configured, it's set for the run.

@centuriomacro9787 5 років тому

@@Arthur-qv8np thx for your answer. What does fixed in their interconnection mean? That they are printed together as logic gates?

@Arthur-qv8np 5 років тому

@@centuriomacro9787 Exactly :p

@insoft_uk 5 років тому

I wonder if Intel have plans to put it’s CPU into a day micro instruction mode, RISC mode were it fetches the actual micro codes from memory so they can use it as a CISC or RISC CPU for the mobile industry, perhaps not never really look into the x86 since years ago and so much has changed. One time I came across that AMDs converted the x86 instruction into ANDs instructions before executing don’t know if that’s true or if I miss read something at the time but be interesting knowing how AMDs doing their CPUs

@datasailor8132 5 років тому

Typically the RISC computers are hard-coded whereas CISC computers are micro-coded. As the presenter alluded to the individual steps of a CISC instruction are read from the microcode memory. There was one aside that I have a quibble with. He kept referring to completing one instruction per clock cycle whereas the real objective is to start one instruction per cycle. You just need to add more pipelines. Part of Patterson's rationale for all this was that very few compilers used the complex addressing modes. Well, he was using Dennis Ritchie's C compiler and Dennis didn't use a lot of the instructions in the VAX. Remember Berkeley was a big UNIX shop with its BSD, Berkeley Software Distribution, operation. I was a developer at Bell Labs in New Jersey in those days and a lot of people were aghast at the amount of resources we'd throw at a project. These included people like Dennis, Ken Thompson, Brian Kernighan, and especially John "Small is Beautiful" Mashey. We'd talk in staffing levels of man-millennia. Old Bell Labs joke. "Why does Dennis Ritchie use a text editor?" Answer: "Because the compiler won't accept code from the standard input." Interesting fact. In the very early days the UNIX source code was found in /usr/dmr.

@soylentgreenb 5 років тому

CISC + RISC-like microops won single threaded performance on the desktop. CISC is like memory compression and saves a lot of external bandwidth. The failure of Dennard scaling makes this choice less obvious now; sometimes you’d rather have more cores than power wasting front ends. Especially for portable stuff.

@1MarkKeller 5 років тому

*GARY!!!* *Good Morning Professor!* *Good Morning Fellow Classmates!*

@GaryExplains 5 років тому

MARK!!!

@ZamanSiddiqui 5 років тому

MARK!

@KaushalNiraula 5 років тому

Gary!!! Mark!!! Zaman!!!

@1MarkKeller 4 роки тому

@@ZamanSiddiqui *ZAMAN SIDIQUI!*

@1MarkKeller 4 роки тому

@@KaushalNiraula *KAUSHAL NIRAULA!*

@kristeinsalmath1959 5 років тому

If i understood correctly. The head generated on CISC chips is caused by complexity of the microcode while RISC's chips take less effort to perform one simple intruction?

@Waccoon 5 років тому

Microcode and nanocode have been around forever. Modern CISC can re-order the nanocode before caching and execution, while RISC tends to still be hard-coded. Ultimately, RISC vs CISC is a checklist of design features, and there's no hard line of separation between the two. Hence, why there's so much confusion. Both styles of chips have huge amounts of redundancy and parallelism. The only "real" difference between CISC and RISC these days is that CISC can combine memory access with a computation (orthogonal instructions), while RISC strictly separates the two (load/store instructions).

@sybaseguru 3 роки тому

Great explanation, thanks. Not sure about your conclusion though. The ryzen 3600 proves that you can get fantastic performance without massive power requirements. Risc keeps sticking its head up and gets it chopped off very quickly. Its probably cheaper and quicker to design and fabricate so uses the latest technolgy whilst cisc is a generation behind but so much quicker. DEC, IBM, Intel, MIPS, Sun, HP all in their heyday tried and got trashed. The only time RISC wins is when very low power is first, second and third priority.

@RayDrouillard 5 років тому

Since modern processors have an internal speed that is several times quicker than their ability to fetch memory, the CISC method of running several processes per instruction has a distinct advantage.

@Arthur-qv8np 5 років тому

"running several processes per instruction has a distinct" I think you're discussing about superscalar architecture, not CISC ISA. Some ARM processor are superscalar too. (with a RISC ISA)

@saisrikargollamudi7892 5 років тому

I know what moore's law is,but can you explain how it was responsible for propelling growth in the semiconductor industry and also does transistor count really correspond to the power of a chip or are there other parameters to it ? I would like see a video from you sir on these topics.Great video as always.

@avid0g 5 років тому

Moore's Law was always a result of industry advances, not a cause. The current emphasis on core count and parallel processing is a reflection of the difficulty in fabricating smaller features in silicon ie yield vs scrap.

@saisrikargollamudi7892 5 років тому

@@avid0g thanks David.

@gordonlawrence4749 5 років тому

One other advantage of RISC is how many gates it takes up on an FPGA. OK there are FPGAs with a bazillion gates on but there are still FPGA's with less than 50k gates too. For SOC on something that small you have to go CISK and stuff a bit of RAM and ROM round the outside.

@schmutztimo8952 2 роки тому

Damn🔥🔥🔥🔥

@virtualinfinity6280 4 роки тому

Quite a good video. However, the micro-op sequencing done today in all modern x86 chips, has far more dire consequences, than the video explains. Conceptually it is explained right: You take a complex CISC instruction, break it down to simple, RISC-like instructions and send those down your internal pipelines to the various execution units. However, when those have been executed, you have to "fuse back" the result leaving the CPU, as if it where just one CISC instruction executed. For simple CPUs, that is not a problem. However, all modern CPUs are multiple-issue, out-of-order execution designs. Which means, it fetches multiple CISC instructions from cache, breaks all of them down to RISC instructions (micrp-op sequencing), send those down to the execution units to be executed in parallel, and after execution, you have to keep track which RISC micro-op "belongs" to which CISC instruction and fuse them all together in the right way, then reorder the result to the exact original CISC instruction sequence and have the results written to cache in-order (micro-op fusion). Sounds complex? Just wait, it gets worse Interrupts, which cause the CPU to stop whatever it is doing and jump to execute some different code (the interrupt handler) are delivered precise. Which means, interrupt can even cause an instruction to be haltet WHILE it is executing, then have the CPU do something else (the interrupt handler) and after that is done, resume the instruction after the interrupt has been serviced. To keep track of all of this is very hard. To do this with the CISC->RISC micro-op translation, is even way harder. Today, modern x86-CPUs have a hard time keeping more than 6 CISC instructions "in-flight" internally. And with micro-ops, you pay a hefty price anyway: it adds at least two pipeline-stages to the whole CPU design. Which essentially means, you need two more clock-cycles to execute an instruction. Going above the current "6-instructions-in-flight" issue would mean to add more pipeline-stages to sort out the added complexity for micro-op sequencing and micro-op fusion. This is by the way one of the key reasons, why CPUs haven't got significantly faster per core at the same clock frequency. If one would apply all the manufacturing bells&whistles of the Intels and TSMCs to a massive out-of-order RISC design, it would still arguably be faster, than the current high-end CISC designs. Finally, the fewer instructions of a RISC architecture allow for more internal registers. RISC has typically 32 (one of which is hard-wired to 0 for good reasons), while x86-64 has 16.

@fk319fk 5 років тому

For RISC, the best example is Video Cards... my question is what happened to VLIW? It seems you compile once, have plenty of real time to do so, so it seems that is the best time to optimize.

@Arthur-qv8np 5 років тому

VLIW (for Very Long Instruction Word) just graps together a fixed number of "small" instructions into a "big" one (a "very long" one) also called a bundle. When you execute that "big" instruction: you execute in parallel all the "small" instructions, each of them is assigned to a different compute unit. The unit assignment must be done at compile time this process is called "static scheduling". That's the opposite of "dynamic scheduling" where the units are assigned at runtime within the CPU, which is what we call a superscalar processor (like x86 CPUs, ARM, RISC-V, ..). "static scheduling" reduces CPU complexity by giving the compiler the workload of scheduling. But building an efficient compiler for static scheduling is really challenging for general purpose processor (like the one you use to navigate on youtube). This is why most general purpose processors are superscalar and not VLIW. But the VLIW architecture is still used in DSPs (Digital Signal Processor) for example.

@23wjam 5 років тому

A lot of so-called Risc chips have multi-cycle instructions. I think risc is more of a concept now, but a lot of real world implementations aren't so pure as lines blur between risc and cisc from the risc side. Soon comp Sci will redefine what risc is, like they did with mini-computer. Imo

@AlexEvans1 5 років тому

Yep, pretty much no such thing as a pure RISC or pure CISC architecture. When did they redefine what a minicomputer is? I am aware of a lot of idiots that call things which are smaller than microcomputers minicomputers, but in the field 1) minicomputer was a term only used by part of the field even in the days of say the PDP-11 and 2) has essentially been abandoned.

@23wjam 5 років тому

@@AlexEvans1 well now minicomputers is meant more between mainframe and micro, but in the past, a mini-computer was a computer with minimal features. The closest thing to the old definition of mini-computer is probably MISC, however it's not a perfect replacement because apparently MISC ISA can't have microcode. DEC's PDP-8, a mini-computer with like 8 instructions isn't MISC because it has microcode.

@AlexEvans1 5 років тому

@@23wjam That seems a difference that isn't a difference since you are talking about a time when there simply weren't microcomputers. A definition that *may* have existed before 1970. Most of DECs PDP series (particularly not the PDP-10) were considered minicomputers when they came out. I know of RISC and CISC implementations that didn't use microcode.

@23wjam 5 років тому

@@AlexEvans1 the difference is that they aren't minimal anymore, previously a fundamental facet for the former criteria. I guess the issue is, and obviously how and why it changed, is probably due to it being more of a marketing term. (maybe? I'm not sure because it was before my time) As regards the no microcode, there are obviously other criteria involved, but in my example the PDP-8 is disqualified from the MISC classification due to microcode, which is idiotic as its definitely MISC in spirit.

@AlexEvans1 5 років тому

@@23wjam that and MISC is a term that wasn't in use at the time. Just like CISC architectures weren't referred to that way until the introduction of the term RISC. MISC has generally referred to certain academic architectures like the one instruction architectures (for example subtract and branch if not zero). In any case RISC and CISC are really abstract notions that few computers fit perfectly.

@kshitijvengurlekar1192 5 років тому

You should talk about Von Neuman and Harvard Arch sometime

@GaryExplains 5 років тому

I cover Von Neuman here: ukposts.info/have/v-deo/qp-WjqlknIJezas.html

@chigga76 5 років тому

You've come away light years in presentation style since that video!!!! Great job and love your work sir!!! Thank you for all the insights and the SoC fights of course :)

@GaryExplains 5 років тому

@@chigga76 Thx :-)

@ToTheGAMES 5 років тому

Every time you look away from the camera (to a prompter?) I imagine you shift gears with your eyes.

@wujasmarecki 5 років тому

Super!

@TheRojo387 Рік тому

I heard that CISC still has its use to compress functions into a format inflatable back into a RISC format for simpler hardware to execute. I believe VLIW would do this even more so.

@CZac2k12 2 роки тому

I'm glad that RISC processing is now being used in tablets and smartphones. What will happen when the desktop and laptop PC's used RICS processing?

@rtype4930 5 років тому

long live The Commodore... !

@Blue.star1 4 роки тому

Gary do you know what sort of ram , L1, L2 cach these risc processors consume ! I dont think cisc will wait few clock cycles to execute a x86 operand , there are methods to bypass waiting few clocks....

@GaryExplains 4 роки тому

I am not clear on exactly what you are asking? The problem of fetching instructions from memory and the use of caches is the same for RISC and CISC, but there is a disadvantage to system that use variable length instructions.

@Blue.star1 4 роки тому

@@GaryExplains I meant multiple instructions for risc instead of using one...

@GaryExplains 4 роки тому

So the idea is this. On RISC the instructions are guaranteed to be of a certain size, which means the fetch and decode phase is simpler compared to variable instruction lengths which means tat the first part has to be fetch and partially decoded to see if more needs to be fetched. As for execution time (in cycles) vs time to fetch next instruction, these mechanisms are basically decoupled nowadays on both CISC and RISC due to caching, wide pipelines, and instruction level parallelism, etc.

@Blue.star1 4 роки тому

@@GaryExplains instead of running code line by line we should use multiple fpga and cpu's to decode instructions and run basic OS Files, we got maturity in hardware and software, we have to implement common OS system files inside the chip...problem is it heats up

@GaryExplains 4 роки тому

I don't really know what you mean by "instead of running code line by line" or what you mean by using multiple cpus to decode instructions. Of course, heat (which equals energy spent) is the THE problem.

@keiyakins 3 роки тому

I honestly kind of hate the thing modern x86 processors do. It being impossible to know what the computer is *actually* doing... it just bothers me. And then there's the security problems introduced by stuff like branch prediction... Then again I'm a huge fan of the approach Microsoft research was taking in Singularity and Midori of keeping code in a level where it can be reasoned about reliably until delivery to the computer it'll run on, which does the same thing but is more explicit about it because it's at the OS layer.

@centuriomacro9787 5 років тому

RISC vs CISC is so exciting. I would have never thought that there are different types of instruction sets and they have such a big impact on the device. Im looking especially forward to Apples ARM designs and how Intel is able to response

@hermanstokbrood 5 років тому

ARM is designed by......ARM. Apple only customizes it like Qualcomm, Huwawei, and Samsung. Qualcomm and Huawei have closed the gap since they build also on 7nm like Apple already did.

@wanmaziah9835 4 роки тому

@@hermanstokbrood lol Huawei did not customize their are cpu...

@hermanstokbrood 4 роки тому

@@wanmaziah9835 ROFL Yes they are. ARM delivers the building blocks. They all do their own stuff with it.

@wanmaziah9835 4 роки тому

@@hermanstokbrood what did you means by ROFL ?

@wanmaziah9835 4 роки тому

@@hermanstokbrood what is different between semi custom and full custom.... as we can see in Qualcomm chip in sd 835 rather than sd 820 / 821 because Qualcomm modified full custom core..... in sd 820 and 821.. and I think Monggose also did full custom core not semi like Samsung do... Is that true.. ??🤔🤔

@WizardNumberNext 5 років тому

there was single architecture which did every single instruction in single clock - MIPS (but it lacked division and possibly multiplication too) all other RISCs did not have all instructions in single clock

@AlexEvans1 5 років тому

early versions of SPARC also did this. They multiplied using the mulscc instruction. For a 32 x n multiply, you would execute the instruction n times.