Hacker News new | past | comments | ask | show | jobs | submit login

I'm confused - I thought the reason that it's hard for Intel to add more decoders is because x86 ISA doesn't have fixed length instructions. As a result you can't trivially scale things up.

From that linked article:

--

Why can’t Intel and AMD add more instruction decoders?

This is where we finally see the revenge of RISC, and where the fact that the M1 Firestorm core has an ARM RISC architecture begins to matter.

You see, an x86 instruction can be anywhere from 1–15 bytes long. RISC instructions have fixed length. Every ARM instruction is 4 bytes long. Why is that relevant in this case?

Because splitting up a stream of bytes into instructions to feed into eight different decoders in parallel becomes trivial if every instruction has the same length.

However, on an x86 CPU, the decoders have no clue where the next instruction starts. It has to actually analyze each instruction in order to see how long it is.

The brute force way Intel and AMD deal with this is by simply attempting to decode instructions at every possible starting point. That means x86 chips have to deal with lots of wrong guesses and mistakes which has to be discarded. This creates such a convoluted and complicated decoder stage that it is really hard to add more decoders. But for Apple, it is trivial in comparison to keep adding more.

--

Maybe you and astrange don't consider fixed length instruction guarantees to be necessarily tied to 'RISC' vs. 'CISC', but that's just disputing definitions. It seems to be an important difference that they can't easily address.




People are rehashing the same myths about ISA written 25 years ago.

Variable length instructions are not a significant impediment in high wattage cpus (>5W?). The first byte of an instruction is enough to indicate how long an instruction is and hardware can look at the stream in parallel. Minor penalty with arguably a couple of benefits. The larger issue for CISC is that more instructions access memory in more ways so decoding requires breaking those down into micro-ops that are more RISC like, in order that the dependencies can get worked out.

RISC already won where ISA matters -- like AVR and ARM thumb. You have a handful of them in a typical laptop plus like a hundred throughout your house and car, with some PIC thrown in for good measure. So it won. CISC is inferior. Where ISA matters it loses. Nobody actually advocates for CISC design because you're going to have to decode it into smaller ops anyway.

Also variable length instruction is not really a RISC vs CISC thing as much as also a pre vs post 1980 thing. Memory was so scarce in the 70s that wasting a few bits for simplicity sake was anathema and would not be allowed.

System performance is a lot more than ISA as computers have become very complicated with many many I/Os. Think about why American automakers lost market share at the end of last century. Was it because their engineering was that bad? Maybe a bit. But really it was total system performance and cost of ownership that they got killed on, not any particular commitment to a solely inferior technical framework.


I agree that's a real difference and M1 makes good use of it, it's just to me RISC ("everything MIPS did") vs CISC ("everything x86 did") implies a lot of other stuff that's just coincidences. Specifically RISC means all of simple fixed-length instructions, 3-operand instructions (a=b+c not a+=b), and few address modes. Some of these are the wrong tradeoff when you have the transistor budget of a modern CPU.

x86 has complicated variable instructions but the advantage is they're compressed - they fit in less memory. I would've expected this to still be good because cache size is so important, but ARM64 got rid of theirs and they know better than me, so apparently not. (They have other problems, like they're a security risk because attackers can jump into the middle of an instruction and create new programs…)

One thing you can do is have a cache after the decoder so you can issue recent instructions over again and let it do something else. That helps with loops at least.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: