And by the way, I used x86 as an example here, but don’t believe for a second the same thing doesn’t apply to, say, the ARM chip in your phone. Modern ARM chips support multiple encodings and also rank over 1000 instructions if you count them at the same level of granularity as XEDs “iforms”.
Indeed, those who think x86 is complex should also take a detailed look at the ARM64 instruction set, particularly its instruction encoding. If you thought making sense of x86 instruction encoding was hard, and that a RISC might seem simpler, AArch64 will puzzle you even more.
To use the MOV example, the closest ARM equivalent might be the 40 variants of LD, which the reference manual (5000+ pages) enumerates as: LDAR, LDARB, LDARH, LDAXP, LDAXR, LDAXRB, LDAXRH, LDNP, LDP, LDPSW, LDR (immediate), LDR (literal), LDR (register), LDRB (immediate), LDRB (register), LDRH (immediate), LDRH (register), LDRSB (immediate), LDRSB (register), LDRSH (immediate), LDRSH (register), LDRSW (immediate), LDRSW (literal), LDRSW (register), LDTR, LDTRB, LDTRH, LDTRSB, LDTRSH, LDTRSW, LDUR, LDURB, LDURH, LDURSB, LDURSH, LDURSW, LDXP, LDXR, LDXRB, LDXRH. Some, like LDP, are then further split into different encodings depending on the addressing mode.
My suspicion is that to achieve acceptable code density with a fixed-length instruction encoding, they just made the individual instructions more complex. For example, the add instruction can also do a shift on one of its operands, which would require a second instruction on x86.
Indeed, those who think x86 is complex should also take a detailed look at the ARM64 instruction set, particularly its instruction encoding. If you thought making sense of x86 instruction encoding was hard, and that a RISC might seem simpler, AArch64 will puzzle you even more.
To use the MOV example, the closest ARM equivalent might be the 40 variants of LD, which the reference manual (5000+ pages) enumerates as: LDAR, LDARB, LDARH, LDAXP, LDAXR, LDAXRB, LDAXRH, LDNP, LDP, LDPSW, LDR (immediate), LDR (literal), LDR (register), LDRB (immediate), LDRB (register), LDRH (immediate), LDRH (register), LDRSB (immediate), LDRSB (register), LDRSH (immediate), LDRSH (register), LDRSW (immediate), LDRSW (literal), LDRSW (register), LDTR, LDTRB, LDTRH, LDTRSB, LDTRSH, LDTRSW, LDUR, LDURB, LDURH, LDURSB, LDURSH, LDURSW, LDXP, LDXR, LDXRB, LDXRH. Some, like LDP, are then further split into different encodings depending on the addressing mode.
My suspicion is that to achieve acceptable code density with a fixed-length instruction encoding, they just made the individual instructions more complex. For example, the add instruction can also do a shift on one of its operands, which would require a second instruction on x86.