OTOH, you have x86 using what is effectively a compressed instruction encoding, and a trace cache (although its advantageous enough arm designs are apparently using them now too) which reduces the size of the icache for a given hit rate. So the arch losses a bit here, and gains a bit it elsewhere. Its the same thing with regard to TSO, a more relaxed memory model buys you a bit in single threaded contexts, but frequently TSO allows you to completely avoid locks/fencing in threaded workloads which are far more expensive.
So people have been making these arguments for years, frequently with myopic views. These days what seems to be consuming power on x86 are fatter vector units, higher clock rates, and more IO/Memory lanes/channels/etc. Those are things that can't be waved away with alternative ISAs.
If it were vectors, clocks, and memory, then Atom would have been a success, but even stripping out everything resulted in a chip (Medfield) that under-performed while using way too much power.
Either the engineers at Intel and AMD are bad at their job (not likely) or the ISA actually does matter.
Atom is a success, just not where you think it is. The latest ones are quite nice for their power profile and fit into a number of low end edge/embedded devices in the denverton product lines. Similarly the gemilake cores are not only in a lot of low end fairly decent products (pretty much all of Chuwi's product lines are N4100 https://www.chuwi.com/), but they are perfectly capable very low cost digital signage devices/etc.
So not as sexy as phones, but the power/perf profiles are very competitive with similar arm devices (A72). If you compare the power/perf profile of a denverton with a part like the solidrun MACCHIATObin the atom is way ahead.
Check out https://www.dfi.com/ for ideas where intel might be doing quite with those atom/etc devices.
Conversely, if the instruction set was the main factor, you'd expect Qualcomm and Samsung also to have ARM processors with a similar power to performance advantage over Intel chips.
The reality is just that Apple is ahead in chip design at the moment.
They are 2 years behind Apple and slowly catching up.
When Medfield came out, Apple didn't have it's own chip and x86 still lost. It was an entire 1.5 nodes smaller and only a bit faster than the A9 chips of the time (and only in single-core benches). The A15 released not too long after absolutely trounced it.
>When Medfield came out, Apple didn't have it's own chip
>It was an entire 1.5 nodes smaller and only a bit faster than the A9 chips of the time
You seem to have the chronology all mixed up here. Medfield came out in 2012. The A9 came out in 2015. Apple was already designing its own chips in 2012. (The A4 came out in 2010.)
So people have been making these arguments for years, frequently with myopic views. These days what seems to be consuming power on x86 are fatter vector units, higher clock rates, and more IO/Memory lanes/channels/etc. Those are things that can't be waved away with alternative ISAs.