There's little benefit in going substantially wider than Cyclone, but there's still a ton of room to improve performance.
Didn't AMD, and Intel more or less say the same thing about 3 wide? No real benefit from going wider. Is that because of the differences in micro architecture or is it more about getting a little more performance without having to ramp the clock speed? What makes 6 wide good for ARM but not x86 or x64?
This is probably completely and utterly wrong as it's just a guess, but potentially the Thumb [1] instructions (small, limited subset of shorter instructions in ARM) might allow for a wider setup. Thumb instructions make a bunch of simplifying assumptions that might remove some of the issues with going wider. Not that I have any idea if Cyclone even supports them in the first place.
The biggest wart on the ARM instruction set was the predicate attached to basically every instruction. That's gone in the 64-bit ARM ISA, so there's no need for a 64-bit Thumb (and there isn't one).
Didn't AMD, and Intel more or less say the same thing about 3 wide? No real benefit from going wider. Is that because of the differences in micro architecture or is it more about getting a little more performance without having to ramp the clock speed? What makes 6 wide good for ARM but not x86 or x64?