They could have started using CR1 instead of introducing CR4, that to me is the mystery. A likely answer is that some software may have been relying on CR1 faulting, though that’s kind of weird.
AMD could have used CR5 instead of CR8 too. Both could (and probably should have) used an MSR for those bits. Numbers are just numbers. I really don't think there's need for much speculation here.
I’m not sure why I have to defend my curiosity, but the engineering decision to add CR4, instead of using the existing CR1 which was literally all bits “reserved for future use”, is just interesting to me. Especially because MSRs were also an option. There must have been a reason, however minutious, leading to that, and I’d like to know.
Though it’s mostly just a corollary to the existence of the completely reserved CR1 itself, which was added in the 386 at the same time as CR2 and CR3 (so I don’t think compatibility concerns apply). CR1 is entirely a forbidden zone between the well-defined CR0 and CR2. At the time, we thought that maybe CR1 was supposed to be the next register new control bits would be added into once the adjacent CR0 is full, but then Intel introduced CR4 for that instead, and made the whole CR1 gap more mysterious.
That Intel decided to skip CR1 entirely speaks of a story. Before I knew of the plans of having an on chip cache, this story was a giant mystery. Now that I know of the possibility that it might have been related to the late-scrapped plans of having an on-chip cache on the 386, I think that’s a pretty interesting potential reason already, but I’m not yet satisfied unless confirmed.
But if CR1 actually exists in the 386 in some form other than the encoding, then that's massive news.
I always assumed CR8 was used because (a) it prevents access outside 64-bit mode, and (b) being the creators of 64-bit mode, there's a guarantee Intel has no plans of using such a register for something else.
To be clear, sure, that's surely the case. The REX encoding added an extra bit to the register field used by the movcr instructions, giving you 16. So... might as well.
Nonetheless the "official" means for adding state to an x86 is an MSR. And in particular there's no good reason for putting this in legacy spaces like CRs: CR8 is an interrupt control register, unrelated to the MMU control in CR0/2/3.
(In fact the proof of this is that within the decade, x86 interrupt handling needed to evolve in the direction of MSI's and x2apic, something that even the newly-expanded CR registers were completely inadequate to do. So we do that stuff in MSRs too.)