Gathering and exploiting in-depth knowledge of a CPU's internals has become more difficult over time, too, I think.
At least for x86/amd64 - with out-of-order-exection, branch prediction and whatnot one not only has to know the architecture, but one has to know the specific implementation the code will run on. And knowledge on the deep internals of CPUs made by Intel or AMD (or Via? are they still around?) is not easy to come by.
At least for x86/amd64 - with out-of-order-exection, branch prediction and whatnot one not only has to know the architecture, but one has to know the specific implementation the code will run on. And knowledge on the deep internals of CPUs made by Intel or AMD (or Via? are they still around?) is not easy to come by.