I'm taking specifically about the workloads I was interested in at the time where cache optimised means it took pains take advantage of larger L1s and took pains to get L1 hits. But this was a general problem too and noted at the time by many people.
AMDs smaller L1 was as definite negative at the time. This was back when hyperthreading could be a net negative because of the reduced L1 cache per thread so we would turn that off to.
AMDs smaller L1 was as definite negative at the time. This was back when hyperthreading could be a net negative because of the reduced L1 cache per thread so we would turn that off to.