µ-ops stand for micro-operations. Behind their CISC frontend, modern [0] x86 (and x64) actually perform as RISC processors. x86 opcodes are translated to these µops on the fly, and executed.
I'm not familiar with the caching mechanism, but here's an educated guess. There are potential optimizations in the CISC -> RISC translation, according to the x86 opcode sequence (reorder operations in order to run some of them in parallel, for example), and it is possible to cache them, sparing cycles in the process, since the processor wouldn't have to analyze the code each time to perform the optimizations.
Edit: apparently, my guess was correct :-) Thanks to Symmetry for the confirmation.
--
[0] By modern, I mean "not ancient". The first processor to do that was the Pentium Pro.
I'm not familiar with the caching mechanism, but here's an educated guess. There are potential optimizations in the CISC -> RISC translation, according to the x86 opcode sequence (reorder operations in order to run some of them in parallel, for example), and it is possible to cache them, sparing cycles in the process, since the processor wouldn't have to analyze the code each time to perform the optimizations.
Edit: apparently, my guess was correct :-) Thanks to Symmetry for the confirmation.
--
[0] By modern, I mean "not ancient". The first processor to do that was the Pentium Pro.