The way I understand it JEDEC had standardized KB/MB/GB being multiple-of-2-based in memory specs prior to the IEC standardising KiB/MiB/GiB for the same. As a result it kind of just stuck in the space since the idea of multiple-of-10 cache is a bit silly anyways so there is no risk for confusion.
Because there is no other interpretation of M that makes sense (neither megabits or base 10 megabytes) you can shorten it further without loss of info, so naturally people have since why not. A similar story can be said of network interface speeds like "10G" which makes its way even to standards names like "10GBASE-SR".