MySQL is a huge amount of code doing a variety of things in each query -- networking, parsing, IO, locking, etc. Each of those can easily have significant and hard to predict latencies.
Benchmarking that needs special care, and planning for whatever it is you want to measure. A million trivial queries and a dozen very heavy queries are going to do significantly different things, and have different tradeoffs and performance characteristics.
The benchmark was specifically testing the hot path of a cached query in their MySQL caching proxy. MySQL wasn’t involved at all.
I agree completely that benchmarks need care, hence my point that the article is disappointing.
The author missed the chance to investigate why removing bounds checks seemed to regress performance by 15%, and instead wrote it off as “close enough.”
It would have been really interesting to find out why, even if it did end up being measurement noise.
"just a cached query" isn't like it's just a hash lookup. You're still doing IO, network protocol decoding, multithreaded synchronization, etc etc. It's certainly not a CPU bound program.