The fastest animal on earth is the fastest not because it has huge muscles, but because it falls. Is it fair compare top speed of falcon to cheetah? Yeah but... x y z.
Every benchmark can be set up in a way that will give edge to one part or another. Especially if there is a vested interest.
And even in your example, the Cheetah can hit 65mph for a whopping 330 feet before risking total exhaustion. Meanwhile Pronghorns, Impala and Antelope can reach 55mph-60mph and maintain it for about half a mile, and vary their speed to increase distance.
So which would you consider faster? The one that can give you a few seconds of effort really quickly; or the one that gives you 90% of that effort but for hours?
The search space has many, many dimensions--CPU, MMU, OS implementation details, how well the compiler translated source to machine code for that particular CPU, how well did the benchmark implementation cater to the target architecture, etc, etc.
Immense opportunities to fool others, and even yourself.
All benchmarks are chosen to show precisely what the person presenting the benchmark wants them to, no more, no less. If the benchmark didn't tell a story that they want to show you, they wouldn't be showing it to you (or they'd be showing you a different benchmark).
This doesn't make them wrong, but benchmarks are never the full story.
That's true, but like many human endeavors we can reach a point that is good enough, or the process of implementing the benchmark leads to useful discoveries. It is not going to be perfect for all situations, but that doesn't mean it's not a valuable tool to use in some situations.