IIRC the branch predictor hint is useless on modern CPUs

Someone · 2024-12-18T18:22:09 1734546129

It was useless on modern CPUs, but has become somewhat useful again on some CPUs. https://www.phoronix.com/news/GCC-Clang-Intel-x86-Branch-Hin...:

“Starting with the Redwood Cove microarchitecture, if the predictor has no stored information about a branch, the branch has the Intel SSE2 branch taken hint (i.e., instruction prefix 3EH), When the codec decodes the branch, it flips the branch’s prediction from not-taken to taken. It then flushes the pipeline in front of it and steers this pipeline to fetch the taken path of the branch.

...

The hint is only used when the predictor does not have stored information about the branch. To avoid code bloat and reducing the instruction fetch bandwidth, don’t add the hint to a branch in hot code—for example, a branch inside a loop with a high iteration count—because the predictor will likely have stored information about that branch. Ideally, the hint should only be added to infrequently executed branches that are mostly taken, but identifying those branches may be difficult. Compilers are advised to add the hints as part of profile-guided optimization, where the one-sided execution path cannot be laid out as a fall-through. The Redwood Cove microarchitecture introduces new performance monitoring events to guide hint placement.”

kristianp · 2024-12-18T23:12:31 1734563551

What about macs? The perf report in the release notes[1] mention [arm64-darwin24], so they're likely on a Mac.

[1] https://github.com/ruby/json/releases/tag/v2.8.0