It’s considerably more onerous than just compiling to a single/multiple microarchitecture(s) though. Plus when you do this, you need to split out this code to be conditionally compiled so that you can support other architectures like ARM.
You write the code only once and do not have to worry about any #pragma/conditional compilation. Just copy-paste about a dozen lines of boilerplate, link with the Highway library, and done.
Disclosure: I am the main author; happy to discuss.