I was under the impression that the parts of performance oriented programs which are typically converted to assembly are in essence small profiled hotspots like very tight loops, as such I doubt that there's any real performance to be had from high level optimizations in conjunction with that code as made possible by insintrics/extensions.
But I'm certainly no expert in this area, so take my opinion with a large grain of salt.
But I'm certainly no expert in this area, so take my opinion with a large grain of salt.