Unfortunately those SWAR optimizations are only useful for strings that are alig...

Findecanor · 2024-07-29T09:44:15 1722246255

Masked SIMD operations, which are in AVX-512 and ARM SVE were intended to solve that problem. Then memory operations could still be aligned and of full vectors all the time, but masked to only those elements that are valid.

Even if a masked vector-memory operation is unaligned and crosses into an unmapped or protected page, that will not cause a fault if those lanes are masked off. There are even special load instructions that will reduce the vector length to end at the first element that would have caused a fault, for operations such as strlen() where the length is not known beforehand.