Hacker News new | past | comments | ask | show | jobs | submit login

Nobody has mentioned PCMPISTRI (and the other SSE4.2 string extensions), but they deserve a benchmark here. Some of them appear in my regular glibc now, and they're hard to beat.

I found this Intel article for XML (but not JSON): https://software.intel.com/en-us/articles/xml-parsing-accele...




Intel employee here, working in string and regex processing. SSE4.2 has its uses, but it is not always the fastest thing you can use. We don't use it anywhere in Hyperscan. SSE4.2 instructions are not particularly hard to beat with other sequences, and it's worth noting that these instructions have not been promoted to AVX2, much less AVX 512.


PCMPISTRI and friends are great, but they're not very friendly to the problem described in the article, since they can only check membership in a 16-element set.


Eh? PCMPISTRI has a few different modes of operation, including full substring search and character classes. e.g., You can use PCMPISTRI on a needle that contains adjacent classes. For example, `azAZ09` would check if any byte in the search string is in any of the ranges a-z, A-Z and 0-9.

Regardless, in the OP, they're specifically looking for one of a small number of bytes, which is exactly what PCMPISTRI is supposed to be good for.

With that said, my experience mirrors glangdale's. Every time I've tried to use PCMPISTRI, it's either been slower than other methods or not enough of an improvement to justify it.


Oh, didn't know that! Thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: