I'll check your new aesni hash, but would recommend to try a simple fast 32bit h...

rurban · on March 3, 2020

So I tested your aesnihash with smhasher. It's really bad. A much better 64bit aesni variant would be falkhash, which is 4x faster, supports a seed and passes most tests.

https://github.com/gamozolabs/falkhash nasm -f elf64 -o falkhash-elf64.o falkhash.asm

majke · on March 3, 2020

<update> Crap. That gist contained some draft version of aesnihash. SORRY. See updated version https://gist.github.com/majek/96dd615ed6c8aa64f60aac14e3f6ab... </update>

Thanks for spending time on this. I would like to understand what "really bad" and "fails most of the tests" means.

For the record, the commit: https://github.com/rurban/smhasher/commit/10f56385f3e9abb018...

The main point of this hash, in this context, is to do streaming hash and find \n at in one loop. The intention is to reduce data loads _mm_loadu_si128 (I already have user data in xmm0, so why not do some aesni already?). Because it's streaming I can't for example derive the initial seed based on the chunk length, since it's unknown at the time of calling hash. See:

https://github.com/cloudflare/cloudflare-blog/blob/master/20...

I don't need full aes hash, but maybe that could be an option as well.

In other words, in my case I don't care just about hash() speed. I care about memchr() + hash() speed. I would like to understand/measure the hash quality itself. Maybe adding another aesenc round would be sufficient to fix it.

telendram · on March 10, 2020

Even falkhash is not that great, and feature an abnormal amount of collisions as the nb of hashes increase. Basically, all "naive" AES implementations share this design weakness.