Hacker News new | past | comments | ask | show | jobs | submit login

Typically the hash functions that you are familiar with due to cryptographic or data consistency use (SHA family, MD family, etc) do not make for good hash table choices because they produce hashes that are much larger than needed and are slow to compute so that they have better cryptographic properties (extremely low collision, no information leakage about inputs, difficulty of guessing inputs). When picking a hash function for a hash table, you want a function that makes a hash just big enough and with low enough collisions while still being fast and easily dealing with variable length keys. This could he something as simple as byte-wise XOR or addition with some shifting as you iterate the key followed by a mod or even bitwise AND mask to pick an index.



However, collision resistance must be still quite good for use in a general-purpose hash table or a HT that is possibly exposed to attackers, otherwise denial-of-service attacks become very easy.

Many "modern" implementations (Python, Ruby, Perl, Rust, Redis, ...) use SipHash with a random seed for this very reason.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: