Hacker News new | past | comments | ask | show | jobs | submit login

If you use a secure hash on the banned passwords they will be random (as good as arbitrary), then just truncate them to 7-8 bits and sort by prefix, searching them using binary search. It will be more efficient than a Bloom filter because there can be zero empty space between entries. It creates a structure similar to a Bloom filter but with no wasted space (no empty entries)

A Bloom filter is great when you need a probabilistic hash table that is fairly efficient and can have records added efficiently.

With a fixed list as I described above, the insertion cost is enormous (avg cost of moving 1/2 of the array elements), but the space/speed efficiency reaches the theoretical limit




Your math is bogus. 7-8 bits? If there's one entry, that would give false positives of 1/256, or 0.3%. Even given 1k entries would likely hit the vast majority of entries.

Or, flipping this: every time you add a password, you check that it hasn't been used before. You can only set 256 passwords, total, before the space of truncated hashes is filled.


Bloom filters don't do anything special to change FP rates, a compact list of random hashes is always more efficient than using a Bloom filter until the filter is 100% full


thanks for the clarification, with a fixed list that makes sense.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: