Hacker News new | past | comments | ask | show | jobs | submit login

Lenore’s approach uses rejection sampling, which translates to needing an uncertain amount of input — a uint128 may or may not be enough.

But there is one related algorithm that doesn’t: https://github.com/apple/swift/pull/39143. This method only requires (output word size + 64) bits, which is really convenient for us since we probably don’t have 2^64 buckets. Same ballpark as (hash >> 64) * n, just with bias correction.

With this method the output value will be very not-scrambled relative to the high bits of the hash. Now whether that’s a problem is… someone else’s question. One can always do some shifts and XORs.




I wrote about this at https://dotat.at/@/2022-04-20-really-divisionless.html

You don’t need to worry about this for mapping hashes to buckets, because a hash has a limited width, whereas unbiased random numbers are based on an unbounded stream of random bits.

In this article they are working with 128 bit hashes so they already have plenty of bits for bias not to be a problem. A single straightforward multiplication is fine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: