> It is easy to generate a random sequence of
inte- gers in the range l..N, so long as we don’t mind dupli-cates
Is it? I was surprised to learn of modulo bias [1]. Does the "Randint(1, J)" implementation take it into account? It's a very easy mistake to make and can have an impact on the uniformity of the shuffle.
If you haven't heard of it it has to do with the relationship between the range spanned by your system's RAND_MAX and J. If RAND_MAX is not a multiple of J, the highest range (RAND_MAX-J..RAND_MAX) will not deliver the entire span of J.
Any random number generator library worth its salt should have a properly implemented RandInt(1, J) function available, instead of just a RandInt(1, RAND_MAX) primitive.
Behind the scenes, mind you, it's easy to implement such things from random number generators of a fixed range. For example, it could be done as follows (for convenience, I'll speak as though our fundamental randomness primitive just produces random bits, though all sorts of things would work just as well):
We ought keep track, internally to our random number generator library, persistently of some interval [Low, High) in [0, 1) (which can be initialized to all of [0, 1) itself); any time a RandInt(1, J) is required, we partition [0, 1) into J many equal intervals, find which one [Low, High) lies entirely within, use the corresponding number as the value to return, and also modify [Low, High) by applying to it the affine transformation which would take that particular subinterval to all of [0, 1). If ever, while doing this, we find [Low, High) spans multiple subintervals, we first do the following until it does not: generate a random bit and then replace [Low, High) with either its lower or higher half accordingly.
Essentially, we are using [Low, High) to keep track of an underspecified value uniformly distributed throughout [0, 1), and then pulling out leading "digits" in arbitrary bases of this value as required by the user, zooming our window in accordingly after doing so. Random bits are used to further and further specify this value, and thus no randomness ever goes to waste. At all times, we will have made the minimum number of invocations of random bits necessary to power the amount of random integers of whatever varying sizes asked for so far.
Sure, but in practice I think we'd be hard pressed to find someone with such stringent requirements.
For all practical purposes, using your entropy stream to seed ChaCha is more than good enough. Want something with a proof? Seed a Blum Blum Shub generator.
That article is very interesting, in a strange way. It points out so many simple, obvious things about card shuffling that aren't visible to a programmer from a high level view
Is it? I was surprised to learn of modulo bias [1]. Does the "Randint(1, J)" implementation take it into account? It's a very easy mistake to make and can have an impact on the uniformity of the shuffle.
If you haven't heard of it it has to do with the relationship between the range spanned by your system's RAND_MAX and J. If RAND_MAX is not a multiple of J, the highest range (RAND_MAX-J..RAND_MAX) will not deliver the entire span of J.
[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#M...