How to generate a double-precision floating-point number in [0, 1] uniformly

bhickey · on March 15, 2015

If you can live with [0, 1), there are simpler ways of doing this. Generate a number [1,2) by filling the 52-bit fractional component with a random integer. Then subtract 1. And you're done.

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/dSF...

fdej · on March 15, 2015

What you propose is similar to generating an integer in {0, 1, ..., 2^k - 1} and dividing by 2^k. This is not the same as drawing a uniformly random real number and rounding to the nearest floating-point number. The original link addresses exactly this problem.

chockablock · on March 15, 2015

>What you propose is similar to...

No, it's not. See the linked paper. In IEEE754, the interval [1,2) is special.

fdej · on March 15, 2015

If you generate a random double-precision floating-point number in [1,2) and subtract 1, you will never generate 2^-100. A correct algorithm will be able to generate 2^-100 (with very small probability!)

chocka_oops · on March 15, 2015

Sorry--chockablock here. I misunderstood your argument and see that you were correct and also that your follow-up was on-subject (see: not a whiz). I'd delete but have noprocrast set.

chockablock · on March 15, 2015

I'm not making claims about the correctness of the approach (I'm not enough of an FP math whiz); I was just pointing out that your original critique appeared to be based on a misunderstanding.

chockablock · on March 15, 2015

Neat read. They note that the technique has been around for a while.

"This method goes back to at least 1997: Agner Fog used this method in his open source library [4], and others seemed to have invented it independently, too." [4] is a link to www.agner.org/random , which appears to contain much detailed information on efficient random number generation.

powera · on March 15, 2015

Well, technically the probability of getting exactly 1 should be 0. Thanks to rounding, it isn't literally a 0 probability, but for any practical purposes the open and closed sets should be equivalent.

forrestthewoods · on March 15, 2015

No. No no. No no no! Stop that. Stop that right now. You are a bad person and should feel bad.

I've dealt with more than a few crashes that came from open vs closed set differences. Yeah, the odds of it happening on a given roll of the dice is low. About one in 8 million for a 32-bit float. But if you have 100,000 users and they each roll that dice even a single time per use of your application then it soon becomes less about the odds of it happening and more about how many times it happens per day. And if it happening can cause a crash, well you're gonna have a bad time. And since you're dealing with floating point numbers they're interacting with gods know what so all it takes to cause a crash is "oh this operation can never result in a divide by zero because this other value can never be exactly one". Oops.

Dylan16807 · on March 15, 2015

Heh. You're very right about the generation of 1. Even 2^54ish is not rare enough to ignore. Put it in GPU code and you could find it breaking in seconds.

But would you be comfortable ignoring double precision 0, with a proper algorithm that makes the probability somewhere around 2^-1070 or 2^-1020?

forrestthewoods · on March 15, 2015

I'd always prefer an algorithm to be correct 100% of the time than any rate than 100%. This is the type of thing for which there is no fundamental reason why you can't simply do it right. The more things you can do right the less you have to worry about! It's much simpler.

That said, I'd certainly listen to the tradeoffs of being correct 100% of the time vs being wrong a mere 1 in 2^1020 times. I wonder if 2^-1020 is greater than or less than the chance of a cosmic ray flipping a bit...

Dylan16807 · on March 16, 2015

It's far far far smaller of a chance than a memory error. Unimaginably smaller.

Now that I think about it, smaller enough that the execution time difference makes it more dangerous to check. :)

forrestthewoods · on March 16, 2015

Since I was curious, it appears that IBM did research in the 90s and came up with 1 per 256mb RAM per month. That's 3.7 × 10-9 per byte per month or 1.4 × 10-15 per byte per second.

http://stackoverflow.com/questions/2580933/cosmic-rays-what-...

cma · on March 16, 2015

The surface area of 256MB of ram has shrunk a lot since then.

Dylan16807 · on March 16, 2015

Though the bits have gotten more fragile.

bhickey · on March 15, 2015

I can promise you that powera knows something about reliable software. Any indication to the contrary is just loose talk. :)

forrestthewoods · on March 15, 2015

I'm sure he does. But I wouldn't want anyone less experienced to read his comment and get the wrong idea. Young minds are impressionable after all! :)

mikeash · on March 16, 2015

Confusing floats with reals is a mortal sin. The probability of a random real in [0, 1] being 1 is zero, but the probability for a float definitely is not.

octatoan · on March 16, 2015

I don't know any measure theory or whatever, but someone who's good at probability needs to chime in here.

nkurz · on March 15, 2015

I appreciate the effort to write this up, but I'm not seeing that this has applicability to any real problem. If you put 53 random bits in the mantissa with a fixed exponent, you get a smooth distribution from from [0, 1 - 2^-53] in increments of 2^-53. He seems to be concerned that you will never get a number between(0, 2^-53). True, but you'll also never get any other number between (n * 2^-53,(n+1) * 2^-53) for all the rest of the range.

[wrong statement about denormals deleted]

His talk about "dividing" to produce the desired range seems misguided. Floating point division is ~10x the cost of any other floating point operations. If you are trying to produce precise floating point numbers, you almost certainly want to be working at the level of bits rather than hoping that division will do the work for you.

dbaupp · on March 16, 2015

Numbers less than 2^-53 aren't denormals. Denormals occur when the exponent is all zeros, i.e. less than 2^-1022, the floating point representation of a double (ignoring sign) is normally:

  1.mantissa * 2^(exponent - 1023)

A denormal occurs when the exponent is all zeros, and the representation switches to

  0.mantissa * 2^(1 - 1023)

nkurz · on March 16, 2015

You are completely right, and I just got lost. I deleted my nonsense above. Is there more to the main article that I'm missing too?

chmike · on March 16, 2015

That's my understanding too. If we want a uniform distribution for the random numbers, we don't want numbers like 2^-100 to show up.

Dylan16807 · on March 15, 2015

First off this seems like unordered notes and I spent 15 minutes trying to figure out what the actual requirements were, because it was saying things it needed and then talking about solutions that didn't have those features.

Anyway, the main goal seems to be even density of chosen numbers, while also not skipping floats. In other words the probability of any specific float being chosen is proportional to its width.

This is easy to do, and needs no rounding. At least for [0,1)

Observe that floats [.5,1) are half the probability space. Use one random bit to choose between them and smaller values.

Then floats [.25,5) are half the space. Use another random bit.

Once you've chosen the exponent, fill in the mantissa with 52 random bits. All floats in the range are the same distance apart, so random bits work fine.

  mantissa := random(2^52)
  exponent := -1
  while random(2) == 1:
    exponent -= 1
  return (1 + mantissa/2^52) * 2^exponent

If you really want [0,1], then you must first define how wide of a range of real numbers you want to assign to "1". Rounding toward zero makes more sense. Alternatively, note how nonzero double precision floats go down below 2^-1000. You will never ever actually generate a zero, no matter what you do. Why should "1" be more likely than "0"?

But if you insist on rounding to nearest then you can put in a special clause where a mantissa of 0 has a 50% chance of being treated as 2^52.

TheLoneWolfling · on March 16, 2015

If you really want [0,1], it might be more simple to include a check at the start:

    if (1-in-number-of-floats-in-[0,1)) return 1;

Your version of 50/50 split with the mantissa will mean that 0 and 1 will be less often chosen than others.

However, your version won't generate a (quite) uniform distribution anyways, due to subnormals. I think a simple fix for that would be to, if your exponent reaches the subnormals, just return a random mantissa with the subnormal exponent.

Dylan16807 · on March 16, 2015

I suppose my version might be wrong when it hits a denormal. It would depend on the rounding mode. Also it would never actually get to a denormal.

But you're right that to be obviously correct it should stop when it reaches denormals.

>Your version of 50/50 split with the mantissa will mean that 0 and 1 will be less often chosen than others.

That's intentional, because rounding a real number to the nearest float has the same behavior. The spec is wrong, I advise against following the spec, but that is how you do it.

kestert · on March 16, 2015

I like the idea of an random number generator that exploits the full precision of the floats. However, the issue with the naive method is loss of accuracy, not bias. When the article states

If k is 64, a natural choice for the source of bits, then because the set {0, 1, ..., 2^53 - 1} is represented exactly, whereas many subsets of {2^53, 2^53 + 1, ..., 2^64 - 1} are rounded to common floating-point numbers, outputs in [0, 2^-11) are underrepresented.

this is very misleading, because the points that are "underrepresented" are closer together. The discrete probability density created by the naive method is a perfectly good approximation to the uniform distribution.

Furthermore, although as the article shows, we can do better, this can, and must, come at a cost. Since the small numbers occur with much lower probability, the only way to generate them (without importance sampling) is by generating much more randomness (1088 bits in the article). Generating random bits is expensive, and this uses 17 times more randomness than the naive method.

EDIT: it would be possible to avoid generating all the extra bits in most cases, and so an efficient average time algorithm should be possible, but the worst case still requires 1088 bits to be generated.

noswi · on March 16, 2015

For [0, 1), would the naive approach of generating a string of, say, 16 digits, sticking a "0." in front of it and calling strod / atof yield similarly uniform results?

CamperBob2 · on March 15, 2015

Wouldn't the best (and coincidentally simplest) way be to synthesize the double-precision word by combining a random integer exponent, a random integer mantissa, and a random sign bit? You'd need to account for the logarithmic distribution of the exponent, but that seems like the biggest complication.

I don't understand the debate about rounding. What does it mean to round a random number? If the rounding process is biased, the number will be less random as a result, so you shouldn't have rounded it. If the process isn't biased, the number will be no more or less randomly-distributed than it was before, so there was no point in rounding it.

alepper · on March 15, 2015

The clue's here:

> What is the `uniform' distribution we want, anyway? It is obviously not the uniform discrete distribution on the finite set of floating-point numbers in [0, 1] -- that would be silly. For our `uniform' distribution, we would like to imagine[*] drawing a real number in [0, 1] uniformly at random, and then choosing the nearest floating-point number to it.

Because of the logarithmic representation, there are as many floating point numbers between .25 and .5 as there are between .5 and 1. If you uniformly sample numbers with an exact, finite floating point representation then you don't get something that 'looks like' a uniform distribution of real numbers in [0, 1] -- which is more likely what was wanted.

angersock · on March 15, 2015

Bingo. The ongoing mindfuck of floating point representations is that sort of non-even mapping of certain intervals.

CamperBob2 · on March 15, 2015

Right, which is why I said you'd have to account for the log bias when you pick a random integer exponent. Otherwise most of your results would be really tiny.

The method bhickey suggests above sounds like a nifty workaround for the problem as well.

alepper · on March 15, 2015

You're right, sorry. I responded to 'why round?' but missed that comment.

rcthompson · on March 15, 2015

I'm sure that there are valid reasons to want to do this, but if your find yourself needing to do this, I think you should at least consider whether there might be some other way to get what you want.

mellavora · on March 15, 2015

Using a computer algorithm, http://en.wikipedia.org/wiki/Mersenne_twister is a good choice.

"Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." -- Von Neumann.

Using a computer, maybe look for small current variations around a resistor or similar device? http://en.wikipedia.org/wiki/Hardware_random_number_generato...

lvh · on March 15, 2015

The problems in the article have nothing to do with the source; it assumes that a source of randomness already exists. Mersenne Twister provides you with some random numbers (generally sequences of random, uniformly distributed, 32 bit integers). It does not solve the problem of picking a random double between 0 and 1.