Behind Intel's New Random-Number Generator

shabble · on Sept 1, 2011

It seems like it's (ab)using the metastability[1] of the dual inverter circuit as the input source. Since metastable states can persist for arbitrary long periods (with asymptotic probability), the bias testing and reset mechanisms are needed.

I assume they're controlling the system to ensure that the thermal noise dominates, and that the de-bias feedback loop and signal conditioner can strip out any low frequency thermal changes (like, temperature forcing through heavily loading CPU).

There are some really neat attacks that use thermal system properties to leak or force information. A submission from a while back: http://news.ycombinator.com/item?id=2872274 struck me as particularly sneaky.

[1] https://secure.wikimedia.org/wikipedia/en/wiki/Metastability...

cpeterso · on Sept 1, 2011

Is there a risk that, after running for a long time, the dual inverter circuit (hardware) could degrade into a "stuck" or severely biased state? At some point, the conditioner probably can't compensate. Is there a method for software to query the RdRand's "health"?

marshray · on Sept 1, 2011

Yes, the instruction that returns random bits returns a flag indicating that the bits were successfully obtained. I guarantee you somebody somewhere is going to forget to check that flag.

chadgeidel · on Sept 1, 2011

Sounds like that is built into the RdRand instruction. From the article:

"We really like to sleep well at night, so we've built in additional circuitry that tests to make sure that the concentrating machinery is always working with bit streams that aren't too biased. If it detects a biased output, we tag it as unhealthy and refine it to meet our standards. This way, only healthy pairs of bit streams are combined."

Daniel_Newby · on Sept 1, 2011

Not unless the hardware fails. The conditioner is a simple, bulletproof feedback loop. (I think it can be done with two resistors, a transistor, and a capacitor, with one internal state: moving towards 0/1 balance.)

mdda · on Sept 1, 2011

It's easy to see that {process-id, parent-id, timestamp} can lead to a pretty predictable random number seed. But Amazon also had another easily available source of randomness : the timestamps of other customer's orders.

Why not generate a random #, and index to a random customer, and re-seed? One extra DB lookup and you've got a whole lot more randomness (like a private lava-lamp) to access.

bdonlan · on Sept 1, 2011

Both sides have to generate secure random numbers in order to perform an ephemeral diffie-helmann exchange securely; if the client's random number is insecure, a man in the middle attack becomes possible.

Non-DH-based exchanges are even worse - the entire exchange hinges on a random number generated by the _client_. The server effectively doesn't even have a chance to generate a random number at all.

mdda · on Sept 1, 2011

But surely the client could use human input as it's source of randomness too : After all, in this case (browser purchases) the user is going to be hitting keys and/or moving the mouse.

ars · on Sept 1, 2011

It could, and today it does. But back then it didn't.

shabble · on Sept 1, 2011

Using a physical process is generally going to be harder to exploit than a software one. There's going to be the possibility of uneven distribution (customer order timestamps may cluster at certain parts of the day, rather than being evenly distributed), as well as the (granted, small) possibility that on some small site without Amazon scale, the attacker can generate enough entries that it can reasonably assume one of them will be picked by the random index.

One interesting story about exploiting a poor RNG in Online Poker: http://www.cigital.com/papers/download/developer_gambling.ph...

https://webcache.googleusercontent.com/search?q=cache:AdC18Y... is another one, where the PRNG seed for a shared Nethack server wasn't random enough, and could be discovered and synchronised for fun and cheaty profit.

Empedocles99 · on Sept 1, 2011

I thought the 40-bit keys were used due to cryptographic export restrictions, and not lack of randomness?

http://en.wikipedia.org/wiki/40-bit_encryption

Seems to agree with me.

oasisbob · on Sept 1, 2011

The authors didn't cite Netscape because the 40-bit key is too short, they cited it because the the PNRG in the SSL implementation was faulty.

They link to this page with more details: http://www.cs.berkeley.edu/~daw/papers/ddj-netscape.html

sarchertech · on Sept 1, 2011

Now that's a perfect example of something that seems obvious in hindsight, but obviously wasn't since the problem has existed for so long without a solution.

It's also nice to see a more EE oriented post on hacker news now and again.

marshray · on Sept 1, 2011

No. There has been no shortage of perfectly workable designs for hardware random number generators patented over the years. If anything, it's something that engineering types with a fondness for crypto obsess about too much.

The reality is that it's just not that hard to generate quality random numbers after a few seconds have passed after bootup on a busy system. Nothing needs gigabit rates of random numbers.

The cases where applications have had lousy random numbers have all been boneheaded implementation bugs (like the early 90s Netscape one the reference).

There are two reasons why this had not been implemented until now:

1. Low demand. It's entirely practical for applications to generate random numbers in software, especially with help from the kernel. With a hardware RNG on some chips, a fallback implementation would still have to be available and people who care about their CSPRNGs in software wouldn't fully trust Intel's numbers anyway.

2. Additional design, documentation, and testing burden. Every feature has a cost. The cryptographic qualities of random numbers are nearly impossible to test as a black box.

palish · on Sept 1, 2011

Speaking as a graphics programmer, I hope to see this hardware-level randomness eventually incorporated into GPUs. Randomness on a GPU is hard, and there are so many ways that a RNG can improve visual quality.

For example, adding a tiny amount of film grain to the final rendered image would most easily be done with RNG.

Another example: it's common for renderers to accumulate the scene's illumination into a HDR texture, and then radially blur it. The goal is to approximate the halo you observe around light sources at night time. The problem is, a Gaussian blur is perfectly smooth --- so the resulting glow typically ends up being a perfectly smooth blurry circle of light. But a real photograph of the same scene rarely has smooth glows. The image exhibits all kinds of subtle noise in the light's corona. After all, a photograph results from interaction of light with the camera lens, multiple scattering in outdoor scenes, chromatic dispersion, interaction between polarized light and certain materials, ... and many, many more phenomena.

So rather than try to model each of those physical lighting effects in realtime, it would be wonderful to have a "RNG()" shader function to simply add a smidgen of unpredictability to our realtime rendering techniques. The closer your final image resembles nature, the better it looks --- and nature is many things, but she is not a perfectly smooth BRDF lighting model!

So yeah... It would be frickin' sweet to have this RNG on a GPU.

Dylan16807 · on Sept 1, 2011

In what way is randomness on a GPU hard? The nVidia SDK apparently comes with a few of them, and you can implement an LCG in handful of arithmetic operations. If you need a seed feed it in from the CPU.

I don't see the benefit of having a high-cryptographic-quality RNG for rendering.

palish · on Sept 1, 2011

You'll have to take my word for it, or try it yourself: there are visually noticeable repetitive patterns that are a direct result of the naive PRNG.

For example I once modeled raindrops as falling billboards that struck the ground at some random (x,y) in your field of view, followed by a "splash" animation billboard where it struck the ground. I started by using C rand() to determine the random (x,y) of the raindrop. And it looked terrible! The raindrops would mysteriously clump together in sinewave-like patterns, and seemingly avoid some spots altogether.

I dropped in a Mersenne Twister RNG and the problem instantly went away. I didn't change a thing except the RNG.

So yeah, a few arithmetic ops gets you rand() quality. But humans are unfortunately good at noticing the overall bias in the resulting patterns.

And you'll run into similar problems when e.g. trying to implement a non-repeating perlin noise function on a GPU. I tried it with a weighted combination of several different perlin noise textures... but I had to be very careful about aliasing artificts in some of the more interesting effects; while also being careful of avoiding visually-repetitive patterns (which almost always occur).

The value of Intel's RNG, I'd imagine, is that it's uniformly random and also unbiased -- meaning repetitive moire patterns won't show up in your results.

Of course it's not deterministic, which is tricky, but I can still think of several use cases for something like this.

marshray · on Sept 1, 2011

So what you need is a high-quality Pseudorandom Number Generator (PRNG). You do not need a Cryptographically Secure PRNG.

This is good, what you need can be implemented significantly more efficiently than the CS version. As you mentioned, it doesn't help your application to be truly nondeterministic.

C rand() is about the worst one you can find. Mersenne Twister is great statistically, but it's quite expensive in terms of memory consumption.

As I understand GPU programming, you want to maximize parallelism. Ideally, you would have something small enough to have an independent generator for each processor. You should be able to generate perfectly random-looking numbers using 64 or 128 bits of state storage and a handful of instructions per 32-bits of output.

For example, the Skein hash function and CSPRNG can generate random data at 5 cycles per byte of output. But that's cryptographically secure, you can probably cut its state size and computational expense each in half to get statistical randomness.

tripzilch · on Sept 1, 2011

I've been reading New Kind Of Science recently, intrigued by some of Wolfram's grandiose claims I ran some tests and it indeed turns out that the Rule 30 cellular automaton is a very high quality PRNG (as long as you pick the right bits). It's a very simple algorithm and very parallel too.

It's just that my version picked one bit per generation (the middle one) and the period of the PRNG is determined by the size of the buffer, so it might be a bit memory intensive for a GPU, if you need several random bits per pixel. I couldn't find any patterns within the period, though. Also it might be possible to get more than one random bit per generation, though I expect the ones right next to eachother are probably too correlated.

Formula's real simple, if the previous state and the two neighbours are labeled L,M and R, rule 30 boils down to the new state will be L xor (M and A).

palish · on Sept 2, 2011

Whoa, cool. Thanks!

I don't suppose you have the code lying around...? Even if it's not pretty, it would still be a big help! =)

palish · on Sept 2, 2011

As I understand GPU programming, you want to maximize parallelism. Ideally, you would have something small enough to have an independent generator for each processor. You should be able to generate perfectly random-looking numbers using 64 or 128 bits of state storage and a handful of instructions per 32-bits of output.

A perfect assessment.

It's why I keep learning about every field --- there are so many areas that turn out to be applicable in different domains.

For example, "variance shadow maps" create soft shadows using statistical analysis. (I wish they were practical. It's just an example.)

I'll check out the Skein function, thanks.

T-hawk · on Sept 1, 2011

> But humans are unfortunately good at noticing the overall bias in the resulting patterns.

Humans are also, of course, good at noticing bias in a distribution even when there isn't any. Ask any online poker player about shufflers.

nitrogen · on Sept 1, 2011

Do bloom effects generally use Gaussian filters? Is anybody doing full convolution with something more natural looking, like a flat circle or a regular polygon (e.g. like Gimp's Focus Blur plugin)?

sarchertech · on Sept 1, 2011

I've never really looking into integrated hardware random number generators, but I got the impression from the article that the previous hardware random number generators were based on analog circuits.

Daniel_Newby · on Sept 1, 2011

Nothing needs gigabit rates of random numbers.

Consider a Monte Carlo algorithm on a 2 GHz processor, with a 50 instruction cycle inner loop, using a 64 bit random value per loop: it needs 2.5 Gb/sec of random bits.

tedunangst · on Sept 1, 2011

There's no need for the bits to be truly random. In fact, if you ever want anyone to replicate your simulation, you'll need to provide them with the prng algorithm and seed used.

ghshephard · on Sept 1, 2011

Perhaps this should be better written as "Nothing needs gigabit rates of _cryptographically strong_ random numbers."

Using a megabit rates of cryptographically strong random numbers to seed a PRNG would be fine for most purposes.

Can anyone think of a reason why you would want/use gigabit rates of truly random numbers instead of using a slower rate to see a PRNG?

marshray · on Sept 1, 2011

Perhaps this should be better written as "Nothing needs gigabit rates of _cryptographically strong_ random numbers."

Yes that's what I meant to say. I probably didn't realize the discussion had widened to include insecure number generation.

algorias · on Sept 1, 2011

>Using a megabit rates of cryptographically strong random numbers to seed a PRNG would be fine for most purposes.

And that's exactly what the hardware described in the article does for you.

Daniel_Newby · on Sept 3, 2011

True. Although the new Intel system is not cryptographically secure.

Gigabit true random numbers could be used to modulate a jam-resistant radar signal. That makes it difficult for an opponent to use active electronics to cloak their target, even in principle. However actual systems would use dedicated hardware, not an Intel chip.

The profoundly paranoid could interleave true random bits with their data bits before encrypting (discard them on receipt). Pattern analysis would become much more difficult.

rwg · on Sept 1, 2011

VIA's "Nehemiah" core C3 CPUs had hardware random number generation, as well as hardware AES assist, way back in 2003. (And, of course, VIA's "PadLock" instructions and Intel's RdRand instruction + AES-NI instructions are totally different. Hooray for continued x86 instruction set fragmentation!)

Edit: It just occurred to me that you're probably referring to the implementation, not the existence of a hardware RNG in x86. Doh!

wmf · on Sept 1, 2011

Yeah, IIRC VIA is using the same ring oscillator style that Intel used to use. This article is about a new, lower-power RNG design.

marshray · on Sept 1, 2011

The idea that a hardware RNG would ever need to consume a noticeable amount of power on a 20W CPU seems strange to me. Lots of low power chips have them. A quick search returns a paper titled "A 2.92μW Hardware Random Number Generator".

jbri · on Sept 1, 2011

Having a look at that paper, the actual rate at which it produces bits is very low - for 2.92μW, they're producing only 500 bits per second.

For an embedded device that might be suitable, but for a consumer or server machine you need a much higher bitrate.

tedunangst · on Sept 1, 2011

You only need a little "grade-A" random to seed an appropriate high bitrate algorithm from time to time.

marshray · on Sept 1, 2011

No one has ever broken a properly designed CSPRNG properly seeded with more than 100 or 200 bits of entropy, total.

yuhong · on Sept 1, 2011

"The nominally random number Netscape [1.x] used to form a secret key was based on just three values—time of day, process identification number, and parent-process identification number—all of them predictable. This allowed the attackers to reduce the number of keys that they needed to try, and to find the right one much sooner than Netscape had anticipated."

Another reason to disable SSLv2 on servers, BTW.

marshray · on Sept 1, 2011

SSLv2 is horribly broken and should be disabled, but technically, that's a different bug of Netscape's. :-)

snowtiger · on Sept 1, 2011

it seems IEEE server down (code 500) to me