Rice professor sees probabilistic method revolutionizing computing

dnaquin · on Feb 8, 2009

“Numbers that fool the Fermat test are called Carmichael numbers, and little is known about them other than that they are extremely rare. There are 255 Carmichael numbers below 100,000,000. The smallest few are 561, 1105, 1729, 2465, 2821, and 6601. In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a “correct” algorithm. Considering an algorithm to be inadequate for the first reason but not for the second illustrates the differences between mathematics and engineering.” footnote pg. 53 SICP

jacquesm · on Feb 8, 2009

"In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a “correct” algorithm."

Where are the numbers for that ?

If the fermat test already fails for 255 numbers out of 100 million input values then that would make it one in about 400K numbers.

I would really be surprised if computers were that susceptible to cosmic radiation. Or should I start shielding my machines with u-metal ?

reitzensteinm · on Feb 8, 2009

It says for very large numbers. Presumably it's a reflection of the amount of computational power required, if a bit of your memory gets flipped every year by cosmic radiation it's not a serious problem, unless you're doing a calculation that takes 10 computer years (eg with 1000 machines over 3 days).

Kind of like Google's terabyte sort test where each time they ran it at least one hard drive died.

zandorg · on Feb 9, 2009

I have a result which shows that a squared number minus 2, if put through the fermat test (modular exponentiation), is always prime. I got some 30k digit primes this way, but I couldn't get mathematicians interested in it. They'd just ignore it and say it was obvious.

Is anyone interested in me writing a little document and posting it to Hacker News?

gjm11 · on Feb 9, 2009

Do you mean (1) n^2-2 always passes the Fermat test (to what base(s)?) or (2) if n^2-2 passes the Fermat test (to what base(s)?) then it is actually prime?

zandorg · on Feb 9, 2009

Yes, that's exactly what I'm saying. I'll post about this to HN soon.

gjm11 · on Feb 9, 2009

Um, I was asking which; #1 and #2 are very different. For instance: A result of the form "n^2-2 is always a Fermat pseudoprime to base SOME-FUNCTION-OF-N" would probably not be interesting. A result of the form "n^2-2 is always a prime provided it's a Fermat pseudoprime to base SOME-FUNCTION-OF-N" would be much more so.

(Where "interesting" means "interesting to me"; no one else need share my interests.)

zandorg · on Feb 9, 2009

I just know I've never found a prime with this method, that PFGW didn't say was a prime.

n^2-2 doesn't always pass the Fermat test, but if it does, it is always a prime. I think it's base 2.

Is that an okay answer? I'll clarify in this thread if you need.

(Posted Lisp code to: http://www.decompiler.org/r2primes.htm)

gjm11 · on Feb 10, 2009

On that page, you conjecture that no Carmichael number has 2 as a quadratic residue. This conjecture is incorrect; 304^2 is congruent to 2 mod 6601, and 6601 is a Carmichael number.

Even if it were true that no Carmichael number has 2 as a quadratic residue, that wouldn't give an efficient way of finding very large prime numbers, because to prove something prime given that it isn't Carmichael you still have to run the Fermat test to every base, which will take ages.

Now, that isn't the same as saying that n^2-2 is never a Carmichael number. I've no idea whether that's true, but it wouldn't be very surprising; hand-wavily, the "probability" that m is a square is on the order of 1/m, and the "probability" that m is a Carmichael number is something complicated-ish that's less than 1/(log m)^2, and so the probability that m is both is less than 1/m(log m)^2, whose integral is finite, so there should only be finitely many Carmichael numbers of the form n^2-2, and if there aren't any small ones then probably there aren't any at all. But, again, even if it's true that n^2-2 is never a Carmichael number, that doesn't give you an efficient way of generating primes.

It's also not the same thing as saying that n^2-2 never passes the Fermat test to base 2 without being prime. That also might be true, for the same hand-wavy reason as above (the number of Fermat pseudoprimes to a given base satisfies an upper bound of the same general form as the number of Carmichael numbers). Most likely it's either very easy or very difficult to prove, if it happens to be true.

In any case, for practical purposes, if you generate a very large number at random and it passes the Fermat test to any one base, then it's almost certainly prime. (For your numbers with thousands of digits, if a single Fermat test says the number is prime then it's more likely to be wrong because your computer has malfunctioned than because you happened to pick a non-prime that satisfies the Fermat test.)

zandorg · on Feb 11, 2009

Thanks so much for taking the trouble to tell me why it's impractical, hard to prove and not interesting. I'll probably remove the webpage. I'll move onto my other (software) projects.

dnaquin · on Feb 10, 2009

You're going to have trouble getting mathematicians interested in just a hunch. Earlier, you used the phrase: "I have a result which shows..." Which, in maths parlance, that would imply you have a proof. Which, if correct, would be rather interesting.

Though, without a proof, it's uninteresting. It's probably not true.

jacquesm · on Feb 9, 2009

removed.

dnaquin · on Feb 9, 2009

"Let C(n) denote the number of Carmichael numbers less than n. ... The upper bound C(n)<n*exp(-(lnnlnlnlnn)/(lnlnn)) (4) has also been proved (R. G. E. Pinch)."

So that ratio isn't constant.

Key is "...of very large numbers...". Clearly larger than 100,000,000.

jacquesm · on Feb 9, 2009

Aye... I missed that. Thank you!

nsrivast · on Feb 9, 2009

Rough estimate: 10^(-11)) errors per bit per hour.

http://catless.ncl.ac.uk/Risks/23.47.html#subj7

jacquesm · on Feb 9, 2009

That is one very interesting link, I'm going to do some experimenting, I'll report back in a bit.

thank you!

atarashi · on Feb 9, 2009

According to the Wikipedia entry on this, for very large numbers -- up to 1e18 -- about 1 in 700 billion is a Carmichael number.

http://en.wikipedia.org/wiki/Carmichael_number

10ren · on Feb 9, 2009

http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-11.html...

Jebdm · on Feb 8, 2009

The big problem I see is that if you ever want to do something which has to be precise, you'll have to have two processors. Perhaps in the future we'll see chips with a "probability drive" in addition to the normal cores.

Now we've just got to develop an improbability drive...

gregstoll · on Feb 8, 2009

Or just do everything twice - if it really runs seven times faster than "today's best technology" than it shouldn't be an issue.

It's a very interesting idea...wish they had more details in the article.

gojomo · on Feb 8, 2009

You might have to do it far more than 7 times to converge on a suitably reliable answer... but perhaps precise recomputation could be deferred to off-hours, or lagged batch processing, or after disputes, etc.

LogicHoleFlaw · on Feb 9, 2009

On mainframes where it really, really matters that the answers be right, three processors will do the same calculation in lockstep and any one processor deviating will have its results thrown out.

JulianMorrison · on Feb 9, 2009

If I read the article right, it's not that it sometimes fails, rather it almost always fails, but the amount of error is predictable. You can converge on the exact value by repeating and averaging the calculation, but it would be simpler to have an exact co-processor.

bprater · on Feb 8, 2009

Does anyone have any technical info on how this is accomplished?

shadytrees · on Feb 8, 2009

Google Scholar has picked up on Krishna Palem's previous publications. There should be a law that internet articles about scientific research have to back everything up with links to a journal article or something.

http://scholar.google.com/scholar?q=Krishna+Palem+probabilis...

For example, here's an interesting paper that talks about saving power by making AND/NOT gates probabilistic.

http://www2.computer.org/portal/web/csdl/doi/10.1109/TC.2005...

I guess there's an inherent assumption that that reduced power consumption will scale up to the level of "checking email and browsing the internet".

michaelneale · on Feb 8, 2009

The power consumption angle is quite interesting - but I given the power saving they demonstrated wasn't really of an order of magnitude or more, then it might be hard to make it scale up to practical tasks and have errors worked out of it. Perhaps there are more savings to be made, but I would expect to see orders of magnitude less power for this stuff to be practical.

f3lix · on Feb 8, 2009

Maybe. I think it may work in computer graphics applications. Most calculation is done for only a single frame and then discarded.

If it were a little off there would not really be any bad effect.

ars · on Feb 9, 2009

The libavcodec has an option that does exactly that. It skips steps in the filter loop that only affect the single frame. It makes things quite a bit faster.

tlrobinson · on Feb 8, 2009

"To compensate, engineers increase the voltage applied to computer circuits to overpower the noise and ensure precise calculations."

"Precise"? How about "correct"?

bdr · on Feb 8, 2009

Most floating point calculations are only correct according to a standard.

tlrobinson · on Feb 9, 2009

Yes, but who said anything about floating point numbers? I guess I should have explained, but my point was that in "normal" microprocessors the voltage level has nothing to do with the precision of the computation... if it's low enough that errors are introduced the results will be wildly incorrect, not merely imprecise as that statement implies.

timf · on Feb 8, 2009

Not an option.

http://en.wikipedia.org/wiki/Floating_point#Accuracy_problem...

An annoying problem for distributed/grid systems where the compilers are different across nodes/sites.