It's fun to see this. I investigated these on my own as an undergrad, but I called them solid primes. Originally I wrote a visual basic program (it was a long time ago) to find all the solid primes in base ten, which obviously involved writing a big num implementation. It ran for three days, only half solved the problem and then ran out of memory. Recoding it in Mathematica took 6 lines, ran in 20 minutes and found all of them.
Plotting the distribution gives a nice curve which you can model quite well with the prime number theorem. From memory it looks like x^2*e^-x. Then plotting in different bases is also interesting.
I first learned about truncatable primes from Project Euler problem 37: "Find the sum of the only eleven primes that are both truncatable from left to right and right to left."
As always with this kind of problems (no input, small output, no limit on the amount of mathematical ‘preprocessing’ allowed), the fastest program will just call write to output the sum.
That’s why I don’t think speeding up such solutions beyond making it really fast using algorithmic changes/specializations is worthwhile.
I used a branch and prune algorithm, starting with the 1 digit primes and then adding digits to the right, pruning using a non-memoising prime test and testing for left truncatability to see if the new numbers should be added to the sum. The tree came to an end fairly quickly, which made the algorithm fast.
I also avoided any string manipulation, favoring truncating by using integer division for right truncation and modulus for left truncation.
I wonder why base 24 is such an extreme outlier. I had noticed it already when I was playing with this concept in Mathematica, it's the only one I couldn't exhaustively search.
These primes can be built up right to left, and the last "digit" of a prime, mod 24, can only take on phi(24) = 8 different values. (phi is Euler's phi function here.) So when you're considering adding on a digit (base 24) to the left of a prime, the chances of getting a prime are relatively high, since the last digit is already stacked in your favor. Notice that for example that sequence is really small when n is prime, because you don't get that same advantage. To get a lot of left truncatable primes you want your base n to be large (so you have a lot of choices at each step of the search) and you want phi(n)/n to be small (so you can stack the deck in your favor on the last digit).
And why does the sequence stop at 29? phi(30) = 8, setting a new minimum for phi(n)/n.
It's not perfect, but it's fast, easy to code, and the first false positive for this version is a long way out. Early Carmichael numbers have small factors, and although I'm only testing against prime bases up to 19, I'm not sure where the first false positive is. But it's a long way further out than the numbers we're dealing with here.
Thanks for that. As it happens it's not a problem because the first 30 all have even digits and so can't be a final segment of a left-truncatable prime, except 7519441, and 441 isn't prime.
I could write a more comprehensive test - didn't seem worth it for this case, but it's an interesting exercise. It's easy to extend this test so the list of primes used goes as far as the smallest factor in each of these factorisations, say, 109. Then these would all be correctly identified as composite. It would also be easy to combine it with Pollard Rho or PPP tests, to extend the range before we need to do something more serious.
Yes the full program for left-truncatable primes happens to be fine (no false positives in its output for any length, as I verified with a different primality test).
The primality test in the post (Fermat's test) is interesting to analyze though. BTW how did you generate your list of false positives? I didn't mention it earlier, but as a shortcut I used the list from https://oeis.org/A001567 (the base-2 pseudoprimes up to 10^12), and found that the function in the post has 2690 of these false positives less than 10^12. I wrote out only the first 10 above (after piping through `factor`). Of these 2690 false positives, these have the 5 largest smallest prime factors:
And indeed the `small_primes` used by the test needs to be extended up to the prime 6917, if we want to get no false positives at least for numbers up to 10^12. And even if we went that far, we'd encounter the next false positive soon after, at 1025414056801 (factors 7561 8101 16741) and then (if we included primes up to 7561) at 1196508858409 (factors 8059 11833 12547). (See https://oeis.org/A083876)
I think compared to the Fermat test used in the post, the Miller-Rabin test is almost as easy to write (only a few lines longer), and it comes with stronger guarantees. Not only is it a test used in the real world (using it with just a few random bases gives a high degree of confidence), but it also comes with guarantees when a deterministic set of primes (like your idea) is used: for example with the first 13 primes (2 to 41), the first false positive is over 3.3 × 10^24 (https://en.wikipedia.org/w/index.php?title=Miller%E2%80%93Ra...). See also https://miller-rabin.appspot.com/
For what it's worth, here is an implementation of the Miller-Rabin primality test that first tries the smallest 13 primes and then tries some random bases:
def is_probable_prime(n, num_random_trials=5):
"""
An implementation of the Miller-Rabin primality test.
If this function returns False, then n is definitely not prime.
If this function returns True, then n is probably prime:
the probability of n being composite is less than 1/4^(num_random_trials), and
nonzero only if n >= 3317044064679887385961981 > 3.3*10^24 > 1.3*2^81 (by https://arxiv.org/abs/1509.00864).
"""
assert n == abs(int(n)), 'Need a nonnegative integer, got %s' % n
if n < 2: return False
if n % 2 == 0: return n == 2
(s, d) = (0, n - 1)
while d % 2 == 0: (s, d) = (s + 1, d // 2)
assert n - 1 == (2**s) * d and d % 2 == 1
small_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
for i in range(len(small_primes) + num_random_trials):
a = small_primes[i] if i < len(small_primes) else random.randrange(small_primes[-1], n - 1)
if a >= n: break
if not (pow(a, d, n) == 1 or any(pow(a, 2**j * d, n) == n - 1 for j in range(s))):
return False
return True
Hi - thanks for the response. This will be brief, as I'm very short of time.
> Yes the full program for left-truncatable primes happens to be fine ...
Yes - although it's a little lucky, but easy enough to run an extra confirmation.
> how did you generate your list of false positives?
Nothing clever - I literally wrote a slow trial division program, then ran through all integers up to some limit - 10^8 I think - and when the "is_prime" routine said it was prime I called the slow version to see if it was right. Machines are fast.
> the function in the post has 2690 of these false positives less than 10^12. ... Of these 2690 false positives, these have the 5 largest smallest prime factors:
> And indeed the `small_primes` used by the test needs to be extended up to the prime 6917, if we want to get no false positives at least for numbers up to 10^12.
We can augment this with the Perrin Psuedo-Prime test, but Miller-Rabin is still probably the best way to go to do it "properly". If I re-wrote the material on the Perrin tests[0] I'd probably use M-R.
So we're pulling in the same direction, although making different trade-offs. And it's fun to play and see what happens.
It can provide false positives but not false negatives, so it's good enough because the goal was only to prove that the number on the pencil is the largest left-truncatable prime. If it provided a positive result the author would have checked that one separately or added more primes to the check.
It's good enough for smallish primes, especially when you use multiple bases. There aren't many Carmichael numbers, and the early ones all have a small factor. So the test is certainly adequate in this range.
I wonder how long one could be if we only require it to stay prime when it has more than K digits? Take K=0 to get the original problem.
In this form of the problem, it is not the length, L, of the initial prime that we want to maximize but rather L-K.
My guess is that these looser requirements won't make much difference. Estimating based on the prime number theorem and pretending that primes are essentially random numbers (an assumption that works remarkably well a large fraction of the time), I get, if I didn't totally botch it, that given a k digit integer that is prime, there is about a 1/(2k) chance that you can add a digit on the left and still have a prime.
Another generalization comes to mind. Suppose that instead of a pencil, we are dealing with some item with break-away segment. Each segment has d digits printed on it, so when we have L segments we have a dL digit number. As we break away segments we left truncate this number by d digits.
I don't think that would make much of a difference. We can treat it as an L digit number in base 10^d, which we truncate a digit at a time. Another hand wavy application of prime number theorem gives 1/(2kd) as the chance that you can left-extend a k digit base 10^d prime by one digit and get a prime.
Here is a heuristic argument for why we should expect there to be infinitely many.
Suppose we start with a prime p which is d digits long (say p=3 so that d=1), and consider the sequence {a10^k + p : 1≤a≤9, k ≥ d}. So for p=3, we would consider 13, 23, … 93, then 103, 203, … 903, then 1003, 2003, … 9003, etc. By a standard heuristic argument (based on the prime number theorem), the number n is prime with "probability" about 1/log n, so each a10^k + p will be prime with probability about 1/k (ignoring some constant factors). The sum of these probabilities diverges, so one can be very confident that there will be a prime number in this sequence. In other words, we can be extremely confident of the following lemma: given any prime number p, there is a way of extending it to another prime number by adding some digit (followed by 0s) to its left.
Repeated applications of this lemma (with any starting prime p, or even different primes p if you wish) will give the result. For example starting with the prime p=3 and taking the first choice each time, you get primes like 30007020009000000000000000600000000900906001600500006000001026001050612113.
Of course, turning this heuristic argument from "extremely confident" to an actual proof is a challenge. (There may be ways to prove this, I don't know: I know this heuristic is often used by number theorists though.)
Actually, I believe this is an open problem at them moment. Primes of the form a^n + 1 are called Generalized Fermat Primes and the consensus seems to be that there are probably only finitely many for every base a.
I don't think it does. Are you aware that "range(1,10)" in Python is the list [1, 2, 3, 4, 5, 6, 7, 8, 9] ??
Ranges in Python don't include their endpoint. A few people have expressed confusion as to this choice in the design, but it means that the expression:
range(a,b) + range(b,c) + range(c,d)
includes everything from a to d-1 one and only once. Knuth said that intervals should include the initial point and not the endpoint for this reason.
The article mentions a "DON'T DO DRUGS" pencil. The actual pencil, distributed in 1998 by the Bureau for At-Risk Youth, was even funnier. It said "TOO COOL TO DO DRUGS", and apparently the company had no clue what would happen as you sharpened it:
◄ TOO COOL TO DO DRUGS
◄ COOL TO DO DRUGS
◄ DO DRUGS
◄ DRUGS
The link to the front page is the reference to Hacker News in general, not to this specific item. I never intended to link to the specific reference, although I can see why people might think it's relevant. I mentioned the person with a link to their profile, I linked to the forum in general, and I quoted/paraphrased the relevant information. I wouldn't think anything more was needed.
Plotting the distribution gives a nice curve which you can model quite well with the prime number theorem. From memory it looks like x^2*e^-x. Then plotting in different bases is also interesting.