Hacker News new | past | comments | ask | show | jobs | submit login

More interesting than the definition of primes, it's almost the definition of multiplication that is embedded in this regex. We have two numbers(of occurrences) being multiplied:

- the first one is represented by the group (..+) it represents the number of occurrences n between 2 and +∞

- the second one is represented by (\1)+. We will repeat the first number m times, between 1 and +∞ times.

So the result of the multiplication is n*(m+1), which cannot be a prime. We just have to take the opposite with negative lookahead. It's very beautiful indeed.

See http://regex101.com/r/qN2fQ8 or http://www.regexper.com/#^%28%3F!%28..%2B%29\1%2B%24%29 to follow the above explanations.




Thanks for the web site links! Both are pretty interesting and I actually learned something from the detailed description(s) that regex101 provides.

(I learned that for (\1)+, "Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data")


Later, I realized that in a regular program / on the command line, the negative lookahead can be avoided by using the !~ (doesn't match) operator,i.e., we just check that the number is not a non-prime using a simpler regex:

  DB<63> print "Matches!" if (("x" x 31) !~ /^(..+)\1+$/)
Matches!

  DB<64> print "Matches!" if (("x" x 18) !~ /^(..+)\1+$/)
The Regex Golf site only asserts matches, i.e., it's using =~. That's why the negation using negative lookahead was needed.

(The simpler regex merely looks for non-primes by matching any number of characters which are a multiple of two numbers, n x m, i.e., those which can be factorized. n comes from (..+), m comes from \1+).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: