Hacker News new | past | comments | ask | show | jobs | submit login

Let's face it, the Perl 5/PCRE regex syntax is atrocious. The only reason it exists is that (? was a syntax error in earlier regex syntaxes, so it could be redefined to mean anything.

Raku is an attempt to design a sane regular expression language from first principles, now that we know what we want them to be able to express. The alternative is being stuck with (?:this|(?>or that)) for the next 30 years.




Awful? It is inscrutable black magic and it is wonderful once you get it.

I haven't touched perl in years but I still find myself writing regex often!


Yes, awful. It takes a crack in the wall and drives a bus through it, at a significant penalty to readability. The compatibility advantage doesn't matter when you're evaluating syntax in a vacuum.


Not everything has to be super easy-to-use.


Nobody is suggesting super easy to use.

But sometimes interfaces could be easier with no loss of ability, because they grew over time and nobody ever fixed them.


It is terse for a reason. You're ordering a parser around in 10 characters. Which makes them fit everywhere! A longer more readable version starts taking up multiple lines and looking like shitty react code. Or you're just using your language's string primitives. But there is a reason that nobody does that any more -- who wants to read 50 lines of string parsing code when a couple dozen characters of regex will do the same thing?

Let us wizards have our magic!


The particular syntax being used an example is more verbose than it needs to be, so I don't know why you're making this argument.


Also various langages allow formatting & commenting regexes and that’s quite useful. Named groups as well.


>Let's face it, the Perl 5/PCRE regex syntax is atrocious.

Agreed, but god-damn is it useful.


Raku’s new grammar syntax is much less awful and just as amazingly useful.


That's cool that's cool, but the problem is getting people to use Raku.


Or even just Raku’s grammars. Perhaps we need an RCRE library :)


Perhaps. But that will become very difficult indeed.

Because in Raku, grammars are just a different way to write code. Grammars are really just specialized classes. And tokens / rules / regexes are just specialized methods. It all compiles down to bytecode, rather than something you can feed a statemachine.

This has several advantages: if a grammar doesn't provide functionality you need, you can write it in Raku code as part of the grammar.

It also means that when you improve execution of the bytecode, you will also improve the performance of grammars and regexes.

Finally: Raku grammars are very powerful. They are used to parse the Raku language itself. Which is a testament to its power. But also brings a whole set of challenges for the core developers :-)


I know all of that :)

I admit that I was partly being facetious, but I also don’t think it is entirely impossible. For example, a hypothetical RCRE could provide hash tables for storing named regexes, or could take function pointers for looking up names so that the library user could implement their own storage for them. And so on, and so forth.

I think that a hypothetical RCRE could be as influential as PCRE was, if someone could find the time to do it.

On the other hand, I am very weary of C these days. A Rust crate with procedural macros to provide compile–time grammar compilation would be a lot more fun. If I had some funding I could easily see spending a year or three on that.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: