Let's face it, the Perl 5/PCRE regex syntax is atrocious. The only reason it exi...

pdntspa · on Aug 7, 2023

Awful? It is inscrutable black magic and it is wonderful once you get it.

I haven't touched perl in years but I still find myself writing regex often!

Dylan16807 · on Aug 8, 2023

Yes, awful. It takes a crack in the wall and drives a bus through it, at a significant penalty to readability. The compatibility advantage doesn't matter when you're evaluating syntax in a vacuum.

pdntspa · on Aug 8, 2023

Not everything has to be super easy-to-use.

Dylan16807 · on Aug 8, 2023

Nobody is suggesting super easy to use.

But sometimes interfaces could be easier with no loss of ability, because they grew over time and nobody ever fixed them.

pdntspa · on Aug 8, 2023

It is terse for a reason. You're ordering a parser around in 10 characters. Which makes them fit everywhere! A longer more readable version starts taking up multiple lines and looking like shitty react code. Or you're just using your language's string primitives. But there is a reason that nobody does that any more -- who wants to read 50 lines of string parsing code when a couple dozen characters of regex will do the same thing?

Let us wizards have our magic!

Dylan16807 · on Aug 8, 2023

The particular syntax being used an example is more verbose than it needs to be, so I don't know why you're making this argument.

masklinn · on Aug 8, 2023

Also various langages allow formatting & commenting regexes and that’s quite useful. Named groups as well.

justinator · on Aug 7, 2023

>Let's face it, the Perl 5/PCRE regex syntax is atrocious.

Agreed, but god-damn is it useful.

db48x · on Aug 7, 2023

Raku’s new grammar syntax is much less awful and just as amazingly useful.

justinator · on Aug 8, 2023

That's cool that's cool, but the problem is getting people to use Raku.

db48x · on Aug 9, 2023

Or even just Raku’s grammars. Perhaps we need an RCRE library :)

lizmat · on Aug 9, 2023

Perhaps. But that will become very difficult indeed.

Because in Raku, grammars are just a different way to write code. Grammars are really just specialized classes. And tokens / rules / regexes are just specialized methods. It all compiles down to bytecode, rather than something you can feed a statemachine.

This has several advantages: if a grammar doesn't provide functionality you need, you can write it in Raku code as part of the grammar.

It also means that when you improve execution of the bytecode, you will also improve the performance of grammars and regexes.

Finally: Raku grammars are very powerful. They are used to parse the Raku language itself. Which is a testament to its power. But also brings a whole set of challenges for the core developers :-)

db48x · on Aug 9, 2023

I know all of that :)

I admit that I was partly being facetious, but I also don’t think it is entirely impossible. For example, a hypothetical RCRE could provide hash tables for storing named regexes, or could take function pointers for looking up names so that the library user could implement their own storage for them. And so on, and so forth.

I think that a hypothetical RCRE could be as influential as PCRE was, if someone could find the time to do it.

On the other hand, I am very weary of C these days. A Rust crate with procedural macros to provide compile–time grammar compilation would be a lot more fun. If I had some funding I could easily see spending a year or three on that.