Hacker News new | past | comments | ask | show | jobs | submit login

Why regex still exists? It is unintuitive, requires mastering an obscure syntax, it is very hard to debug, and very difficult to explain to others how it works. It feels like we are trying to write intermediate code by ourselves, while we should have a human readable language that generates regex.



You might be interested in “Eggex”, which aims to be a human-readable language that generates regexes. It’s currently written as a feature of the Oil shell, but in theory any tool could support them. Eggex docs: https://www.oilshell.org/release/latest/doc/eggex.html. Recent blog post about their development: https://www.oilshell.org/blog/2019/12/22.html.

However, Eggexes are a thin, mostly-syntactic layer over regexes. You still have to understand the regex engine to use them. If this sounds useless to you because you don’t currently understand any flavor of regex or parsing, I encourage you not to give up on learning regexes. (https://www.regular-expressions.info/ was how I learned; it’s a great tutorial.) Text-parsing engines, including regex engines, are a powerful concept that can be used in many situations, and I think it’s worth spending the effort learning them until, to paraphrase another commenter, regexes become the human-readable language you were searching for. Or Eggexes, at least.


The investment into learning regexes is worth it if you write or read enough of them. They become the human readable language you speak of, eventually. The question is where the threshold lies.


Do it! You will find that it's very easy, but the result will either be extermely verbose or just like regex. Since most regexes (at least for me) are meant as one-time-use, the extra verboseness has no added benefit. If you have complex needs, you should probably be using something other that regex, anyways.


Extremely verbose is right. Here's one such approach in java that I found last year - https://github.com/sgreben/regex-builder.

Yeah, regex can be a bit clunky at times and has a steeper learning curve, but they're pretty industry standard at this point, and portable across languages with a few caveats.


"Why regex still exists?"

Is there an alternative that is clearly superior?


Your mileage may vary, but to my taste, the lpeg flavor of Parsing Expression Grammars is clearly superior.

It uses operator overloading to build patterns from component parts. I don't think anything can replace the terseness of regex for command line use, or vim searching, cases like that.

But for a program, give me lpeg every time.


Because it's really powerful, and some people actually like it (I'm one of them).

I can understand that a complex pattern might look scary if you're unfamiliar, but if you work with it long enough, you can put patterns together with relative ease.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: