I feel like this is very underspecified, The very first example: $ echo 'cat' | ...

danielparks · 2025-02-07T17:33:22 1738949602

This is a matter of operator precedence and tokenization. Tokens are single characters in this language, and there is an invisible operator between them.

If the operator were explicit (let’s call it ~), the example would look like this:

    $ echo 'cat' | trre 'c:d~a:o~t:g'
    dog

With unnecessary parentheses:

    $ echo 'cat' | trre '(c:d)~(a:o)~(t:g)'
    dog

c0nstantine · 2025-02-08T07:32:41 1738999961

That's true. Thank you for elaborating.

There is a hidden operator of concatenation as for usual regular expressions. In the code I denote it as lower dot '.' (as in the old Thompson's implementation).

c0nstantine · 2025-02-08T09:59:49 1739008789

The grammar is underspecified. The full grammar is more complex. I guess I need just remove the current version from docs. Now it is confusing indeed.

> Why is "c" not being replaced with "da"?

It is all about precedence. According to the discussion I think I've chosen a wrong one and it raises confusion. Current version of precedence table is this:

| 1 | Escaped characters | \<special character> | | 2 | Bracket expression | [] | | 3 | Grouping | () | | 4 | Single-character-ERE duplication | * + ? {m,n} | | 5 | Transduction | : | | 6 | Concatenation | . (implicit) | | 8 | Alternation | | |

So the ':' is stronger then '.' (implicit concatenation).

kccqzy · 2025-02-07T18:23:43 1738952623

Yes it is underspecified. The deletion example shows that an empty string is possibly a REGEX. So you can essentially treat any position as containing as many empty string regexes as you want. So there are indeed infinite number of parses.

If we instead require regex to be non-empty (breaking the deletion examples), then the ambiguity becomes that of concatenation: whether it's '(((c:d)(a:o))(t:g))' or '((c:d)((a:o)(d:g)))'. Assuming associativity, this would not matter.

Imustaskforhelp · 2025-02-07T17:20:36 1738948836

From what it feels as to how it works, it seems that c:d and a: (nothing) and ot:g

but yes now that I read it , it also makes confusion , theoretically your point makes valid , I also believe that c should be replaced by da after I read the repo , I am not sure ...