I don't understand why this still looks scary to me. A while back after learning regex (though only at a basic level) I thought, maybe I can make my own custom C preprocessor as an exercise. Perfect right? I get to choose the syntax AND the rules. But somehow I just can't write it. Nested ifdefs, defines, undefs mixing together too OP.
Reading through this, it seems like there's a huge amount of work into real compilers. Front end, optimizer, backend, etc. I do appreciate it more, but as useful as compilers are, maybe I'll just leave it to the pros (:
Don't ever do that! It really isn't that complicated at a basic level. Regex aren't the solution for parsing. What you're missing is the recursive nature of "context free languages"[1]. Formal language theory [2] was one of the big, early wins of Computer Science. There's a lot to it, but writing a simple recursive descent parser doesn't require any of it.
The big caveat is that parsing is only about 1/3 of the problem, the others as described in this article are optimization and code generation. And of these, optimization can be skipped entirely leaving only code generation, which can be naively done by walking the AST. If this all seems foriegn to you, there's a lot of really good info online nowadays ie people love writing about this subject.
Understanding this subject is very important in my opinion, this is one of the foundational things we do. Attaining a comfort level with compiler engineering is one the two or three things anyone can do to really "level up" as a software developer. Some other things are writing multi threaded servers in C, and 3d software rasterization. Again, my opinion here.
>Understanding this subject is very important in my opinion, this is one of the foundational things we do. Attaining a comfort level with compiler engineering is one the two or three things anyone can do to really "level up" as a software developer.
I think it depends on the level of abstraction.
Then again, I'm not really a software developer, most of the work I do is scripting stuff for data analysis, which is where I learned regex from. I did learn C in college which I somehow got fascinated with, but I never really do any serious work with it. Still, over time, reading about quite lower level stuff (compared to what I do) does seem like it helps take the mystery out of things like unintuitive behaviors with multiple references to a Python list (I suspect it was pointers all along). Taking it to the next lower level with studying compilers and how assembly gets generated doesn't seem like it will benefit me much more.
Still, I do like C enough to possibly consider doing something in it for fun if I get an idea, so maybe one day I'll come back to this.
Reading through this, it seems like there's a huge amount of work into real compilers. Front end, optimizer, backend, etc. I do appreciate it more, but as useful as compilers are, maybe I'll just leave it to the pros (: