Hacker News new | past | comments | ask | show | jobs | submit login

IIRC C preprocessor is not very hard to implement according to the specification if you don't worry too much about performance.



The C preprocessor is hilariously underspecified in the standard, so implementing the standard doesn't guarantee that you'll be able to handle real-world C programs (even ones that don't use GNU or clang extensions).


K&R preprocessor was indeed underspecified and allowed lots of variations---much of those issues can be seen in the GCC manual [1]---, but the current ISO C is much better at that job AFAIK. I think `## __VA_ARGS__` is the only popular preprocessor extension [2] at this moment, as the standard replacement (`__VA_OPT__`) is still very new.

[1] https://gcc.gnu.org/onlinedocs/gcc/Incompatibilities.html

[2] Assuming that we don't count things like `#pragma` or `#include_next`, which can be added without affecting other preprocessing jobs.


Yes, consider the case of shecc. It requires just a handful of C code lines to interpret directives set in the C preprocessor. Unlike relying on existing tools like cpp, as, or ld, shecc stands alone as a minimalist cross-compiler. This design could be particularly beneficial for students delving into the study of compiler construction. See https://github.com/sysprog21/shecc/blob/master/src/lexer.c#L...


I largely meant a standard-complaint implementation though, which shecc doesn't claim to be. ;-) In comparison I can easily see that this lexer is not suitable for preprocessor because C requires a superset of numeral tokens [1] during the preprocessing phase.

[1] https://en.cppreference.com/w/c/language/translation_phases#...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: