The C preprocessor is hilariously underspecified in the standard, so implementing the standard doesn't guarantee that you'll be able to handle real-world C programs (even ones that don't use GNU or clang extensions).
K&R preprocessor was indeed underspecified and allowed lots of variations---much of those issues can be seen in the GCC manual [1]---, but the current ISO C is much better at that job AFAIK. I think `## __VA_ARGS__` is the only popular preprocessor extension [2] at this moment, as the standard replacement (`__VA_OPT__`) is still very new.
Yes, consider the case of shecc. It requires just a handful of C code lines to interpret directives set in the C preprocessor. Unlike relying on existing tools like cpp, as, or ld, shecc stands alone as a minimalist cross-compiler. This design could be particularly beneficial for students delving into the study of compiler construction. See https://github.com/sysprog21/shecc/blob/master/src/lexer.c#L...
I largely meant a standard-complaint implementation though, which shecc doesn't claim to be. ;-) In comparison I can easily see that this lexer is not suitable for preprocessor because C requires a superset of numeral tokens [1] during the preprocessing phase.