A preprocessor "trick" not mentioned in the article is the use of GNU statement expressions.
The article describes the do {} while(0) trick to make macros look like statements. They can then basically be used like inline void function calls. But what if you want to write a macro to replace an inline function that returns a result?
Statement expressions [1] solve that problem. Now you can finally do this:
> A preprocessor "trick" not mentioned in the article is the use of GNU statement expressions.
That's probably because IAR sells compilers (and IDEs) for embedded software. IAR and Keil are the popular compilers in the "bare bones microcontroller C/Assembly" world.
It's a good introduction to the preprocessor, but I would dispute that it's advanced tricks.
The content is definitely interesting, but also very standard practice.
There are many wonderful and horrible things that can be done with macros, and some people get very creative, sometimes to a fault.
It's always a special feeling to see macros taking as argument other macros, all of that spreads through a deep hierarchy over several files!
I follow the advise of Scott Meyers to prefer the compiler over the pre-processor. Pre-processor code can inject bugs that which can't be easily traced with debugging tools.
The pre-processor is pure evil and should be avoided when ever possible.
If you want to debug preprocessor-generated code, -gdwarf-4 -g3 usually does the trick with GCC and Clang. If it doesn't, you can just gcc -E it and run it through clang-format or something, and compile and debug the expanded version. It's really not a big deal. You can have a Makefile rule to do this automatically, for example https://github.com/alpha123/yu/blob/master/Makefile#L70-L103
Saying the C preprocessor can "inject bugs" and make code that "can't be easily traced" is deceptive and wrong. It is, for all intents and purposes, a solved problem. That Makefile rule could be 3x shorter if I didn't go for some extra niceties like not expanding system headers. (As usual, 20% of the result takes 80% of the work). The preprocessor doesn't "inject bugs" anyway, unless you write very careless macros. Modern C, with things like __typeof, _Generic, and GCC/Clang's statement expressions means your macros can be totally safe against overwriting variables, double-evaluation, type errors, and more. You can, of course, still do silly things, but C isn't about protecting the programmer anyway, and that's totally fine if you're prepared for it.
Perhaps in C++ the preprocessor is ‘pure evil’, but in C it's a very useful part of the language.
I work in the embedded space and use a compiler from Green Hills.
The preprocessor in either C or C++ is evil for the simple matter you can't debug it when the thing you're debugging is in production, symbols are stripped, and the disassemble resembles nothing you can marry up to actual C or C++ code. In simple cases, maybe, but in not so simple cases it's just evil.
A modern compiler will optimized out the global consts anyways too...
If you have a modern compiler then it supports outputting the pre-processor result.
If you have a modern toolchain stack unwinding is already friendly to macro expansion.
A production build does not mean you don't have symbols, it just means you don't have symbols in the executable. Stash your build symbols somewhere so you can symbolize the stacks offline. Macros don't change this. They are consistent output for the same input.
Macros are not evil. Abusable, yes, but they also solve various sets of problems just better than any other solution.
Outputting the pre-processor result is not that helpful, you're still stuck with deciphering things.
macro expansion is useless, again, for same reason you can't marry it to actual code. If you can do this now, would be at most curious (not interested) as it's evil to begin with.
A production build in the embedded world, typically means you don't have symbols. Often they are stripped to save space and unless you have a USB stick or something you carry with you with map files, you're still toast.
And there are a whole host of other reasons why they are just pure evil:
1. Global scope only.
2. Unexpected results rather than error messages - ISO25259 won't even allow you to do this if that tells you anything.
3. Can't use sizeof
4. no type checking. Every compiler I've used, including GCC, won't warn about if an int is compared to unsigned.
5. can't take the address
6. worst, substituted value need not even be legal in the context the #define is created because it is evaluated at each point it is referenced allowing to another evil problem of being able to reference objects that are not declared yet.
I could go no... and I haven't even touched on strings yet.
with so many problems, I have a hard time justifying or championing their use.
So are functions, by that rationale. If you want to control scope of variables, pass in macro parameters.
> 2. Unexpected results rather than error messages
When have min() or max() produced unexpected results rather than error messages? They're macros, you know.
> Every compiler I've used, including GCC, won't warn about if an int is compared to unsigned.
-Wconversion
> 5. can't take the address
You can't take the address of a struct definition or a typedef, either. Macros are not replacements for functions; they're code generators.
> 6. worst, substituted value need not even be legal in the context the #define is created because it is evaluated at each point it is referenced allowing to another evil problem of being able to reference objects that are not declared yet.
I don't even... You do know that, following the preprocessing phase, the compiler actually checks the generated code for legality, right? And since when is forward referencing evil?
What does the comment about stack unwinding refer to?
I've always found macros pretty horrid for debugging because you can't step through them properly. I just tried with gcc 6.0 and gdb 7.11 and it still seems to be unsupported, 20-odd years after the first time I rued the lack of this functionality. At this rate I strongly suspect this will never, ever work. (Another black mark against macros that expand to anything significant, in my view.)
DWARF 3 and 4 can contain information about macros, and GDB can expand them (macro exp foo(bar,...)). Compile with the somewhat under-mentioned flags -gdwarf-4 -g3 to retain macro definitions. This lets you debug expansions easily, but doesn't map already-compiled macro expansions back to their macro definitions.
If you want to step through execution of a function defined by a macro, preprocess with -E -P and compile that. For more advanced (and pleasant) macro debugging, you can do a little trick:
1. Create an e.g. include/dummy directory and create empty versions of whatever headers you don't want to expand.
2. Run your files through $CC -E -P -Iinclude/dummy
3. clang-format is your friend for making the output much nicer.
4. Compile with explicit -include of the real versions of files in include/dummy.
Which turns out to be useful for more than just error messages (which is still useful in C++ if you're avoiding exceptions). Recently, for example, I did something like that to map a small fixed set of object properties to database columns.
If you want to see some actual code, I use X-macros, as well as other macros, heavily in the test suite for my (very) WIP programming language: https://github.com/alpha123/yu/tree/master/test
I use preprocessor macros for what I guess I'll call "hyper-local extraction" in an attempt to be as DRY as possible. I don't know how modern C++ would handle this better. It looks a bit like this:
auto a = [] {frob(-1,0,0); grob("a"); blah(...);};
auto b = [] {frob(1,0,0); grob("b"); blah(...);};
auto c = [] {frob(0,-1,0); grob("c"); blah(...);};
...
if (normal_order) {
a(); b(); c(); ...
} else if (...) {
b(); c(); a(); ...
} ...
Or for things like compile-time injection of wrappers that you need to be completely gone in production versions. An example would be wrapping all OpenGL calls with glGetError() checks in debug builds.
Modern C++ did not replace macros nor did it even try. some uses of macros were made obsolete (like min/max), but most were not.
Bit off topic, but I also work in embedded (automotive) with Green Hills compiler & Multi2000 debugger. After using GCC and GDB, Green Hills tools are so much inferior that it has become the biggest issue on project.
On top of it all, they require constant(!) connection to license server and will throw annoying popups if being disconnected for few minutes. It's so frustrating to constantly being punished by DRM even if I'm paying customer.
At my last job we used the Green Hills MULTI toolchain as well. We had USB dongle licensing; no license server was needed, just the little USB stick.
And yeah, I hated using it too. Awful stuff. We went to great lengths to allow a testing version of the firmware to be built using Visual C++ and executed locally, just so we could avoid using the Green Hills stuff whenever possible.
oh, I so agree with you. Their tools do SUCK! I constant lose connectivity with their probes and their paid support is atrocious and they can't even answer simple questions about their own products. Horrible, horrible experience. If you can avoid Green Hills, take it from a 20 year veteran, you should do so!
It's not pure evil. It handles `#include` for one. And I can't see how you can write cross-platform code without platform-specific macros or conditionally processed sections.
statement
statement
statement
// remember to comment this code out
statement
statement
statement
// --- end of debugging code
statement
statement
statement
nor do I want a runtime check
Of course ultimately I just wish C/C++ and every other language let you use the entire language at compile time like lisp (and a few others).
As for more advanced macro foo macro lists are my favorite
I know some people don't like them. Typically those people would solve the issue by using some other language, python, to generate C/C++. I'd prefer not to have yet another dependency.
There's no need to worry about a runtime check. Let DEBUG be a compile time constant, and the optimizer will remove the check.
C++ seems to be (very) slowly moving in the direction of using the whole language at compile time: see how constexpr has been getting better. I doubt it will ever get anywhere near what lisp lets you do though, and in those cases the preprocessor sure does help at least partially fill the gap. This is my favorite explanation of the technique you linked to:
Let DEBUG be a compile time constant, and the optimizer will remove the check.
What do you feel is gained with "const int DEBUG" over "#define DEBUG"? While most compilers will do what you say most of the time, I haven't encountered a significant downside to the macro approach.
I also like that I can have a makefile where "make debug" adds "-DDEBUG" to the compiler command line and appends "_debug" to the executable name. With the const approach, is there an easy way to automatically generate both debug and production executables?
I guess the argument is orthogonality. It is a desirable property in programming languages to not have overlapping features, because people will always be confused about when to use which (the classic example being pointers vs. references in C++).
preprocessor tricks are cute, but there's definitely something to be said for staying completely w/in the language -- I've really come to appreciate this using Go, where I get incredible auto-formatting, rename refactor, and source introspection tools that would not be possible if a preprocessor were standard-issue.
(Of course, Java people have been saying something similar for a long time, but I never could get into it, 'cuz of a bunch of other stuff, like the IDE-centric culture & the language itself)
> but there's definitely something to be said for staying completely w/in the language
The C Pre Processor is defined in the ISO C standard. In order for any implementation to call it self a normative conforming C implementation I has to include a pre processor to spec. The Pre Processor is part of the ISO C language.
All of those things are had quite easily with C. Either as IDE/Editor features or with separate tools such as Clang Complete, Valgrind and many others.
spent a long time w/ c and c++, never found tools anywhere near as good as what i've got now for go.
maybe/probably not impossible, but a lot harder in practice. i have no personal experience, but people that seem to know what they're talking about say a big reason is the preprocessor makes life a lot harder. (and other stuff too, of course, like not having build semantics built into the language, and, for c++, just insane complexity in general.)
obviously the preprocessor is well-defined and standardized, but it still: operates on a totally different semantic level. it really is like two entirely different languages pasted together.
Would you mind listing a few tools you use with go?
I do not have much exposure to go but from what I have seen there is a nice set of introspection libraries. are you talking about tools that use these?
A final note, real time semantic checking of multiple expansion macros has been solved for quite a few years now and is in most IDE's and even in Vim and Emacs. Semantic analysis is done in the comelier, LLVM/Clang lib has given rise to a substantial set of tools that are now quite widely used and solve most of the problems.
And I believe C++17 with concepts is also going to have a dramatic effect on complexity and readability of template error messages.
these days my only c use is via obj-c in xcode, so i may well be unaware of the state of the art for straight c or straight c++ / other tools. xcode's refactorings and jump-to-defn (presumably powered by clang?) are still very flaky for me.
for go, i use:
goimports -- which is gofmt (which is awesome) + auto-adding/removing imports (just a heuristic so not always right, but usually right and quite handy)
godef -- for jumping to definitions
gorename -- for rename refactorings. i have not once gotten this to break code.
oracle -- for showing other things about code, like 'which types implement this interface?'. as far as i can tell, it hasn't lied to me yet.
i use emacs as my editor, which integrates easily w/ all of the above tools.
i'm not sure what libraries power the tools above, but i think it's true that packages like go/types do the heavy lifting and make them relatively easy to write.
> preprocessor tricks are cute, but there's definitely something to be said for staying completely w/in the language -- I've really come to appreciate this using Go,
Wait, doesn't Go use code generation as well? I thought their answer to "we don't have a generics" is to basically run a gen as a preprocessor then generate a bunch of code?
definitely code generation breaks similar things to the preprocessor, but it's got a different set of trade-offs. basically the generated code is there to look at, so rename refactors or jump-to-definition or what have you will work reliably, until you regenerate the code. depending on how the code regeneration works (is it just annotations of existing code?), maybe the regenerated code will work and match the code definition.
from past experience, i view code generation with a healthy amount of skepticism, for the same reasons i'm skeptical of the preprocessor! it's cool, it's powerful, you lose out on a lot.
my impression/experience of go is that codegen is still not widespread. (i haven't seen it much, anyway.)
> #ifdefs don't protect you from misspelled words, #ifs do.
That's not (typically) correct. If an undefined identifier is used in a preprocessor expression, it expands (contracts?) to a constant 0. gcc does not warn about using undefined identifiers by default, and if there's an option to make it do so I'm not familiar with it. (EDIT: I am now, it's "-Wundef". Thanks, sigjuice.) For example, this program:
#include <stdio.h>
#define MISPELLED
int main(void) {
#if MISPELED
puts("MISPELED is true");
#else
puts("MISPELED is false");
#endif
#ifdef MISPELED
puts("MISPELED is defined");
#else
puts("MISPELED is not defined");
#endif
}
produces no compile-time diagnostics and prints this output:
MISPELED is false
MISPELED is not defined
If you have a C compiler that warns about this, that's great -- but you're likely to get spurious warnings for some code.
I'm a fan of compiling with -Wall -Wextra and manually disabling any irrelevant warnings (which is rare, and preferable to do on a per-file basis in the Makefile).
I'm desperately looking for a C preprocessor that dumps the AST (preferable in JSON format) instead of expanding the macros. And it treats the C/C++ code as just data.
I plan to take the rest of the code and send it through clang to get C/C++ AST dump. Currently I'm getting clang's AST dump which "includes" all the headers and the size of the AST just blows up (e.g., 300k lines of AST for a 4k line C++ file that has about 10 include files).
My favorite "feature" of the pre processor is when I include Windows.h and then get a super strange compile error in my code because a macro Windows defines (with a super generic name) matches part of a name in my code (which is in a namespace btw) and expands into some unholy thing that is invisible at the source level... Fun times.
Deferred expansion is a common (well, not that common) trick this misses that I would have included. Generally the C preprocessor is pretty terrible though, you should seriously consider generating code with an external tool before doing anything nontrivial with it.
Reading this makes me a little sad. I know the preprocessor can be misused but I really miss it in C#. A lot of highly repetitive code could be written much more elegant with a few well written macros.
There's always M4, which will preprocess just about anything you throw at it.
The syntax starts off pretty godawful, but eventually you learn to love it. C# and Java are much nicer with a little M4.
Everybody loves to hate on text-substitution macros (and hey, I write parsers and such, I'm a Smug Lisp Weenie, and generally I know how easy it is to break something and how much better real macros are), but sometimes they actually are the right tool for the job. Just because they're sometimes very much the wrong tool doesn't mean they should be kept out of your toolbox. Until all languages are either Lisp-ish or Forth-ish, I will maintain that there is a place for CPP/M4/etc. :-)
M4 is seriously awesome, just remember: with great text substitution comes great attention to detail. And heed the warning from the manual[1]:
> Some people find m4 to be fairly addictive. They first use m4 for simple problems, then take bigger and bigger challenges, learning how to write complex sets of m4 macros along the way. Once really addicted, users pursue writing of sophisticated m4 applications even to solve simple problems, devoting more time debugging their m4 scripts than doing real work. Beware that m4 may be dangerous for the health of compulsive programmers.
M4 is a great tool. A good chunk of programmers now seem to have never been exposed to pre processors and what you can do with them, its a shame really.
Not always possible though. For instance, I don't think it's possible to write "assert" as a function; it has to be a macro. But things like assert are admittedly pretty rare.
Why is this preferable over a preprocessor define? Is it because it can't be redefined easily? A max or count is certainly useful when there are other enum values but I don't see the huge advantage it has here.
That's the very basics of C preprocessor and the last part (#ifdef vs #ifs) didn't convince me at all, it's just a matter of programming style and use case.
In C++, probably never. In C, they can be necessary for doing type orthogonal things (like intrusive data structures). Type generic expressions in C11 can be a bit of a safer alternative for some stuff, but you generally need to give explicit implementations for each type.
For example,
#define sqrt(x) _Generic((x), long double: sqrtl, float: sqrtf, default: sqrt)(x)
Even in C++, they are good when you want to encapsulate a statement that may return from the enclosing function. Every time I write a parser or interpreter in C++, I end up writing a macro like
You have to use preprocessor in any non-trivial C++ project. For an idiomatic and very nice and clean example of a heavy preprocessor use see LLVM and Clang code.
You don't have to use the preprocessor, but it is definitely practical for certain things. It's probably not clear from my comment, but macros were what I assumed the above post was talking about though (e.g. min/max). FYI, some projects actually eschew the C/C++ preprocessor for other languages.
I tend to avoid macros, but stuff like generating things at compile time can sometimes be pretty useful. It can make it harder for other people to read the code if they're not familiar with it though (also any tools that need to parse), which is why I try to prefer using UltiSnips for most stuff I normally might do with a macro.
> Any kind of repeating lists (enum + a switch over all the values, for example) are best handled by define-include-undefine sequences.
> See how it is done in LLVM, see all the .def files there. I cannot think of a single viable alternative to this method.
I've seen x-macros. I'm not sure I've seen them used to good effect, but I'd definitely be curious how the llvm code uses them.
> If you want a dense, readable, maintainable code - you're out of options, you have to resort to the preprocessor.
I guess it wasn't entirely obvious from my comment, but I wasn't arguing against using the preprocessor. It's self-contained, mostly usable with common tools, etc.. Anything that's supported on more than one platform or has compile time options generally depends on it. Dense, readable, maintainable code is definitely a good thing, which is one of the reasons I'd personally like to see a usable reflection land in C++.
> which is one of the reasons I'd personally like to see a usable reflection land in C++.
Of course. Reflection and proper macros. At least something like mbeddr would have been great.
I never said preprocessor is the best possible option. I'm only saying that at the moment it is the only practically available option, and therefore it's unavoidable, like it or not. I personally hate it with a passion, but I have to use it extensively.
The article describes the do {} while(0) trick to make macros look like statements. They can then basically be used like inline void function calls. But what if you want to write a macro to replace an inline function that returns a result?
Statement expressions [1] solve that problem. Now you can finally do this:
They are supported by most popular compilers which are not made by Microsoft (including GCC and clang).[1] https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html