Hacker News new | past | comments | ask | show | jobs | submit login
C preprocessor tips and tricks (iar.com)
129 points by ingve on Feb 29, 2016 | hide | past | favorite | 71 comments



A preprocessor "trick" not mentioned in the article is the use of GNU statement expressions.

The article describes the do {} while(0) trick to make macros look like statements. They can then basically be used like inline void function calls. But what if you want to write a macro to replace an inline function that returns a result?

Statement expressions [1] solve that problem. Now you can finally do this:

    int _foo(int x) { return 2 * x; }

    #ifdef __DEBUG__
    #   define foo(x) \({                     \
            int r = _foo(x);                  \
            printf(                           \
                "foo returned: %d (%s:%d)\n", \
                r, __FILE__, __LINE__         \
            );                                \
            r;                                \
        })
    #else
    #   define foo _foo
    #endif
They are supported by most popular compilers which are not made by Microsoft (including GCC and clang).

[1] https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html


The canonical use of which being safe min/max:

    #define min(a, b) ({ \
      __typeof__(a) _a = (a); \
      __typeof__(b) _b = (b); \
      _a > _b ? _b : _a; \
    })

    #define max(a, b) ({ \
      __typeof__(a) _a = (a); \
      __typeof__(b) _b = (b); \
      _b > _a ? _b : _a; \
    })
which avoid evaluating their arguments twice.


> A preprocessor "trick" not mentioned in the article is the use of GNU statement expressions.

That's probably because IAR sells compilers (and IDEs) for embedded software. IAR and Keil are the popular compilers in the "bare bones microcontroller C/Assembly" world.


It's a good introduction to the preprocessor, but I would dispute that it's advanced tricks. The content is definitely interesting, but also very standard practice.

There are many wonderful and horrible things that can be done with macros, and some people get very creative, sometimes to a fault. It's always a special feeling to see macros taking as argument other macros, all of that spreads through a deep hierarchy over several files!


> It's a good introduction to the preprocessor, but I would dispute that it's advanced tricks.

Yes. The X macro is missing for example.

https://en.wikipedia.org/wiki/X_Macro

I would also add #error, #line... and command line options to get the preprocessed output.


> I would dispute that it's advanced tricks

Ok, we took out "advanced" and put in "C".

HN readers are like sandpaper for titles. My thought of the day.


I follow the advise of Scott Meyers to prefer the compiler over the pre-processor. Pre-processor code can inject bugs that which can't be easily traced with debugging tools.

The pre-processor is pure evil and should be avoided when ever possible.


If you want to debug preprocessor-generated code, -gdwarf-4 -g3 usually does the trick with GCC and Clang. If it doesn't, you can just gcc -E it and run it through clang-format or something, and compile and debug the expanded version. It's really not a big deal. You can have a Makefile rule to do this automatically, for example https://github.com/alpha123/yu/blob/master/Makefile#L70-L103

Saying the C preprocessor can "inject bugs" and make code that "can't be easily traced" is deceptive and wrong. It is, for all intents and purposes, a solved problem. That Makefile rule could be 3x shorter if I didn't go for some extra niceties like not expanding system headers. (As usual, 20% of the result takes 80% of the work). The preprocessor doesn't "inject bugs" anyway, unless you write very careless macros. Modern C, with things like __typeof, _Generic, and GCC/Clang's statement expressions means your macros can be totally safe against overwriting variables, double-evaluation, type errors, and more. You can, of course, still do silly things, but C isn't about protecting the programmer anyway, and that's totally fine if you're prepared for it.

Perhaps in C++ the preprocessor is ‘pure evil’, but in C it's a very useful part of the language.


I work in the embedded space and use a compiler from Green Hills.

The preprocessor in either C or C++ is evil for the simple matter you can't debug it when the thing you're debugging is in production, symbols are stripped, and the disassemble resembles nothing you can marry up to actual C or C++ code. In simple cases, maybe, but in not so simple cases it's just evil.

A modern compiler will optimized out the global consts anyways too...


If you have a modern compiler then it supports outputting the pre-processor result.

If you have a modern toolchain stack unwinding is already friendly to macro expansion.

A production build does not mean you don't have symbols, it just means you don't have symbols in the executable. Stash your build symbols somewhere so you can symbolize the stacks offline. Macros don't change this. They are consistent output for the same input.

Macros are not evil. Abusable, yes, but they also solve various sets of problems just better than any other solution.


Outputting the pre-processor result is not that helpful, you're still stuck with deciphering things.

macro expansion is useless, again, for same reason you can't marry it to actual code. If you can do this now, would be at most curious (not interested) as it's evil to begin with.

A production build in the embedded world, typically means you don't have symbols. Often they are stripped to save space and unless you have a USB stick or something you carry with you with map files, you're still toast.

And there are a whole host of other reasons why they are just pure evil:

1. Global scope only. 2. Unexpected results rather than error messages - ISO25259 won't even allow you to do this if that tells you anything. 3. Can't use sizeof 4. no type checking. Every compiler I've used, including GCC, won't warn about if an int is compared to unsigned. 5. can't take the address 6. worst, substituted value need not even be legal in the context the #define is created because it is evaluated at each point it is referenced allowing to another evil problem of being able to reference objects that are not declared yet.

I could go no... and I haven't even touched on strings yet.

with so many problems, I have a hard time justifying or championing their use.


> 1. Global scope only.

So are functions, by that rationale. If you want to control scope of variables, pass in macro parameters.

> 2. Unexpected results rather than error messages

When have min() or max() produced unexpected results rather than error messages? They're macros, you know.

> Every compiler I've used, including GCC, won't warn about if an int is compared to unsigned.

-Wconversion

> 5. can't take the address

You can't take the address of a struct definition or a typedef, either. Macros are not replacements for functions; they're code generators.

> 6. worst, substituted value need not even be legal in the context the #define is created because it is evaluated at each point it is referenced allowing to another evil problem of being able to reference objects that are not declared yet.

I don't even... You do know that, following the preprocessing phase, the compiler actually checks the generated code for legality, right? And since when is forward referencing evil?


The vast, vast majority of your "evil" list is either just wrong or unrelated. I don't think you really know what macros are or how/when to use them.


What does the comment about stack unwinding refer to?

I've always found macros pretty horrid for debugging because you can't step through them properly. I just tried with gcc 6.0 and gdb 7.11 and it still seems to be unsupported, 20-odd years after the first time I rued the lack of this functionality. At this rate I strongly suspect this will never, ever work. (Another black mark against macros that expand to anything significant, in my view.)


DWARF 3 and 4 can contain information about macros, and GDB can expand them (macro exp foo(bar,...)). Compile with the somewhat under-mentioned flags -gdwarf-4 -g3 to retain macro definitions. This lets you debug expansions easily, but doesn't map already-compiled macro expansions back to their macro definitions.

If you want to step through execution of a function defined by a macro, preprocess with -E -P and compile that. For more advanced (and pleasant) macro debugging, you can do a little trick:

1. Create an e.g. include/dummy directory and create empty versions of whatever headers you don't want to expand.

2. Run your files through $CC -E -P -Iinclude/dummy

3. clang-format is your friend for making the output much nicer.

4. Compile with explicit -include of the real versions of files in include/dummy.

5. Debug normally.

It takes a little bit of effort to get right, but after that you can just throw it in a Makefile and forget about it (e.g. https://github.com/alpha123/yu/blob/master/Makefile#L96-L126).


Can you give an example of a problem which the preprocessor solves better than modern C++?


The X-macro pattern comes to mind:

  #define LIST_ERROR_TYPES(X) \
    X(OK, "") \
    X(ERR_OOM, "out of memory") \
    X(ERR_BAD_STRING_ENCODING, "string was invalid utf-8") \
    ...

  #define DEF_ENUM(ident,_) ident,
  typedef enum { LIST_ERROR_TYPES(DEF_ENUM) } err_t;

  const char *get_err_msg(err_t err) {
  #define IDENT2MSG(ident,msg) case ident: return msg;
    switch (err) {
    LIST_ERROR_TYPES(IDENT2MSG)
    default: return "unknown error";
    }
  }
Which turns out to be useful for more than just error messages (which is still useful in C++ if you're avoiding exceptions). Recently, for example, I did something like that to map a small fixed set of object properties to database columns.

If you want to see some actual code, I use X-macros, as well as other macros, heavily in the test suite for my (very) WIP programming language: https://github.com/alpha123/yu/tree/master/test

There's some other interesting and generic macros that I frequently use in there too: https://github.com/alpha123/yu/blob/master/src/yu_common.h


I use preprocessor macros for what I guess I'll call "hyper-local extraction" in an attempt to be as DRY as possible. I don't know how modern C++ would handle this better. It looks a bit like this:

    #define a_ frob(-1,0,0); grob("a"); blah(...);
    #define b_ frob(1,0,0); grob("b"); blah(...);
    #define c_ frob(0,-1,0); grob("c"); blah(...);
    #define d_ frob(0,-1,1); grob("d"); blah(...);
    #define e_ frob(0,0,-1); grob("e"); blah(...);
    #define f_ frob(0,1,1); grob("f"); blah(...);
    
    if (normal_order) {
      a_ b_ c_ d_ e_ f_;
      c_ b_ a_ f_ e_ d_;
    } else if (special_order_1) {
      b_ c_ d_ e_ a_;
      d_ e_ c_ b_ a_ f_;
    } else if (...) {
      // ... and so on ...
    }
    
    #undef a_
    #undef b_
    #undef c_
    #undef d_
    #undef e_
    #undef f_


Modern C++ could use lambdas:

    auto a = [] {frob(-1,0,0); grob("a"); blah(...);};
    auto b = [] {frob(1,0,0); grob("b"); blah(...);};
    auto c = [] {frob(0,-1,0); grob("c"); blah(...);};
    ...

    if (normal_order) {
      a(); b(); c(); ...
    } else if (...) {
      b(); c(); a(); ...
    } ...


I guess that's pretty good.


Code generation, for example:

   #define DISALLOW_COPY_AND_ASSIGN(TypeName) \
     TypeName(const TypeName&) = delete;      \
     void operator=(const TypeName&) = delete
Or for things like compile-time injection of wrappers that you need to be completely gone in production versions. An example would be wrapping all OpenGL calls with glGetError() checks in debug builds.

Modern C++ did not replace macros nor did it even try. some uses of macros were made obsolete (like min/max), but most were not.


Bit off topic, but I also work in embedded (automotive) with Green Hills compiler & Multi2000 debugger. After using GCC and GDB, Green Hills tools are so much inferior that it has become the biggest issue on project.

On top of it all, they require constant(!) connection to license server and will throw annoying popups if being disconnected for few minutes. It's so frustrating to constantly being punished by DRM even if I'm paying customer.

Sorry for rant, I just needed to vent a bit.


At my last job we used the Green Hills MULTI toolchain as well. We had USB dongle licensing; no license server was needed, just the little USB stick.

And yeah, I hated using it too. Awful stuff. We went to great lengths to allow a testing version of the firmware to be built using Visual C++ and executed locally, just so we could avoid using the Green Hills stuff whenever possible.


oh, I so agree with you. Their tools do SUCK! I constant lose connectivity with their probes and their paid support is atrocious and they can't even answer simple questions about their own products. Horrible, horrible experience. If you can avoid Green Hills, take it from a 20 year veteran, you should do so!


It's not pure evil. It handles `#include` for one. And I can't see how you can write cross-platform code without platform-specific macros or conditionally processed sections.


Some people hate the preprocessor. I love it. I miss it in every language that doesn't have one. I'd much rather be able to write

    statement
    statement
    statement
    #if DEBUG_FEATURE_X
        statement
        statement
        statement
    #endif
    statement
    statement
    statement
than

    statement
    statement
    statement
    // remember to comment this code out
    statement
    statement
    statement
    // --- end of debugging code
    statement
    statement
    statement
nor do I want a runtime check

Of course ultimately I just wish C/C++ and every other language let you use the entire language at compile time like lisp (and a few others).

As for more advanced macro foo macro lists are my favorite

http://games.greggman.com/game/keeping_lists_in_sync_in_c__/

I know some people don't like them. Typically those people would solve the issue by using some other language, python, to generate C/C++. I'd prefer not to have yet another dependency.


There's no need to worry about a runtime check. Let DEBUG be a compile time constant, and the optimizer will remove the check.

C++ seems to be (very) slowly moving in the direction of using the whole language at compile time: see how constexpr has been getting better. I doubt it will ever get anywhere near what lisp lets you do though, and in those cases the preprocessor sure does help at least partially fill the gap. This is my favorite explanation of the technique you linked to:

http://journal.stuffwithstuff.com/2012/01/24/higher-order-ma...


Let DEBUG be a compile time constant, and the optimizer will remove the check.

What do you feel is gained with "const int DEBUG" over "#define DEBUG"? While most compilers will do what you say most of the time, I haven't encountered a significant downside to the macro approach.

I also like that I can have a makefile where "make debug" adds "-DDEBUG" to the compiler command line and appends "_debug" to the executable name. With the const approach, is there an easy way to automatically generate both debug and production executables?


One advantage is that the protected code will be syntax and typechecked even if it is "compiled" out. That will make it less likely to bitrot.


Yeah it's hard to keep exponentially many code configurations that may compile together.

It's hard to support in an IDE. Where a variable works nicely


I guess the argument is orthogonality. It is a desirable property in programming languages to not have overlapping features, because people will always be confused about when to use which (the classic example being pointers vs. references in C++).


Nothing usually prevents you using the preprocesor and the running your code through cpp before running/compiling in most languages.


preprocessor tricks are cute, but there's definitely something to be said for staying completely w/in the language -- I've really come to appreciate this using Go, where I get incredible auto-formatting, rename refactor, and source introspection tools that would not be possible if a preprocessor were standard-issue.

(Of course, Java people have been saying something similar for a long time, but I never could get into it, 'cuz of a bunch of other stuff, like the IDE-centric culture & the language itself)


> but there's definitely something to be said for staying completely w/in the language

The C Pre Processor is defined in the ISO C standard. In order for any implementation to call it self a normative conforming C implementation I has to include a pre processor to spec. The Pre Processor is part of the ISO C language.

> auto-formatting, rename refactor, source introspection tools

All of those things are had quite easily with C. Either as IDE/Editor features or with separate tools such as Clang Complete, Valgrind and many others.


spent a long time w/ c and c++, never found tools anywhere near as good as what i've got now for go.

maybe/probably not impossible, but a lot harder in practice. i have no personal experience, but people that seem to know what they're talking about say a big reason is the preprocessor makes life a lot harder. (and other stuff too, of course, like not having build semantics built into the language, and, for c++, just insane complexity in general.)

obviously the preprocessor is well-defined and standardized, but it still: operates on a totally different semantic level. it really is like two entirely different languages pasted together.


Would you mind listing a few tools you use with go?

I do not have much exposure to go but from what I have seen there is a nice set of introspection libraries. are you talking about tools that use these?

A final note, real time semantic checking of multiple expansion macros has been solved for quite a few years now and is in most IDE's and even in Vim and Emacs. Semantic analysis is done in the comelier, LLVM/Clang lib has given rise to a substantial set of tools that are now quite widely used and solve most of the problems.

And I believe C++17 with concepts is also going to have a dramatic effect on complexity and readability of template error messages.


these days my only c use is via obj-c in xcode, so i may well be unaware of the state of the art for straight c or straight c++ / other tools. xcode's refactorings and jump-to-defn (presumably powered by clang?) are still very flaky for me.

for go, i use:

goimports -- which is gofmt (which is awesome) + auto-adding/removing imports (just a heuristic so not always right, but usually right and quite handy)

godef -- for jumping to definitions

gorename -- for rename refactorings. i have not once gotten this to break code.

oracle -- for showing other things about code, like 'which types implement this interface?'. as far as i can tell, it hasn't lied to me yet.

i use emacs as my editor, which integrates easily w/ all of the above tools.

i'm not sure what libraries power the tools above, but i think it's true that packages like go/types do the heavy lifting and make them relatively easy to write.


> preprocessor tricks are cute, but there's definitely something to be said for staying completely w/in the language -- I've really come to appreciate this using Go,

Wait, doesn't Go use code generation as well? I thought their answer to "we don't have a generics" is to basically run a gen as a preprocessor then generate a bunch of code?

https://clipperhouse.github.io/gen/

Or is that something that is not used anymore?


definitely code generation breaks similar things to the preprocessor, but it's got a different set of trade-offs. basically the generated code is there to look at, so rename refactors or jump-to-definition or what have you will work reliably, until you regenerate the code. depending on how the code regeneration works (is it just annotations of existing code?), maybe the regenerated code will work and match the code definition.

from past experience, i view code generation with a healthy amount of skepticism, for the same reasons i'm skeptical of the preprocessor! it's cool, it's powerful, you lose out on a lot.

my impression/experience of go is that codegen is still not widespread. (i haven't seen it much, anyway.)


> #ifdefs don't protect you from misspelled words, #ifs do.

That's not (typically) correct. If an undefined identifier is used in a preprocessor expression, it expands (contracts?) to a constant 0. gcc does not warn about using undefined identifiers by default, and if there's an option to make it do so I'm not familiar with it. (EDIT: I am now, it's "-Wundef". Thanks, sigjuice.) For example, this program:

    #include <stdio.h>
    
    #define MISPELLED
    
    int main(void) {
    #if MISPELED
        puts("MISPELED is true");
    #else
        puts("MISPELED is false");
    #endif
    
    #ifdef MISPELED
        puts("MISPELED is defined");
    #else
        puts("MISPELED is not defined");
    #endif
    }
produces no compile-time diagnostics and prints this output:

    MISPELED is false
    MISPELED is not defined
If you have a C compiler that warns about this, that's great -- but you're likely to get spurious warnings for some code.


gcc has -Wundef (Warn if an undefined identifier is evaluated in an #if directive.)


Which is a enabled by default with -Wall.

I'm a fan of compiling with -Wall -Wextra and manually disabling any irrelevant warnings (which is rare, and preferable to do on a per-file basis in the Makefile).


I'm desperately looking for a C preprocessor that dumps the AST (preferable in JSON format) instead of expanding the macros. And it treats the C/C++ code as just data.

I plan to take the rest of the code and send it through clang to get C/C++ AST dump. Currently I'm getting clang's AST dump which "includes" all the headers and the size of the AST just blows up (e.g., 300k lines of AST for a 4k line C++ file that has about 10 include files).



Try CPIP in Python: http://cpip.sourceforge.net/


My favorite "feature" of the pre processor is when I include Windows.h and then get a super strange compile error in my code because a macro Windows defines (with a super generic name) matches part of a name in my code (which is in a namespace btw) and expands into some unholy thing that is invisible at the source level... Fun times.


#undef


The rationale for #if vs #ifdef was interesting, though.


Deferred expansion is a common (well, not that common) trick this misses that I would have included. Generally the C preprocessor is pretty terrible though, you should seriously consider generating code with an external tool before doing anything nontrivial with it.

Regardless, here's a link that seems to cover this and other somewhat more obscure techniques: https://github.com/pfultz2/Cloak/wiki/C-Preprocessor-tricks,...


Reading this makes me a little sad. I know the preprocessor can be misused but I really miss it in C#. A lot of highly repetitive code could be written much more elegant with a few well written macros.


There's always M4, which will preprocess just about anything you throw at it.

The syntax starts off pretty godawful, but eventually you learn to love it. C# and Java are much nicer with a little M4.

Everybody loves to hate on text-substitution macros (and hey, I write parsers and such, I'm a Smug Lisp Weenie, and generally I know how easy it is to break something and how much better real macros are), but sometimes they actually are the right tool for the job. Just because they're sometimes very much the wrong tool doesn't mean they should be kept out of your toolbox. Until all languages are either Lisp-ish or Forth-ish, I will maintain that there is a place for CPP/M4/etc. :-)

M4 is seriously awesome, just remember: with great text substitution comes great attention to detail. And heed the warning from the manual[1]:

> Some people find m4 to be fairly addictive. They first use m4 for simple problems, then take bigger and bigger challenges, learning how to write complex sets of m4 macros along the way. Once really addicted, users pursue writing of sophisticated m4 applications even to solve simple problems, devoting more time debugging their m4 scripts than doing real work. Beware that m4 may be dangerous for the health of compulsive programmers.

1. https://www.gnu.org/software/m4/manual/m4.html


For C# based environments, you should use T4[1] actually. It comes with Mono[2] as well as the normal CLR. It has an API and a command-line tool.

[1]: https://msdn.microsoft.com/en-us/library/bb126445.aspx [2]: https://github.com/mono/monodevelop/tree/master/main/src/add...


M4 is a great tool. A good chunk of programmers now seem to have never been exposed to pre processors and what you can do with them, its a shame really.


Best to avoid macros if possible. use inline functions. Use global variables instead of MAX_BUF, etc


Not always possible though. For instance, I don't think it's possible to write "assert" as a function; it has to be a macro. But things like assert are admittedly pretty rare.


The name of a variable, global or otherwise, is not a constant expression, so there are contexts in which it can't be used.


Enums can be used for that purpose.


True, but only for type int.

    enum { MAX = 1000 };


Why is this preferable over a preprocessor define? Is it because it can't be redefined easily? A max or count is certainly useful when there are other enum values but I don't see the huge advantage it has here.


Enumeration constants have scope.


That's the very basics of C preprocessor and the last part (#ifdef vs #ifs) didn't convince me at all, it's just a matter of programming style and use case.


Just because you can doesn't mean that you should


Protip #1: don't use them.


In C++, probably never. In C, they can be necessary for doing type orthogonal things (like intrusive data structures). Type generic expressions in C11 can be a bit of a safer alternative for some stuff, but you generally need to give explicit implementations for each type.

For example,

    #define sqrt(x) _Generic((x), long double: sqrtl, float: sqrtf, default: sqrt)(x)


Even in C++, they are good when you want to encapsulate a statement that may return from the enclosing function. Every time I write a parser or interpreter in C++, I end up writing a macro like

    #define EXPECT_TYPE(node, Type) if ((node)->type != Type) { return FAIL; }
and using it extensively.


Also things like generating code at compile time, but a lot of people could argue it both ways.


You have to use preprocessor in any non-trivial C++ project. For an idiomatic and very nice and clean example of a heavy preprocessor use see LLVM and Clang code.


You don't have to use the preprocessor, but it is definitely practical for certain things. It's probably not clear from my comment, but macros were what I assumed the above post was talking about though (e.g. min/max). FYI, some projects actually eschew the C/C++ preprocessor for other languages.

I tend to avoid macros, but stuff like generating things at compile time can sometimes be pretty useful. It can make it harder for other people to read the code if they're not familiar with it though (also any tools that need to parse), which is why I try to prefer using UltiSnips for most stuff I normally might do with a macro.


If you want a dense, readable, maintainable code - you're out of options, you have to resort to the preprocessor.

Any kind of repeating lists (enum + a switch over all the values, for example) are best handled by define-include-undefine sequences.

See how it is done in LLVM, see all the .def files there. I cannot think of a single viable alternative to this method.

And it is far from a "code generation", just merely listing things in a consistent manner.


> Any kind of repeating lists (enum + a switch over all the values, for example) are best handled by define-include-undefine sequences.

> See how it is done in LLVM, see all the .def files there. I cannot think of a single viable alternative to this method.

I've seen x-macros. I'm not sure I've seen them used to good effect, but I'd definitely be curious how the llvm code uses them.

> If you want a dense, readable, maintainable code - you're out of options, you have to resort to the preprocessor.

I guess it wasn't entirely obvious from my comment, but I wasn't arguing against using the preprocessor. It's self-contained, mostly usable with common tools, etc.. Anything that's supported on more than one platform or has compile time options generally depends on it. Dense, readable, maintainable code is definitely a good thing, which is one of the reasons I'd personally like to see a usable reflection land in C++.


> which is one of the reasons I'd personally like to see a usable reflection land in C++.

Of course. Reflection and proper macros. At least something like mbeddr would have been great.

I never said preprocessor is the best possible option. I'm only saying that at the moment it is the only practically available option, and therefore it's unavoidable, like it or not. I personally hate it with a passion, but I have to use it extensively.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: