Metaprogramming custom control structures in C (2012)

glouwbug · on July 10, 2021

I once back ported the core of the C++ STL to C to introduce templated containers to C with some macro magic. I learned in the end that I’m better off using C++

kleton · on July 10, 2021

klib? https://github.com/attractivechaos/klib

glouwbug · on July 10, 2021

CTL. https://github.com/glouw/ctl

It's functionally blackbox compatible with the STL for the major containers. Unless you're writing heap spaced firmware where C++ with an STL isn't available I recommend you just use modern day C++.

Unless, of course, you want blazing fast compile times!

naasking · on July 11, 2021

Neat, I actually prototyped something like this a few years ago to see how closely I could reproduce parametric polymorphism in C using macros, and it ends up looking a lot like Ada generic packages. The only difference from your style is that I have a "clever" hack to support generic-functions as well and the error messages preserve those names, so your example from github would look something like this:

    #include <stdio.h>

    #define P
    #define T int
    #include <vec.h>

    int compare(int* a, int* b) { return *b < *a; }

    int main(void)
    {
        vec(int) a = vec_init(int)();
        vec_push_back(int)(&a, 9);
        vec_push_back(int)(&a, 1);
        vec_push_back(int)(&a, 8);
        vec_push_back(int)(&a, 3);
        vec_push_back(int)(&a, 4);
        vec_sort(int)(&a, compare);
        foreach(vec(int), &a, it)
            printf("%d\n", *it.ref);
        vec_free(int)(&a);
    }

I really should finish that writeup some day.

ludamad · on July 10, 2021

I'm curious about how much easier it is to optimize compiler time for macros rather than templates here. In theory they wouldn't be all that different, but in practice it doesn't seem so

glouwbug · on July 10, 2021

CTL just copy pastes a bunch of code via an include for each new type. The following two tidbits are basically the same thing:

CTL

    #define P 
    #define T int
    #include <vec.h>

STL

    #include <vector>
    template class std::vector<int>;

C++ with its STL is just dramatically slower at compiling it all. C loves to chew through its basic syntax and O(1) lookup since every symbol is essentially unique (take that C++ and your function overloading!)

l- · on July 11, 2021

Strange to focus on compile times given libraries can be precompiled to object files over not having to deal with Turing complete templates.

eps · on July 10, 2021

Intrusive containers are a way more natural fit for a C codebase. They are also a superior choice for C++ code. The one and only plus of STL containers is that they come standard.

naasking · on July 11, 2021

I can sort of see some arguments for this conclusion, but I've never read any comprehensive argument or guideline explaining why intrusive containers are a good idea, and how best to use them. Do you have something like this?

eps · on July 12, 2021

Once you try both, it's really just some common sense:

1. Operating on intrusive containers requires no heap operations, all control structures are preallocated. This is golden, and in more ways than one.

2. Keeping an item in multiple containers has the exact same semantics as storing it in just one. With STL it requires switching from storing items to storing pointers to them. Meaning that you can't throw foo from an existing list into some extra map without reworking all list-related code.

naasking · on July 12, 2021

> Operating on intrusive containers requires no heap operations, all control structures are preallocated. This is golden, and in more ways than one.

I get the advantages in abstract, and intrusive containers are common in kernel programming where allocation is strictly controlled and objects have limited membership, I'm just curious about the applicability to more general programming and domain modelling, and whether it scales in terms of developer productivity.

For instance, how common are programs that store objects in multiple dictionaries and/or multiple lists simultaneously? You say this has the same semantics as storing it one container, but I'm not clear what you mean by this.

Also, if you want to extend an object's membership to another container, what sorts of changes are required compared to non-intrusive containers [1]? Adding an object to a non-intrusive container is a simple local change, ie. container.Add(item), but with an intrusive container you need to actually extend the type definition itself, an intrusive non-local change; this should inhibit some forms of extension, so I want a better understanding of that impact.

Finally, do intrusive containers retain meaningful advantages in languages with garbage collection? Certainly less allocation is one obvious benefit, but are there downsides? Functional languages in particular emphasize composing small scale, orthogonal data types to build programs, which intrusive containers basically turns inside out.

Intrusive containers almost seem like something a clever functional compiler should do for you, ie. it's a program transformation kind of like array of structs can be functionally transformed to a more efficient struct of arrays. A flow sensitive analysis identifies the collections to which an object might be added, and adds the requisite bookeeping info to the data type during compilation. Would be an interesting research topic at least.

[1] For instance, in a kernel, a process could be waiting on multiple file descriptors, or timers, or any number of other things, all of which might have their own queues to which the process might be added.

jstimpfle · on July 10, 2021

Maybe you're just better off not porting the STL :P

glouwbug · on July 10, 2021

It was a learning experience. And you're right, the real learning experience was to not port the STL

wudangmonk · on July 10, 2021

If you want to use metaprogramming in C you are better off doing the parsing/tokenizing yourself and creating your own macros than trying to use the C preprocessor, its a lot less work.

ludocode · on July 10, 2021

But then your code can't be used by anyone without your preprocessor. There is a lot of value in plain C metaprogramming because it can be compiled with an ordinary C compiler.

You can, however, use a script to generate some of the preprocessor boilerplate while still having the templates configurable and instantiable with an ordinary C compiler. This is how my metaprogramming library Pottery works:

https://github.com/ludocode/pottery

It uses #include for templates rather than code block macros, something the article doesn't really go into. It's more powerful this way and the templates are far more readable; aside from the generated metaprogramming boilerplate, the templated code looks like (and preprocesses to) ordinary C.

colejohnson66 · on July 11, 2021

Let me preface this with: I’m not a Rust fanatic (I still like C#). But Rust’s “procedural macros” are pretty interesting. They’re not just regular typed macros (which Rust has), but a function that takes a token stream and outputs a new token stream. This allows pretty cool macros like hex! for example: https://docs.rs/hex/0.4.2/hex/

bruce343434 · on July 10, 2021

So basically create your own language + compiler that compiles to C?

auxym · on July 10, 2021

Or use Nim, which compiles to C :)

bruce343434 · on July 10, 2021

To each their own :)

wudangmonk · on July 10, 2021

Yes, its less work

dmateos · on July 10, 2021

Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should.

adamnemecek · on July 10, 2021

So much time has been spent on making C something that it’s not. Sure. It’s interesting, but then some people start thinking that this is the way you should be writing C.

MaxBarraclough · on July 10, 2021

> So much time has been spent on making C something that it’s not.

GObject is a great example of this. It's a library for doing OOP with C, rather than switching to C++. It's clunky enough that they ended up making a whole new language which compiles to C/GObject, called Vala.

Vala rarely gets attention these days, but it's worth reading about. https://en.wikipedia.org/wiki/Vala_(programming_language)

pjmlp · on July 10, 2021

Here is another example, COM or as it originally started OLE.

Yes, basically it is C OO ABI, however only masochists would use it as such.

Everyone else just reaches for VB 6, C++ (MFC, ATL, WRL, WIL, C++/CX, C++/WinRT, yeah I know, so is WinDev....), .NET, Delphi, C++ Builder, ...

MaxBarraclough · on July 10, 2021

I have to admit ignorance on exactly what COM does. Doesn't Direct3D (all versions including the latest) use COM despite being essentially a C++-only API?

pjmlp · on July 10, 2021

Yes, they are COM based APIs.

All modern Windows APIs use COM to great extent, anyone that uses pure Win32 APIs is basically frozen in Windows XP view of the world.

Mini-history lesson, initially there was DDE, which allowed for basic IPC on Windows 3.x, then came OLE 1.0, both required endless amount of C boiler plate.

Then someone realized that from all concepts, there was a set of central ideas and Windows OO ABI was born, which basically maps to structure with function pointers, naturally with the same layout as Windows C++ compilers organize their vtables when using single inheritance.

So then we had OLE 2.0, COM, OCX, ActiveX, mini-COM, which are basically all the same, IUnknown interface with three methods QueryInterface(), AddRef() and Release(), everything else builds on top of that.

WinRT/UWP is COM updated with some of the ideas that were actually behind .NET birth, so alongside IUnknown there is IInspectable.

So on Windows any language that understands COM ABI (not only C++) can make use of an OS wide OO ABI.

Then there are lots of things one can do with it, regarding in which memory space those components run, how many threads they use, security,...

Going back to the initial question, it is designed to be called from C as well, but almost no one does it.

veltas · on July 10, 2021

I played with this kind of C, and it seems to be attractive to HN. Applying C's own (albeit limited) type-safety and encapsulation features where I can, but not going too exotic or trying too hard to be another language, seems to be the right compromise for me.

danlugo92 · on July 10, 2021

9 out of 10 times good old `doSomethingWith(foo: Foo)` style C is better than c++/whatever `foo.doSomething()`.

BenFrantzDale · on July 10, 2021

Contemporary C++ encourages this too. There’s a place for member functions—even convenience ones—but algorithms operate on data structures; they needn’t be part of the data structures.

pjmlp · on July 10, 2021

Since STL was introduced actually, even if functors were a bit painful to write, however if you look at codebases like Android, there is some catch up to do with modern times.

_meqs · on July 10, 2021

A while back I saw someone implement pattrn matching over Rust/Haskell-like algebraic data types with macros in C99. Think it was called something like Datatype99

Zababa · on July 10, 2021

Here's a link: https://github.com/Hirrolot/datatype99

atom3 · on July 10, 2021

When I found out about this, I wrote some macros to replicate some of the semantic of ISPC [1] in C++ as a fun experiment [2].

Of course it has no practical value but it was really cool to see it was possible to do so.

[1] https://ispc.github.io/

[2] https://github.com/aTom3333/ispc-in-cpp-poc

veltas · on July 10, 2021

If you introduce stuff like this into C it's major code smell. If I see a 'cool macro to do a custom loop' in code I immediately have to go look up what it does, and if it's as complicated as this I'm going to want to read it all to make sure it's actually right, I'll probably rediscover all the caveats he has at the end of the article, and I'll wonder what the original programmer was smoking.

jacoblambda · on July 10, 2021

I think what OP did was taking it a bit far but xacros definitely have their place in C. Most notably they are extremely useful for instantiating hardware interfaces that often come with large amounts of boilerplate.

I've also found use in them in combination with `_Generic` for implementing generic containers/data structures. Of course I don't use these all the time by any means but if I'm going to be using a complex data structure I might as well just use an xacro to do a glorified copy-paste for the structs and accessors. It's all type safe, doesn't make the code any less readable IMHO, and it's surprisingly very debugger friendly.

The xacros used for this are all together only about 10 lines of code but they've saved me countless hours of work/headache over the years and I've never once seen them blow up in a way that isn't immediately diagnosable and fixable.

I understand that macros are by no means to be used everywhere but I do find that macros/xacros provide an incredible amount of utility when putting together "library" or "HAL" code where there's a well defined interface but the internals can largely be hidden from the user/developer.

Of course I'd generally just prefer to use C++ but when that's not an option or would add undue friction, I find macros/xacros to be a useful tool for a developer.

veltas · on July 10, 2021

I don't see any mention of xacros in the article.

jacoblambda · on July 10, 2021

Ah I don't believe it does but they are normally bundled under the "complicated macros that add features or significantly change how C code is written" category which is what I thought you were referring to.

veltas · on July 11, 2021

Using xacros to generate a lot of boring data or simplistic init code is definitely worth the potential confusion/complexity for the reader, and they will no doubt appreciate the reduced effort in maintaining it despite potentially having to come out of their C comfort zone a bit. I've used macros to add or change control structures myself. I will sometimes use a TRY macro that runs an expression and returns when its value indicates an error (which reduces line count significantly in some files), but there are almost no caveats with that one and you can look at the macro definition and understand it immediately. Most of us use ARRAY_LENGTH or DIM macro to get the length of a fully typed array, this is highly conventional so there's no confusion.

But this article is adding tweaks to C control flow that really I could live without and are just about complicated enough to scare and waste the time of anyone stuck maintaining it, that's my concern. It's a real concern, based on my own experience and watching the many C programmers around me tackle fixing or upgrading such code. Macros that try to be "too smart" and try to simplify or make some control structure in C more elegant with hidden complexity and caveats under its preprocessor hood are harder to maintain, I don't think the 'nicer' code you get is worth the extra work maintaining it. It's probably applied with the least potential cost in a code base that will be written and maintained by only one person, but most professional C you need to assume will be maintained by other people as well as yourself. That's the angle I'm coming from.

jacoblambda · on July 11, 2021

OK that makes sense. I think then that we are in agreement. Sorry for the misunderstanding.

veltas · on July 12, 2021

No need to apologise, I just didn't understand your comment.

pwdisswordfish8 · on July 10, 2021

Indeed, better to just use C++.

veltas · on July 10, 2021

So I can implement code smell without ever leaving the comfort of C++? I've seen much scarier footguns in C++ than I've ever seen done in the C preprocessor.