Inheritance in C using structure composition

ChrisSD · on July 19, 2020

> Structures in C are not padded and they do not even hold any meta information, not even for the member names; hence during allocation, they are allocated the space just enough to hold the actual data.

I get that this is a simplification and that the point is there's no hidden metadata at runtime but this is dangerously badly worded. Structures in C can be padded, they just happen not to be in this particular case. All the fields are 4 bytes and they're on a 32bit platform so these particular structures will be packed.

That's not always going to be true however. For example, if you add something that isn't a pointer or an int. Or compile on a platform with 64bit integers and 32bit ints.

See also: http://www.catb.org/esr/structure-packing/

quelsolaar · on July 19, 2020

The fist member of a struct is guaranteed not to have padding before it according to the C spec, so for this use case this is always true. Implementations are free to have padding in other places, but pretty much all implementations adhere to a ABI, and that forces the padding to be well defined.

astrobe_ · on July 19, 2020

You mean 64 bits pointers and 32 bits integers, I guess.

ChrisSD · on July 19, 2020

You're right, thanks. I'd edit but it's too late now.

boomlinde · on July 20, 2020

You're thinking of struct members, not structs

twoodfin · on July 19, 2020

This is exactly how C++ structures single-inheritance types without virtual member functions.

Supporting virtual member functions (i.e. runtime polymorphism) only requires adding a vtable pointer—which IIRC Linux also does for some of its own structural subtyping, at least in its Virtual File System component.

Multiple inheritance requires some more bookkeeping by the compiler of appropriate offsets, but structurally it doesn't change very much.

It's not surprising: C++ was famously originally implemented as a preprocess transformation into C. Inside the C++ Object Model[1] is a fascinating deep dive on how C++ semantics map to C constructs.

[1] https://www.oreilly.com/library/view/inside-the-c/0201834545...

_28jh · on July 19, 2020

I don’t know if this is actually how C++ compilers do inheritance, but reading about how GTK does object-oriented in C[1] taught me a lot about what (might be) happening behind the scenes in C++.

1: https://www.linuxtopia.org/online_books/gui_toolkit_guides/g...

ChrisMarshallNY · on July 19, 2020

It's a bit wild, seeing this. I was doing it with C in the mid-1990s. In fact, I designed an entire SDK around it, that is still used, to this day.

I called it my "faux object" pattern, and I used a lot of function pointers and property pointers (like the OP mentioned). That allowed a "poor man's polymorphism." I may have actually published it as a pattern, but I'm not sure anyone ever read it.

The reason that I did it, was that C++ was still clambering out of the bassinet, back then, and we needed a cross-platform way to translate stuff that required an object model, across an opaque binary interface. We used C to transfer the state and data, and built object frameworks on either side of the coupling, with C++ or Object Pascal.

On the Mac platform, we also used Pascal types, which ensured a predictable stack frame.

It was pretty klunky, but it worked.

User23 · on July 19, 2020

I believe that the C standard now explicitly supports casting a struct to the type of its first member. I learned the pattern in the late 90s from looking at the gtk project's glib library.

Then I realized I sometimes needed to downcast, so the base struct ended up with a type enum. And then I realized sometimes I wanted to call specialized functions, so I reinvented vtables. To this day I consider C++ an unwieldy over-complicated beast with a specification that is literally too complex for a verifiable implementation, but that brief exercise gave me a sufficient appreciation for the designers of C++ to respect their creation.

marmaduke · on July 19, 2020

When I wrote a C library, I also implemented this sort of pattern. I think it's a mark of a "less is (exponentially) more" tool that allows for convergent patterns among disconnected users. C++ is a "more is more" tool that results in divergent patterns outside particular app or lib specific communities.

icedchai · on July 19, 2020

The Amiga OS used this pattern everywhere. I first learned C on an Amiga in the late 80’s.

ChrisMarshallNY · on July 19, 2020

I believe that. I know that I learned it somewhere, but the origin is lost in the mists of antiquity.

I did see it somewhat formalized in the late 1990s, though, with Apple's QuickDraw GX.

m4r35n357 · on July 19, 2020

Dumb but genuine question from a non-expert programmer:

Most of the "OO" world seems to have abandoned inheritance as an anti-pattern in most cases (apart from genuine "is a" relationships).

I'm guessing the two case studies are in the list of exemptions from this rule.

So, is this considered a generally good thing to do in c?

astrobe_ · on July 19, 2020

I believe this is because of Liskov's substitution principle (LSP), which is the hidden gotcha that awaits around the corner beyond the "manager is-a person" [1] example. The "square is-a rectangle... or maybe not" discussions that pop up from time to time is an illustration of the problem. In practice, when you produce a lot of classes, derive a lot classes (because of the open-close principle), it is easy to blunder.

[1] Yes, they have feelings too.

vidarh · on July 20, 2020

The problem tends to be that "is-a" is conveniently short but it confounds several different types of inheritance.

Inheritance in most OO languages really means "api is a superset of, and implementation is a specialization (and possibly superset) of" while in natural language "is-a" implies a relationship where some facets might be expanded, while others may be more restricted.

This is really partially a weakness in expressiveness of our languages, partially a problem of our frequent insistence on mutability. Often the problem goes away with immutability: If your "square" is immutable and inheriting from a "rectangle" class that allows setting width and height to different values isn't a problem, because the result would be a new object and that result can be a rectangle.

But sometimes we also genuinely would be best served by inheriting implementation and public API separately without having to create cumbersome facades etc. - it's just that making an API of a subclass a subset of the API of the superclass violates a lot expectations people tend to have about OO systems, and so few systems outside of prototype based ones seem to allow it without resorting to ugliness like overriding methods to make them throw exceptions etc.

vidarh · on July 19, 2020

It's not that the OO world has abandoned inheritance, but that inheritance used to be overused in a lot of cases where composition provides the same flexibility.

So the typical advice would be to consider composition first, and to limit inheritance to genuine "is a" relationships as you say. But in some domains there are a lot of genuine "is a" relationships, so that does not in any way remove the use of inheritance.

rabidrat · on July 19, 2020

The feature article is "inheritance using composition"; how/why could you consider strict composition first?

Isn't the problem almost entirely multiple inheritance, then?

vidarh · on July 20, 2020

The article is using composition for inheritance as an implementation detail. The effect is the same as "full" inheritance in that the "subclass" is fully inheriting the aspects of the "superclass" unless you introduce a vtable to allow overriding effects.

In terms of why to avoid inheritance, there are many reasons not tied to multiple inheritance, but the foremost one is to avoid exposing implementation details that may not make sense.

One thing to remember is that in OO public APIs "A is a B" really translates to "A's API is a superset of B's". Which is really not what we often think of as "is a" in daily speech. E.g. we might consider a Circle "is a" Ellipse to be a reasonable statement, because in day to day speech "is a" tends to denote a specialization that may be a superset and subset of different aspects of something at the same time, but if these objects are mutable, then allowing you to set all the parameters of an Ellipse for your Circle object will allow you to create a Circle object that is not actually a Circle.

So really, the point is to avoid inheritance unless you actually want to inherit and possibly expand on the full API of the class you're inheriting from. If I want a mutable Circle object, I shouldn't be inheriting from a mutable Ellipse - in fact in terms of API it may well make more sense for my Ellipse to inherit the API of a Circle.

You'll often find those kinds of inverted relationships in OO. Some types of OO can handle this - e.g. prototype based inheritance would allow you to simple remove methods that makes no sense in the subclass; alternatively you can trigger exceptions etc., but this violates what is very often a central contract of inheritance that users tends to rely on: That you can treat an object as if it is an object of its superclass in most contexts. It's better then to aim for subclassing to strictly superset the API.

sergeykish · on July 19, 2020

This is function operating on values with slightly relaxed type checks. Default mode in dynamic languages:

    let list_add = (n, prev, next) => {
      prev[1] = n
      n[1] = next
      n[0] = prev
      if (next) next[0] = n
    }
    foo = [null, null, 3]
    bar = [null, null, 5]
    list_add(bar, foo, null)

Looks a bit cooler with C structs names as offset.

Inheritance would be building container hierarchy on top of that. Could be brittle in any language.

quelsolaar · on July 19, 2020

There are some good reasons to use this. A typical thing is to have a descriptor header followed by multiple types of data. The descriptor header says what kind of data is following. You can obviously just use a void pointer in the descriptor that points to a separate data struct, but having them both in the same memory block means fewer allocations, and fewer cache misses, giving better performance.

megous · on July 19, 2020

Struct composition is used pretty much everywhere in the Linux kernel. It's a pretty nice pattern, along with container_of() macro.

chris_wot · on July 19, 2020

LibreOffice does this to power it’s typelib library, which is part of UNO.

I’m trying to figure out how this stuff works right now by building unit tests so I can add to my work in progress book.

https://github.com/chrissherlock/libreoffice-experimental/bl...

The chapter I’m writing deals with types:

https://chris-sherlock.gitbook.io/inside-libreoffice/univers...

unixguy1337 · on July 19, 2020

Check https://www.cs.rit.edu/~ats/books/ooc.pdf from 1993. It has been mentioned and discussed earlier on HN, like https://news.ycombinator.com/item?id=7011540

sd314 · on July 19, 2020

C is not designed for OOP. You will write much nicer C code without such patterns.

haberman · on July 19, 2020

I agree. While it is occasionally can work alright (as with the examples in the article), I recommend a very light touch with this stuff. If you get to the point where you are implementing casts, RTTI, etc, stop and rethink. I say this as someone who has gone down this path and later regretted it.

User23 · on July 19, 2020

In a professional context I couldn't agree more. However for a hobby project I couldn't disagree more. Implementing safe down-casting (up is trivial), dynamic dispatch, encapsulation, and other OO(ish) features is an absolutely wonderful learning experience. Like most first attempts, it will probably be a mess as you say, but there can be a joy in making a mess of things so long as you're not inflicting it on the unsuspecting.

haberman · on July 19, 2020

Yes I agree that for learning purposes it can be very illuminating. :)

quelsolaar · on July 19, 2020

No, actually C was designed with this pattern in mind. That is why the C standard guarantees no padding before the first struct member.