> Structures in C are not padded and they do not even hold any meta information, not even for the member names; hence during allocation, they are allocated the space just enough to hold the actual data.
I get that this is a simplification and that the point is there's no hidden metadata at runtime but this is dangerously badly worded. Structures in C can be padded, they just happen not to be in this particular case. All the fields are 4 bytes and they're on a 32bit platform so these particular structures will be packed.
That's not always going to be true however. For example, if you add something that isn't a pointer or an int. Or compile on a platform with 64bit integers and 32bit ints.
The fist member of a struct is guaranteed not to have padding before it according to the C spec, so for this use case this is always true. Implementations are free to have padding in other places, but pretty much all implementations adhere to a ABI, and that forces the padding to be well defined.
This is exactly how C++ structures single-inheritance types without virtual member functions.
Supporting virtual member functions (i.e. runtime polymorphism) only requires adding a vtable pointer—which IIRC Linux also does for some of its own structural subtyping, at least in its Virtual File System component.
Multiple inheritance requires some more bookkeeping by the compiler of appropriate offsets, but structurally it doesn't change very much.
It's not surprising: C++ was famously originally implemented as a preprocess transformation into C. Inside the C++ Object Model[1] is a fascinating deep dive on how C++ semantics map to C constructs.
I don’t know if this is actually how C++ compilers do inheritance, but reading about how GTK does object-oriented in C[1] taught me a lot about what (might be) happening behind the scenes in C++.
It's a bit wild, seeing this. I was doing it with C in the mid-1990s. In fact, I designed an entire SDK around it, that is still used, to this day.
I called it my "faux object" pattern, and I used a lot of function pointers and property pointers (like the OP mentioned). That allowed a "poor man's polymorphism." I may have actually published it as a pattern, but I'm not sure anyone ever read it.
The reason that I did it, was that C++ was still clambering out of the bassinet, back then, and we needed a cross-platform way to translate stuff that required an object model, across an opaque binary interface. We used C to transfer the state and data, and built object frameworks on either side of the coupling, with C++ or Object Pascal.
On the Mac platform, we also used Pascal types, which ensured a predictable stack frame.
I believe that the C standard now explicitly supports casting a struct to the type of its first member. I learned the pattern in the late 90s from looking at the gtk project's glib library.
Then I realized I sometimes needed to downcast, so the base struct ended up with a type enum. And then I realized sometimes I wanted to call specialized functions, so I reinvented vtables. To this day I consider C++ an unwieldy over-complicated beast with a specification that is literally too complex for a verifiable implementation, but that brief exercise gave me a sufficient appreciation for the designers of C++ to respect their creation.
When I wrote a C library, I also implemented this sort of pattern. I think it's a mark of a "less is (exponentially) more" tool that allows for convergent patterns among disconnected users. C++ is a "more is more" tool that results in divergent patterns outside particular app or lib specific communities.
I believe this is because of Liskov's substitution principle (LSP), which is the hidden gotcha that awaits around the corner beyond the "manager is-a person" [1] example. The "square is-a rectangle... or maybe not" discussions that pop up from time to time is an illustration of the problem. In practice, when you produce a lot of classes, derive a lot classes (because of the open-close principle), it is easy to blunder.
The problem tends to be that "is-a" is conveniently short but it confounds several different types of inheritance.
Inheritance in most OO languages really means "api is a superset of, and implementation is a specialization (and possibly superset) of" while in natural language "is-a" implies a relationship where some facets might be expanded, while others may be more restricted.
This is really partially a weakness in expressiveness of our languages, partially a problem of our frequent insistence on mutability. Often the problem goes away with immutability: If your "square" is immutable and inheriting from a "rectangle" class that allows setting width and height to different values isn't a problem, because the result would be a new object and that result can be a rectangle.
But sometimes we also genuinely would be best served by inheriting implementation and public API separately without having to create cumbersome facades etc. - it's just that making an API of a subclass a subset of the API of the superclass violates a lot expectations people tend to have about OO systems, and so few systems outside of prototype based ones seem to allow it without resorting to ugliness like overriding methods to make them throw exceptions etc.
It's not that the OO world has abandoned inheritance, but that inheritance used to be overused in a lot of cases where composition provides the same flexibility.
So the typical advice would be to consider composition first, and to limit inheritance to genuine "is a" relationships as you say. But in some domains there are a lot of genuine "is a" relationships, so that does not in any way remove the use of inheritance.
The article is using composition for inheritance as an implementation detail. The effect is the same as "full" inheritance in that the "subclass" is fully inheriting the aspects of the "superclass" unless you introduce a vtable to allow overriding effects.
In terms of why to avoid inheritance, there are many reasons not tied to multiple inheritance, but the foremost one is to avoid exposing implementation details that may not make sense.
One thing to remember is that in OO public APIs "A is a B" really translates to "A's API is a superset of B's". Which is really not what we often think of as "is a" in daily speech. E.g. we might consider a Circle "is a" Ellipse to be a reasonable statement, because in day to day speech "is a" tends to denote a specialization that may be a superset and subset of different aspects of something at the same time, but if these objects are mutable, then allowing you to set all the parameters of an Ellipse for your Circle object will allow you to create a Circle object that is not actually a Circle.
So really, the point is to avoid inheritance unless you actually want to inherit and possibly expand on the full API of the class you're inheriting from. If I want a mutable Circle object, I shouldn't be inheriting from a mutable Ellipse - in fact in terms of API it may well make more sense for my Ellipse to inherit the API of a Circle.
You'll often find those kinds of inverted relationships in OO. Some types of OO can handle this - e.g. prototype based inheritance would allow you to simple remove methods that makes no sense in the subclass; alternatively you can trigger exceptions etc., but this violates what is very often a central contract of inheritance that users tends to rely on: That you can treat an object as if it is an object of its superclass in most contexts. It's better then to aim for subclassing to strictly superset the API.
There are some good reasons to use this. A typical thing is to have a descriptor header followed by multiple types of data. The descriptor header says what kind of data is following. You can obviously just use a void pointer in the descriptor that points to a separate data struct, but having them both in the same memory block means fewer allocations, and fewer cache misses, giving better performance.
I agree. While it is occasionally can work alright (as with the examples in the article), I recommend a very light touch with this stuff. If you get to the point where you are implementing casts, RTTI, etc, stop and rethink. I say this as someone who has gone down this path and later regretted it.
In a professional context I couldn't agree more. However for a hobby project I couldn't disagree more. Implementing safe down-casting (up is trivial), dynamic dispatch, encapsulation, and other OO(ish) features is an absolutely wonderful learning experience. Like most first attempts, it will probably be a mess as you say, but there can be a joy in making a mess of things so long as you're not inflicting it on the unsuspecting.
I get that this is a simplification and that the point is there's no hidden metadata at runtime but this is dangerously badly worded. Structures in C can be padded, they just happen not to be in this particular case. All the fields are 4 bytes and they're on a 32bit platform so these particular structures will be packed.
That's not always going to be true however. For example, if you add something that isn't a pointer or an int. Or compile on a platform with 64bit integers and 32bit ints.
See also: http://www.catb.org/esr/structure-packing/