Object oriented programming with ANSI-C (1993) [pdf]

yawaramin · on Oct 20, 2017

The first chapter, on abstraction and information hiding, is probably the most important. Every programmer should know how to do it. It's one of C's main strengths that it allows abstract data types so easily. ADTs in fact allow better binary compatibility since you only expose pointers as your public API, not structures of different shapes and sizes.

I'm continually surprised by how often people insist on putting struct definitions right in their header files, defeating the whole idea of type abstraction.

sparkie · on Oct 20, 2017

You need to define a struct in a header to pass by value, which is usually cheaper than having to dereference a pointer (often resulting in a cache miss). The style presented in the paper has several pointer dereferences, for example each type has another pointer to a class header which contains its constructor and some other function pointers.

I personally do not like the style in this paper as it makes APIs difficult to read - you need to resort to documentation to understand anything. In the first set example:

    #ifndef SET_H
    #define SET_H
    extern const void * Set;
    void * add (void * set, const void * element);
    void * find (const void * set, const void * element);
    void * drop (void * set, const void * element);
    int contains (const void * set, const void * element);
    #endif

What type is add expecting and what is it going to return? We have no clue, other than to continue reading the docs.

A "better" approach to an opaque pointer type is to simply declare a struct in the header, but only define it in the implementation file. The above API becomes this, where it's pretty obvious from the arguments and return types (it also avoids having to do an explicit manual cast in each of the methods):

    #ifndef SET_H
    #define SET_H
    #include "Object.h"
    typedef struct set_t Set;
    Object* set_add (Set*, const Object* element);
    Object* set_find (const Set*, const Object* element);
    Object* set_drop (Set*, const Object* element);
    int set_contains (const Set*, const Object* element);
    Set* set_alloc(void);
    void set_free(Set*);
    #endif

Obviously, this doesn't play well with the full OOP approach this paper takes, but IMO, it's better to just compose structs using the style above, and leave memory management up to the type rather than trying to have a fancy global "new" and pointers to constructors.

yawaramin · on Oct 20, 2017

Yeah, when you can't afford the cache misses I understand that you would pass by value. But often, these things are done as premature optimisations. We can always expose a struct's definition later; but we can't hide it once it's exposed.

I agree with your API redesign, in fact if we forget about trying to do OOP we would get IMO even nicer design:

    typedef struct set_t* set;
    typedef void* set_elem;

    set set_new(void);
    void set_free(set);
    set_elem set_add(set, set_elem);
    set_elem set_find(set, set_elem);
    set_elem set_drop(set, set_elem);
    int set_contains(set, set_elem);

Edit: hiding the pointer in the typdef because if we later decide to expose the set struct, we minimise the required changes to the API.

kgabis · on Oct 20, 2017

Hiding pointer behind a typedef is a terrible idea (like non-const reference arguments in C++). You never know if something you pass into functions could be modified by them or not. I'd much rather change every place in code where set was used than deal with code that hides such important information.

yawaramin · on Oct 21, 2017

Ha, I knew someone would say this ;-) You never know, that is, unless you just follow the type to its declaration and find out in like a couple of seconds. A modern IDE should just show the typedef on hover.

Too · on Oct 22, 2017

With that argument you might as well redeclare everything with unexpected macros, your ide will do macro expansion on hover anyway.

Or why write descriptive function names at all? Ctrl-click to view the implementation of the function is just one click away...

Typedef pointer is a huge anti pattern.

attractivechaos · on Oct 21, 2017

In addition to the use of opaque pointer type, the PDF adds pointers and function pointers to struct only for the sake of "object oriented". They waste space and again obfuscate code.

The PDF represents a bad practice we should stop populating. I have only seen such coding style once and the developer said it was just a toy project he himself wouldn't use in practice.

dualogy · on Oct 21, 2017

> You need to define a struct in a header to pass by value, which is usually cheaper than having to dereference a pointer (often resulting in a cache miss).

If one wanted to do this, what kind of struct size would be the cut-off point (on current-gen machines) where passing the reference instead of a full struct copy becomes advisable? (For funcs where both would work equally fine)

comex · on Oct 21, 2017

ABIs tend to implicitly pass by reference anyway for structs larger than two words or so. (For example, in the AMD64 ABI, two 64-bit words is the cutoff for both arguments and return values. On 64-bit Windows, the limit is one word for both, and on 32-bit ARM, it's four for arguments but one for return values.)

ericbb · on Oct 21, 2017

Haven't you misinterpreted "pass by reference" as "pass on the stack"?

(In the following program, I'd say that x is passed on the stack but not passed by reference.)

    #include <stdio.h>
    #include <stdlib.h>

    struct example { void *a, *b, *c; };

    void foo(struct example x)
    {
        x.a = &x;
        if (x.a != NULL) puts("In foo(), x.a is not NULL.");
    }

    int
    main(void)
    {
        struct example x = { .a = NULL };
        printf("sizeof(x): %lu bytes (%lu bits).\n", sizeof(x), 8 * sizeof(x));
        foo(x);
        if (x.a == NULL) puts("In main(), x.a is NULL.");
        return 0;
    }

tom_mellior · on Oct 21, 2017

What the parent is saying is that no, the struct will be loaded into registers and passed there. Here is a simpler example: https://godbolt.org/g/DcHgMr

As you can see, structs used to be on the passed stack on x86-32. Most other/newer architectures use ABIs that pass the initial struct elements in registers.

comex · on Oct 21, 2017

There are two orthogonal things going on. And I was actually mistaken as to certain specifics. So if only to clear up my own confusion, let me go into a somewhat unnecessary level of detail. TL;DR, though:

- To answer ericbb's concern, the compiler will make copies as necessary to preserve pass-by-value semantics; it just sometimes looks like pass-by-reference at the assembly level. (But "pass on the stack" isn't quite the right wording either; I was referring to a special case of that.)

- Only very small structs are worth passing or returning by value, with the exact limit depending on the platform (and often unnecessarily low).

Anyway.

First, as you (tom_mellior) mention, most ABIs have a list of registers used for function arguments; if there are too many arguments (or on 32-bit x86, any arguments), the remainder are placed on the stack, starting at a fixed offset from where the stack pointer points at function entry.

Second, a struct argument can be passed to a function in three different ways.

The fastest way is to pass each struct field in order as if it were a separate function argument.[1] Thus each field could go in a register or, if all the registers have been used, on the stack. As I said, this is often done only for structs under a certain size: 1 word on Windows x86-64, 2 words on ARM64 and non-Windows x86-64. But actually, some ABIs have no limit, like 32-bit ARM and some less popular architectures. (Thus I shouldn't have said ARM's limit was 4. 4 is the number of registers available for arguments in total, but that's different. The ABI doesn't change behavior for individual arguments based on the struct size. And even a huge struct can have the first few fields passed in registers.)

Another way is to pass a pointer to the struct in place of the struct itself. This is what happens on Windows x86-64 (for structs larger than 1 word) and ARM64 (for structs larger than 2 words). The semantics are still pass-by-value, so the caller generally needs to make a copy of the struct on the stack (at least if the original value is used afterward), but that's its own business; the callee just sees a pointer.

This is what I meant by "implicitly pass by reference", at least for arguments (see below about returns). At best, it's identical at the assembly level to using a real pointer argument, and thus equally efficient; at worst, it's less efficient, since the compiler may emit a copy when it could have just passed a pointer to an existing copy. This might be because the language semantics don't guarantee the existing copy won't be modified, but it can happen even if they do: struct arguments/returns aren't that popular in C and C++, so even modern compilers don't always optimize them as well as they could. This is a problem for Rust.

The third way - which I've only seen on non-Windows x86-64, and which I had confused with the previous one - is to pass the struct like excess arguments, except forced to be on the stack, skipping over any registers that may be available. Thus, if you have

    void func(int x, struct foo foo, int y);

then 'x' and 'y' would be put in the first two registers, while 'foo' would go on the stack, at the aforementioned fixed offset. On the other hand, if there were a lot of preceding arguments that already filled up all the registers:

    void func(int a, int b, int c, int d, int e, int f, int x, struct foo foo, int y);

...then 'foo' would have to be on the stack anyway, so the two-word limit makes no difference. The stack (starting at the fixed offset) will contain 'x', then 'foo', then 'y'.

Anyway, in this case too it's usually equally efficient to use a real pointer, though it's not identical at the assembly level. If you use a real pointer, and the pointer fits into a register, then to load a field of a struct, the function has to load from that register, which is one instruction:

    mov %rax, 0(%rdi)

If you pass by value, and it's forced onto the stack, then the function has to load from the stack, which is still one instruction:

    mov %rax, 8(%rsp)

Though if the pointer itself ends up on the stack, then of course it needs two instructions.

Separately, performance can also differ greatly due to the cache. In general, the top of the stack is almost certainly in L1 cache, while some random pointer might not be. But if the choice is between the callee loading the pointer versus the caller loading the same pointer to copy it to the stack, then that doesn't really matter.

...Then there's return values, which are a bit simpler, since each function has only one return value. Every ABI I know of uses uses either one or two registers for return values. (Even legacy x86 uses registers for returns, rather than the stack.) The limit for struct return values is zero, one, or two words. Larger return values are done by having the caller pass an out pointer, usually treated like an extra function argument. So here too, for large structs it's at most equally efficient to use an explicit out pointer, often more efficient.

Note that annoyingly, the limit is often smaller than you'd expect from the number of registers available. On 32-bit x86, struct return values always go by out pointer, even if the struct contains a single int. On 32-bit ARM, single-field structs are OK, but annoyingly, a struct containing two 32-bit values cannot be returned in two registers, even though two registers are used in other cases - namely, to return a single 64-bit value.

[1] Alternatively, the struct may have its original layout divided into words, and have those words passed as separate arguments. The difference comes when struct fields are smaller than machine words. If you have

    struct example { uint32_t a, b; };

then on non-Windows x86-64, passing 'struct example' as an argument will pack both 'a' and 'b' into a single 64-bit register (rdi), whereas if you have two uint32_ts as standalone arguments, they will be passed in separate registers (rdi and rsi).

ericbb · on Oct 22, 2017

Thanks for the detailed response!

My point was that the beginning of foo() is compiled to the following (by my compiler):

    push   rbp
    mov    rbp,rsp
    lea    rax,[rbp+0x10]
    mov    QWORD PTR [rbp+0x10],rax

Instead of passing the struct in registers (rdi, rsi, rdx) or passing a pointer to the object (pass by reference), the caller places the object on the stack and the callee knows where to find it just based on the type of the function (it's at [rbp+0x10]).

On the other hand, if you delete the c field from the struct type, then the beginning of foo() looks like this:

    push   rbp
    mov    rbp,rsp
    sub    rsp,0x10
    mov    rax,rdi
    mov    rcx,rsi
    mov    rdx,rcx
    mov    QWORD PTR [rbp-0x10],rax
    mov    QWORD PTR [rbp-0x8],rdx
    lea    rax,[rbp-0x10]
    mov    QWORD PTR [rbp-0x10],rax

In that case, the struct was passed in via registers rdi, rsi (similar to tom_mellior's example) and subsequently copied to the stack (at [rbp-0x10] this time).

So, in neither case would I consider it "pass by reference" because neither leads to a pointer to the x object being passed in as an argument.

It's interesting to learn that Windows does it differently and does in fact use what I'd call "pass by reference". The only ABI I'm (somewhat) familiar with is Linux x86-64.

Your point about the annoying limitations on return values is a good one too. I had the thought at one point that it'd be possible to uniformly use structs with just one field instead of typedefs but the limitations on return value calling conventions mean that the two alternatives do not lead to equivalent code, surprisingly.

smnplk · on Oct 21, 2017

>> What type is add expecting and what is it going to return?

Not only that. Can I keep the thing that is returned, can i save it for later, can i change it and not affect other parts of the program, is there someone looking at the same part of memory ? All kinds of worrisome questions spring to mind.

rmind · on Oct 20, 2017

> I'm continually surprised by how often people insist on putting struct definitions right in their header files, defeating the whole idea of type abstraction.

In general, I totally agree with you that it is best to keep the structure definition in the translation unit (the .c file) and opaque structure declaration in the header. However, there are some cases when it is not very practical, e.g. when you you want to embed the structure into another (for performance reasons or cases like on-disk structures, etc) or when you want to separate out some logic into another .c file (think of separation of concerns), instead of having one massive .c file.

When there is such need, I generally create two headers files, e.g. foo.h (for the public API) and foo_impl.h (structure definition and related stuff), with something like:

    #if !defined(__FOO_PRIVATE)
    #error "foo_impl.h should only be included by the internal foo modules"
    #endif

yawaramin · on Oct 20, 2017

Good points. It's at least possible to break up a massive .c file into multiple smaller .c files amalgamated with #includes, right?

The foo_impl.h idea is a good one also.

catnaroek · on Oct 20, 2017

Not that object-oriented languages are particularly good at it, but defining and using abstract data types in C is a pain:

(0) Forward declarations are the only way not to explicitly reveal the representation of types, but you have to pay an indirection tax for it, whether you actually need indirection or don't.

(1) Without heavily abusing the preprocessor, there is no way to express the fact that two abstract data types expose the same interface.

(2) Again, without heavily abusing the preprocessor, there is no way to make common algorithms work on abstract data types that have a common interface.

slrz · on Oct 21, 2017

> (0) Forward declarations are the only way not to explicitly reveal the representation of types, but you have to pay an indirection tax for it, whether you actually need indirection or don't.

If you don't want the indirection, you need to declare at least size and alignment requirements. So why not go ahead and just do that.

  struct typename {
      uchar xxx[THE_SIZE];
  } __attribute__((aligned($your_alignment)));

If your compiler doesn't support such alignment specification, there are ways to work around that.

> (2) Again, without heavily abusing the preprocessor, there is no way to make common algorithms work on abstract data types that have a common interface.

There are. Look at any bigger C project to discover them. A common approach is the use of "ops" structures containing function pointers that are filled depending on the type in use.

Yes, doing all of this manually tends to be more cumbersome than it might be using other more modern languages. Just as manual memory management tends to be more painful than working in a proper garbage-collected environment.

If you still pick C for writing your program, you probably don't mind it or have some other reason for sticking with C.

catnaroek · on Oct 22, 2017

> (horrible snippet)

If that's how you define abstract data types in C, then I'll stick to my current practice of not defining abstract data types in C, for obvious reasons.

> A common approach is the use of "ops" structures containing function pointers that are filled depending on the type in use.

Maybe you're confusing ADTs with objects? They aren't the same thing at all[0]. The point to ADTs is delimiting the scope where implementation details (e.g., whether a set data structure is implemented as a red-black tree or an AVL tree) are relevant. The point to objects is deferring until runtime the decision to use this or that implementation (e.g., callbacks). ADTs in general needn't pay an indirection tax. And using objects when ADTs suffice for your modularity purposes is a sign of being a very stupid programmer.

[0] http://www.cs.utexas.edu/~wcook/Drafts/2009/essay.pdf

> If you still pick C for writing your program, you probably don't mind it or have some other reason for sticking with C.

The latter.

jcelerier · on Oct 21, 2017

> Forward declarations are the only way not to explicitly reveal the representation of types, but you have to pay an indirection tax for it, whether you actually need indirection or don't.

How would you solve this while still allowing object files to compile like they do ?

e.g. if I have a .c file with a function

    void f(some_type t) { ... }

the compiler has to know at that point what is the size of t so that it can allocate the correct space on the stack at the beginning of the function when compiling it. You could fix this with some kind of module system but this would mean that you'd always need the whole module which may be multiple megabytes vs a .hpp which is generally small (or a forward-declaration which is negligible).

Regarding 3, C11 has some generics nowadays: http://en.cppreference.com/w/c/language/generic so for simple stuff it can be done (and for more complex stuff you'd be much better served by c++ templates)

catnaroek · on Oct 21, 2017

> You could fix this with some kind of module system but this would mean that you'd always need the whole module which may be multiple megabytes (...)

You don't need the whole module. You need the module's interface to compile anything that depends on the module. Of course, type sizes and alignments must be a part of the interface in a C-like language. But “exporting sizes and alignments” doesn't necessitate “exporting the entire internal representation of the type”, cf. Rust.

> (...) vs a .hpp which is generally small (or a forward-declaration which is negligible).

You can't forward-declare type sizes and alignments. This is precisely why you must pay the aforementioned indirection tax.

> Regarding (2), C11 has some generics nowadays

Which is just a compile-time switch statement on types. You still have to write separate implementations of generic algorithms for every concrete type on which you want to use them.

> (and for more complex stuff you'd be much better served by c++ templates)

The pain associated to using C++ templates is exactly what happens when you try to do generic programming in a language with no support for abstract type specifications.

yawaramin · on Oct 20, 2017

(0) True, but the default assumption really should be that the indirection tax is fine. There are plenty of non-realtime C apps. If you're really seeing a slowdown, then by all means trade off some of that abstraction power.

(1) & (2) You could use conversion functions, no? E.g.:

    typedef struct ordering_t* ordering;
    typedef struct person_t* person;
    typedef struct business_t* business;

    ordering ordering_of_person(person);
    ordering ordering_of_business(business);
    void* sort(void*, ordering);

catnaroek · on Oct 21, 2017

(0) The indirection tax is not fine. I want both nice abstractions and efficient implementations.

(1) and (2) Your proposal doesn't work, because different abstract data types are, well, different, even if they expose the same operations. (Perhaps you're confusing “abstract data types” with “abstract classes”?) If you don't believe me, try merging a binomial heap and a pairing heap.

yoklov · on Oct 20, 2017

Not including the struct in the header defeats the ability of the user to control where the type is allocated in memory, among other things. That’s a feature that many c programmers are willing to give up a certain degree of api/abi compatibility for. It’s just a different tradeoff, neither is the best choice in all cases.

Of course, if you’re willing to put in a lot of extra legwork, you can do what pthreads does, and define dummy types with the same size for this.

Koshkin · on Oct 20, 2017

> It's one of C's main strengths that it allows abstract data types so easily.

On the contrary, data type abstraction in C is a pain. This was one of the reasons for creation of "C with classes" (a.k.a. C++).

dangerbird2 · on Oct 20, 2017

The only things that are particularly painful about opaque types in C is that you can't easily use stack-based memory to allocate opaque objects. This of course is true for C++ opaque types, but people tend to rely on private fields of concrete types instead of "PIMPL" struct declarations, even though the latter is arguably better for binary compatability and information hiding.

ASalazarMX · on Oct 20, 2017

I prefer to call it "C Gone Wild".

catnaroek · on Oct 20, 2017

Not that C++ is particularly good at data abstraction either.

hawski · on Oct 20, 2017

I was thinking lately that C is more data-oriented than many. However I'm limited in knowledge mostly to imperative languages.

What I mean is that defining arrays of structs has very light syntax. For example in Python AFAIK you can't easily do this. You have to use dictionaries (which is cumbersome if you have many rows) or some kind of class/module that will make it even less clear. What Suckless [1] guys are doing where you can use C as configuration language is example of it. Like keybindings configuration [2] in dwm [3]:

  static Key keys[] = {
	/* modifier                     key        function        argument */
	{ MODKEY,                       XK_p,      spawn,          {.v = dmenucmd } },
	{ MODKEY|ShiftMask,             XK_Return, spawn,          {.v = termcmd } },
	{ MODKEY,                       XK_b,      togglebar,      {0} },
	{ MODKEY,                       XK_j,      focusstack,     {.i = +1 } },
	{ MODKEY,                       XK_k,      focusstack,     {.i = -1 } },
        /* ... */
  };

When I do something in Python I miss this. C++ most likely will require doing some playing around with constructors.

Also with C99 it's only sweeter. When you need you can use designated initializers [4] - putting names of fields or indices of array in initialization list. You can use compound literals [5] to obviate some need for constructors.

[1] https://suckless.org/

[2] https://git.suckless.org/dwm/tree/config.def.h

[3] https://dwm.suckless.org/

[4] https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html

[5] https://gcc.gnu.org/onlinedocs/gcc/Compound-Literals.html

geofft · on Oct 20, 2017

Take a look at https://github.com/python-attrs/attrs which will let you do something like

    import attr

    @attr.s
    class Key(object):
         modifier = attr.ib()
         key = attr.ib()
         function = attr.ib()
         argument = attr.ib()

    keys = [
        Key(MODKEY,           XK_p,      spawn, {'v': dmenucmd}),
        Key(MODKEY|ShiftMask, XK_Return, spawn, {'v': termcmd}),
        # ...
    ]

which gets you the concise and readable syntax of C structs, while keeping these things as actual attributes as if they were classes (e.g., [key.modifier for key in keys] will work).

And you get designated initializers via Python's usual keyword-argument syntax, the ability to specify attr.ib(default=...), and a few other things.

yawaramin · on Oct 21, 2017

In Python the equivalent is namedtuples, which IMO is better syntax anyway because you can use named arguments:

    import collections

    Key = collections.namedtuple('Key', ['modifier', 'key', 'function', 'argument'])

    keys = [
      Key(modifier=MODKEY,           key=XK_p,      function=spawn,      argument=dmenucmd),
      Key(modifier=MODKEY|ShiftMask, key=XK_Return, function=spawn,      argument=termcmd),
      Key(modifier=MODKEY,           key=XK_b,      function=togglebar,  argument=0),
      Key(modifier=MODKEY,           key=XK_j,      function=focusstack, argument=+1),
      Key(modifier=MODKEY,           key=XK_k,      function=focusstack, argument=-1)
    ]

tom_mellior · on Oct 21, 2017

That does add a lot of redundant noise on each line, though. I would write this particular case as normal tuples which I would then turn into Keys in a separate step:

    keys = [Key(*tuple) for tuple in keys]

Another possibility is to keep the Key constructor on each line, but to get rid of the keyword arguments, which are not mandatory:

    >>> Key('a', 'b', 'c', 'd')
    Key(modifier='a', key='b', function='c', argument='d')

anyfoo · on Oct 21, 2017

When I recently used python, after a long absence, I was pleased to discover named tuples: https://pymotw.com/2/collections/namedtuple.html

They basically give you the light syntax you were looking for.

Related, I was also very pleased to discover mypy for static type checking. I am convinced that it sped up development of my project by an order of magnitude or so. Especially since I was using the also new async/await syntax (something I was also very pleased to discover for my particular project, and it had nothing to do with the web). The use of async/await leads to quite elaborate types, with easily made mistakes that are very painful if only caught at runtime, instead of trivially at compile time.

Together, named tuples and static type checking brought in some much missed aspects of C into python.

thechao · on Oct 21, 2017

Or make it real data:

    #define KEY_TABLE(ENTRY) \
        ENTRY(MODKEY, key, XK_p, spawn, ...)\
        ENTRY(MODKEY|ShiftMask, key, XK_p, spawn, ...)\

Then:

    #ifdef KEY_TABLE_ENTRY
    #undef KEY_TABLE_ENTRY
    #endif//KEY_TABLE_ENTRY
    static Key keys[] = {
    #define KEY_TABLE_ENTRY(MOD, KEY, FUN, ARG) { MOD, KEY, FUN, ARG },
    KEY_TABLE(KEY_TABLE_ENTRY)
    #undef KEY_TABLE_ENTRY
    };

tom_mellior · on Oct 21, 2017

No need to guard #undef with #ifdef, it's a no-op if the macro is not already defined.

earenndil · on Oct 21, 2017

Most compilers will warn if you #undef something not #defined.

tom_mellior · on Oct 21, 2017

Can you name one and explain how to call it? I can get neither GCC nor Clang to warn about the following file:

    #undef THIS_VARIABLE_IS_NOT_DEFINED

earenndil · on Oct 21, 2017

Nevermind, you're right. The warning I was thinking of was redefinition.

megous · on Oct 20, 2017

> I'm continually surprised by how often people insist on putting struct definitions right in their header files, defeating the whole idea of type abstraction.

It also defeats the idea of easy struct embedding as is done for example in the Linux kernel. Different codebases, different needs.

sesteel · on Oct 20, 2017

This is one of my favorite reads. About 10 years ago I was developing a toy programming language that compiled to ANSI-C and this was a great resource to me at the time.

kuwze · on Oct 20, 2017

Reminded me of the C Object System by Laurent Deniau[0].

[0]: http://ldeniau.web.cern.ch/ldeniau/cos.html

pdfernhout · on Oct 21, 2017

I wrote an AI Expert article on a similar theme ("Simulating Intelligent Interacting Objects in C", AI Expert, January 1989) about using C for OO programming of a robot simulator -- where the first argument is a pointer to a data structure representing the object. I prefixed all the function names with the class name, like Robot_move(robot, x, y). I was inspired by IIRC Eric Schildt's previous article elsewhere on how OO is an attitude.

Looks like a very comprehensive book though compared to just some short articles (even if it does not cite those as previous work). In general, this idea of opaque handles was (and is) very common in C programming, like with file operations. So, the question is what conventions we use or what convenience superstructure we build around that basic idea of handles.

wybiral · on Oct 20, 2017

One thing that was really educational for me was reading how CPython implemented their object system. https://github.com/python/cpython/blob/master/Include/object...

Basically objects are structs that all share the same header which contains a pointer to the object type (ob_type) and the reference count (ob_refcnt) for memory management.

That way you can always cast back to a pointer of that header struct and access ob_type and ob_refcnt no matter what the actual implemented struct contains afterwards.

ob_type contains the name of the type and function pointers for common methods used by objects (getattr, setattr, hash, arithmetic, etc).

jacinabox · on Oct 21, 2017

For me the notion of domain is helpful in understanding OO. The definition of a domain is that everything inside it stands in determinate relations, whereas while multiple domains can interact they don't stand in determinate relations. Inheritance in classes, in adding more features to a class, can also have knock-on effects that interfere with the assumptions of other classes that are closely coupled to it, making classes an example of domain. Whole computers in a network are also domains; nodes can go down in which case no determinate expectations can be placed on them.

The existence of multiple domains is essentially postmodern as it creates, for practical purposes, multiple sources of truth. Because the expectations of one domain on another are weak, this naturally leads to a domain becoming robust against errors and inconsistencies from other domains, which leads to a stronger system. This type of robustness also helps components still be valid when they are placed into a different theory/environment that has additional properties. In some sense FP and OO are aiming at the same thing, that is making definitions that remain valid when the "arrows" of their environment change their meaning.

For a different perspective on this see Gerald Jay Sussman, "We Really Don't Know How to Compute!"

kasajian · on Oct 20, 2017

Another article related to this: https://www.codeproject.com/Articles/22139/Simply-Object-Ori...

I wrote it a while back just to see what I would come up with if I had to solve that problem for myself using only macros.

pakl · on Oct 21, 2017

I am a fan of libco2[1] which is a minimalist object-oriented layer on top of C.

[1] https://github.com/peterpaul/co2/tree/master/examples/my-obj...

Koshkin · on Oct 20, 2017

  > void * find (const void * _set, const void * _element)

Such an eye sore, and I think it goes strongly against the tradition of coding in C when someone uses the asterisk in the pointer context in such a way that makes it look like an infix operator, i.e. as if it was multiplication.

As to achieving a reasonably looking OO in C, this is why C++ was born...

4lch3m1st · on Oct 21, 2017

Anytime I write C code, I try to keep the asterisk close to the type:

void* foo(void* bar);

I know that the asterisk technically belongs with the variable name when declaring it, but it helps to wrap my head around them, as a reminder that they're a pointer type.

sigjuice · on Oct 21, 2017

The world mostly does it the other way.

A couple of code examples:

https://github.com/apache/httpd/blob/trunk/server/provider.c

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....

Almost all documentation, references and literature: Here are a couple of man pages.

http://man7.org/linux/man-pages/man3/gethostbyname.3.html

https://developer.apple.com/library/mac/documentation/Darwin...

Here is a draft of the ISO C standard:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

ngcc_hk · on Oct 21, 2017

I agree you based on fact that more in code we use

    void *s = then use it as 
    s=

But I agree more that the

    void* s = then use it as 
    s=

Is more understandable as the first set confuse me when I learn pointer. It still does. void* is a type like statement. And void is like a return type statement.

jcelerier · on Oct 21, 2017

The only thing that sucks hard is when you do :

    int* s1, s2;

here s2 is an int, not an int* and you have to do

    int* s1, *s2;

this is braindead imho

megaman22 · on Oct 21, 2017

As long as you're not using prehistoric versions if C, you don't have to define all variables up front, so there is much less justifiable reasons for that kind of multiple declaration.

Koshkin · on Oct 21, 2017

Except nothing will save you from the "one true way to write C" when you try to write a declaration of a pointer to a function, for example.

sigjuice · on Oct 21, 2017

You can adjust your function pointer typedefs if you like. e.g. signal(2) http://man7.org/linux/man-pages/man2/signal.2.html

  typedef void (*sighandler_t)(int);
  sighandler_t signal(int signum, sighandler_t handler);

can be rewritten as

  typedef void sighandler_t(int);
  sighandler_t *signal(int signum, sighandler_t *handler);

jcelerier · on Oct 21, 2017

yep, I prefer the c++ way by far : using foo = void(int, float);

millstone · on Oct 21, 2017

This totally destroyed me when I was learning C. The theory was:

    int *foo;

makes more sense, because to access the int, you write:

   *foo = ...;

But that's not an int, it's a crash! If you want an int, you have to make one.

Placing the asterisk next to the type made it all click for me.

Y_Y · on Oct 21, 2017

    double foo;
    compute(foo);

That's not even a crash, just silent garbage.

I don't know how you expect to do better with an uninitialised variable (except ints that default to 0).

foo dereferences to an int, but what could you hope to happen if you never gave it some memory to point to?