Hacker News new | past | comments | ask | show | jobs | submit login

Here is sample code that illustrates how it works.

The approach to using inline functions in C is a bit counterintuitive. Here is how you might define a normal function in C:

    // file.h
    int add_one(int x);

    // file.c
    #include "file.h"
    int add_one(int x) {
        return x + 1;
    }
Here is how you might define it inline:

    // file.h
    inline int add_one(int x) {
        return x + 1;
    }

    // file.c
    #include "file.h"
    int add_one(int x);
Seems backwards to have a definition in the header and prototype in the implementation file, but because the prototype is "extern" (which is default, and can be omitted), and because the definition + declaration merge in certain ways (extern + inline = extern inline), you end up with an "extern inline" definition which means "emit the code in this translation unit, please." If LLVM's behavior is counterintuitive, it's probably because it's there to support counterintuitive parts of the C language.

This frees you from making decisions in a sense, because the compiler can decide whether to use the inline or extern definition at any given call site. Again, the syntax for it is counterintuitive.

The other way is:

    // file.h
    static inline int add_one(int x) {
        return x + 1;
    }
This may give you multiple copies of the code (emitted in potentially any translation unit it appears in), but it's less typing.

Omitting any discussion about C extensions here.




Any guesses as to why C chose such crazy semantics for inline functions? "The compiler can decide to use the inline or extern definition" is begging for trouble.

"inline functions cannot have their address taken and cannot have static variables" seems so natural and obvious.


> "The compiler can decide to use the inline or extern definition" is begging for trouble.

IMO this behavior seems like the obvious choice, but I've been writing C long enough that I'm sure my perspective is distorted and my sense of what is obvious is completely out of whack.

C++ made the choice that linkers should be able to coalesce duplicates, C made the choice that linkers do not need this feature. If you want to inline, you want definitions in every translation unit. If you don't want to have to inline, you want to pick a translation unit which gets the code.

The compiler is what chooses whether a function is inline, and programmers should not think of the "inline" keyword as affecting that choice.

> "inline functions cannot have their address taken and cannot have static variables" seems so natural and obvious.

Yes, obvious that they should not have static variables. There's nothing wrong with taking the address of an inline function, though. It behaves just like a normal function. Pointers to a function will compare equal if and only if they point to the same function--a function with external linkage is the same function in every translation unit, a function with internal linkage is a different function in every translation unit. Whether or not the function is inline does not matter (why would it?)

This means that you can take functions in an existing C code base, and you can make them inline, without any change in functionality. This should always work, barring... you know... those weird unexpected consequences. Isn't that nice though? Whether a function is inline is a detail that you don't have to care about, unless you're defining the function.

Remember that "inline" does not mean, "this function will be inlined". Instead it means, "the compiler may choose to inline this, or not, at its discretion." Which really just means, "the definition for this function is visible in the translation unit where it is called", because the compiler can generally choose to inline any function if the definition is there.

So, the keyword could be renamed to reflect what it means. Instead of:

    // file.h
    inline int add_one(int x) {
        return x + 1;
    }

    // file.c
    int add_one(int x);
You could choose a different name for the keyword:

    // file.h
    do_not_emit_code_in_this_translation_unit_by_default
    int add_one(int x) {
        return x + 1;
    }

    // file.c
    override_default_and_omit_code_in_this_translation_unit
    int add_one(int x);
I suspect "inline" was considered easier to type.


> C++ made the choice that linkers should be able to coalesce duplicates

Yeah and shared libraries are where it gets ugly. For example, libstdc++ has an empty string singleton `_S_empty_rep_storage`. Its headers compare a std::string's storage against this, by address, so the dynamic linker is on the hook to coalesce the empty string symbols. It doesn't always work, and then things break mysteriously when you have two copies of the same variable.

> There's nothing wrong with taking the address of an inline function, though. It behaves just like a normal function.

I think you're proposing that inline functions should have a single definition? This breaks header-only libraries, and also gets sticky with shared libs; whose symbol wins?

If everything is one big static link then it's probably fine. If your compiler is deciding to emit code now, or link to future code, dependent on optimization level...it's painful!


> I think you're proposing that inline functions should have a single definition? This breaks header-only libraries, and also gets sticky with shared libs; whose symbol wins?

You can have the definition in as many translation units as you like, it's just that only one of the definitions can have extern linkage. This works right now, today, both with header-only libraries and across shared library boundaries. If you take the address of an inline function defined this way, you just get the address of the extern linkage definition. The C compiler will do the right thing, as long as you keep in mind the limitations:

- Anything static will get duplicated in each translation unit. (Regardless of whether something is inline--the "inline" keyword isn't relevant here.)

- No two translation units have an extern definition for the same symbol. (Again, regardless of whether the symbol is an inline function.)

All "inline" lets you do is two things:

- You can define a function without creating an extern linkage symbol.

- Static inline functions do not cause warnings if not used.

So, with your shared library, you have to pick one translation unit in one library which gets the extern definition. This is the same restriction you have with non-inline functions--if you are linking a static library into two different shared libraries, and then combining these shared libraries, you will run into problems regardless of "inline".

Or put another way... none of these issues are related to the "inline" keyword in C. The "inline" keyword does not create any new problems.


I believe the standard practice is:

    // file.h
    #ifndef FILE_DECLSPEC
    #define FILE_DECLSPEC inline
    #endif

    FILE_DECLSPEC int add_one(int x) {
        return x + 1;
    }

    // file.c
    #define FILE_DECLSPEC
    #include "file.h"
In this way you don't keep the declarations in both files.


I'm confused. You would presumably have header guards for file.h. Why do you care about inline then, the function is included verbatim in all compilation units having file.c.


In case of static inline, can't the linker detect the duplicate copies and eliminate them accordingly?


This subverts the notion of what "static" means here. "Static" means "private to this translation unit".

Usage of the linker in modern C has been moving in the opposite direction, IMO, towards keeping the interface between the compiler and linker simple. For example, it looks like the trend is towards eliminating the use of "common" variables--GCC now defaults to -fno-common.

You can still get all sorts of fancy stuff with LTO turned on. But if you want no duplicates, you can express that intent by choosing a specific translation unit to contain the duplicates.


My understanding is that `static` makes a symbol local to a translation unit, making it invisible to the linker.


Yes, that's true... but often, visible symbols don't have enough information to get deduplicated anyway. Often, the symbol is just an address within a section in the object file. The section contains other code, and you can't remove things from it... by default, on most systems.

E.g. if you have file.o, the linker will see something like this:

    section .text: [...16kb of data follows...]
    section .data: [...2kb of data follows...]

    my_function = .text + 0x1f3a
This is simplified, but it just illustrates the core of what an object file looks like during linking.

It's just not enough information to go on, if you want to deduplicate a function. C runs on weird embedded systems. You might think, "Just use LTO" and well, those weird systems don't always have LTO. You might think, "If you care about code size, don't use inline functions!" and well, sometimes, inlining a function results in smaller code!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: