Hacker News new | past | comments | ask | show | jobs | submit login

Calling C from Rust can be quite simple. You just declare the external function and call it. For example, straight out of the Rust book https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#usin... :

  extern "C" {
      fn abs(input: i32) -> i32;
  }

  fn main() {
      unsafe {
          println!("Absolute value of -3 according to C: {}", abs(-3));
      }
  }
Now, if you have a complex library and don't want to write all of the declarations by hand, you can use a tool like bindgen to automatically generate those extern declarations from a C header file: https://github.com/rust-lang/rust-bindgen

There's an argument to be made that something like bindgen could be included in Rust, not requiring a third party dependency and setting up build.rs to invoke it, but that's not really the issue at hand in this article.

The issue is not the low-level bindings, but higher level wrappers that are more idiomatic in Rust. There's no way you're going to be able to have a general tool that can automatically do that from arbitrary C code.




There's also cbindgen for going the other way around. https://github.com/mozilla/cbindgen


Passing integers around is easy, sharing structs or strings and context pointers for use in callbacks crossing the language barrier etc is typically much harder.


For rust code calling C, sharing structs is doable with #[repr(C)]. See https://doc.rust-lang.org/reference/type-layout.html#reprc-s...

(Nitpick: I don’t think it technically is correct to call this “The C representation”, as strict layout in C depends on the C compiler/ABI. I wouldn’t trust this to be good enough for serializing data between 32-bit and 64-bit systems, for example. For calling code on the same system, it’s good enough, though)


That's not really "simple", it's on par with C FFI in about any other language (except C++), with same drawbacks.


It's on par with C++, too. In C++ you need an `extern "C"`, because C++ linkage isn't guaranteed to be the same as C linkage. You can get away with wrapping that around it in a preprocessor conditional, but that's not all that much easier than Rust's bindgen.

A lot of C to C++ interop is actually done wrong without knowing it. Throwing a C++ static function as a callback into a C function usually works, but it's not technically correct because the linkage isn't guaranteed to be the same without an extern "C". In practice, it usually is the same, but this is implementation-defined, and C++ could use a different calling convention from C (e.g. cdecl vs fastcall vs stdcall. The Borland C++ compiler uses fastcall by default for C++ functions, which will make them illegal callbacks for C functions).

The major difference between Objective-C and C++'s C interop and other languages is the lack of the preprocessor. Macros will just work because they use the same preprocessor. That's really not easy to paper over in other languages that can't speak the C preprocessor.


I think you're confusing some terms here.

> In C++ you need an `extern "C"`, because C++ linkage isn't guaranteed to be the same as C linkage.

`extern "C"` has nothing to do with linkage, all it does is disable namemangling, so you get the same symbol name as with a C compiler.

> Throwing a C++ static function as a callback into a C function usually works, but it's not technically correct because the linkage isn't guaranteed to be the same without an extern "C".

Again, linkage is not relevant here. Your C++ callbacks don't have to be declared as extern "C" either, because the symbol name doesn't matter. As you noted correctly, the calling conventions must match, but in practice this only matters on x86 Windows. (One notable example is passing callbacks to Win32 API functions, which use `stdcall` by default.) Fortunately, x86_64 and ARM did away with this madness and only have a single calling convention (per platform).


> `extern "C"` has nothing to do with linkage, all it does is disable namemangling, so you get the same symbol name as with a C compiler.

extern "C" also ensures that the C calling convention is used, which is relevant for callbacks. It's not just name mangling. This is the reason that extern "C" static functions exist. You can actually overload a C++ function by extern "C" vs extern "C++", and it will dispatch it appropriately based on whether the passed in function is declared with C or C++ linkage.

And I'm not sure the terms are confused, because that's how most documentation refers to it: https://learn.microsoft.com/en-us/cpp/cpp/extern-cpp?view=ms...

> In C++, when used with a string, extern specifies that the linkage conventions of another language are being used for the declarator(s). C functions and data can be accessed only if they're previously declared as having C linkage. However, they must be defined in a separately compiled translation unit.

And https://en.cppreference.com/w/cpp/language/language_linkage

The post you're replying to had it completely right. extern "C" is entirely about linkage, which includes calling convention and name mangling.

> As you noted correctly, the calling conventions must match, but in practice this only matters on x86 Windows.

Or if you want your program to actually be correct, instead of just incidentally working for most common cases, including on future systems.

If you're passing a callback to a C function from C++, it's wrong unless the callback is declared extern "C".


> extern "C" also ensures that the C calling convention is used, which is relevant for callbacks. It's not just name mangling.

I stand corrected. I didn't know that `extern "C"` enforces the C calling convention.

However, on modern platforms this doesn't really matter because, as I said, there is only a single calling convention (per platform). And I'm pretty sure that future platforms will keep it that way. Fortunately, if you try to pass a C++ callback of the wrong calling convention, you get a compiler error.

> If you're passing a callback to a C function from C++, it's wrong unless the callback is declared extern "C".

That's certainly not true because `extern "C"` is not the only way to specify the calling convention. In fact, you might need a different calling convention! As I mentioned, on x86 the Windows API uses stdcall for all API functions and callbacks, so `extern "C"` would be wrong. If you look at the Microsoft examples, you will see that they declare the callbacks as WINAPI (without `extern "C"`): https://learn.microsoft.com/en-us/windows/win32/procthread/c...

So I stand by my point that in practice you don't need `extern "C"` for passing C++ callbacks to C functions. You can pass a lambda function just fine, and when it doesn't work the compiler will tell you.


A couple big caveats here:

* cdecl is a platform specific calling convention. There is no standard C ABI. cdecl is a wintel thing, not the standard C calling convention. On Linux, this is the System V ABI for instance. On Windows ARM, it's also not cdecl.

* Specifying calling convention at all is a compiler specific extension. There is no standard way of specifying a C calling convention without `extern`.

So specifying cdecl gets you the right calling convention on some platforms and ties your code to some specific compilers. The only portable way to specify C linkage in a C++ program is extern "C". You will always get the right ABI for your platform and it will work on every compiler.

> So I stand by my point that in practice you don't need `extern "C"` for passing C++ callbacks to C functions. You can pass a lambda function just fine, and when it doesn't work the compiler will tell you.

The compiler will very often not tell you. It will complain if the lambda can't be coerced to a function pointer (because it's a closure) or if the argument or return types are wrong. An incorrect ABI will usually be accepted and will just do the wrong thing or crash at runtime. The C++ standard says that language linkage is part of a function's type, but very few compilers actually support this.

Your position works sometimes for some compilers and some platforms. I assert that it's better to use standard C++ features and just work everywhere.


> * Specifying calling convention at all is a compiler specific extension.

Yes, because the calling conventions themselves are platform/compiler specific.

> There is no standard way of specifying a C calling convention without `extern`.

Well, on modern platforms you don't need to because there is only a single calling convention that is shared between C and C++. For legacy platforms with multiple calling conventions, you need compiler specific extensions by definition.

> The only portable way to specify C linkage in a C++ program is extern "C". You will always get the right ABI for your platform and it will work on every compiler.

Again, on platforms with several calling conventions `extern "C"` absolutely won't give you the appropriate calling convention all the time. See again my Win32 API example.

> The compiler will very often not tell you > An incorrect ABI will usually be accepted and will just do the wrong thing or crash at runtime.

That's absolutely not my experience! Functions with different calling conventions have different types, so a C++ compiler must reject such code. See https://godbolt.org/z/6EnncE5v5. (Note that for the lambda case MSVC is smart enough to automatically add __stdcall whereas MinGW refuses to compile. The free function is rejected by both compilers.)

Can you show me an actual example where a C++ compiler silently accepts a function with the wrong calling convention?

> Your position works sometimes for some compilers and some platforms.

It has always worked for me so far and I write software for many different platforms.


Ah, yeah, you're right. I was spacing the fact that C as well as C++ can have multiple calling convention. I blame early morning brain.

As far as the wrong calling convention goes, I'm basing it on the fact that an extern "C++" function can be passed as a callback where an extern "C" is demanded. Even if they're the same calling convention, that should fail, but it doesn't. Looks like it doesn't fail at runtime, which is a small comfort, but given the different permissiveness of different compilers, it still makes me very nervous to pass a C++ function as a C callback and just hope that it works, given that it isn't guaranteed in the standard.


> Even if they're the same calling convention, that should fail, but it doesn't.

It's an interesting question. According to the standard, functions with different language linkage are indeed considered different types. As a consequence, <cstdlib> should declare two overloads for qsort() that only differ in the type of the sort function. However, modern compilers don't seem to care:

"The only modern compiler that differentiates function types with "C" and "C++" language linkages is Oracle Studio, others do not permit overloads that are only different in language linkage, including the overload sets required by the C++ standard"

https://en.cppreference.com/w/cpp/language/language_linkage

In practice, extern "C" does two things (as you correctly pointed out):

1. disable name mangling - This only affects the symbol name and is not relevant for callback functions

2. enforce the (default) C calling convention - On all (modern) platforms I know, C and C++ have the same default calling convention for free functions.

This means that from the view of a C++ compiler, pointers to `foo()` and `extern "C" foo()` have the exact same type.

Anyway, no need to be nervous. Even if the compiler treated these as different types, you would get a compiler error because C++ disallows implicit casts between different pointer types.


As long as I can't silently get wrong behavior or runtime crashes, I'm happy enough. Is it guaranteed that an incorrect calling convention will always cause a compiler error? I wasn't aware the calling convention was considered part of the pointer type.

Anyway, thanks for engaging with me so earnestly. I guess I had some assumptions about calling conventions that needed to be straightened out, which is important, as I'm doing work in this territory right now.


> Is it guaranteed that an incorrect calling convention will always cause a compiler error?

A standard-conforming C++ compiler must not allow implicit pointer casts, so yes!

> I wasn't aware the calling convention was considered part of the pointer type.

Some well-designed C APIs define a macro for the calling convention that they add to all API functions and function pointer declarations. The user can then use the same macro when supplying their callbacks, which guarantees that the calling conventions match. (On modern platforms, the macro would be typically empty.)

Here's an example: https://github.com/Celemony/ARA_API/blob/1f68fba7a374b14df19.... As you can see, it is part of the function pointer type: https://github.com/Celemony/ARA_API/blob/1f68fba7a374b14df19...

Another famous example is, of course, the WINAPI macro in the Win32 API.

That's also what I tend to do with my own C APIs.

> I guess I had some assumptions about calling conventions that needed to be straightened out

I also learned a few things in this discussion, so thanks for that!


How is that not simple? You just declare the function and then call it. I find it hard to imagine how it could be any more simple than that.


Now imagine a hundred or two functions, structures and callbacks, some of them exposed only as CPP macros over internal implementation. PJSIP low level API is one example.


But... that's what bindgen is for. Which I mentioned.

I said it "can be quite simple"; for simple use cases, just using extern and translating the declarations by hand is perfectly viable.

For more complex cases, you use bindgen.


Bindings generators exist in most other languages with same limitations.

I would love to see how bindgen would handle a function call defined as a preprocessor macro that I mentioned. Because most likely it won't.


Can someone shed some light on why the parent comment (by varjag) is downvoted?


... And? Most languages make C interop simple.


They quickly become unwieldy on non-trivial APIs, with hundreds of definitions across dozens of files and with macros to boot. Naturally people would still get the job done but it's beyond simple.


That's what bindgen is for, as was mentioned in the original comment you replied to.


How well does it handle preprocessor macros in APIs?


I have used it successfully against header files for Win32 COM interfaces generated from IDL which include major parts of the infamous "windows.h". Almost every type is a macro.

This is an extremely well-understood space.

Just open the docs and do it.


Not types, functions. Where the macro is essentially a forward declaration but the implementation is deep inside the code and is not exposed via headers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: