An introduction to libuv (2016)

saghul · on Aug 16, 2018

We merged this guide into http://docs.libuv.org/en/v1.x/guide.html which got some updates to match API changes.

nosefrog · on Aug 16, 2018

I work with Nikhil. He's incredibly smart and a fantastic programmer. Didn't know he wrote a book on libuv though!

tapirl · on Aug 16, 2018

> This book and the code is based on libuv version v1.3.0.

libuv v1.30 was release at on Jan 28, 2015. It is at v1.22.0.

nsm · on Aug 16, 2018

Original author here. Yes this book was written a long time ago when I was into the node ecosystem.

A lot of the book was copied into the official docs later https://github.com/libuv/libuv/pull/1246.

The book definitely does not capture new things. I do not intend to update it, but I still look at pull requests.

saghul · on Aug 16, 2018

I merged it into the official docs, thanks a lot for all your work nsm! <3

ridiculous_fish · on Aug 16, 2018

libuv supports child processes via fork, such as in uv_spawn. It also uses multiple threads to support uv_fs_stat, etc.

How does it handle the well known incompatibility between fork and threads?

oconnor663 · on Aug 16, 2018

Which incompatibility do you mean? If you're forking to create a child process, usually you're just careful not to do anything that might acquire a lock (including allocating memory) in between the fork and the exec, and then I think you're good?

Forking for something else, though, god help you.

ridiculous_fish · on Aug 16, 2018

The tricky part of this approach is error handling. What if a system call fails in between fork and exec, or exec itself?

It looks like libuv creates a pipe for every new process, and uses that to send errno back on any failing system call. This has the disadvantage that you only get the error code, and none of the context. You get ENOENT, but you don't know which path was invalid.

geofft · on Aug 16, 2018

They could just write the entire error string to an fd, right? It's annoying in C to write a formatted string without doing dynamic allocations, but it's possible.

(Or, if you want to be weird, do a PTRACE_TRACEME after fork and have the parent trace the child process and only detach when it sees a successful return from execve. If ptrace is unavailable, fall back to less-useful errors)

ridiculous_fish · on Aug 16, 2018

Piping back the error string is probably the best one can do.

noselasd · on Aug 16, 2018

Do you need to know ? If you're fork/exec and get ENOENT, there's only one path that could relate to, the path to the program you wanted to run.

majewsky · on Aug 16, 2018

Or the path to the interpreter (e.g. if you're trying to run a x86-32 binary on an x86-64 system without 32-bit libc, or if you're trying to run a binary linked against glibc on a system with musl).

noselasd · on Aug 22, 2018

Fair enough - but there's still no more info you could learn from the ENOENT error.

Keyframe · on Aug 16, 2018

Is libuv used anywhere else outside of node?

djs55 · on Aug 16, 2018

There's a very good OCaml binding: https://github.com/fdopen/uwt . It works really well in the networking component of the Docker for Mac and Windows desktop apps (see https://github.com/moby/vpnkit)

insertnickname · on Aug 16, 2018

https://github.com/libuv/libuv/wiki/Projects-that-use-libuv

piquadrat · on Aug 16, 2018

There's an event loop implementation for Python based on libuv: https://github.com/MagicStack/uvloop

It's compatible with the native asyncio event loop and can be used as a drop-in replacement.

bmease · on Aug 16, 2018

neovim uses libuv.

snarfy · on Aug 16, 2018

I'm pretty sure aspnetcore uses it.

sasmithjr · on Aug 16, 2018

The default for Kestrel has been moved off of libuv, but you can choose to use libuv if you'd like.

https://blogs.msdn.microsoft.com/webdev/2018/04/12/asp-net-c...

gajjanag · on Aug 16, 2018

The Julia programming language uses it.

newnewpdro · on Aug 16, 2018

Why is the hello world example [0] unnecessarily allocating and freeing the uv_loop_t? When I see this kind of crap right out of the gate I immediately begin suspecting this is probably a pile of awful code written by a novice C programmer then documented and published as if it's the best thing since sliced bread.

Much of the value in making a struct (and hence its size) public and supplying a pointer to the initializer is the potential to embed or avoid heap allocation of the thing altogether, and here in an example which would benefit from both the simplified code in addition to demonstrating the advantage, it's completely missed.

Fixed form:

  int main() {
    uv_loop_t loop;
    
    uv_loop_init(&loop);

    printf("Now quitting.\n");
    uv_run(&loop, UV_RUN_DEFAULT);

    uv_loop_close(&loop);

    return 0;
  }

[0] https://nikhilm.github.io/uvbook/basics.html#hello-world

CJefferson · on Aug 16, 2018

Generally, you only create one uv_loop_t, and it is vital it is never copied (which is easy to do accisentally if you stack allocate it).

mallocing is the entirely sensible thing to do here.

beached_whale · on Aug 16, 2018

I just assumed it was because if the memory location ever changes it would invalidate all the pointers to it. It has to have a fixed memory location

int0x80 · on Aug 16, 2018

What do you mean by accidentally copy?

struct embbed { ... uv_loop_t loop; ... } e1, e2; ... e1 = e2; ?

newnewpdro · on Aug 16, 2018

Presumably they're referring to copy via simple assignment vs. requiring a memcpy or pointer dereference. It's a b.s. argument.

int0x80 · on Aug 16, 2018

totally. Your comment was right.

newnewpdro · on Aug 16, 2018

Then uv_loop_t should be an opaque type so its size isn't known at all, and uv_loop_init() should instead be uv_loop_new() returning the heap-allocated uv_loop_t, and uv_loop_close() renamed to uv_loop_free().

As-is the API and the way its use is being demonstrated in that example appear amateur to say the least.

viraptor · on Aug 16, 2018

But then you wouldn't be able to stack allocate it. It's reasonable to make something recommended, but not forbid other approaches if someone really wants to do it that way. You don't have to go to either extreme.

megous · on Aug 16, 2018

Funny how most structs in the linux kernel are public, they must be all amateurs, too. Perhaps go read about container_of and struct embedding in C before calling anyone an amateur.

It's just a different programming approach in C.

Encapsulation is good only for making it harder for anyone to poke in the internals of your library, which has some benefits with binary compatibility, etc.

But it's strictly worse in all other metrics when coding in C. It limits you, it forces malloc calls where most of the time none would be necessary, it makes memory management more complicated, it forces explict initialization, etc.

newnewpdro · on Aug 16, 2018

What exactly is funny here? The linux kernel is not a userspace library with a public API - and in the case of libuv, judging from `apt-cache search libuv` output, one which intends to be dynamically linked as well. By exposing this performance-insensitive struct publicly, they're also unnecessarily increasing their ABI surface area. It just reeks of novice library authorship.

What makes sense for a monolithic kernel's code is quite different from what makes sense in a userspace library, particularly on its public API boundaries.

It's also worth noting that the Linux kernel developers have historically made choices which are actively hostile to kernel module API/ABI stability as a way to discourage third parties from distributing out-of-tree proprietary binary modules. There is literally a policy of not having a stable interface for module writers, it's the anti-library except for the system call interface.

dap · on Aug 16, 2018

If one already accepts that the object should be dynamically allocated because otherwise it's too easy to accidentally copy (which seems specious, but that was the suggestion), then all of those arguments apply to that recommendation, too. And if it's "vital" that the object not be copied, then the API could enforce that with an opaque type.

I've only worked with libuv in a few contexts, and maybe there are compelling reasons for exposing the struct. (I'd like to hear them! I would not have expected a uv_loop to be allocated in perf-critical code paths or code paths where failure isn't an option.) But I think critical analysis of C API design is an important topic. C gets a bad rap for being unsafe, which it obviously is in many ways, but as C developers, simple choices like this (that a novice might not even think much about) can make an API much safer -- or much less safe.

megous · on Aug 16, 2018

Performance is orthogonal to this. To me the less you juggle with pointers and malloc/free the safer your code will be from memory leaks, misuse of unallocated memory, NULL checking issues, and easier to inspect/reason about.

Struct embedding helps with this quite a bit. And it's not possible without exposing the struct definition.

Performance gains are possible, but that's secondary.

> If one already accepts that the object should be dynamically allocated

It can still be allocated on heap, but in one continuous chunk of memory as a part of the larger struct. Hiding the struct definition would prevent this.

dap · on Aug 16, 2018

> Performance is orthogonal to this.

That's largely true, but it's often cited as a reason to avoiding malloc/free (however dubious that is).

> To me the less you juggle with pointers and malloc/free the safer your code will be from memory leaks, misuse of unallocated memory, NULL checking issues, and easier to inspect/reason about.

I don't quite agree. I've worked mostly in code bases using the pattern described earlier (an opaque pointer, a $type_create(), and a $type_destroy() function). With that pattern, I find it much easier to be certain by code inspection that a particular object or transformation is valid because as long as the pointer was allocated correctly, the object can only be modified by functions that know the type (aside from memory corruption, but that's always possible). That's usually a small set of functions that know the struct details. This fact is useful both as a library author and as the author of a library consumer. By contrast, if the struct is exposed, it's harder to identify all the places that can modify the structure's details and to be sure that invariants are maintained in all those places.

Besides that, several other failure modes are much less likely with opaque structures, including copying a structure you shouldn't, miscopying a structure that's okay to copy (e.g., off-by-one while copying), or operating on a correctly-sized block of memory that's never been initialized. You can still use unallocated memory, of course, but that's fairly easy to make safer by initializing pointers to NULL. (The analogous option for stack-initialized structs -- initializing them to zero -- is not necessarily any safer than leaving them uninitialized -- particularly if the struct contains file descriptors.)

There are tradeoffs to both approaches. To me, the ability to modify the struct in new versions of the library without breaking the ABI is a pretty major point in favor of using an opaque structure for a library's primary handle. For very simple ancillary structures that are very unlikely to change, and where the convenience of stack allocation is worthwhile, exposing the structure makes a lot of sense.

int0x80 · on Aug 16, 2018

The libuv API is not amateur, is all about flexibility. The example usage however is not optimal, it is just confused.

The option to dynamically alloc uv_loop_t is not to avoid "accidental copies" or anysuch thing. Is there for when you need to dynamically create loop contexts.

Linux does exactly the same thing. But you wont see unecesary kmallocs of structs when static storage is enough. In such cases you will see the *_init functions used.

emteycz · on Aug 16, 2018

I wouldn't call a library that is the foundation of one of the most popular platforms in the world amateur.

newnewpdro · on Aug 16, 2018

Ever looked at the PHP implementation throughout its popular existence? Popularity is an awful metric for gauging implementation quality.

emteycz · on Aug 16, 2018

Agreed, however PHP was never praised for anything other than its ecosystem. Node is generally praised for its async IO, which is built on top of libuv.