You can have the same file mmap()ed at the same address in two separate processe...

Const-me · on Jan 7, 2022

> I'm not sure what you mean by "sharing one time only" or "sharing between processes one-to-one"

By one time only, I meant that you only want to pass the data once, as opposed to sharing the data and being able to update it on one side and see updates on the other one.

It’s not impossible, but practically hard to implement growable/shrinkable shared memory areas, the communicating processes need to unmap and remap the virtual memory regions.

It’s similar with more than 2 processes sharing the memory. Technically doable but hard in practice, for instance it’s easy to leak these memfd handles.

None of these issues are present for threads sharing the address space.

> you just have to allocate the bitmaps differently instead of using the fd_set type

When really unlucky, with these bitmaps one gonna need to waste kilobytes of memory to represent a set with only a couple of handles. The RAM itself is usually cheap at that scale, the problem is the performance overhead of searching for the set bits.

> epoll is a faster alternative to select() (or, as you say, poll()), isn't it?

Right, but for IO bound things, epoll() + multithreading is even faster. Modern network cards are aware of multi-core CPUs on the other side of the PCIe bus. Faster cards like 10 Gbit/second have multiple transmit/receive queues designed to interact with different CPU cores.

OTOH, when things aren’t that I/O bound, the single dispatcher thread is adequate, and the API overhead of poll() is reasonable too, given the usability win.

> Where can I find out more?

Yeah, you’re right about that, accept() never does that, according to specs. However, note the F_DUPFD parameter of fcntl: duplicate the file descriptor using the lowest-numbered available file descriptor greater than or equal to arg. This alone means that for libraries or long-lived projects, relying on file handles to be small numbers is fundamentally unreliable.

kragen · on Jan 7, 2022

Typically when many processes share memory it is with a fixed-size buffer for things like a database page cache or the Apache scoreboard. Or sometimes one process creates a data file and then one or many processes mmap it read-only (your "one-time" case, I think). It's true that sharing dynamically growing and shrinking data structures becomes complicated, but multithreading doesn't make it less complicated, it just moves the complication somewhere else.

Probably you shouldn't use select() or poll() in a library unless it's something like libevent. Libraries that want to own the event loop are impossible to use together unless you're willing to resort to multithreading.

Yes, it's true that you can use F_DUPFD (or dup2()) to request a high-numbered file descriptor. As I see it, that's not really a matter of luck. The only reason I can think of to do that is for some sort of art project or side-channel communication where your file descriptor numbers spell out a message or something.