You can have the same file mmap()ed at the same address in two separate processes, so putting pointers in it is a valid thing to do. In most languages doing dynamic allocation in shared memory is a world of pain, but maybe the STL allocator approach makes it a reasonable thing to do in C++? Sharing a complex linked data structure between different threads is a somewhat tricky thing to do in any case, though in some cases it may be the only reasonable solution. In many cases something like FlatBuffers may be a better solution.
I'm not sure what you mean by "sharing one time only" or "sharing between processes one-to-one"; can you elaborate?
It isn't really true that select() can only monitor file descriptor numbers less than FD_SETSIZE; you just have to allocate the bitmaps differently instead of using the fd_set type. Libevent does this, for example.
io_uring isn't really only for I/O; you can use it for madvise(), poll(), and timers as well. Normally I'd be comfortable with saying that's "I/O" but in this context poll() and timers are the things you're implicitly opposing to "I/O".
Yes, kqueue and iocp are not available on Linux. But io_uring and epoll aren't available on other operating systems, so I thought it was worthwhile mentioning the corresponding facilities there.
Why do you say you need multithreading to truly benefit from epoll? If you have an I/O-bound process that has a lot of file descriptors open, running in an event loop, epoll is a faster alternative to select() (or, as you say, poll()), isn't it?
This thing about nonsequential socket file descriptors is news to me; I hadn't heard anything about it, though I admit I haven't been paying attention. Where can I find out more?
> I'm not sure what you mean by "sharing one time only" or "sharing between processes one-to-one"
By one time only, I meant that you only want to pass the data once, as opposed to sharing the data and being able to update it on one side and see updates on the other one.
It’s not impossible, but practically hard to implement growable/shrinkable shared memory areas, the communicating processes need to unmap and remap the virtual memory regions.
It’s similar with more than 2 processes sharing the memory. Technically doable but hard in practice, for instance it’s easy to leak these memfd handles.
None of these issues are present for threads sharing the address space.
> you just have to allocate the bitmaps differently instead of using the fd_set type
When really unlucky, with these bitmaps one gonna need to waste kilobytes of memory to represent a set with only a couple of handles. The RAM itself is usually cheap at that scale, the problem is the performance overhead of searching for the set bits.
> epoll is a faster alternative to select() (or, as you say, poll()), isn't it?
Right, but for IO bound things, epoll() + multithreading is even faster. Modern network cards are aware of multi-core CPUs on the other side of the PCIe bus. Faster cards like 10 Gbit/second have multiple transmit/receive queues designed to interact with different CPU cores.
OTOH, when things aren’t that I/O bound, the single dispatcher thread is adequate, and the API overhead of poll() is reasonable too, given the usability win.
> Where can I find out more?
Yeah, you’re right about that, accept() never does that, according to specs. However, note the F_DUPFD parameter of fcntl: duplicate the file descriptor using the lowest-numbered available file descriptor greater than or equal to arg. This alone means that for libraries or long-lived projects, relying on file handles to be small numbers is fundamentally unreliable.
Typically when many processes share memory it is with a fixed-size buffer for things like a database page cache or the Apache scoreboard. Or sometimes one process creates a data file and then one or many processes mmap it read-only (your "one-time" case, I think). It's true that sharing dynamically growing and shrinking data structures becomes complicated, but multithreading doesn't make it less complicated, it just moves the complication somewhere else.
Probably you shouldn't use select() or poll() in a library unless it's something like libevent. Libraries that want to own the event loop are impossible to use together unless you're willing to resort to multithreading.
Yes, it's true that you can use F_DUPFD (or dup2()) to request a high-numbered file descriptor. As I see it, that's not really a matter of luck. The only reason I can think of to do that is for some sort of art project or side-channel communication where your file descriptor numbers spell out a message or something.
I'm not sure what you mean by "sharing one time only" or "sharing between processes one-to-one"; can you elaborate?
It isn't really true that select() can only monitor file descriptor numbers less than FD_SETSIZE; you just have to allocate the bitmaps differently instead of using the fd_set type. Libevent does this, for example.
io_uring isn't really only for I/O; you can use it for madvise(), poll(), and timers as well. Normally I'd be comfortable with saying that's "I/O" but in this context poll() and timers are the things you're implicitly opposing to "I/O".
Yes, kqueue and iocp are not available on Linux. But io_uring and epoll aren't available on other operating systems, so I thought it was worthwhile mentioning the corresponding facilities there.
Why do you say you need multithreading to truly benefit from epoll? If you have an I/O-bound process that has a lot of file descriptors open, running in an event loop, epoll is a faster alternative to select() (or, as you say, poll()), isn't it?
This thing about nonsequential socket file descriptors is news to me; I hadn't heard anything about it, though I admit I haven't been paying attention. Where can I find out more?