When reading along: Keep in mind that I'm not a core developer and thus are not ...

When reading along: Keep in mind that I'm not a core developer and thus are not directly involved in development, design decisions, or roadmap. I have some understanding of the internals and the associated challenges based on my contributions and discussions on the mailing list, but the following might not be entirely correct.

> I wonder if a disk-writing process could be spun out before dropping privileges?

I mean … it sure can and that appears the plan based on the last comment in that issue. However the “no disk access” policy is also useful for security. HAProxy can chroot itself to an empty directory to reduce the blast radius and that is done in the default configuration on at least Debian.

> but I don't see why disk accesses need to block requests

My understanding is that historically Linux disk IO was inherently blocking. A non-blocking interface (io_uring) only became available fairly recently: https://stackoverflow.com/a/57451551/782822. And even then it's a operating system specific interface. For the BSD's you need a different solution.

If your process is blocked for even one millisecond when handling two million of requests per second (https://www.haproxy.com/de/blog/haproxy-forwards-over-2-mill...) then you drop 2k requests or increase latency.

> or why they have to occur in the same thread as requests?

“have” is a strong word, of course nothing “has” to be. One thing to keep in mind is that HAProxy is 20 years old and apart from possibly doing Let's Encrypt there was no real need for it to have disk access. HAProxy is a reverse proxy / load balancer, not a web server.

Inter-thread communication comes with its own set of challenges and building something reliable for a narrow use case is not necessarily worth it, because you likely need to sacrifice something else.

As an example at scale you can't even let your operating system schedule out one of the worker threads to schedule in the “disk writer” thread, because that will effectively result in a reduced processing capacity for some fractions of a second which will result in dropped requests or increased latency. This becomes even worse if the worker holds an important lock.