Moving to async model adds new set of operational challenges as well as some int...

throwaway71271 · on April 18, 2023

this is not async, its sync queue

the lb puts a request where it has some reply_to (ip:port) where it waits (blockingly) for response from whoever picked up the request, it just does now know who that is until a reply comes

gbuk2013 · on April 18, 2023

If you have a queue then you have async model.

As an example of a failure scenario, how does your system distinguish between a request timeout, a response that didn’t get sent back because of network failure and the consumer crashing and losing the message?

throwaway71271 · on April 18, 2023

pretty much the same way a push load balancer does https://github.com/jackdoe/back-to-back/blob/master/broker/b...

    for {
      select {
      case reply := <-replyChannel:
       if reply.Uuid == r.message.Uuid {
        return reply
       }
      case <-timeout:
       return makeError(r.message, MessageType_ERROR_CONSUMER_TIMEOUT, "consumer timed out")
      }
     }

not much different than what you do with normal http timeouts, you send a request, sometimes a response comes sometimes it doesnt, up to the load balancer to decide if it wants to retry or error out

also, queue does not mean async model, it means a queue, there are many queues in http requests responses (e.g. the listen(2) backlog queue itself) and it does not make it async :)

gbuk2013 · on April 18, 2023

> queue does not mean async model

“Message queues implement an asynchronous communication pattern between two or more processes/threads whereby the sending and receiving party do not need to interact with the message queue at the same time.”

https://en.m.wikipedia.org/wiki/Message_queue

throwaway71271 · on April 18, 2023

if you read what i wrote, you will see that it is not asynchronous communication.

and it is not a message queue in the sense you mean, it is a request response queue, it just happens to be using messages

when people think of message queue they are thinking of async event driven communication, and this is not it

gbuk2013 · on April 18, 2023

And what happens if your request modifies state but the response is not received?

throwaway71271 · on April 18, 2023

exactly the same as push load balancing when it doesnt receive the instance's response due to network failure, but the state is modified

jayd16 · on April 18, 2023

If we're not talking about an async model then the suggestion is much less drastic than it sounded at first. In that case the crux of your desire is simply allowing the hosts to signal readiness more directly.

You would almost never actually wait for host machines to dial in. You would have a list of hosts that are ready or not ready as they would almost always be ready for more. You want to assume readiness (as this lowers latency) and feed the fire hose.

But in this interpretation, in a world where an LB would be using an existing connection to host machines with HTTP/3 we're basically already there. I suppose its trivial and standard to signal unreadiness to the LB from the host with a 429 Too Many Requests response code.

Off the top of my head I'm trying to think how a host could actively signal to an LB that's its ready for more requests... I suppose its trivial and common to use a health check. Is it even a change to say that these need to be updated to achieve your goal of host to LB pulling?

adql · on April 19, 2023

...that's just push model. "Signalling" via "well, the loadbalancer have 10 sessions max per server" is enough.

Pull model just adds unnecesary RTT.

> Off the top of my head I'm trying to think how a host could actively signal to an LB that's its ready for more requests... I suppose its trivial and common to use a health check. Is it even a change to say that these need to be updated to achieve your goal of host to LB pulling?

Like this.

There is rarely a case where you decide to not serve the next request after serving previous one so push is most optimal for short ones. And if it doesn't want to it can just signal that via healthcheck.

Pull makes more sense for latency-insensitive jobs like "take a task from queue, do it, and put the results back", as if you say make video encoding service that dynamically scales itself in the background and "just do one encode and exit" is commonplace.

preseinger · on April 19, 2023

it is not possible for a remote destination host to signal to a sending host that it is ready, or not ready, for more requests, in a reliable way

readiness is not knowable by a receiver, it is a function of many variables, some of which are only knowable to a sender, one obvious example is a network fault between sender and receiver, there are many more

even the concept of "load" reported by a receiving application isn't particularly relevant, what matters is the latency (and other) properties of requests sent to that application as observed by the sender

health is fundamentally a property that is relative to each sender, not something that is objective for a given receiver

adql · on April 19, 2023

That's push model when per-server maxconn is full tho.

The biggest benefit from pull model is not having to update backend server list every time you add/remove one but outside of that it isn't really all that beneficial.

You also get added latency, unless each backend server is actively listening and connected but if it is, you're just wasting extra RTT to say "hey, there is a request in queue, do you want it?"

potamic · on April 19, 2023

How do you ensure two backends don't pick up up the same request?