This was a great and readable explanation! Also it was a great example as to why...

fyp · on Jan 30, 2020

I thought the example in the article was a little artificial at first. Like why would you only have 12 requests if you have 12 backends?

Reframing it in terms of capacity cleared that up. If the rate of incoming requests is higher than the total rate your backends can process, your queues will grow infinitely!

So something like an average of 12 incoming requests per second with each backend capable of processing 1 request per second is actually fairly realistic. And I think the math still works out the same there.

skizm · on Jan 30, 2020

That condition is only met if you send N requests to N routers. If you send 1,000,000*N requests to N routers, they will almost always be evenly distributed.

jedberg · on Jan 30, 2020

But then you’re under capacity. The assumption is that it takes N servers to service N requests simultaneously.