> Or to a manual load balancing strategy. What I have in mind is being able to easily re-direct users from one server to another using a simple CLI utility. This won't entirely eliminate downtime issues but it will (hopefully) mitigate effects to a manageable level.
In practice, this means that each time a server goes down someone has to be on-call to run a command to redirect them. It's also breaking your utilization-based sharding scheme.
> In the era of cloud computing, downtime has become less of an issue.
Well, yes and no. Downtime is less frequent because of robust, automated load balancing. Individual servers, whether VMs or containers or whatever you prefer, are far less reliable. That's intentional. It's cheap commodity hardware, designed to die, and 'cloud native' applications are supposed to handle that properly via things like automated load balancing.
> I like attempting to simplify supposedly complicated issues ... It's true that all requests are not equal but statistically, over time, with all servers being similar, the differences will tend to balance out.
This is an example of one of those simplifications that seems intuitive but just doesn't work. It is completely normal for there to be multiple orders of magnitude differences in request cost, between customers, and at different times. Even if you assume that your application is static (which it hopefully isn't), customer workloads are not. Their behavior will change, which means your sharding needs to change. This is already solved by existing load balancing algorithms described in the linked article.
What problem do you see with existing solutions that you're trying to solve?
In practice, this means that each time a server goes down someone has to be on-call to run a command to redirect them. It's also breaking your utilization-based sharding scheme.
> In the era of cloud computing, downtime has become less of an issue.
Well, yes and no. Downtime is less frequent because of robust, automated load balancing. Individual servers, whether VMs or containers or whatever you prefer, are far less reliable. That's intentional. It's cheap commodity hardware, designed to die, and 'cloud native' applications are supposed to handle that properly via things like automated load balancing.
> I like attempting to simplify supposedly complicated issues ... It's true that all requests are not equal but statistically, over time, with all servers being similar, the differences will tend to balance out.
This is an example of one of those simplifications that seems intuitive but just doesn't work. It is completely normal for there to be multiple orders of magnitude differences in request cost, between customers, and at different times. Even if you assume that your application is static (which it hopefully isn't), customer workloads are not. Their behavior will change, which means your sharding needs to change. This is already solved by existing load balancing algorithms described in the linked article.
What problem do you see with existing solutions that you're trying to solve?