I have a simple alternative method in mind for SAAS (software as a service) apps...

paulddraper · on April 18, 2023

AKA "sticky sessions"

https://docs.aws.amazon.com/elasticloadbalancing/latest/appl...

https://www.haproxy.com/blog/enable-sticky-sessions-in-hapro...

https://www.nginx.com/products/nginx/load-balancing/#session...

yabones · on April 18, 2023

That's very common with stateful applications. Lots of people use HAProxy or other application-aware LB's to keep the sessions "sticky" to a single app server.

https://www.haproxy.com/blog/enable-sticky-sessions-in-hapro... https://www.haproxy.com/blog/load-balancing-affinity-persist...

jqpabc123 · on April 18, 2023

Keeping a session "sticky" once it is assigned to a server is a somewhat different issue than controlling the assignment.

"Persistence" is more what I want but digging this out of HAProxy configuration and making changes on the fly looks troublesome.

alex_sf · on April 18, 2023

As others have mentioned, this is a kind of sharding strategy. It doesn't get rid of the need for load balancing, and it's really going the opposite direction of what we know to be reliable.

> Manual/statistical load balancing --- assign users to a specific server based on their login credentials.

What happens when that specific server goes down? Needs an upgrade/deployment? You'll have to failover to a different server, which brings you back to an automatic load balancing strategy.

> Latency can be reduced to zero by simply forwarding the connection to the proper server once the login is complete.

Modern load balancers add a meaningless amount of latency per request. If you're truly forwarding it in the networking sense, then a load balancer/reverse proxy is still involved.

If you mean something like redirecting them to an endpoint that points directly at an individual server, you get back to the first problem. What happens when that server goes down?

> A statistical model of server utilization can be maintained and users assigned or re-assigned as needed.

This is one of those things that sounds _very simple_, but in practice is incredibly complicated.

jqpabc123 · on April 18, 2023

What happens when that specific server goes down?

Good point.

You'll have to failover to a different server, which brings you back to an automatic load balancing strategy.

Or to a manual load balancing strategy. What I have in mind is being able to easily re-direct users from one server to another using a simple CLI utility. This won't entirely eliminate downtime issues but it will (hopefully) mitigate effects to a manageable level.

In the era of cloud computing, downtime has become less of an issue.

This is one of those things that sounds _very simple_, but in practice is incredibly complicated.

I like attempting to simplify supposedly complicated issues. What I have in mind is simply counting the requests each server handles and using this as a simple measure to compare utilization. It's true that all requests are not equal but statistically, over time, with all servers being similar, the differences will tend to balance out.

alex_sf · on April 18, 2023

> Or to a manual load balancing strategy. What I have in mind is being able to easily re-direct users from one server to another using a simple CLI utility. This won't entirely eliminate downtime issues but it will (hopefully) mitigate effects to a manageable level.

In practice, this means that each time a server goes down someone has to be on-call to run a command to redirect them. It's also breaking your utilization-based sharding scheme.

> In the era of cloud computing, downtime has become less of an issue.

Well, yes and no. Downtime is less frequent because of robust, automated load balancing. Individual servers, whether VMs or containers or whatever you prefer, are far less reliable. That's intentional. It's cheap commodity hardware, designed to die, and 'cloud native' applications are supposed to handle that properly via things like automated load balancing.

> I like attempting to simplify supposedly complicated issues ... It's true that all requests are not equal but statistically, over time, with all servers being similar, the differences will tend to balance out.

This is an example of one of those simplifications that seems intuitive but just doesn't work. It is completely normal for there to be multiple orders of magnitude differences in request cost, between customers, and at different times. Even if you assume that your application is static (which it hopefully isn't), customer workloads are not. Their behavior will change, which means your sharding needs to change. This is already solved by existing load balancing algorithms described in the linked article.

What problem do you see with existing solutions that you're trying to solve?

francislavoie · on April 19, 2023

If I'm understanding the ask, you can certainly do this with Caddy. You could use the `forward_auth` directive to proxy to one upstream to authenticate the connection (by looking at the request headers contents) and possibly ask it to give you an upstream address as a response header, then you can use `reverse_proxy` to proxy the original request to that address from the auth request's header. You could also implement your own dynamic upstreams module (in Go) to do custom upstream selection logic. And Caddy has plenty of load balancing policy options to choose from (and a few more improvements in PRs I opened last week).

vlunkr · on April 18, 2023

There are lots of good replies already, but I'm curious why you would even want to do this. Web applications have been moving away from stateful servers for ages. Ideally your server instances are completely disposable. If you need to persist and share state between them, there are great mechanisms: the database, a queue, cookies, etc.

The 12-factor app describes this well: https://12factor.net/processes

mkl95 · on April 18, 2023

Sounds like sticky sessions

jqpabc123 · on April 18, 2023

"Sticky" applies after a session has been assigned to a server. I'm interested in control over the assignment algorithm.

cett · on April 18, 2023

Definitely doable with HAProxy + lua. I've used it extensively for load balancing stateful apps.

jqpabc123 · on April 18, 2023

I'll definitely take a look at this.

What I have in mind isn't really a "proxy" but more of a login/redirection server.

A "proxy" is middleware which directs all communication through a single server which adds to latency.

What I have in mind will run logins through a single server. But once the login is complete, any further communication is redirected to the proper work server to continue without any proxy middleware involved.

This won't entirely eliminate downtime issues but it does limit the effects to a reasonable level while offering increased efficiency and decreased latency.

adql · on April 19, 2023

Start here:

https://github.com/haproxy/spoa-example

https://www.haproxy.com/blog/extending-haproxy-with-the-stre...

We used that to make an SSO login site that works independently on what is on the backend. Logic was basically:

* if there is no/invalid SSO cookie, SPOA set a flag which made haproxy redirect to the SSO app * if there is valid cookie, decode it and send the data (usually just logged user name) to the app in header

Once cookie is correct it doesn't need SSO server so it is pretty fast for users that already logged in.

It can be also used for blocking requests based on external engine, it's pretty flexible overall

https://docs.fastly.com/signalsciences/install-guides/other-...

cett · on April 18, 2023

HAProxy can also do redirects

bbatchelder · on April 18, 2023

Did this 20 years ago by having the name of the server as part of the user's profile.

User's 1 through 50 (light users) log in and their profile says they go to app-1.myapp.com. User's 51 through 60 (heavy users) log in and their profile says they go to app-2.myapp.com.

A specific user may pay extra to have a non-shared environment, and this supports that as well.

jqpabc123 · on April 18, 2023

Did this 20 years ago by having the name of the server as part of the user's profile.

I had a simple lookup table in mind so it can be easily changed and adjusted as required without affecting the user's profile.

chasd00 · on April 18, 2023

i did something similar way back when only the server identifier was part of the session cookie iirc. Whichever server behind the LB started the session got all the requests for that session. It was more like a user load balancer vs a request load balancer.

choult · on April 18, 2023

That's called sharding; this can be achieved with any load balancer worth its salt.

lucideer · on April 18, 2023

This is called sharding, but the question is more about the assignment algorithm(s) supported by the sharding algorithms, which are not always flexible enough to support the gp's suggestion.