You bring up a good point about disabling signup on the main server.
I've recently been wondering why the Matrix team wouldn't want to lose some of the control they have over the Matrix universe..
They could solve their scalability problem by bringing up other homeservers (why not a paid riot.im server even?) or promoting other open servers hosted by the community.
Seems like Mastodon folks are favoring horizontal scalability by making people use various servers, but the Matrix team wants to keep mostly everyone on matrix.org.
(I hope this doesn't sound too negative.. I know it's not like Matrix folks want to centralize things.. It's just, I wonder why they don't promote other homeservers more)
The reason we haven't disabled signup is to try to provide a server which we can point people to over which we have some level of control and quality (even if that quality is pretty crap right now).
Once we have account migration sorted out (hopefully coming sooner than expected thanks to work being done for GDPR), then the situation should be much better as folks can flee off the overloaded server onto one of their choice... but first-time newbies don't need to make the complicated and confusing call on picking an alt server.
GP brings up an interesting point, though - couldn't you run multiple smaller homeservers rather than one huge one? I suppose it depends on how much of a challenge that is operationally, but Matrix federation as a whole seems to scale much better than individual homeservers do at this point.
(Users who have already signed up on the matrix.org HS are stuck there until account migration is a thing, of course.)
Splitting matrix.org into smaller homeservers wouldn't necessarily help (even if we had account migration), as you'd just end up switching client<->server traffic for server<->server traffic. As New Vector (the startup we setup to support Matrix) we're also working on providing a homeservers-as-a-service though which should help.
That's of course true for the massive rooms, such as #matrix:matrix.org, but if there are enough 1:1 conversations to contribute significantly to the overloadedness of matrix.org splitting the servers up and 'randomly' assigning a server to users would decrease the load.
The problem is that the 1:1 conversations are a negligible weight relative to the massive 15,000 user rooms like #matrix:matrix.org - and spreading those 15,000 users over 15,000 servers rather than 1000 is going to just swap a client<->server API overload problem with an server<->server API overload...
Do you know what exactly the bottleneck is on Synapse? Is it syncing everything to users, or the actual room processing? I wonder how much of an impact JSON (de)serialization has, assuming you don't cache serialized requests.
We do. The main bottleneck is merge resolution when unifying your copy of your room with everyone else’s. If the room starts to fragment due to netsplits or unreliable servers then this can get incredibly resource intensive. https://github.com/matrix-org/synapse/pull/3122 is the fix which switches the algorithm from roughly O(N) to O(1).
For context, a typical synapse actually only uses around 300MB of RAM. It only spikes up to 1-2GB when trying to resolve state on big rooms like Matrix HQ, and then python doesn’t relinquish the RAM.
We do cache responses in JSON to avoid serialisation overheads.
Am I reading this right that this is the algorithmic fix, but it still needs a concrete implementation - and then the main server should be back down to "vertically scalable"?
Without going and reading the doc(yet); does this relate to:?
It would seem that a simpler, deterministic merge algo would be possible to parallelize - but I'm not sure if it's easy to match matrix idea of merges with what's discussed in that post/paper?
The algorithm is already being implemented in synapse (but got delayed by dealing with some security bugs). There's also a rust test jig for playing with algorithms around merge resolution at https://github.com/erikjohnston/rust-matrix-state/.
It's not directly related to the Categorical Theory of Patches paper - the merge resolution here is much simpler than reasoning about VCS patches, although the approach of taking a formal mathematical approach is similar :)
I'm on the public Matrix.org server and I recommend new folks to sign up on the public server unless they're 100% sure they're willing to commit to the devops burden of maintaining their own homeserver.
I also know at least a halfdozen people who immediately said "yes" to the devops burden of maintaining their own homeserver (and reportedly, it's actually really easy).
Different strokes for different folks. If self-hosting it was a pre-req for using it, I'd still be saying to myself "yeah, I'll do that riiiight after I get my closet k8s cluster just the way I want it", and, well, I know myself better than that, so I use the public one, and I'm happy.
Mastodon gets away with just telling people to go find a server to register with, you don’t have to self host to get some reasonable level of decentralisation.
I've recently been wondering why the Matrix team wouldn't want to lose some of the control they have over the Matrix universe..
They could solve their scalability problem by bringing up other homeservers (why not a paid riot.im server even?) or promoting other open servers hosted by the community.
Seems like Mastodon folks are favoring horizontal scalability by making people use various servers, but the Matrix team wants to keep mostly everyone on matrix.org.
(I hope this doesn't sound too negative.. I know it's not like Matrix folks want to centralize things.. It's just, I wonder why they don't promote other homeservers more)