Why are these Twitter clones always so resource intensive and finicky? Twitter i...

sigmar · 2024-10-30T04:33:01 1730262781

>extremely competent and high spirited developers are giving up like this.

I'm pretty sure the median IRC server runs for much less than 7.5 years. I don't think anyone expects volunteers to dedicate decades of their life to admin duty. and it seems fine and healthy for the ecosystem that he is telling people they have a few months to move their bots to a different server.

lanstin · 2024-10-30T09:47:48 1730281668

I didn't read TFA but was there a way to find the migrated bots? honestly, for some quirky reason, the bots are a big chunk of my enjoyment of social media (from Opposum's every hour to randomly generated 3 body simulations of suns in 3d to flight tracking to weather alerts to CO_2 levels), and I have so much enjoyed botsinspace bots since joining mastodon (which is by far the most enjoyable/least addictive/least evil social media I've found).

viraptor · 2024-10-30T05:55:56 1730267756

The post shows where the cost is - storage and bandwidth. With IRC servers you're not expected to serve the all the history forever, with a website around it, persistent subscriptions, outbound queued notifications, etc. On IRC people also pretty much expect missed messages and splits from time to time. Those are very different services.

kraftman · 2024-10-30T08:28:26 1730276906

Yeah but are those valid reasons? Bandwidth is unlimited for most dedicated servers, and 190 GB for 7 years of data isn't a lot; it could fit on my phone 5 times.

viraptor · 2024-10-30T08:48:53 1730278133

190GB in the database. That didn't include media. I'm assuming the media part is not served from the same host since that can easily overshadow other traffic.

koito17 · 2024-10-30T04:29:04 1730262544

Not related to Mastodon, but in the case of Matrix, the server software ranges from "runs on a raspberry pi with zero issues" (Conduit) to "even with 16 GB of RAM, federating with a large enough room will exhaust Python's heap" (Synapse).

In the case of Conduit, a Matrix server with a few private rooms and users consumed only 32 MB of RAM, using RocksDB for storage. The equivalent on Synapse required about 5x as much memory, despite using SQLite. In practice, Synapse instances will use Postgres since many appservice plugins specifically require Postgres and don't support SQLite. Not to mention, SQLite isn't optimized for frequent, concurrent writes.

I do sincerely think the choice of Rails, and the fact Ruby only got a compiler people use recently, means that most Ruby programs require fairly beefy processors and plenty of memory in order to keep up with a few hundred clients.

Of course, I am extrapolating based off my experience running Synapse (a Matrix server) with Postgres. There is a chance Mastodon scales much better.

robobro · 2024-10-30T04:20:08 1730262008

From what I understand, IRC doesn't hold messages for extended periods of time or allow media uploads, while fediverse does, so that's one big difference.

numpad0 · 2024-10-30T13:14:03 1730294043

Do they have to? Twitter does and that's noble, but can't the server, say, per-user logrotate, sign that with webserver cert, and send via email or force download or push to HTML localStorage thing when user is on desktop and then forget about it?

r14c · 2024-10-30T06:18:35 1730269115

it really, really varies by implementation. mastodon is popular (for some reason), but far from the most efficient activitypub server. akkoma derivatives are more limited by postgresql's IO performance than the phoenix app itself. unfortunately, what people know is a really slow rails app.

i haven't personally operated a misskey derivative, but based on my experience writing network servers on node.js it probably performs better than rails XD

the same applies for clients. there are nice native apps and some pretty efficient web clients, but they aren't the default on the most popular server software so nobody uses them.

sureglymop · 2024-10-30T06:07:49 1730268469

Read the post. This specific server is created as a "playground for bots", ran by one hobbyist volunteer. Nothing in the post is surprising or says anything about the architecture of Mastodon-like software.

strken · 2024-10-30T04:31:54 1730262714

I think there are issues with hotspots. The most popular tweets are seen by a big chunk of the userbase, which means they have to operate on a fanout model where each tweet is pushed to individual followers.

I believe IRC doesn't operate like that. The messages delivered to each user don't need to be retained, and I assume the size of the largest channels is in the tens or hundreds of thousands.

mardifoufs · 2024-10-30T04:27:35 1730262455

Because twitter isn't just IRC in reverse?

goodpoint · 2024-10-30T08:54:35 1730278475

Mastodon is not a twitter clone.

Unfortunately it's written in ruby and it has no quotas on the amount of images and videos uploaded or downloaded per-account. Also it is not designed to scale horizontally or leverage any form of p2p.

neonsunset · 2024-10-30T07:55:40 1730274940

These clones are often written in unimaginably inefficient languages like Ruby or Python.

If I’m not mistaken, Twitter uses Scala. That would have been a good start. For all the indie-ness, one of these clones could have been written in hand-tuned Rust or C# or Kotlin to respect the resources of people who would run them out of their own pocket. But sadly this has not happened yet.

lifthrasiir · 2024-10-30T06:04:25 1730268265

Do you really think those "extremely competent and high spirited developers" haven't tried? Not only that there are numerous attempts to extend or replace IRC, but those attempts generally understood what is fundamentally different between an ephemeral room-base chatting protocol and a protocol that allows efficient traversal, aggregation and streaming of possibly large social graph and interactions.

noduerme · 2024-10-30T06:25:00 1730269500

Very incisive. Your post got me thinking: Rather than a federated system like Mastodon, what sort of protocol could (a) function as a temporary, room-based, privately hosted chat, that also (b) encoded the social graph and aggregated interactions in a distributed way that could be polled by any client? It seems like the past 30 years have designed either for the decentralized, chat-first model, or else the centralized social-first model (federated or not). I'm thinking of what an LLM could do in terms of summarizing and compressing both at the client level, so large aggregate searches would know where to look in a decentralized universe of chat rooms to more or less emulate the data-retrieval functionality of a massive centralized social network...

lifthrasiir · 2024-10-30T06:57:07 1730271427

While technically different, relays in the ATProto protocol serve a similar purpose; it can be thought as a materialized view in RDB as far as I understand. So if ATProto proves to be successful in the future, extensions to relays might make that possible transparently. (One big limitation of relays right now is that they have to consume the entire repository at once, making it hard for individuals to host their own relays.)

noduerme · 2024-10-30T07:23:05 1730272985

What about like just readable fragments of materialized views that were encoded into the messages themselves. So that a sharp local context and a blurrier larger context could be reconstructed from any given message. Sort of like a Mipmap. And with maybe 10% of the messages in a thread you could reconstruct a fairly accurate representation of the whole thread, at least good enough to run a search on. Every client could serve as a relay that stored its own threads and a constellation of associated mipmaps, and, if some were missing messages, it would be obvious which other clients needed to be checked for the missing portions the next time they logged on. Old/archaic data could be warehoused by clients that chose to do so. No central servers at all, you just crawl client to client looking for the connections, and build your own graph based on what you're looking for.

mplewis · 2024-10-30T04:38:16 1730263096

Scrollback.