The Flux terminology is confusing to me. It looks like the Observer pattern to me.
* Stores contain Observables
* Components (or Views) contain Observers
* Actions are Proxies
So the article is basically saying the Observer pattern is scalable, but uses the buzz-phrase "Full Stack Flux" instead. To make it even worse it is only a theoretical application of this pattern.
I suppose there is some value to this as a thought experiment but pretty much every tier in that architecture has breaking flaws. The most relevant being that anything that has an audience of 1 million users cannot run on an architecture that has single points of failure.
Hardware fails. This architecture has single points of failures which means if one of those instances fails the entire system goes down. This is generally not acceptable for environments with 1 million active users (which implies high business value)
AFAICT, Facebook engineered their client side code around Flux in order to eliminate two-way data binding in their user interface code, which leads to all sorts of issues. I don't imagine they push the pattern into the server. Their relay stuff still relies upon a smart data tier which understands a query language called GraphQL.
I definitely don't think they considered implementing a dumb dispatcher and store layer on the database server using stored procedures. (This seems terrifying to me, I don't see the upside.)
It's an interesting experiment but I think this might be an example of being too aggressive in trying to generally apply a design pattern that was motivated by a specific problem.
Great post, but has this design been implemented to effectively handle something close to a million users ( or even 100k and show that no part is overheating) ?
As in every new and complicated design, i'm a bit skeptical of rules of thumb calculations. You never know what the wrong latency issue at the wrong place can do...
The design is almost comically bad (single source of truth for a "scalable" chat app?!). This is either an attempt at parody or this guy must be suffering from a rather severe case of second system effect...
Oh, you're absolutely right, 10M/s is a mistake, which I've corrected.
Few remarks though:
- it's still way below the number of actions we're talking about here (~100k/s)
- since redis is only used as a generic MQ and not as a store, it can be sharded at the app level without the pain usually associated with redis clustering
- I've deployed a similar (but less performant) design for the player of a gaming website, which is in use in production for more than a year, and works like a charm (we're talking ~5-50k users per channel on a daily basis). This is definitely a "second-system" pattern, but I try to avoid the associated pitfalls :)
Have you ever used redis MQ at scale? I'm guessing no; the redis server is not your bottleneck. The fact that every server proc has to parse every message puts a hard ceiling on the amount of traffic you can handle. Intelligent routing is, I believe, the answer here. I've spoken with antirez on this and he agrees that at the scales you're talking about, redis MQ doesn't cut it.
How about just not making wild claims about byzantine fantasy designs that you never tested under any kind of load.
There has been a lot of research in messaging architectures, some of the best message brokers are free. As it happens, none of them have any resemblance to your proposed design.
RabbitMQ has been benchmarked[1] to 1 million messages/sec on 30 servers and works very well for many people.
I think I may have failed to express my point, though. I'm not building a message queue, as it is certainly a very hard problem that has been engineered for years by people way smarter than me :) I'm merely leveraging the goodness of their implementations (in my case redis, but RabbitMQ is also an option I've considered explicitly in my post).
The chat is a contrived example to show that even under high load, full-scale flux over the wire is a reasonable option. As for "any kind of serious load", well, maybe my example fails to meet the requirements, but unless I'm building Facebook, I think I've faced something serious enough to be able to think about my next step.
If you're building a large scale chat service you are implicitly also building a message queue.
And as for the high load you haven't actually experienced high load until you put this into production with a million users.
To make that clearer: you can design a system for any number of users, the only relevant question is how it held up in practice and as long as you haven't had a million concurrent users you just don't know (and probably it won't).
That may be the kernel of the problem here; you built a subset of a message queue without realising it.
RabbitMQ has a websocket plugin[1]. Just make your javascript connect directly to a RabbitMQ cluster and you have a solid, scalable foundation - almost for free.
I was under the impression that the design was indeed never tested, but at least the author had some real life experience of building a moderately sized chat service.
I can understand why people like you are pissed by this kind of blog post which reads a bit too much like an ad, but i think it's still good that people are trying to reinvent the wheel with completely new technologies, because sometimes it leads to surprising results.
Maybe the OP should add some warnings in the blog, saying that it's an highly experimental design that people shouldn't try to use for their own projects at the moment...
I agree. "Number of messages a single Node broker can handle: ~10k per second without JSON stringification memoization, ~100k per second with JSON stringification memoization (Nexus Flux socket.io does that for you)." <- Can't be possible. The numbers should be lower with JSON stringification/parsing instead of being 10x.
GameRanger has six million users, tens of thousands to hundreds of thousands of whom will be active simultaneously. His chat problem involved has moderate fan-out.
Scott Kevill does it on a single machine (last time I checked) with hand-rolled C++ and close attention to the details of how the Linux networking stack works.
This could be the inspiration for a great open source project, and become something that could easily be deployed to a cloud hosting platform. Basically it's the same as firebase, but with some of the react and flux goodness like server side rendering. Or somebody could package this as a product.
One problem I have with using postgres listen/notify as a general purpose message queue is that it requires polling (At least that was the case when I last looked at it). Of course you can use a blocking wrapper around the polling code but it still causes unnecessary roundtrips.
Your database connection is just a socket, so you can add that file descriptor to the set of file descriptors you are waiting IO on, if you are using a classic select/poll based system. See an example in the pscyopg2 docs here: http://initd.org/psycopg/docs/advanced.html#asynchronous-not...
Once that FD is active, you call the poll() method and your notify payload becomes available to you.
You are right. It seems that this was an issue with older versions of libpq: "In prior releases of libpq, the only way to ensure timely receipt of NOTIFY messages was to constantly submit commands"
http://www.postgresql.org/docs/9.4/static/libpq-notify.html
* Stores contain Observables
* Components (or Views) contain Observers
* Actions are Proxies
So the article is basically saying the Observer pattern is scalable, but uses the buzz-phrase "Full Stack Flux" instead. To make it even worse it is only a theoretical application of this pattern.