This is basically not true, is my point. There is no meaningful "problem" with throwing up a Redis instance in AWS, this just doesn't mesh with my experienced reality.
Speaking from experience, you can get away with not understanding Redis for a while. Then one day you'll wake up and everything will be on fire, because you used Redis wrong, and now your main and replica are in a death spiral DoS'ing each other trying to pass a 1TB replication log back and forth.
You don't need to learn your tools for doing simple stuff under normal circumstances. You need to learn them to do bespoke surgery on live data while everything is on fire and the customers are threatening to fire your company. Or better yet, so that you can avoid doing that altogether by anticipating limitations during design.
That being said, doing everything in Postgres is also going to bite you if you have moderate scale. This is really the same mistake again. Postgres looks like a big truck you can just load up and load up, until you wake up one day and there are cascading failures across all services that touch that database, because you wrote a dumb query that took a lock for 5 entire minutes while it did an HTTP request. It's robustness will lull you into thinking something is working well when it's actually barely working.
(Before you object, yes, it is a better idea not to have multiple services talk to the same database, I hear you. And no, you shouldn't ever hold a database lock while doing an HTTP request, believe me I know. These things can happen.)
Er, I really feel like you’re not understanding how simple AWS makes managing a Redis instance.
I’ve been using Redis for nearly 10 years and it’s been a seamless and pleasant experience. Honestly it sounds like you’re taking your specific experiences and overgeneralizing.
Partly I'm sharing war stories because that's a fun thing to do, I'm not being entirely serious. Partly it's that you said that wasn't your experience of reality, so I broke off a little piece of mine and offered it to you. I'd suggest we're both generalizing, and as long as it's not taken too seriously, that's fine; it's how shop talk works.
I don't know the nature of the applications you've been working on those last 10 years, but it was more or less the main database for a high bandwidth, low latency service I was working on, also using Elasticache.
Problem spaces vary. If you're using it as a cache with modest load and consistency requirements, maybe you never need to understand it. But those sorts of requirements often creep & change out from under you.
So if you're saying, Elasticache did a good job of abstracting Redis, sure, I agree. If you're saying, there is no additional cognitive load to adopting a new service in your data path, because you don't even need to understand it - that puts a shiver up my spine, and makes me hear Pagerduty alerts in my head.
Keep in mind the context of this thread. It's some shitty blog suggesting you do every fuck thing on postgres. Myself and the other guy are simply suggesting that running a single-node instance of redis is an infinitely better and simpler choice than implementing a cache or a job queue on a rdbms.
I don't feel designing for a guaranteed high-availability application was part of the discussion at all.
If you don't find it applicable, feel free to discard what I'm saying, I take no offense. But I just don't quite understand your perspective. Maybe I have oncall firefighter brain rot, but a distributed job queue is exactly the sort of thing I'd want to be available, and a cache is something I regard as being very dangerous and requiring utmost care.
Yes, mostly. There are a few things to take into account though:
- No multiple queues
- No priorities
- Practically no scheduling (the delay is very limited)
- Creating and tearing down a queue takes a lot of time and the number of queues is subject to AWS account limits
- The FIFO/LIFO semantics (remember about no priorities?) will bite you when you least expect
It does have great durability unlike Redis though and will scale to much, much larger queues in an easier way.
You can make good job queues out of this, combined with sharding or consistent hashing, for low(ish) latency applications. Each shard has a stream, they operate on data stored in Redis, and you pass them the key to this data over their stream.
But SQS is great, and a great rebuttal to the article. Totally easier to prototype a job queue that way than with pg, and you probably won't need to move off of it.
I have used both SQS and PG-based queues and for the smaller workloads/smaller systems (read: "not very very large systems") I now prefer the latter. There is also a non-trivial amount of stuff that we turned out to need for operating SQS at scale on the application side, basically to compensate for the things SQS does not have. It is great it doesn't have those things, but if you have a smaller application you might want to have those things instead and sacrifice a bit of scalability.
The advantage being that you could sort things to implement priorities and such? Did you use listen()/notify() at all?
ETA: seeing your list of missing features now, that all makes a lot of sense. In my mind the biggest advantage of SQS is that it glues together all the other AWS offerings, so you can go AWS -> Lambda for an instant job queue (with concurrency limits, etc. so you don't blow your hand off - perhaps undermining the simplicity argument). But everything you're saying makes sense if your job queue needs any degree of sophistication.
Redis isn’t just a cache. That’s memcached. Also SQS absolutely sucks for a job queue as soon as you want to do anything like control concurrency or have job priorities, but if your needs are simply “I need a background job queue” then SQS is likely a great choice.
The overhead is actually the conversation we're having right now about whether postgres or Redis is better. It's not that postgres is hard to use or less perfomant, but that there's memory overhead in "Here's how you use Postgres for session management, here's how you use it for application building". Use Redis for this, Postgres for that is easier to grok.
That's not the scenario they're describing, postgres has most likely already been designed and worked on to scale to their workload, using postgres means you don't have to replicate that for another system.
As long as it works the first time and everyone on the team is fine installing a local Redis - there is very little problem. If the code doesn't make assumptions about read-after-write consistency for jobs. There will be "problems" (or - rather - things you will find out you haven't accounted for) when, for example, an improper URL is used and your Redis fails over. Or you do not have a replica configured (someone decided to "let's save some budget of team XYZ and is this really necessary it is transient after all"). Or you started using something that saturates your Redis. Or that you haven't configured alerting on Redis metrics...
It's all normal stuff, by a long shot not the end of the world, but it is stuff that you need to do, and it is more stuff, and it can bite you if you come unprepared and "just clicked a few instances into existence last year".
Chiming in to concur. Redis is amazing and simple software. You can use a managed service like Elasticache or install the binary on a VM instance. Folks using a relational database for a job queue or a cache when an infinitely simpler and more appropriate tool is available are just making poor technical decisions.
But it's not simpler when you consider all the things you've got to do around and after installing that binary on a VM instance. Consider the overhead of managing it - monitoring it, updating it.
Failover when the node dies. Clustering for high availability?
Backups? For a cache, probably not, for a job queue broker, probably necessary.
Making sure your app deals with inserting into Redis on successful transactions and not when a transaction is rolled back.
Getting up and running can be fairly painless, staying running on all edge cases and handling partial failures is what gets you.
I’m confused the scenario you have where a) a singular postgres install which does everything is acceptable vs b) as soon as redis comes into the picture, suddenly
you need HA and monitoring and apparently running transactions with full ACID integrity?
It’s just a nonsensical and unfair comparison. You can run a single Redis instance with normal rdb disk syncs and don’t ever update it for years on end without issue. Is that guaranteed resilient? Absolutely not, but that’s not the scenario in discussion. We’re talking about the context of a bootstrap/MVP scenario, not an enterprise setup.
I’d take a single-node redis job queue everytime over a HA citus/postgres cluster improperly acting as a queue.
I think the point is they have a Postgres server running anyway as the datastore and the job queue being in Postgres gives you HA, backup and Transaction for free. I think Redis in particular won't give you transaction right?
Needing Transactional semantics for jobs alongside an application operation makes a lot of simpler queue/tool choices difficult.
This is basically not true, is my point. There is no meaningful "problem" with throwing up a Redis instance in AWS, this just doesn't mesh with my experienced reality.