Hacker News new | past | comments | ask | show | jobs | submit login

The major benefit of putting your queue in your RDBMS, which isn't commonly brought up in these discussions, is that it lets you protect your jobs with the same ACID guarantees as the rest of your data. This is very valuable for some use cases.

I have a Postgres-based job queue that uses advisory locks to get around some of the drawbacks he mentions (job lock queries don't incur writes or block one another like SELECT FOR UPDATE would). Feedback is welcome: https://github.com/chanks/que




You can use messaging to distribute the work, but let all workers access the same RDBMS; that way you pretty much get the same ACID properties you are used to.


We are currently building a system that does this, but it feels inefficent.

First, write the job to the DB, then put a message on the queue to notify the job processor(s) that there's work to be done. The processor then updates the DB to indicate that the work is done.

The goal is, of course, to prevent polling the DB via repeated queries for jobs. But wedging in an entirely new layer/API/messaging server just feels like overkill for the simple task of initiating a job.


There are two answers I have to that.

First, scaling often involves some things that feel unnecessary or inefficient

Second, once you have a messaging system, you find lots of good use cases for it (other kinds of notifications, logs, statistics, integration with other systems, ...).


ACID can be done with messaging systems as well, within the realm of messaging that is. The two rules are "don't use a messaging system as a database" and "don't use a database as a messaging system." In a product like WebSphere MQ (I don't have a lot of experience with the open source messaging systems), not losing messages, atomicity, etc. are important use cases.


My point is that the only way to wrap your jobs in the same transactions as the rest of your data is to have your job queue in your RDBMS. If you don't have that, you can't guarantee that they are consistent, that your backups have snapshots of each at the same time, etc.

Inconsistency between jobs and the rest of your data isn't a problem for many (or even most) use cases, but there are certainly times when you need it.


Use a two-phase commit to go into and out of your database when your message needs to be turned into data. There is no reason your data in the database can't be consistent with your messaging system.


You can provide the ACID guarantees by using something like Redis.


It can only be consistent with the rest of your data if the rest of your data is also in Redis.


i'd recommend zookeeper or consul for orchestrating, distributed locks, two- and three-phase commit state, etc. instead of redis if you want to have your transaction state highly available. they both use raft, which is proven, instead of a home-grown mostly-works algorithms (see aphyr's work with jespen).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: