Hacker News new | past | comments | ask | show | jobs | submit login
Don't Use an RDBMS for Messaging (functionwhatwhat.com)
92 points by DrJokepu on Sept 27, 2014 | hide | past | favorite | 86 comments



The major benefit of putting your queue in your RDBMS, which isn't commonly brought up in these discussions, is that it lets you protect your jobs with the same ACID guarantees as the rest of your data. This is very valuable for some use cases.

I have a Postgres-based job queue that uses advisory locks to get around some of the drawbacks he mentions (job lock queries don't incur writes or block one another like SELECT FOR UPDATE would). Feedback is welcome: https://github.com/chanks/que


You can use messaging to distribute the work, but let all workers access the same RDBMS; that way you pretty much get the same ACID properties you are used to.


We are currently building a system that does this, but it feels inefficent.

First, write the job to the DB, then put a message on the queue to notify the job processor(s) that there's work to be done. The processor then updates the DB to indicate that the work is done.

The goal is, of course, to prevent polling the DB via repeated queries for jobs. But wedging in an entirely new layer/API/messaging server just feels like overkill for the simple task of initiating a job.


There are two answers I have to that.

First, scaling often involves some things that feel unnecessary or inefficient

Second, once you have a messaging system, you find lots of good use cases for it (other kinds of notifications, logs, statistics, integration with other systems, ...).


ACID can be done with messaging systems as well, within the realm of messaging that is. The two rules are "don't use a messaging system as a database" and "don't use a database as a messaging system." In a product like WebSphere MQ (I don't have a lot of experience with the open source messaging systems), not losing messages, atomicity, etc. are important use cases.


My point is that the only way to wrap your jobs in the same transactions as the rest of your data is to have your job queue in your RDBMS. If you don't have that, you can't guarantee that they are consistent, that your backups have snapshots of each at the same time, etc.

Inconsistency between jobs and the rest of your data isn't a problem for many (or even most) use cases, but there are certainly times when you need it.


Use a two-phase commit to go into and out of your database when your message needs to be turned into data. There is no reason your data in the database can't be consistent with your messaging system.


You can provide the ACID guarantees by using something like Redis.


It can only be consistent with the rest of your data if the rest of your data is also in Redis.


i'd recommend zookeeper or consul for orchestrating, distributed locks, two- and three-phase commit state, etc. instead of redis if you want to have your transaction state highly available. they both use raft, which is proven, instead of a home-grown mostly-works algorithms (see aphyr's work with jespen).


My rule of thumb is: First build it with a database, then break things off using NoSQL/Queues/Memfs/etc. after scale increases to a point where it becomes a pain. Its never a good idea to start something built in the most optimized way possible (in terms of performance). In my experience 99% of the time, actual performance numbers are way less than what you expected anyways.


I don't consider using a messaging system if you need messaging premature optimization at all. That's just using the right tool for the job.


"Messaging" and "messaging system" is too nebulous.

Why are you not using a mail server?

I'm totally serious. Been there, done that (used qmail as a message bus). It works great for certain types of requirements.

Sometimes you badly need persistence. Sometimes it doesn't matter. Sometimes you need massive scalability. Sometimes you need reporting. Sometimes you need to be able to provide a dashboard to visualize pending messages. Sometimes you need to be able to publish to large numbers of listeners at once. Sometimes a small number of listeners needs to aggregate events from thousands or millions of sources. Sometimes you need a message to only be processed exactly once.

Often you can nail down some of those issues from the outset. As well as scale. But often you won't know in advance.

More importantly: Your system will almost certainly use a database server. Unless you have clear indications that your needs can't be met by the database server, adding a "messaging system" is one more dependency which you don't yet know whether or not you'll actually need. One more thing that can fail.


> Why are you not using a mail server?

If that's the right tool for the job, then I wouldn't argue with you. But my messaging needs are typically solved by tools like WebSphere MQ, RabbitMQ, etc.


I don't see how this applies to messaging systems at all.

It's actually harder to implement a real-time messaging system with a database or key/value store than it is to use something like RabbitMQ (messaging) or Beanstalkd (for jobs).


There is an interesting post about locking and performance when implementing queues in PostgreSQL by one of the PostgreSQL contributors here: http://johtopg.blogspot.se/2010/12/queues-in-sql.html


This is a great blog post. Anyone implementing this sort of thing in Pg needs to read it.


Link doesn't work for me. Has it been moved?



Two reason why I have ended up using RDBMS for messaging: 1)Transactions can easily cover both DB and messaging operations (distributed transactions are not easy/possible on all platforms) 2) The database is anyways there, no need to add another critical component to the stack.


I think a better reason is that jobs and messages can be relational with the other tables in the system, and so can provide the same guarantees that the rest of the db offers.


This is a topic that I'm actively researching on for my startup (www.vacationlabs.com) and would love to know what other experienced people think.

Our Rails app has grown over time and we ended up doing everything in a monolithic way to get stuff out faster. However, now the app is so big and tightly coupled that newer members of the team find it very hard to contribute. Therefore we are thinking of breaking it down into smaller apps which talk to each other using some sort of APIs -- either JSON/HTTP or MQ.

The MQ use case being considered is this -- the core transactional part of the system handles just that - maintaining tour availability and keeping track of payments. Everything else will be hived off to a mini-app which is notified of booking/payment related changes via a pub/sub model. So the email notification system will subscribe to the new-booking, booking-cancelled, payment-received, etc events and will send out emails appropriately. The billing system will subscribe to another set of events and will bill the customer appropriately. And so on.

Is this the right use case for an MQ? Is using an MQ worth the additional dev-ops related complexity that it brings along? in our case message delivery needs to be guaranteed, else extremely important business functions will not work. How do we deal the MQ unavailability? In normal cases if the DB is unavailable your system is down. Is this how MQ should also be treated? If not, how do you deal with the situation where the core DB transaction is complete, but for some reason you're unable to publish an event to the MQ? If the pub/sub system were built on top of the DB itself this problem would not arise because publishing an event would be part of your DB transaction itself.

Is there a sane way to build a pub/sub architecture on top of a DB, especially Postgres, which is what we're using. If not, any recommendations for which MQ we should be using for a guaranteed delivery pub/sub model?


While it sounds like it could be a case for a queue, the fact that your workflow is mission-critical is itself a reason not to use one. In particular, a reason not to use RabbitMQ.

Some message queues are more reliable than others. RabbitMQ is designed to be clustered, and its handling of partition tolerance has been shown [1] to be very bad, something that I have experienced personally in a production system. Don't ever use it if losing messages will be a problem; and never use it with automatic acking (you'll want messages to be retried if your workers die mid-stream). RabbitMQ can be reliable if your boxes are all on a native (not cloud VM-based) LAN that is stable, and your machines don't occasionally get so overloaded that it impacts network connectivity.

One possibility is to use a message queue purely for signaling, not for state -- instead, use databases and APIs to transmit actual state. For example, let's say you want to shoot off an email every time there is a new booking. The "emailer" app listens to events published by the "booking" app. But the event doesn't contain any information about the booking; instead, the event simply says that there was a booking. When the "emailer" app receives this event, it asks the "booking" app for new bookings that it doesn't know about; it processes each booking, first recording that a (booking_id, email_id) row, then firing off the email, then committing that row.

This makes every participant in the workflow idempotent, because they can run the same piece of logic many times and still produce the same result. If you ever have a problem with the queue going down, you can simply execute the exact same code: You don't need to replay any missing events. You only need to worry about multiple listeners (multiple "emailer" workers) waking up from the same notification and doing the emailing for the same bookings. This is why you must transactionally update your email log table using database locks. You don't necessarily need to use database locks, but such a system needs some kind of locking to be absolutely atomic.

The nice thing about this workflow is that you can make it extra bulletproof by making the "check for bookings to email about" logic run, say, every ten minutes -- in addition to responding to the message queue events. If the queue isn't working, your app will still be doing the emailing, just a little slower. In other words, the queue simply becomes a trigger mechanism.

[1] http://aphyr.com/posts/315-call-me-maybe-rabbitmq


While the article isn't wrong, it's increasingly my belief that "messaging" is something of a misnomer, because it's used to refer to two rather one different concepts: Events and jobs.

We've been using RabbitMQ for messaging for a long time. Some messages are purely reactive: An object was changed, so some other subscribing application needs to react by updating its data structures to reflect the change. This is an event.

Other messages represent jobs. Jobs are tasks that are queued up for processing. Processing photos, for example, is a classic job.

Jobs don't fall that well into the RabbitMQ/AMQP framework because it's designed for messaging and is, despite support for things like durability and dead-letter exchanges, largely about ephemeral data. RabbitMQ is pretty good with events; its excellent routing support, for example, makes it trivial to build simple a pub/sub architecture, and it's pretty fast.

But one big difference between events and jobs is that events are "fan out" (every logical type of listener gets routed one copy of each event, because they may be subscribing for different reasons), whereas jobs are directed (there is only processor that can do the job, and there is always just one processor working on a single job at any given time).

Another difference is that a job can be in various states. With events, you don't care about completion: A consumed event has been consumed. But with jobs it's terribly useful to track the final status of everything. Jobs should be pausable and long-running jobs should be able to report on their progress ("34% of subjects mutated") along the way. Jobs should arguably be capable of being re-run, even if they succeeded. The performance of jobs -- how long they took, how long they had to wait, their failure rate, the rate of their various progress metrics (516 photos processed), etc. -- is something you want to track and aggregate.

But as it happens, relational databases excel at the sort of stuff you need for jobs. I haven't made any decisions yet, but I'm really itching to create a job management app on top of Postgres, unless I can find an existing system that is equally stable. I've played with the idea of a hybrid system: Register jobs in Postgres, use RabbitMQ as the trigger mechanism. But this puts a burden on the processor, which now has to use two APIs.

(On a related note, a big pet peeve of mine is how flaky RabbitMQ is. Being used to working with Postgres, which is famously stable, RabbitMQ is easily the least reliable piece of software in our application stack. We lose messages every week due to network instability and RabbitMQ bugs. There's something to be said for Postgres' master/slave design here. It may not be a cool clustered Erlang project, but it just works.)


We used an rdbms for our workflow engine (what you are calling jobs) and it works great. We needed to react to state changes and process workflows within seconds, with dozens of workers executing different jobs in parallel. This is well within the limits of any rdbms.

The workflow system itself took a week or so to build along with a client library and our first actual workflow. Trying to add a messaging system on top for push notification would have been crazy. Polling every second for work in the db works just fine for us. I'm sure we could lower that polling interval arbitrarily lower if we needed (we might never get it as low as RabbitMQ but that is fine).

So if your tps and latency needs fit in an rdbms, I would recommend just operating one system instead of two.


Sounds like a pragmatic solution. If you're using Postgres you can use "listen" and "notify" to avoid polling, though.


The hard part of distributing work from a central dispatcher is that you wind up with a central controller needing a complete model of the overall system.

I work on a component of the Cloud Foundry PaaS. The current generation works more or less as you describe -- there's a cloud controller backed by a PostgreSQL database. It sends start and stop commands to a pool of execution agents over a messaging bus.

The next generation of that runtime system ("Diego") is actually inverting this control. Execution agents no longer receive instructions to perform tasks or long-running processes. Instead they bid on them in an auction.

The controller is radically simplified and the overall system converges on a satisficing solution for distributing work efficiently (and they ensure it does so with extensive simulations based on the actual code). It's actually more robust because it doesn't rely on heavy efforts to ensure that the controller is always up to date. And it doesn't fall into inconsistent states if the controller vanishes.

Onsi Fakhouri did a brief presentation which is worth watching: https://www.youtube.com/watch?v=1OkmVTFhfLY


I would be interested in a debate over these strategies between the cattle.io creator and one of the cf guys..


Essentially it's based on experiences of scaling up CF installations for customers, and for the public service version (Pivotal Web Services).

A high-intelligence controller turns into a giant "god object". It attracts new functionality like a magnet. Everything depends on it and it has to know everything about everything.

In practice it begins to be both the leading cause of failure and the main scaling bottleneck.

The auction-based model allows any number of auctioneers and executors to participate without any one component needing global state.

Essentially, we rediscovered the economic calculation problem. And we're solving it by distributing knowledge across the system and letting agents solve their own problems locally.


Creator of cattle.io here. There are far too many topics to address here, but I’ll give you a brief overview of how cattle.io works. The high level orchestration system’s role is to manage the state of resources and their state changes. When states are changed events are fired that may or may not be required to be handled to ensure that that state transition can be completed. The important thing here is that the high level orchestrator does not actually perform the key logic to complete an operation. That logic is delegated to an agent or microservice. The scheduling of which agent or microservice handles each event is specific to the nature of the operation (long running, targeted to hypervisor, etc).


Thanks for the reply Darren. I had actually read all of the cattle.io docs and have started poking through the source code. My initial reaction was that I really liked how the orchestration of resources was modelled as a type of state machine. It really clicked for me and I could see how something like this would have value on it's own as a sort of run-deck or the like. It also simplifies the agents.. I can see how creating a .Net agent would be dead simple, and I'm a firm believer that if you're orchestrating on Windows you'll end up wishing you were using .Net eventually if you aren't already.

But to the topic at hand, I was a bit confused about how the bidding comes in to play with Diego. After reading the Diego design notes, https://github.com/cloudfoundry-incubator/diego-design-notes , I'm believing this mostly has to do with the scheduling. So it seems to be an alternative to the omega-style serialized scheduler of Flynn, or however Fleet handles this. My current understanding is that stampede.io relies on Fleet to handle scheduling, I'm not sure how stock cattle.io gets about it.. It would seem CF still has high-level orchestration(going from staging, then running, etc) and that a comparison between the two systems would be more involved than I had thought(that and cattle.io concerns itself with IaaS as well) :|


Fleet is used only to bootstrap the system. It has very little to do with the larger architecture of stampede or cattle. I view orchestration and scheduling as two completely different topics. When looking at IaaS the scheduling needs are initially quite simple. As such I don't have a mature scheduler and really don't intend on building one. As your requirements mature the needs for more complex scheduling is needed. I've gone down that path many times and it snowballs so quickly into a very difficult problem.

With cattle I’ve purposely decided to defer the scheduling needs to something like Mesos or possibly YARN. I don’t intend to try to reinvent that wheel. I have yet to do that integration.

The CF Diego approach is one that seem very common today. In general all of the Docker “orchestration” platforms out there today are mostly just Docker plus a scheduler. This approach is great, as long as you don’t need complex orchestration. As the CF folks point out, complex orchestration is hard. If you can remove the need to orchestrate and build a system that needs only a scheduler it is generally much simpler and easier to scale.

The problem I have is that stampede focuses heavily on real world application loads and boring practical issues that exist in crufty old applications that exist in the wild. These apps have requirements that necessitate complex orchestration. As such, with stampede I tackle orchestration head on. I try to find a sane and scalable approach to it. Which is really quite difficult, but it is my area of expertise. I feel if I have a platform that can excel at both orchestration (from the native capabilities of cattle) and scheduling (by leveraging a mature scheduler framework) I’ll have an extremely capable base to build the next generation of infrastructure.


Thanks for the clarification and insight :) I was mistakenly under the impression that stampede was utilizing Fleet's scheduling abilities.. Assumptions :[


State machines and protocols are an excellent way to model the problem of managing the lifecycle of a single task or LRP; they don't solve the problem of the overall system. It's the latter which is actually much harder.

Consider the classic termite-and-wood simulation[0]. In this simulation I have a population of termites, randomly scattered on a grid. And flakes of wood, also randomly distributed. I want to gather this wood.

A current-generation approach would be to, for example, divide the grid into subgrids, assign local termites to perform exhaustive search, progressively aggregating into larger piles in successive rounds. This would require substantial engineering to ensure that my central wood-gathering algorithm has checkpointing, failover, robust message passing, reruns etc etc.

Alternatively, I can give each termite 3 simple rules:

1. Walk randomly.

2. If you don't have a piece of wood in your jaw and you come across one, pick it up.

3. If you have a piece of wood in your jaw and you come across another piece of wood, drop the wood in your jaw.

This latter solution doesn't have difficult -- in some cases impossible -- engineering requirements. And it works very well.

Most systems at this scale eventually wind up push intelligence out to the nodes, because (forgive my wankery here) it's impossible in a relativistic universe to have perfect simultaneous knowledge of phenomena separated by space and time. You need to subdivide the problem so that agents can solve their own and rely on emergent behaviour in the overall system.

[0] http://ccl.northwestern.edu/netlogo/models/Termites


This makes a lot of sense for certain problems in the domain. I've thought a few times about how to introduce the right bit of agency into actors in order to have them accomplish a larger goal without a central authority.

I created a couchbase cluster joiner that would have the nodes converge into a single initial cluster based on ec2 tags. I thought about how people would group up on their own when walking into a room and came up with:

  Join a group
  Judge value of group(is it the largest + tie breaker)
  Leave and join group of highest value
The problem wasn't so hard as the termites but it removed the need for a central controller.. And I was proud of myself :)


> I've played with the idea of a hybrid system: Register jobs in Postgres, use RabbitMQ

That is similar to what I've been using at $DAY_JOB for a low volume work queue that really, really needs to function exactly on time and never, ever have an issue. The only difference is I use MariaDB/Galera and dropped the MQ entirely as it could handle the volume easily [its peaks @ 100 r/s for hours ... of course when your only allowed 100 r/s against an API...].

The only reason I did it was because of time-based rate limiting on external APIs where the vendor refuses to increase our bucket size.

Tbh, I'd only consider it for a low volume message queue. If you want something that operates on the master/slave model I'd use Redis as the queue.

https://github.com/resque/resque https://github.com/blog/542-introducing-resque

If it is good enough for github, its probably good enough for you. ;)


There is not very much relational about jobs, you just need a good persistence mechanism. Have you seen Gearman (http://gearman.org) ? It is designed explicitly as a job manager.


There's lots of stuff that is relational about jobs the moment you have a large number of them, and need to be able to search and filter by different statuses, different users, different sources, date and time, type of job etc., or run reports over them (how many percent of jobs are in what states? average latency to start processing? average latency to completion?)

You can work around that by putting metadata about the jobs into an RDBMS, or collating it separately, so you can certainly make do without an RDBMS...

Or you could just put them in the RDBMS in the first place and place a few indexes and optionally triggers to create log entries on state changes.


These all sound like attributes of a job definition - not separate entities to which a job is related. You are correct they would need to be indexed.


Gearman looks decent, though it does not have persistent queues, and relies on an external storage such as an RDBMS to provide that. Not sure if there is that much gained. Still makes the database a SPOF, for example.


It depends on the database back-end you use; it can use libmemcached which opens up the possibility to use a CouchBase cluster if you really need to go that far.


Forgot to say that a decent job manager should be sufficiently composable that you can integrate a GUI and high-level monitoring into it by exploiting its API.

I consider the persistence layer a private implementation detail of a system. If Gearman controls an RDBMS schema, for example, I don't want to bypass Gearmen to mess with that directly, ever. But as far as I can see, German doesn't have an API to access anything in the queue.


Indeed. And we know that Queues Are Databases

    http://research.microsoft.com/pubs/69641/tr-95-56.pdf
But the issues here as you've identified are different concepts being dealt with at different levels.

You can make the query more efficient by using a different schema. You can move the polling function to a single poller which then triggers the event out. And you can even have the database do the trigger - for most databases; certainly Oracle and Postgres.


I hear you. I had to build a job queue system back in 2005 at one job - pretty much what you said - insert as either ready or pending, start to consume as in-process (important to have a transaction around the find-and-mark operation, of course), mark complete at end of job. Using a DB allowed easy querying to find out what was going on recently, as well.

On the other hand, last year I worked on some JMS stuff that was essentially also just a job processing system. To me, the JMS stuff seemed like a total waste and unneeded complication vs DB queue-tables, other than integrating with other software that wanted to use JMS. On the other hand, I can't see the "enterprise" minions that developed that debacle reasoning about concurrency well enough to do the job with a DB. E.g. - using event driven listeners to dequeue and acknowledge the messages, and than placing them in a list (i.e. - another queue) in memory, even though they had not been processed yet!

Yet another example of resume padding, where it's important to have 3 to 5 years experience with some tool, since the intelligence to reason about what the tool actually must do, and why, won't be recognized.


When I did billing at yahoo, about a decade ago, the European billing system (Yahoo had about ten billing systems at the time, thanks to a combination of turf wars and acquisitions) was built almost entirely around a job management system in an RDBMS, where every little thing that was done (issue an invoice; charge a credit card; refund etc.) would be a result of a process picking up the next applicable job, look at the state, decide what the next step was, process, log what was done, and update state.

Nothing fancy anywhere, which is how it should be (it was billing; we were processing millions of dollars of payments).

The nice thing about a proper job management system was that each state transition on a job was logged, and every state transition was the result of executing a quite small amount of code with a quite small amount of input data, so if/when anything went wrong, it was a matter of going through the audit trail until something didn't match up, and it means we knew what component had done something wrong, and what data it had been processing.

Wherever you can afford that overhead (or can't afford not to take that overhead), it makes things so nice to deal with.


Replying to myself because I forgot two aspects that messaging services handles poorly: Scheduling and prioritization.

Scheduling is important for jobs, but not events. Prioritization is useful for both. But RabbitMQ doesn't even support assigning priorities to messages.

Scheduling can be done with RabbitMQ by having a cron job that publishes a job message. But that means you're statically tied to a cron configuration. A more flexible configuration would let clients post future jobs to a special queue (or queues, perhaps organized by "daily", "weekly" etc.) that is read by an external scheduler app that reschedules the messages into the proper worker queue. But this stuff gets (unnecessarily) complicated fast.


I'm pretty curious as to your statement 'Register jobs in Postgres, use RabbitMQ as the trigger mechanism. But this puts a burden on the processor, which now has to use two APIs.'

Does that mean you know of a way to do it with just a DB and without a MQ?

If you store the jobs in the database and dont have a parallel MQ system, exactly what triggers the job to be run?

For example, a job might be to process a batch of pictures. To create the job, you would insert a row into the database. But when you insert the row, how would another machine know to wake up and start processing it? Is that other machine just polling? If so, how do you ensure two machines on the cluster dont pick up the same job?


Postgres has "listen"/"notify", which lets you easily signal between concurrent clients; one notification is delivered to exactly one listener. You can't pass along any information, though you can use the signal name as a way to target listeners.


Actually, when you NOTIFY a channel in Postgres it gets delivered to all clients who have previously run LISTEN on that channel. And you can send an optional payload (up to 8000 bytes):

http://www.postgresql.org/docs/9.3/static/sql-notify.html


Oops, my mistake. That's actually disappointing, since it means that for every message, N workers will all wake up and needlessly try to grab the message.

You can get around that by using advisory lock to make access exclusive, but they will still all wake up and will still have issue a select.


Why not have N channels for N listeners? These are very lightweight (no persistence/retry, etc. -- essentially just a way to avoid SELECT polling by consumers).


That ties the channel names to the number of listeners, which opens you up to race conditions when the pool of workers changes. And you would have to keep track of which channels aren't busy at any given time, in order to avoid waking up busy workers that can't accept a job. Not impossible, just more stuff that could go wrong.


I think you're right, that would need to be built into whatever job management logic you implemented before the notification was sent. This type of architecture is more suitable for a workflow or job management system rather than fanning out data processing to N workers.


Completely agree. I also think projects should draw a clearer distinction between queues, and messaging. ZeroMQ for example, is not a queue, it's a toolkit for building a message bug!


> I've played with the idea of a hybrid system: Register jobs in Postgres, use RabbitMQ as the trigger mechanism. But this puts a burden on the processor, which now has to use two APIs.

I've had exactly the same idea for my next project. Seriously, I'm quite surprised that something like that doesn't seem to exist already.


I built a similar mechanism using SQS and DynamoDB instead of RabbitMQ and PostgreSQL, respectively. The nice part is it requires no servers for job management. (Just workers obviously.)

https://github.com/gfodor/skyrunner

(It's still a bit half baked, but could be useful as a starting point.)


Nice, thank you. I've been looking for exactly this---SQS and DynamoDB seem like a good combination.


You may not care, but I personally avoid SQS and Dynamo for lock-in purposes. I'd rather run my own systems and have the option of moving to GCE or something later.


I've found in the past that moving code between different queue and job coordination systems is pretty straightforward. Operationally of course you need to learn the new infrastructure but that is unrelated to fears about vendor lock-in.


Dynamo is more of a beef than SQS. I've found that Dynamo usage tends to grow. Like a fungus. Applying sufficient fire to remove it later seems beyond many shops' ability to marshal.

(Also, SQS is really weird about latency and places pretty ugly limits on responsiveness.)


Out of interest, what do you think the benefits of such a system would be? I'm concerned that it would only add complexity.

For my part I plan to hide the queue processing in a central app (we're all SOA), so that workers don't need to know anything about databases, serialization formats or AMQP. They would just wait on a socket to get new jobs.


A message queue stores, well, messages, so it is really good for events. But I also need to persist state machines that run for months. They must be queryable and so on. A db is good at this. On the other hand, I've had a very unpleasant experience with the database-as-IPC-channel and do not wish to ever go near that again. So using a rdbms to persist state and a message queue for the events seems like the conservative approach. Am I overlooking something obvious here?


Not at all, just wondering about your thinking. Note that I think using a database for IPC is a bad idea if you're actually using tables and such to coordinate; Postgres' listen/notify IPC is much better.

That said, it has issues, as pointed out elsewhere in this thread; when you notify, all listeners wake up, which is not what you want with a one-job-per-worker dispatch system.

If you abstracted the listening to a central job manager app, however, you could use "listen" to wake up just the job manager, and then use that wake-up to dispatch the next job to an available worker.


RabbitMQ's consistent hash exchange is pretty useful to route jobs to where you need them.

https://github.com/rabbitmq/rabbitmq-consistent-hash-exchang...


So basically ,you're saying that a pubsub system is (obviously) hard to model with a RDBMS,while a job queue is a perfect use case,because jobs have states, for RDBMSs right ?


Also because you want persistence (durability) for jobs; overhead and latency is okay for jobs, whereas events want super low latency and don't need persistence.


just for your radar: i've used pgq from skype, for queueing in postgresql a couple of times, and liked it.

http://wiki.postgresql.org/wiki/Londiste_Tutorial_(Skytools_...

once was implicitly when i used londiste for replication. more recently was for queueing, from lisp.


I have seen it before, never used it. Thanks for the reminder.


One could replace locking with optimistic concurrency. Off the top of my head (MS SQL):

  -- SETUP

  drop table dbo.myqueue

  create table dbo.myqueue (
    itemId int identity primary key clustered,
    worker varchar(32) null,
    myguid uniqueidentifier null,
    work_timeout datetime null,
    picture_to_resize nvarchar(128) not null, 
  )

  -- ENQUEUE WORK

  insert dbo.myqueue(picture_to_resize) values('Kitten.jpg')
  insert dbo.myqueue(picture_to_resize) values('Doggy.jpg')

  -- PROCESS WORK FROM QUEUE

  declare @worker varchar(32)  = 'worker-21'
  declare @guid uniqueidentifier = newid()
  update t set 
    t.worker = @worker, 
    t.work_timeout = DATEADD(minute, 30, getdate()),
    t.myguid = @guid
  from 
    (select top 1 * from dbo.myqueue 
      where work_timeout is null or work_timeout<getdate()
      order by itemId asc
    ) t

  select * from dbo.myqueue where myguid=@guid
  
  -- If nothing found, client should retry...

  -- UPDATE TIMEOUT AS PROGRESS IS MADE
  -- Don't update if another worker stole our job!

  update t set t.work_timeout =DATEADD(minute, 30, getdate())
    from dbo.myqueue t where myguid=@guid and t.worker=@worker


  -- WHEN WORK IS DONE, ITEM SHOULD BE DELETED/ARCHIVED...


I worked at a mass notification company that used a similar approach and it worked well with a handful of workers pulling in batches that could push out 300 to 500 of messages per second and could queue a quarter million messages in a few seconds. There wasn't really a need to optimized past that since the throughput on the rest of the system was the bottlenecked by outside components.

The problem with queuing in {insert new hot queueing system} is that persistent and failover support is often weak. They are by default are more efficient and simpler than RDBMS. If you're running Snapchat, Redis or RabbitMQ or whatever is entirely reasonable.


Probably unnecessary. You can have concurrency and locking without using any tricks, because SQL Server has this little nice feature - the 'READPAST' hint. It's been added just to make queue implementations efficient and AFAIK only Oracle has something similar.


Coming from the WebSphere MQ world, there is a saying that has been around forever. The saying is "Don't use a messaging system as a database." :p Sometimes people have to be reminded of the obvious I guess!


With regards to the index - did they consider a partial index on (queue, date_sent, consumed = false) ? So that already-consumed messages aren't in the index.


My recollection is that Oracle supports notifications, which is essentially an event system.

I'm surprised that Postgres has not yet implemented something like this. Would it not make sense to have a notification mechanism integrated into the RDBMS to avoid the necessity of polling ?


Postgres does have notifications. And the article mentions this:

> Notifications are not part of the SQL standard and most relational databases do not support them. There are some exceptions, however. For example, PostgreSQL has very decent support for notifications.


OK, thanks for that information.

I scanned the documentation TOC and found no mention of them. I guess I would have thought such a fundamental feature would show up in the TOC but it turns out you have to dig into the SQL documentation to find the NOTIFY and LISTEN statements.

If I had used the search feature on the word "notification" I would have found this but it's not necessarily obvious that they would actually be called that (ie. they could be called events or something else).


Oracle has Data Change Notification, which fires a callback when the results of a query change, and AQ which is a pub-sub mechanism that you can invoke from a trigger.


The way to do this is to have a field in the message record that says which consumer it is assigned to, so the consumers set that, process and then set the consumed flag. Then the db can be coherent, and show which consumer processed what.


The restriction on delivery also implies that a message will not be delivered twice.


I'd say: use RDBMS for messaging wherever you can. Messages are data, please keep it where the data belongs. If you have all your data together it's much easier to do backups/restores or any other maintenance tasks. Also, you want to be transactional, and you will if sending/receiving messages is just a part of a normal database transaction (You don't even need distributed transactions). RDBMSes are inherently transactional and they've done this right, why replace it with something half-baked and unreliable? And your admins will be happy too, they don't want to support any exotic messaging technology. Just make sure you know what you're doing, RDBMS can be quite powerful message queue if you do it right.


Have you read Patterns of Enterprise Integration Architecture, by Martin Fowler? I think it might change your mind on this topic. Messaging isn't the same thing as static data.


I think nobody ever read that book. It's more used as a reference. I tried reading it, but it was the most boring book I ever encountered (even worse than structure and interpretation of computer programs, which is also incredibly boring).


I read it. It's one of my favourite technical books.


I don't remember M.Fowler saying that db table can't be used as a message queue.


Don't get this post. He is comparing RabbitMQ to PostgreSQL and states that RabbitMQ is better for messaging out of the box because you don't have to build anything else.

Duh?


I use MongoDB for everything. Its web scale.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: