Hacker News new | past | comments | ask | show | jobs | submit login

There is not very much relational about jobs, you just need a good persistence mechanism. Have you seen Gearman (http://gearman.org) ? It is designed explicitly as a job manager.



There's lots of stuff that is relational about jobs the moment you have a large number of them, and need to be able to search and filter by different statuses, different users, different sources, date and time, type of job etc., or run reports over them (how many percent of jobs are in what states? average latency to start processing? average latency to completion?)

You can work around that by putting metadata about the jobs into an RDBMS, or collating it separately, so you can certainly make do without an RDBMS...

Or you could just put them in the RDBMS in the first place and place a few indexes and optionally triggers to create log entries on state changes.


These all sound like attributes of a job definition - not separate entities to which a job is related. You are correct they would need to be indexed.


Gearman looks decent, though it does not have persistent queues, and relies on an external storage such as an RDBMS to provide that. Not sure if there is that much gained. Still makes the database a SPOF, for example.


It depends on the database back-end you use; it can use libmemcached which opens up the possibility to use a CouchBase cluster if you really need to go that far.


Forgot to say that a decent job manager should be sufficiently composable that you can integrate a GUI and high-level monitoring into it by exploiting its API.

I consider the persistence layer a private implementation detail of a system. If Gearman controls an RDBMS schema, for example, I don't want to bypass Gearmen to mess with that directly, ever. But as far as I can see, German doesn't have an API to access anything in the queue.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: