Even if you don’t find the tool useful, the implementation under the hood is very clever. The dependency graph between the jobs is stored as a bunch of live processes flock()ing on each other’s log files. Once the files are unlocked, the process execs the requested job. This makes the scheduling and error handling code much simpler. Also, right now the dependency graph is dense (a job depends on all previous jobs) but it would be straightforward to thin it out to allow parallel jobs.
Agreed! I think the general principle of doing UNIXy stuff is "use the capabilities of processes and the filesystem your units of abstraction". It comes out quite nicely sometimes. Another example is SLURM, where processes are used as context managers for keeping track of allocations.
Yes, but it's too bad it creates litter wherever you run it. I'd rather it stored all those logs under ~/.cache/ like the other well behaved utilities, in text files like it does or in a searchable SQLite database.
NQDIR Directory where lock files/job output resides.
Each NQDIR can be considered a separate queue.
Whether or not the default should be $XDG_CACHE_HOME/nq/<something> is a different question. For my own use cases with nq I like the current directory being used, but it would obviously be just as easy for me to set `NQDIR=.`.
This is addressed directly in the first section of the README:
> By default, job ids are per-directory, but you can set $NQDIR to put them elsewhere. Creating nq wrappers setting $NQDIR to provide different queues for different purposes is encouraged.
The AIX print spooler can be used as a generic queueing system - I remember setting up queues with shells as backends. Jobs sent to those queues would then be executed in sequence, and could be cancelled/reprioritized/etc like any print job.
Can't claim any credit for the idea though, it was actually suggested in the documentation...
It might be possible on other unix systems too, but I never had to try it.
I once wrote a backend for what I think of as "the SysV print spooler" (hpux7/8, d-nix, and maybe AIX, but IIRC there were some subtle differences with AIX) to run rayshade on the files fed in and drop the resulting image back somewhere (this was ~30 years ago, I don't exactly recall the details).
I recall consider doing it for the SunOS box we had, but lpd/lpr (what I think of as "the BSD print spooler") was sufficiently different that while it would have been possible, it wasn't worth the extra hassle to get one more machine to be able to render on.
I don't think we ever set it up to shuffle render spool jobs from one host to another, though. And, no, "doing raytracing" was not at ALL what the machines were supposed to do, they just weren't very loaded from their normal workloads, so some of the staff played around with it for fun.
I second the recommendation for task spooler, which seems to cover similar ground to nq. It's a really nice way to queue simple long-running tasks without having to write code or set up infrastructure.
I wrote an answer on Super User [0] on how to use this neat tool. There are use cases where it really comes in handy. For example when copying a lot of files around to various folders that sits on the same disk so that you can assume that it won't run efficiently in parallel.
When I think of it even lftp has queue and jobs commands. But "task spooler" is the generic tool you can use for more general use cases than file copying.
Consider trying redo[0]. It's an idea of D. J. Bernstein (a.k.a. djb) what could already be a good advertising.
Your problem can be solved with make as it pointed by others but I see a wonderful example where redo's target files are pretty clear describing what redo can do.
redo's target files are usually SHELL-scripts but they can be whatever you want if it can be executed by a kernel. `redo-ifchange file1` is a command which waits until file1 have been rebuilt or, by other words, waits until a file1's target file have been executed if it requires.
There are 4 target files to show how to solve your problem --- downloading and merging two files:
curl "http://example.com/foo.json" # redo guarantees that any errors won't update foo.json as it can happen in make world.
bar.json.do file is
curl "http://example.com/bar.json"
After creating these files you could write `redo all` (or just `redo`) and it will create a graph of deps and will execute them in parallel --- foo.json and bar.json will be downloading at the same time.
I'd recommend getting started with a Go version of redo --- goredo[1] by stargrave. There is also a link to documentations, FAQ and other implementations on the web-site.
I'll note that nq's author also has a redo implementation¹. Being generally redo curious I've wondered a few times why their other projects(nq/mblaze/etc) don't use redo, but never actually asked.
What you're looking for is in the class of tools known as batch schedulers. Most commonly these are used on HPC clusters, but you can use them on any size machine.
There are a number of tools in this category, and like others have mentioned, my first try would be Make, if that is an option for you. However, I normally work on HPC clusters, so submitting jobs is incredibly common for me. To keep with that workflow without needing to install SLURM or SGE on my laptop (which I've done before!?!?), my entry into this mix is here: https://github.com/compgen-io/sbs. It is a single-file Python3 script.
My version is setup to only run across one node, but you can have as many worker threads as you need. For what you asked for, you'd run something like this:
$ sbs submit -- wget http://example.com/foo.json
1
$ sbs submit -- wget http://example.com/bar.json
2
$ sbs submit -afterok 1:2 -cmd jq -s '.[0] * .[1]' foo.json bar.json
$ sbs run -maxprocs 2
This isn't heavily tested code, but works for the use-case I had (having a single-file batch scheduler for when I'm not on an HPC cluster, and testing my pipeline definition language). Normally, it assumes the parameters (CPUs, Memory, etc) are present as comments in a submitted bash script (as is the norm in HPC-land). However, I also added an option to directly submit a command. stdout/stderr are all captured and stored.
The job runner also has a daemon mode, so you can keep it running in the background if you'd like to have things running on demand.
Installation is as simple as copying the sbs script someplace in your $PATH (with Python3 installed). You should also set the ENV var $SBSHOME, if you'd like to have a common place for jobs.
The usage is very similar to many HPC schedulers...
I've used (and installed) PBS, SGE, and SLURM [1]. Most of the clusters I've used recently have all migrated to SLURM. Even though it's pretty feature packed, I've found it "easy enough" to install for a cluster.
What is the sales pitch for OAR? Any particularly compelling features?
I imagine in theory Snakemake, which handles dependency graph resolution , could be used to compute dependencies, and its flexible scheduler could then call nq.
OTOH, if just working on one node, skip nq and use Snakemake as the scheduler as well.
I guess some slight tweaks for task persitance and a CLI wrapper for it could let you achieve this (although I don't leverage Ractors so no true parallelism yet).
Anyway, it still does not have an "official" release, nor a stable API, although the code works well and it's fully tested, as far a I can tell. I might consider providing such wrapper myself in the future as I can definitely see it's utility, but time is short nowadays.
Having skimmed the README, I failed to grasp what it is. I would appreciate it if the README featured a clear one-sentence summary of what the tool does. Like...
> cat - concatenate files and print on the standard output
> nohup - run a command immune to hangups, with output to a non-tty
> at — execute commands at a later time
Anyway, it's good to see new conceptual tools being developed for the Unix Toolbox, keep up!
> Build targets clean, depends, all, without occupying the terminal
This accommodates a workflow where one wouldn't prefer to just open up another terminal to do other stuff (maybe you're not in a graphical environment, maybe there's no tmux, etc.). With just the shell features, one could do `make ... && make ... && make ... &`, but that would cause bothersome output that would prevent you from effectively working in the same terminal. One could redirect the output of those commands, but then you lose it, or you have to think of where to collect it. This provides an alternative where you can background and still have convenient access to the output when you need it.
It also offers a better history UX. For example, if you issued,
make .. && make foobar && make ..
and you wanted to just run 'make foobar' again you'd search your history for the last foobar invocation, and be required to do some editing. With nq this wouldn't be necessary.
It's also not clear to me if nq has '&&' or ';' semantics in the event a command fails. I suspect it's ';'.
It probably isn't as user friendly as you'd want, but you can reload the playlist by firing commands at a socket. See the "JSON IPC" section in the mpv man page, and specifically the load* commands. jo¹ and socat² are probably the simplest way to use it if you're not looking for heavy scripting.
The scriptability of mpv is really nice if you're the sort of person who likes building your perfect environment, and also a huge pit of addictive time sinks. </warning>
"Lightweight" adds nothing to the description and "job" is ambiguous. I think "nq - queues the execution of background processes" would be a better description.
How is it in practice different to something like:
cmd && cmd2 &
edit: From reading other comments I guess the difference is not mainly that it keeps running if the shell dies but that it allows you to easily append new stuff to the end of the queue
Using ; will not work if you ran set -e before (eg.: if you're using it in a script). Perhaps more importantly, cmd1; cmd2 & will not work either, you'll need a subshell. So, you need at least: ( set +e; cmd1 ; cmd2 ) &. And there are probably other caveats. And we haven't even got to the output redirection yet.
From the reading of the text, it’s a serialized job queue for arbitrary commands submitted from the command line. It’s the equivalent of “cmd1 & cmd2 & cmd3” except all are put to the background and will be run serially one after another. And new command is automatically added to the end.
This is a neat little command that is kind of nice because it doesn't require a daemon like at or ts (task spooler) and does dependencies unlike a lot of informal queue techniques and job control built into the shell:
> nq tries hard (but does not guarantee) to ensure the log file of the currently running job has +x bit set. Thus you can use ls -F to get a quick overview of the state of your queue.
Is that a typo? Why would you need {ugo}+x to tail a file?
[edit -- sorry, it's not tailing; somehow I read `tail -F` instead of `ls -F`.]
"ls", not "tail". I think the idea is that "ls" will visually mark files with the execute bit set, which makes them easy to pick out in a directory listing.
Ah! Thanks. I guess if they need a second color, they could mark the file `setuid` as well. :P This is clever, but I wouldn't trust any tool that marks not-intentionally-executable things as executable.
It supports a single queue per directory. Unless you're using an ancient filesystem, you should be able to create more than enough queues to handle anything you need.
It's offtopic, but man pages should really adopt a new format (e.g. markdown) instead of roff.
roff is a terrible way to write a document. Its format is ancient and not well documented. Its behavior is not consistent across different implementations. Worst of all, no proper i18n support.
There's a tool like roff2html, but again it's pretty sketchy in terms of reliability and i18n support. I wrote my own converter when I was translating OpenBSD manpages [1], but I hope more people recognize this issue.
[1] https://github.com/euske/openssh-jman/blob/master/roff2html....
The scdocs havde the extension .md because Microsoft Github thinks .sc is SuperCollider files, whatever that is. The .1 files are “compiled” from the .1.md sources.
It happens to be a convenient way to ensure that things are mostly used in a personal context. Most lawyercats will steer clear of software under these terms, and use under such terms is typically disallowed in any megacorp.
I thought it was the opposite; there are some weird legal subtleties to placing things in the "public domain" in some parts of the world, so I thought people wrote statements like in the GP comment to avoid having to use that language at all, and share their code in the least encumbered way possible.
I wonder whether there's a better way to express it. E.g. "if you see this code, I automatically grant to you a nonexclusive, irrevocable, royalty-free license to do whatever you want with it, including re-mixing it, re-distributing it under any license you want, or anything else you might think of." Or is this _worse_? Any lawyers in the room? :)
This is my understanding as well. IIRC, copyright attaches automatically upon creation, and there's no way to disclaim all intellectual property rights under the Berne Convention.
I suppose one could theoretically release something anonymously, but that technically would just be a copyrighted work with an unknown author.
As far as code goes, what's wrong with the MIT license?
Although the WTFPL may have been a joke at first, it is substantially similar to your suggestion. It says, in full:
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004
Copyright (C) 2004 Sam Hocevar <sam@hocevar.net>
Everyone is permitted to copy and distribute verbatim
or modified copies of this license document, and
changing it is allowed as long as the name is changed.
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. You just DO WHAT THE FUCK YOU WANT TO.
The WTFPL is "recognized" in the sense that several Linux distributions ship software released under it. I don't think it has been "recognized" in boring legal terms by any of the organisations that we normally trust to do such things (FSF or OSI). Apparently Bradley Kuhn of the FSF made an unofficial remark that it's a free software license (and I would agree), but clearly being a free license in spirit and philosophy is not the same as being recognized in a legal context. Have any of the paranoid companies like Google and Microsoft used WTFPL software for anything? I wouldn't recommend using WTFPL unless the point is specifically to make life harder for corporations (which I think is a totally valid thing to do).
> I wouldn't recommend using WTFPL unless the point is specifically to make life harder for corporations (which I think is a totally valid thing to do).
The same effect seems to be obtained by using the AGPL. Funnily enough, the two licenses at the "extreme ends" of free software elicit exactly the same kind of corporate panic. I always hesitate between these two for my works.
Why's that? I thought megacorps would love software with no strings attached: at least, they seem to love MIT- and BSD-licensed codebases, which give up pretty much everything but attribution, if I understand correctly.
Is it because it's unclear whether such a statement would really waive all copyright?
I'd imagine it is a statement on the author's views on software copyright. Or perhaps they wanted to offer something to the world, no strings attached. I don't think the word slavishly applies here at all.
Ok, that sheds light into the kind of workflow this might be useful in, like calling stuff from a launcher instead of a terminal or something like that and having them go through nq automatically. Personally, I would just keep the shell open, but options are cool.
You can use e.g. the "screen" utility to detach from a shell and later re-attach. I use this often: start something at work, detach, then go home, re-attach.
You can have multiple sessions, give them names, etcetera.
I was thinking an advantage is that you can halt the queue, and/or the queue will pick up where it left off (on the next item in the queue) if power is cut. Not sure that this does that, though.
A quick read makes me think it does not do that. Instead when one process exits it looks for more queued job files.
[edit]
I suppose if you ran nq on boot, it would find the existing queued job files and you would only lose the one running job? Or maybe it would rerun the one that was in progress. Not sure.
1. at runs jobs at a given time. batch runs jobs "when system load levels permit" [1]. nq runs jobs in sequence with no regard to the system's load average.
2. at and batch have 52 built-in queues: a-z and A-Z. Any directory can be a queue for nq.
3. You can follow the output of an nq queue tail-style with fq.
4. The syntax is different. at and batch take whole scripts from the standard input or a file; nq takes a single command as its command line arguments.
5. nq doesn't rely on a daemon. It's an admirably simple design. Jobs are just flock()-ed output files in a directory.
Microsoft offers an IBM Job Control Language system for Azure.[1] It's intended for migration from mainframes, but it does the job described here. Not that you'd want to take that route by choice.
Looks pretty cool. I would love it if it weren't dependent on filesystem locking. Wondering about network locking now. If it's already forking a daemon, might as well store that lock data in memory and communicate it over a network socket. There's still the question of resolving the network service, but that's a flaw in the TCP/IP stack. I bet it could be solved by a simple wrapper command to a service resolver. Custom protocol syntax for locking per dir should be trivial. Heck, maybe move the network logic out to the wrapper, and have the locking be some other interaction between the wrapper and nq.
Slurm is the go-to free software workload manager / task scheduler. It scales well from a single machine to the largest clusters in existence. Simple things are simple with it while you can also do complex things if you put in more effort.
Line 2:
while pgrep $1 >/dev/null; do sleep 2; done && $*
^-- SC2086: Double quote to prevent globbing and word splitting.
^-- SC2048: Use "$@" (with quotes) to prevent whitespace problems.
Also, how do you pick the right job to wait for if several programs match $1? (E.g. if it's a bash script, $1 will be "bash", which on my system matches lots of things.)
Also, a 2s sleep will slow things down if you want to use it for a whole lot of jobs.
Also, this won't run your second command if the first command you wait for has already exited!
Also, shouldn't $* be `shift; "$@"` or do you only ever queue jobs of the same command?
Overall, a clever trick I intend to steal.