This is missing something that I highly recommend:
"Start with supporting stdin/stdout as the only input and output. This ensures that it is composable with other utilities. You may find you never need anything else"
The two applications for specifying files on the command line is when: they actually do something with that file (e.g. move the file) or when you operate on multiple files and/or directories (e.g. backup application).
Otherwise it still might be useful, but in general it's kind of unnecessary. It adds logic to your application that it doesn't need, which violates "do one thing, and one thing very well" (albeit only a very small violation).
ISTM that you're arguing from a perspective of intrinsic necessity. Your argument is that, anyone can cat a file into my utility, if they need that functionality.
Sure. However, for example, GNU sort doesn't work that way. Most utilities don't work that way. Most utilities accept a file as an source of input, and most of those don't act on the inode. That's the status quo.
This has bitten me occasionally, even though I know the workarounds (tempfiles or pipe-consumers like sponge(1)).
I'm wondering if there's any practical use for the behaviour, or if it's worth hacking a shell such that it produces a warning/interactive confirm prompt for it (or transparently buffers to DWIM maybe?)
It's pretty common to `set -o noclobber` in beginner dotfiles; doing deferred-open-if-exists is an interesting idea that would probably get a lot of resistance :-)
A third is "it uses multiple files (and the behavior can't be replicated by concatenating them)."
Picking one input to still be consumed from stdin can make sense, though. A special case of this is config files, which are almost never read from stdin (they usually have a default location or several).
Outputting to stdout gives the control and power to the user - if they want it in a file which they want to name, they can do that; if they want to pass the output as input to another application - they can do that too.
There are few usage scenarios where such behavior isn't enough, like for example fsck, but even there this paradigm is flexible enough to work - for such applications could be split into an analyzer and a repair program. There is nothing stopping one from outputting a binary data stream on stdout; Lots of applications do exactly that on UNIX, compressors come to mind.
What use case have you where stdout/stdin/stderr isn't enough on an operating system family where the core paradigm of usage is that everything is a binary stream of data?
> Outputting to stdout gives the control and power to the user - if they want it in a file which they want to name, they can do that; if they want to pass the output as input to another application - they can do that too.
Please note that we already agree here.
> What use case have you where stdout/stdin/stderr isn't enough on an operating system family where the core paradigm of usage is that everything is a binary stream of data?
My argument isn't that I have a use case where it 'isn't enough.' My argument is that many people come to a cli utility expecting that <foo /path/to/myfile> simply works.
My argument is that many people come to a cli utility expecting that <foo /path/to/myfile> simply works.
So are for or against that? If you're for the < /path/to/my/file argument, then we are in complete agreement. A command line application should read from stdin where that applies.
I want stdin to work. I also want specifying a file name to work. I think supporting both is good UI design, and follows the principle of least surprise.
Though, you still should provide a way to read from file/write into file. I prefer passing "-" or /dev/stdin (/dev/stdout) for that purpose.
Just for the sake of debuggability, if someone would like to debug your script with a debugger/debug prints/debug reads
Why not both? Many of my favorite utilities are stdin/out by default, have file parameters, and will accept '-' to go back to stdin/out.
This is highly useful if you want to allow composition into scripts without forcing users to dynamically build parameters. Meaning, they can use, e.g. `FILE_NAME=${FILE_NAME:-"-"}; some-util --output $FILE_NAME;` and not have to decide whether or not to use the --output parameter in their script.
If processing the list of files is equivalent to processing the concatenation of the list of files, then you can `ls | xargs cat | my-util`. If it is not, then as I argued in a cousin comment I think that's another good case to support accepting file names.
> The problem with that example is parsing `ls` output.
Yeah, I left that alone to keep things simpler and was assuming `ls` was simply being used as short-hand for "some unspecified approach to outputting file names". In a context where hard-to-handle filenames are possible, you'd of course need to do something a little more robust.
"Start with supporting stdin/stdout as the only input and output. This ensures that it is composable with other utilities. You may find you never need anything else"
Need to read a file? `my-cli < file`
Need to write a file? `my-cli > file`
How about read from a URL? `curl url | my-cli`