How to write to stderr so people will like you

ready · on April 16, 2010

"flush stdout, write your message to stderr, then flush stderr."

Standard Error is unbuffered: http://www.opengroup.org/onlinepubs/009695399/functions/stdi...

"When opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device."

crc5002 · on April 16, 2010

If you can't recompile the program, an expect script will come very handy:

  $ ./unbuffer ./tst >/tmp/logit 2>&1; cat /tmp/logit
  line 1
  stderr line 1
  line 2
  stderr line 2

http://expect.nist.gov/example/unbuffer

gte910h · on April 16, 2010

This is a nice paradigm when it works, however for programs that are often piped to other programs, I believe this can run into issues with the flush on stdout blocking. Am I wrong here?

scott_s · on April 16, 2010

I looked up the man page for C's fflush(). One of the error messages is EPIPE, "An attempt is made to write to a pipe or FIFO that is not open for reading by any process. A SIGPIPE signal shall also be sent to the thread."

So, at least in C, I think flushing is intelligent enough not to block if no one's receiving the output.

sorbits · on April 16, 2010

I am not sure what you are referring to, but the case where EPIPE is useful is incase you do something like:

    grep foo *.txt|head -n10

Here head will close its stdin after having read 10 lines. If grep writes more it will get the EPIPE error on write (which may indirectly be called by fflush).

This will also send the SIGPIPE signal which by default will kill the process, so grep is automatically terminated when no more output will be read from it.

gte910h · on April 16, 2010

I'm not talking about no one receiving output. I'm talking about this case:

You write a program called "thefirstprogram". Who cares what it does. However, a user is pushing its (standard) output into a second program called "asecondprogram". Now imagine "asecondprogram" has melted, but not terminated. It's basically in a screwed up state.

So the run command was:

bash-2.04> thefirstprogram | asecondprogram

I believe the flushing scheme advocated in the article is to be considered harmful, as you will in fact never see an error message if asecondprogram gets in a bad state and stops reading from stdin (and the buffers fill up).

I do agree the flushing makes for nicer ordering of error messages, but is more fragile.

Only used flushed output when you really need flushed output otherwise you'll get less resilience in the case of meltdown of cooperating programs.

ars · on April 16, 2010

What?!

If the second program is messed up, it makes no difference if you flush or not. You'll still never see an error message.

Stdin will fill up, flush or not, the program will block, and then sit there - no error message.

You could maybe detect if stdin is blocked and say something on stderr - but if you want to do that it makes no difference whatsoever if you flush stdout or not.

gte910h · on April 16, 2010

No, stderr isn't being piped with that command. So you will see it printed out on the command line.

You have to explicitly pipe stderr as well if you don't want this behavior

ars · on April 18, 2010

You don't understand. Once stdout blocks the whole program blocks, so it makes no difference that stderr is not piped. The program is stopped, it won't be making any errors.

Obviously you could make some complicated buffering scheme - but you can do that and flush things anyway. The flush or no flush makes no difference.

gte910h · on April 18, 2010

Only if you're using blocking IO. Just set O_NONBLOCK if you don't want blocking IO.

ars · on April 18, 2010

AKA "some complicated buffering scheme". Without blocking IO you have to buffer the output locally.

gte910h · on April 18, 2010

Or just stop processing....

You're picking nits frankly. The system can be useful. But it does deadlock and/or cause delays if your output program stops reading from the pipe temporarily or permanently. The deadlocked secondprogram was just a simple example of when this happens.

This sort of behavior is a rare case. It is far from rare though in a less degenerate form. In the very simple case of displaying text on the command line you will fill your output buffer all the time, but with the flushes, you'll vastly slow the program down to the speed of text display in your terminal as you make all text output every time, as opposed to just when the buffer display program runs.

Which do you think will run faster: Cat without those flushes or cat with those flushes?

This technique is not universally applicable, as the article says. It does not fit all cases. It has costs.

It does also have applications. But it's not the suggested universal technique.

ars · on April 18, 2010

Sigh.

And the difference between "stop processing" and letting the IO block is what?

gte910h · on April 18, 2010

You can do other things....like say, print out on stderr that there is a problem after a certain timeout, email technical support, or restart the second program.

If you use blocking IO, you just silently fail.

Periodic · on April 16, 2010

Won't it just get buffered into the other program's input buffer?

I believe pipes are block-buffered and basically look just like redirecting to a file.

jrockway · on April 16, 2010

If the process you're piping to never reads stdin and your process writes to stdout with a blocking write, you will eventually be blocked. But just use an event loop for stdout, and all will be well. (That way your program is in control of its own buffering and blocking. select is probably good enough in this case.)

Anyway, if you want to see freakin' echo block because of this, try:

   $ mkfifo foo
   $ echo "OH HAI" > foo
   <blocks>

In another xterm:

   $ cat foo

<echo unblocks and exits>.

btilly · on April 16, 2010

The potential problem comes when stdout and stderr are going to different places. For instance stdout is going to the next stage in a pipeline while stderr is going to the console. Then the error message winds up waiting for the buffering program.

gte910h · on April 16, 2010

Yes exactly. This is the scenario I'm worried about. It's pretty common actually (think logging, etc)

ars · on April 16, 2010

If that happens, it's supposed to block, so let it.

So yes, you are wrong here.

zokier · on April 16, 2010

imho better way to solve this 'problem' would be to write error messages in such way that you don't need stdout as context to interpret them.

ars · on April 16, 2010

That's good to do, but you should flush things correctly anyway. Otherwise your error messages and output conflict.

influx · on April 16, 2010

Agreed, and using a standard logging library like log4* that gives your users a nice way to analyze their logs along with a timestamp of when they happened.

j_baker · on April 16, 2010

"(Note that programs that run other programs need to do more than just this; they need to flush stdout and perhaps stderr before they start another program.)"

Could someone explain exactly why this is?

mbreese · on April 16, 2010

Because you want to make sure your output is sent before the possible output from a child program. This is the same principle as from the article.

lil_cain · on April 16, 2010

Because if they start another program without flushing their buffers, you'll have the output from the new program before the output from the old program, despite the fact that chronologically, the errors were generated before it started.

kqueue · on April 17, 2010

a good practice is not to combine stdout and stderr into the same output when the ordering matters. You'll get a different result depending on the shell you are using.

JoachimSchipper · on April 17, 2010

Erm... if the problem is that stdout may be fully buffered, just set it to line buffered explicitly? (See setbuf(3) and friends.)

(Note that only stdout is the issue, see ready's comment.)

CamperBob · on April 16, 2010

I'm in the habit of using doing a setbuf(stdout,NULL) in the beginning of all my console apps, anyway. Buffered console I/O is something that made sense 20 years ago but certainly not anymore.

If you did a setbuf(stderr,NULL) as well as setbuf(stdout,NULL), you'd achieve the same effect that the author's going for, with no need to remember to flush.

nitrogen · on April 17, 2010

It really depends on how much data you need to write to stdout. If you're writing a single character at a time with no buffering, many terminal programs (GNOME terminal, etc.) will struggle to put out more than a few dozen lines per second, which can significantly slow down your program (I've known people who redirect their cc's stdout to /dev/null to avoid terminal latency slowing the build).