"If a process doesn't respond reasonably to SIGTERM, then you should consider removing it from your filesystem."
That's an overreaction. I've seen programs refuse to respond to SIGKILL for reasons out of their control, like getting stuck in an NFS transaction or other "unusual" filesystem and the kernel being unable to process the SIGKILL. "Listing a directory" is not exactly a crazy thing to do.
And no wiggling out by saying "well, that's the kernel, not the process", the process never gets a chance to "handle" SIGKILL, so it can't really screw it up, either. Arguably, all such failures are the kernel; the kernel should not expose any sequence of calls that causes SIGKILL to fail, and over time they tend to be fixed (I haven't seen this on my modern Linux machine in a long time, even playing with some funny stuff), but it has happened and will probably continue to happen as new stuff comes out.
>And no wiggling out by saying "well, that's the kernel, not the process", the process never gets a chance to "handle" SIGKILL, so it can't really screw it up, either.
Technically, you are correct- the process doesn't even know it's been SIGKILL'd. However, there are other things it could do to gracefully handle the scenario upon next start.
Is there an already-existing lock file? Prompt for its removal.
An already existing PID file? Again, ask what to do. Include tools to fix records that may have been left in an inconsistent state.
So on and so forth.
That said, I've never had to do a SIGKILL. However, I still expect my programs to sanely recover from power outages, clumsy interns, RAID failures, and other "acts of god" that may suddenly cause a program to end before it has a chance to clean up after itself. It's part of making robust programs.
That's an overreaction. I've seen programs refuse to respond to SIGKILL for reasons out of their control, like getting stuck in an NFS transaction or other "unusual" filesystem and the kernel being unable to process the SIGKILL. "Listing a directory" is not exactly a crazy thing to do.
And no wiggling out by saying "well, that's the kernel, not the process", the process never gets a chance to "handle" SIGKILL, so it can't really screw it up, either. Arguably, all such failures are the kernel; the kernel should not expose any sequence of calls that causes SIGKILL to fail, and over time they tend to be fixed (I haven't seen this on my modern Linux machine in a long time, even playing with some funny stuff), but it has happened and will probably continue to happen as new stuff comes out.