Unix Programming - Question re: The "kill -9" problem

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > October 2006 > Question re: The "kill -9" problem





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Question re: The "kill -9" problem
Kenny McCormack

2006-10-21, 1:31 pm

What this boils down to is: Yes, despite what some people say, "kill -9"
is necessary more often than it ought to be.

Here's the thing: we all know that "kill -9" should be used only as a
last resort and that "kill -[1,2,3,15]" (one or more of these) should
be tried first. In years past, Randy Schwartz (I think it was Schwartz)
used to post over and over on two topics: UUOC and UUOKm9.

The point, of course, is that if an app has installed a signal handler
(for the signals that can be caught), then it is better for you, the
user, to try to kill it via one of those signals, so that the app has a
change to tidy up and exit cleanly. But that leads to the following
conclusion: If either of the following is true, then using (say) "kill -1"
and "kill -9" should be equivalent. But, unfortunately, in practice,
it doesn't work out like that; frequently, I find that none of the
"nice" kills do anything, but the sledge hammer (-9) works. And, yes, I
believe that one or both of the assumptions below are met, so I am surprised
that -9 still works better:

1) The app hasn't installed any signal handlers
(or) 2) The app is correctly written and there is no bug or loop in
their signal handler(s). E.g., if there is a signal handler
for signal 1 (HUP), it does only safe things and then exits.

Yes, I realize that assumption 2 is giving the app writer a lot of
credit. But let's go with that assumption for now.

jmcgill

2006-10-22, 1:20 am

Kenny McCormack wrote:
> What this boils down to is: Yes, despite what some people say, "kill -9"
> is necessary more often than it ought to be.
>
> Here's the thing: we all know that "kill -9" should be used only as a
> last resort


It should be avoided as a first resort, but the signal is available for
good reasons. The problem is that too many people get in the habit of
going to kill -9 as an alias for kill; it is a habit and it never occurs
to them to try anything softer.

The only real severe consequences I've ever seen from the habit, was a
certain ODBC driver that left statements and connections open on the
remote database, that would not time out for something like 48 hours,
meaning you could kill the remote database just by leaking connections.
Gordon Burditt

2006-10-22, 1:20 am

>What this boils down to is: Yes, despite what some people say, "kill -9"
>is necessary more often than it ought to be.
>
>Here's the thing: we all know that "kill -9" should be used only as a
>last resort and that "kill -[1,2,3,15]" (one or more of these) should
>be tried first. In years past, Randy Schwartz (I think it was Schwartz)
>used to post over and over on two topics:


UUOC and UUOKm9.
^^^^ ^^^^^^

Wt t fk ds tt mn?

Sometimes "kill -9" is needed due to an escalating war between the
programmer who believes that his baby never needs to be killed and
the user who really needs to get control of his session back and
who is sometimes willing to use the house main breaker to get his way.
(I recall one programmer who routinely ignored signals 1 thru 65535
whether or not they existed, then complained when SIGCLD/SIGCHLD
acted wierd in child processes not expecting it to be ignored.)

>The point, of course, is that if an app has installed a signal handler
>(for the signals that can be caught), then it is better for you, the
>user, to try to kill it via one of those signals, so that the app has a
>change to tidy up and exit cleanly.


Except occasionally when you REALLY NEED to prevent the app from
tidying itself and the system into oblivion, either for debugging
using what was left in the temporary files, or to not lose or destroy
important data. (I recall one program that did the equivalent of
"rm -rf" on its temporary directory as cleanup, except when it was
unable to create that directory in a filesystem prone to running
out of disk space, in which case it managed to use / as a default.
This could get messy even though it wasn't running as root.)

>But that leads to the following
>conclusion: If either of the following is true, then using (say) "kill -1"
>and "kill -9" should be equivalent. But, unfortunately, in practice,
>it doesn't work out like that; frequently, I find that none of the
>"nice" kills do anything, but the sledge hammer (-9) works. And, yes, I
>believe that one or both of the assumptions below are met, so I am surprised
>that -9 still works better:
>
> 1) The app hasn't installed any signal handlers
>(or) 2) The app is correctly written and there is no bug or loop in
> their signal handler(s). E.g., if there is a signal handler
> for signal 1 (HUP), it does only safe things and then exits.


Often, the apps will try to catch signals and *continue* a command
loop (either by setting a flag that hopefully gets the main program
out of its loop, or *ick* attempting to longjmp out of a signal
handler, which usually seems to result in intermittent crashes
later). Why this is useful for SIGILL, SIGBUS, and SIGSEGV is
debatable. For SIGINT it's often useful.

Occasionally, I will want to use "kill -9,dammit" (unfortunately,
it doesn't exist) which means I really, REALLY want the process to
go away RIGHT NOW in spite of some kind of lock or driver bug that's
causing the process to hang in "D" state. Occasionally this will
be needed when using NFS filesystems and the network or the NFS
server has gone down for some reason.

Do not confuse "kill -9,dammit" with SIGNUKE, which means that the
system has detected a nuclear detonation nearby and the system has
a few milliseconds to run before it gets vaporized.

>Yes, I realize that assumption 2 is giving the app writer a lot of
>credit. But let's go with that assumption for now.
>



Bjorn Reese

2006-10-22, 1:16 pm

jmcgill wrote:

> The only real severe consequences I've ever seen from the habit, was a
> certain ODBC driver that left statements and connections open on the
> remote database, that would not time out for something like 48 hours,
> meaning you could kill the remote database just by leaking connections.


Another severe consequence is for applications that use resources that
are not automatically removed by the OS on exit (e.g. semaphores). Such
resources are typically limited, and after a while they are used up.

--
mail1dotstofanetdotdk
sjdevnull@yahoo.com

2006-10-22, 1:16 pm

Gordon Burditt wrote:
>
> UUOC and UUOKm9.
> ^^^^ ^^^^^^
>
> Wt t fk ds tt mn?


Useless use of cat, useless use of kill minus 9. The former is (or
used to be) very common usage on comp.unix.programmer

> Occasionally, I will want to use "kill -9,dammit" (unfortunately,
> it doesn't exist) which means I really, REALLY want the process to
> go away RIGHT NOW in spite of some kind of lock or driver bug that's
> causing the process to hang in "D" state. Occasionally this will
> be needed when using NFS filesystems and the network or the NFS
> server has gone down for some reason.


Some OSes (e.g. Linux) allow you to mount NFS partitions with an "intr"
(or, less desirably, "soft") mount option that avoids a lot of these
"processs stuck in D state" problems regarding NFS.

John Gordon

2006-10-23, 1:16 pm

In <12jm11u83jadc5c@corp.supernews.com> gordonb.3atgq@burditt.org (Gordon Burditt) writes:

> UUOC
> ^^^^


> Wt t fk ds tt mn?


UUOC = Useless Use of Cat. Using the "cat" command when you don't need to:

cat textfile | some_program

when you could just as easily do without the cat:

some_program < textfile

--
John Gordon "It's certainly uncontaminated by cheese."
gordon@panix.com

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com