03-16-07 12:24 AM
Sunil Varma <sunil.s51@gmail.com> wrote:
> I'm running a process.
> In that I call fork().
> In the fork() I call exec() to another binary.
> In the child process I catch all the signals and respective log
> messages will get printed, except SIGKILL and SIGSTOP.
> The child process gets killed by itself after 17 hours or so.
> I've checked for any memory leaks, but no memory leaks found in
> valgrind.
> So, what I feel is the child is getting a SIGKILL where in which the
> process terminates immediately.
> Actually I've opened a file and closing and deleting it in any
> occurance of a signal whose default action is terminate.
> Even that file is not getting deleted from the system.
> And also I've registered a function with atexit() to check if the
> process is getting terminated on reaching the end of main().
> Even this is also not happening.
> The most important problem in debugging is the time to reproduce the
> issue.
Can you wait(2) or waitpid(2) for the child process? In that case
you could check the exit status of the child for an indication if
it really got killed by a signal - use the WIFSIGNALED() macro on
the exit status to see if it was killed due to an uncaught signal
and the use WTERMSIG() also on the exit status to find out which
signal it did receive. Since you write the child process is dead
one probably can rule out a SIGSTOP signal since that shouldn't
kill but only stop the process.
The obvious qustion, of course, is why the process should get a
SIGKILL signal at all. Is the process perhaps running out of a
resource, i.e. because there's a maximum amount of CPU time set
a process can use and the process has used it all up? Or is
perhaps memory-overcommitment switched on and the process allo-
cates a huge amount of memory at the start but only gets to
using it in increasing amounts the longer it runs and thus the
OOM killer shoots it down?
Regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
[ Post a follow-up to this message ]
|