|
Home > Archive > Unix Programming > April 2007 > fork freeze
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| m4r3ck 2007-04-11, 1:21 pm |
| Hi,
I have a fork freeze problem in my application, and i would appreciate
some help.
I've got a process that forks a reader process and some performer
process, they're communicate through sockets (read, write functions).
The main process works like a "man-in-the-middle" (child processes
sending him data, and he passes them next).
I've have a signal handler, for the child processes, if they crashed,
the main processes forks them again ... until all the data has been
processed (with remembering the bad data, and excluding it)
Everything works fine, (socket communication, child process works like
they should). But when a performer process has crashed, and I try to
fork it again, then the parent process get freezed on this fork
command, like there was a deadlock ( => stops managing communcation
from other running child processes)
The crashed child process returns signal 8 (floating-point exception)
I cannot figure out, what can be the blocking element, maybe something
with the sockets?
got someone the same problem ?
| |
| Gianni Mariani 2007-04-11, 1:21 pm |
| m4r3ck wrote:
> Hi,
> I have a fork freeze problem in my application, and i would appreciate
> some help.
>
> I've got a process that forks a reader process and some performer
> process, they're communicate through sockets (read, write functions).
> The main process works like a "man-in-the-middle" (child processes
> sending him data, and he passes them next).
> I've have a signal handler, for the child processes, if they crashed,
> the main processes forks them again ... until all the data has been
> processed (with remembering the bad data, and excluding it)
>
> Everything works fine, (socket communication, child process works like
> they should). But when a performer process has crashed, and I try to
> fork it again, then the parent process get freezed on this fork
> command, like there was a deadlock ( => stops managing communcation
> from other running child processes)
>
> The crashed child process returns signal 8 (floating-point exception)
>
> I cannot figure out, what can be the blocking element, maybe something
> with the sockets?
>
> got someone the same problem ?
>
Did you try to strace ? It would help you to know which system call
your process is freezing on.
| |
| m4r3ck 2007-04-12, 7:21 am |
| On Apr 11, 4:00 pm, Gianni Mariani <gi3nos...@mariani.ws> wrote:
> m4r3ck wrote:
>
>
>
>
>
>
> Did you try to strace ? It would help you to know which system call
> your process is freezing on.
yes, it was the kernel call: fork()
| |
| Maxim Yegorushkin 2007-04-12, 7:21 am |
| On Apr 11, 1:45 pm, "m4r3ck" <Marek.Bujnov...@gmail.com> wrote:
> Hi,
> I have a fork freeze problem in my application, and i would appreciate
> some help.
>
> I've got a process that forks a reader process and some performer
> process, they're communicate through sockets (read, write functions).
> The main process works like a "man-in-the-middle" (child processes
> sending him data, and he passes them next).
> I've have a signal handler, for the child processes, if they crashed,
> the main processes forks them again ... until all the data has been
> processed (with remembering the bad data, and excluding it)
>
> Everything works fine, (socket communication, child process works like
> they should). But when a performer process has crashed, and I try to
> fork it again, then the parent process get freezed on this fork
> command, like there was a deadlock ( => stops managing communcation
> from other running child processes)
>
> The crashed child process returns signal 8 (floating-point exception)
>
> I cannot figure out, what can be the blocking element, maybe something
> with the sockets?
Are you calling fork() from a signal handler? If so, it might be
pthread_atfork() handlers calling async not safe functions or
something like that.
http://www.opengroup.org/onlinepubs...tions/fork.html
<q>
A process shall be created with a single thread. If a multi-threaded
process calls fork(), the new process shall contain a replica of the
calling thread and its entire address space, possibly including the
states of mutexes and other resources. Consequently, to avoid errors,
the child process may only execute async-signal-safe operations until
such time as one of the exec functions is called.
[THR] [Option Start] Fork handlers may be established by means of the
pthread_atfork() function in order to maintain application invariants
across fork() calls. [Option End]
When the application calls fork() from a signal handler and any of the
fork handlers registered by pthread_atfork() calls a function that is
not asynch-signal-safe, the behavior is undefined.
</q>
| |
| William Ahern 2007-04-12, 7:21 am |
| m4r3ck <Marek.Bujnovsky@gmail.com> wrote:
<snip>
This is assuredly a bug in your code, or the next feasible--but very
slight--possibility would be an issue w/ how your execution environment is
setup (wierd kernel modules, etc).
[vbcol=seagreen]
>
> yes, it was the kernel call: fork()
More often than not when a process locks up on *me* its actually simply
stuck in a loop. strace will not give coverage to the vast majority of your
code, and leaving aside mitigating factors, statistically the bugs are more
likely to appear in your own code. The fact that fork() was the
_last_ syscall doesn't say much itself (is it really pending, as-in strace
doesn't show the return value?).
In this particular case you should try to attach to the frozen process with
GDB (if possible), or another debugger of your choice. For gdb:
gdb -p <PID OF FROZEN PROCESS>
Then just get a backtrace using the `bt' command. That will give you a
better place to begin looking. I would be surprised if it shows that you're
_still_ at the fork() call.
After you resolve that issue, or maybe while resolving it, you should
definitely employ valgrind (or purify). I bet it will find at least a few
interesting bugs.
| |
| m4r3ck 2007-04-12, 7:21 am |
| On Apr 12, 9:55 am, "Maxim Yegorushkin" <maxim.yegorush...@gmail.com>
wrote:
> On Apr 11, 1:45 pm, "m4r3ck" <Marek.Bujnov...@gmail.com> wrote:
>
>
>
>
>
>
>
>
> Are you calling fork() from a signal handler? If so, it might be
> pthread_atfork() handlers calling async not safe functions or
> something like that.
>
> http://www.opengroup.org/onlinepubs...tions/fork.html
> <q>
> A process shall be created with a single thread. If a multi-threaded
> process calls fork(), the new process shall contain a replica of the
> calling thread and its entire address space, possibly including the
> states of mutexes and other resources. Consequently, to avoid errors,
> the child process may only execute async-signal-safe operations until
> such time as one of the exec functions is called.
> [THR] [Option Start] Fork handlers may be established by means of the
> pthread_atfork() function in order to maintain application invariants
> across fork() calls. [Option End]
> When the application calls fork() from a signal handler and any of the
> fork handlers registered by pthread_atfork() calls a function that is
> not asynch-signal-safe, the behavior is undefined.
> </q>
Oh, maybe a use wrong terms.
By a signal handler I mean that the parent process has :
signal(SIGCHLD, signal_handler_func()) call.
I don't use threads explicitly at all.
progress info:
i've tried to replace the fork call by vfork, it didn't freezed, but I
have problem to share some data between parent and child processes
(through message queue)
thanks for response
| |
| Maxim Yegorushkin 2007-04-12, 7:21 am |
| On Apr 12, 9:54 am, "m4r3ck" <Marek.Bujnov...@gmail.com> wrote:
> On Apr 12, 9:55 am, "Maxim Yegorushkin" <maxim.yegorush...@gmail.com>
> wrote:
>
>
>
>
>
>
>
>
>
>
>
> Oh, maybe a use wrong terms.
> By a signal handler I mean that the parent process has :
> signal(SIGCHLD, signal_handler_func()) call.
> I don't use threads explicitly at all.
Do you call fork() from the signal handler, that is, from
signal_handler_func?
| |
| m4r3ck 2007-04-12, 7:21 am |
| On Apr 12, 11:25 am, "Maxim Yegorushkin" <maxim.yegorush...@gmail.com>
wrote:
> On Apr 12, 9:54 am, "m4r3ck" <Marek.Bujnov...@gmail.com> wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
> Do you call fork() from the signal handler, that is, from
> signal_handler_func?
yes
| |
| David Schwartz 2007-04-13, 1:24 am |
| On Apr 12, 4:18 am, "m4r3ck" <Marek.Bujnov...@gmail.com> wrote:
[vbcol=seagreen]
> yes
That is a recipe for disaster. If you need to get the same effect as
this, either set a flag in the signal handler and notice it when
you're done handling or block a thread on the signal and handle the
signal synchronously. It requires significant expertise to do any
"real work" from a signal handler that was invoked asynchronously.
DS
| |
| Rainer Weikusat 2007-04-13, 7:16 am |
| "David Schwartz" <davids@webmaster.com> writes:
> On Apr 12, 4:18 am, "m4r3ck" <Marek.Bujnov...@gmail.com> wrote:
>
>
>
> That is a recipe for disaster.
fork is supposed to be async-signal safe.
| |
| m4r3ck 2007-04-13, 7:16 am |
| ok, I've solved it with the vfork call and replacing the message queue
by something other
thanks for help
| |
| Maxim Yegorushkin 2007-04-13, 7:16 am |
| On 13 Apr, 11:15, "m4r3ck" <Marek.Bujnov...@gmail.com> wrote:
> ok, I've solved it with the vfork call and replacing the message queue
> by something other
Just in case, the only thing you can do after vfork is exec.
http://www.opengroup.org/onlinepubs...ions/vfork.html
| |
| Bin Chen 2007-04-13, 7:16 am |
| On 4=D4=C213=C8=D5, =CF=C2=CE=E74=CA=B100=B7=D6, Rainer Weikusat <rweiku...=
@mssgmbh.com> wrote:
> "David Schwartz" <dav...@webmaster.com> writes:
>
>
>
>
> fork is supposed to be async-signal safe.
Maxim has make the reason clear by quote from POSIX standard, see his
reply on 12th.
| |
| Rainer Weikusat 2007-04-13, 1:21 pm |
| "Bin Chen" <binary.chen@gmail.com> writes:
> On 4月13日, 下午4时00分, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
>
> Maxim has make the reason clear by quote from POSIX standard, see his
> reply on 12th.
What makes you think I was replying to that, considering that I
didn't?
| |
| David Schwartz 2007-04-13, 7:18 pm |
| On Apr 13, 1:00 am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
> "David Schwartz" <dav...@webmaster.com> writes:
>
>
>
>
> fork is supposed to be async-signal safe.
That's not the problem. The problem is that the child process is in a
context in which only async-signal safe functions can be used.
DS
|
|
|
|
|