Unix Programming - How to pass network requests to existing child processes?

This is Interesting: Free IT Magazines  
Home > Archive > Unix Programming > August 2005 > How to pass network requests to existing child processes?





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author How to pass network requests to existing child processes?
Jan Stap

2005-08-24, 6:11 pm

Hello all,

I am looking for a scheme where a UNIX process listening on a
TCP-socket can pass incoming network requests to existing child
processes (so to not have to fork for each request).

My first try was by letting the child processes also doing accept() on
the listen socket (after finishing their initial job). Then I tried a
variant where all processes (parent and children) do select(), followed
by accept(). On Solaris 2.7 and 2.9 however, neither approach gives an
even work subdivision; on Linux (kernel 2.2.15) it does however. But I
want my program to be portable (and use a documented way of working).

So I looked at file descriptor (fd) passing (parent sends fd returned
by accept() to given child process). A pipe to each child does not
scale well (2n fd's for n child processes), so I'm looking at UNIX
sockets.

Using UNIX sockets, I wonder whether a single socket file would suit
all processes (less files needed), but that would probably give me the
same trouble as I had in my multi-accept() scheme. And: even with
SO_REUSEADDR on, bind() simply fails if the socket file already exists.
A bind()/fork() scheme would circumvent that, but this bind() behavior
suggests that this is maybe not something one would want to use.

So I end up with the one-UNIX-socket-per-process setup. Detailing that,
it looks a bit expensive to create and delete a socket file with each
child being born and exiting (possible slowdown due to disk I/O). So I
guess I should reuse socket files as much as possible?

Advise on this or suggestions for better approaches are very welcome!

Thanks,

Jan Stap
Eindhoven, The Netherlands

moi

2005-08-24, 6:11 pm

Jan Stap wrote:
> Hello all,
>
> I am looking for a scheme where a UNIX process listening on a
> TCP-socket can pass incoming network requests to existing child
> processes (so to not have to fork for each request).
>
> My first try was by letting the child processes also doing accept() on
> the listen socket (after finishing their initial job). Then I tried a
> variant where all processes (parent and children) do select(), followed
> by accept(). On Solaris 2.7 and 2.9 however, neither approach gives an
> even work subdivision; on Linux (kernel 2.2.15) it does however. But I
> want my program to be portable (and use a documented way of working).
>
> So I looked at file descriptor (fd) passing (parent sends fd returned
> by accept() to given child process). A pipe to each child does not
> scale well (2n fd's for n child processes), so I'm looking at UNIX
> sockets.
>
> Using UNIX sockets, I wonder whether a single socket file would suit


[snip]

You could take a look at how apache handles this.
Apache is extremely portable. They seem to have several schemes to do
this, but fd-passing + pre-forked children is certainly one of them.

HTH,
AvK
Ulrich Hobelmann

2005-08-24, 6:11 pm

Jan Stap wrote:
> Hello all,
>
> I am looking for a scheme where a UNIX process listening on a
> TCP-socket can pass incoming network requests to existing child
> processes (so to not have to fork for each request).


If you want to use processes, not threads, you can use sendmsg(2) to
pass the file descriptor to the child. But I wonder how you'd determine
which child is idle and which one is still busy processing requests.
Overall it seems like you'd need a good deal of IPC. Maybe forking
wouldn't be that much slower (did you measure the overhead? is it
really too much?).

(socketpair() before forking should work, I guess)

> My first try was by letting the child processes also doing accept() on
> the listen socket (after finishing their initial job). Then I tried a
> variant where all processes (parent and children) do select(), followed
> by accept(). On Solaris 2.7 and 2.9 however, neither approach gives an
> even work subdivision; on Linux (kernel 2.2.15) it does however. But I
> want my program to be portable (and use a documented way of working).


Maybe the equal workload distribution isn't important (or is it); only
that an available child quickly gets the connection.

> Advise on this or suggestions for better approaches are very welcome!


Hm, pthreads might work, but some people don't like them (I do). Just
have a structure where child threads can wait and let the parent post a
condition when a connection is ready (setting a slot in the structure).
Fast and simple. Avoid sharing anything else between threads; use
messaging like in the process case.

--
I believe in Karma. That means I can do bad things to people
all day long and I assume they deserve it.
Dogbert
Maxim Yegorushkin

2005-08-25, 7:49 am


Jan Stap wrote:

> My first try was by letting the child processes also doing accept() on
> the listen socket (after finishing their initial job).


This is what Apache prior to 2.x does. Does Apache work on your
Solaris?

Jan Stap

2005-08-25, 7:49 am


moi wrote:
> Jan Stap wrote:
> [snip]
>
> You could take a look at how apache handles this.
> Apache is extremely portable. They seem to have several schemes to do
> this, but fd-passing + pre-forked children is certainly one of them.
>
> HTH,
> AvK

I'll have a look, thanks!

Jan Stap

Jan Stap

2005-08-25, 7:49 am

Ulrich Hobelmann wrote:
> Jan Stap wrote:
>
> If you want to use processes, not threads, you can use sendmsg(2) to
> pass the file descriptor to the child. But I wonder how you'd determine
> which child is idle and which one is still busy processing requests.
> Overall it seems like you'd need a good deal of IPC. Maybe forking
> wouldn't be that much slower (did you measure the overhead? is it
> really too much?).
>
> (socketpair() before forking should work, I guess)


Yep, sendmsg() I want to use when using sockets for transporting a file
descriptor. The child status I keep in an array of structures in shared
memory. Initially I planned to pass the file descriptor also via this
array, until I realized that the receiving side can do little with
that... Using socketpair() would give me n file descriptors on the
parent process side for n children, which I don't really like.

The overhead of 1 session per child is little for my application: the
program (an SMTP mail filter) is running on a Sun Fire V240 with 2G of
mem with Solaris 2.9; the filter is in loopback mode with Postfix
("after-queue" setup), which uses one SMTP session per email currently.
Mail load is about 15.000 messages/hour, thus giving about 4
connections/second to the filter. System load is <1.0 and about 1.5G of
memory is free. Ok, it's a big machine, but still.

But for usage on smaller machines it may pay off to pass more work to a
given child process. And I find it more elegant. Using it specifically
with Postfix, one could enable connection caching in Postfix (do >1
mail per SMTP session) and leaving the filter at 1 session per child.

>
> Maybe the equal workload distribution isn't important (or is it); only
> that an available child quickly gets the connection.


Maybe a little bit: the more even the workload is spread, the less
fork()'s are needed in general (given a set idle timeout per child).

>
> Hm, pthreads might work, but some people don't like them (I do). Just
> have a structure where child threads can wait and let the parent post a
> condition when a connection is ready (setting a slot in the structure).
> Fast and simple. Avoid sharing anything else between threads; use
> messaging like in the process case.


I will try that at some point in my free time, but cannot spend the
amount of hours needed for conversion in my boss'es time :-)

Thanks for your answer!

> --
> I believe in Karma. That means I can do bad things to people
> all day long and I assume they deserve it.
> Dogbert


Jan Stap

2005-08-25, 7:49 am


Maxim Yegorushkin wrote:
> Jan Stap wrote:
>
>
> This is what Apache prior to 2.x does. Does Apache work on your
> Solaris?


Are you sure? I did an strace (Linux, kernel 2.4.20) on the apache
processes (Apache 1.3.26; hardly any web traffic) and found:

- main process: loop with select() just for a one-second timeout,
followed by
a wait4(-1, .., WNOHANG, NULL); so just wait for a child to die as
it seems
- one child process: select(31, [28 29 30], NULL, NULL, NULL)
handle a connection?
- the other child processes: semop(..) (same semaphore for all)

I need to check the code to get more clarity. Btw. Solaris runs Apache
1.x ok (on a Solaris 2.8 box over here, but for which I don't have root
rights).

But thanks for your answer anyway!

Jan Stap

Maxim Yegorushkin

2005-08-25, 7:49 am


Jan Stap wrote:
> Maxim Yegorushkin wrote:
>
> Are you sure? I did an strace (Linux, kernel 2.4.20) on the apache
> processes (Apache 1.3.26; hardly any web traffic) and found:


Pretty sure, although it's been quite a while since I last browsed the
code.

> - main process: loop with select() just for a one-second timeout,
> followed by
> a wait4(-1, .., WNOHANG, NULL); so just wait for a child to die as
> it seems
> - one child process: select(31, [28 29 30], NULL, NULL, NULL)
> handle a connection?
> - the other child processes: semop(..) (same semaphore for all)


IIRC, they protected accept() call with a mutex/semaphore lock to avoid
thundreing herd problem on old kernels.

Ulrich Hobelmann

2005-08-25, 6:04 pm

Jan Stap wrote:
> Yep, sendmsg() I want to use when using sockets for transporting a file
> descriptor. The child status I keep in an array of structures in shared
> memory. Initially I planned to pass the file descriptor also via this


shared memory? I find that much more ugly than pthreads, but YYMV.

> array, until I realized that the receiving side can do little with
> that... Using socketpair() would give me n file descriptors on the
> parent process side for n children, which I don't really like.


Then use pthreads ;)

>
> I will try that at some point in my free time, but cannot spend the
> amount of hours needed for conversion in my boss'es time :-)


You just use pthread_create with a function, instead of fork and a
function (or exec). Instead of the shared memory (aaah!) you can use
something like this:
struct Message {
pthread_mutex_t mutex;
pthread_cond_t messageArrived;
/* and the message */
};

and do simple setmessage and getmessage operations on that. Much faster
than IPC over pipes or sockets (at least on my system), no
de/serializing, and cleaner (IMHO) than shared memory.

Also, if you use fork without exec right now, you don't have to close
all the parent's file descriptors, you don't have to free the parent's
memory and other resources, because it's only one address space. If the
parent frees some of its memory, it will go back to the OS, because
there are no other processes holding on to it (after forking you inherit
all the parent's crap).

--
I believe in Karma. That means I can do bad things to people
all day long and I assume they deserve it.
Dogbert
Jan Stap

2005-08-25, 6:04 pm

Ulrich Hobelmann wrote:

> Jan Stap wrote:
>
> shared memory? I find that much more ugly than pthreads, but YYMV.


I hear more people speaking out their dislike of shared memory. Of
course one has to set the protection bits to prevent security leaks,
but what is so ugly about it?

[snip]

> You just use pthread_create with a function, instead of fork and a
> function (or exec). Instead of the shared memory (aaah!) you can use
> something like this:
> struct Message {
> pthread_mutex_t mutex;
> pthread_cond_t messageArrived;
> /* and the message */
> };
>
> and do simple setmessage and getmessage operations on that. Much faster
> than IPC over pipes or sockets (at least on my system), no
> de/serializing, and cleaner (IMHO) than shared memory.
>
> Also, if you use fork without exec right now, you don't have to close
> all the parent's file descriptors, you don't have to free the parent's
> memory and other resources, because it's only one address space. If the
> parent frees some of its memory, it will go back to the OS, because
> there are no other processes holding on to it (after forking you inherit
> all the parent's crap).


Threads seemed more complex and more kernel-dependent to me at first,
which led me to use the fork() scheme (I'm fairly new to UNIX
programming). But I want to learn, so I will try a threaded version,
once I find the time to do the conversion. For now, I need to extend
the fork()-based version that's now in production. But complexity with
pthreads seems limited, judging from what you write.

Thanks!

Jan Stap

> --
> I believe in Karma. That means I can do bad things to people
> all day long and I assume they deserve it.
> Dogbert


Ulrich Hobelmann

2005-08-26, 7:55 am

Jan Stap wrote:
> Ulrich Hobelmann wrote:
>
>
> I hear more people speaking out their dislike of shared memory. Of
> course one has to set the protection bits to prevent security leaks,
> but what is so ugly about it?


I can't speak for the others, but to me the point of using processes
instead of threads is isolation. You pay a small price for that
(slightly higher context switching; nothing to worry about). I've never
really used shm, but I think shm segments even persist if your processes
die, so there's the cleanup problem; sharing memory between threads just
means that you pass some address to both (or more) threads before
creating them, so they know about it (or you use global variables).

Then there's the security that you mention; threads don't share their
memory with anything outside the process.

To me shm feels like circumventing everything a process was designed for

> Threads seemed more complex and more kernel-dependent to me at first,


True. The BSDs didn't have thread support until rather recently.
That's why always hated them, too ;)

> which led me to use the fork() scheme (I'm fairly new to UNIX
> programming). But I want to learn, so I will try a threaded version,


In general it's good to know how to create and communicate between
processes. Maybe too many people don't, and often it would do a program
good to separate it into several processes (even programs). To me a
(web) server isn't one of those cases. YYMV. Both fork and threads are
basically fine, with their individual tradeoffs.

> once I find the time to do the conversion. For now, I need to extend
> the fork()-based version that's now in production. But complexity with
> pthreads seems limited, judging from what you write.


Only if you take care (control all communication between threads). So
if the forking version works fine, just use that one.

--
I believe in Karma. That means I can do bad things to people
all day long and I assume they deserve it.
Dogbert
Jan Stap

2005-08-26, 7:55 am

FYI: a separate thread on this topic is running at:
http://www.developerweb.net/forum/s...17930#post17930

Two interesting approaches came up in there, where one of both has a
variant in which no separate parent process is required and all
processes behave identical. Interesting stuff!

Cheers,

Jan Stap

Jan Stap

2005-08-26, 7:55 am


Ulrich Hobelmann wrote:

[snip]
>
> I can't speak for the others, but to me the point of using processes
> instead of threads is isolation. You pay a small price for that
> (slightly higher context switching; nothing to worry about).


That's a good point.

> I've never
> really used shm, but I think shm segments even persist if your processes
> die, so there's the cleanup problem; sharing memory between threads just
> means that you pass some address to both (or more) threads before
> creating them, so they know about it (or you use global variables).


Cleaning up is not so much of an issue: after creating the shm segment,
shmctl(shmid, IPC_RMID, ...) marks it for deletion, i.e. it is deleted
after the last reference to it is removed.

> Then there's the security that you mention; threads don't share their
> memory with anything outside the process.
>
> To me shm feels like circumventing everything a process was designed for


Indeed there is no kernel-provided protocol on how to share this
memory, apart from the user-group-others protection bits set on it.
This in contrast to semaphores and message queues, which do not open up
process memory space.

[snip]
> In general it's good to know how to create and communicate between
> processes. Maybe too many people don't, and often it would do a program
> good to separate it into several processes (even programs). To me a
> (web) server isn't one of those cases. YYMV. Both fork and threads are
> basically fine, with their individual tradeoffs.


I think Postfix is a good example of splitting functionality into
separate processes for security sake (use small components and each
component has no more system rights than it needs).

>
> Only if you take care (control all communication between threads). So
> if the forking version works fine, just use that one.


I see what I will do; it's still a good test case to get experience
with threads.

Cheers,

Jan Stap

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com