Apache Mod-Python - Server Shutdown and register_cleanup

This is Interesting: Free IT Magazines  
Home > Archive > Apache Mod-Python > August 2006 > Server Shutdown and register_cleanup





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Server Shutdown and register_cleanup
David Fraser

2006-08-24, 1:16 pm

David Fraser wrote:
> Graham Dumpleton wrote:
> So you are saying:
> 1) There is a mechanism for cleaning up code
> 2) This mechanism is not reliable
> 3) Since databases have to assume clients are not reliable, they clean
> up for them anyway
> 4) Therefore we should not even try to clean up
>
> I'm with you on points 1, 2, and 3, but I think point 4 is taking it a
> bit too far...
> Surely there must be *some* value in trying to clean up behind
> yourself, sometimes?

Hi

I thought it would be good to take this across to python-dev. I've read
through
https://issues.apache.org/jira/brow...ON-109?page=all
and the discussion in
http://www.modpython.org/pipermail/...ary/019865.html
http://www.modpython.org/pipermail/...ary/019866.html and
http://www.modpython.org/pipermail/...ary/019870.html
again, and I'm just not sure about this.

Basically, Apache seems to provide some sort of mechanism for child
processes to clean themselves up, and for modules to clean up their
resources in a particular child.

The argument to remove the ability to clean up Python objects seems to
be that:
A) The finalize method was been called in an awkward place (from inside
a signal handler) and other code may be running and have the GIL, so it
may not be called at all, even in a graceful shutdown.
B) A normal restart will just send a TERM signal, which doesn't give
proper opportunity for cleanup
C) If the graceful shutdown doesn't work or respond quickly, Apache will
just kill the process anyway, so we may as will live with being killed
(talk about mixed metaphors...)
D) Since databases etc have to deal with the client process being
killed, they generally will handle this

I accept that problem A with the finalizing methods is a real problem,
but wonder if there are alternate solutions that can be provided to
allow cleanups to be attempted.
I don't think that B or C is a good argument - in that case why would
Apache be providing the hooks to clean up anyway? It feels like throwing
in the towel...
And D just seems impolite - if we can try and clean up we should.
Of course, if we can't manage to call finalize methods even in a
graceful shutdown none of this may be possible...

Trying to find relevant info on this from the Apache docs and other
module documentation:
http://httpd.apache.org/docs/2.2/st...ml#gracefulstop
talks about advising children to exit after their current request. In
this case it would seem the cleanup methods should get called at the end
of the request processing, and thus shouldn't be in a signal handler
(and there should be no other Python code executing...)
http://www.apachetutor.org/dev/pools
talks about using pools to allocate/deallocate resources other than
memory - could we provide a way to register Python objects that need
cleanup using this mechanism?

Am I barking up the wrong tree or is this worth investigating further?
David

Jim Gallacher

2006-08-24, 1:16 pm

David Fraser wrote:
> David Fraser wrote:
> Hi
>
> I thought it would be good to take this across to python-dev. I've read
> through
> https://issues.apache.org/jira/brow...ON-109?page=all
> and the discussion in
> http://www.modpython.org/pipermail/...ary/019865.html
> http://www.modpython.org/pipermail/...ary/019866.html and
> http://www.modpython.org/pipermail/...ary/019870.html
> again, and I'm just not sure about this.
>
> Basically, Apache seems to provide some sort of mechanism for child
> processes to clean themselves up, and for modules to clean up their
> resources in a particular child.
>
> The argument to remove the ability to clean up Python objects seems to
> be that:
> A) The finalize method was been called in an awkward place (from inside
> a signal handler) and other code may be running and have the GIL, so it
> may not be called at all, even in a graceful shutdown.
> B) A normal restart will just send a TERM signal, which doesn't give
> proper opportunity for cleanup
> C) If the graceful shutdown doesn't work or respond quickly, Apache will
> just kill the process anyway, so we may as will live with being killed
> (talk about mixed metaphors...)
> D) Since databases etc have to deal with the client process being
> killed, they generally will handle this
>
> I accept that problem A with the finalizing methods is a real problem,
> but wonder if there are alternate solutions that can be provided to
> allow cleanups to be attempted.
> I don't think that B or C is a good argument - in that case why would
> Apache be providing the hooks to clean up anyway? It feels like throwing
> in the towel...
> And D just seems impolite - if we can try and clean up we should.
> Of course, if we can't manage to call finalize methods even in a
> graceful shutdown none of this may be possible...
>
> Trying to find relevant info on this from the Apache docs and other
> module documentation:
> http://httpd.apache.org/docs/2.2/st...ml#gracefulstop
> talks about advising children to exit after their current request. In
> this case it would seem the cleanup methods should get called at the end
> of the request processing, and thus shouldn't be in a signal handler
> (and there should be no other Python code executing...)


Except that the parent "advises" it's children by sending a signal,
doesn't it?

> http://www.apachetutor.org/dev/pools
> talks about using pools to allocate/deallocate resources other than
> memory - could we provide a way to register Python objects that need
> cleanup using this mechanism?


That *is* the mechanism that mod_python uses to register cleanups.
req.register_cleanup uses the request pool, and apache.register_cleanup
uses the server pool (child_init_pool).

> Am I barking up the wrong tree or is this worth investigating further?
> David


It's worth investigating. There may be a solution, but we just can't see
it. I don't think anyone would argue that the current proposal to drop
the server cleanup is sub-optimal, but the current implementation is
worse than having no cleanup at all.

Really though, isn't this whole discussion actually about database
connection pooling? Doesn't that cover 99% of the cases people care
about? If so maybe our energies would be better focused on what may be
required to support mod_dbd within mod_python.

http://httpd.apache.org/docs/2.2/mod/mod_dbd.html

Jim

David Fraser

2006-08-24, 1:16 pm

Jim Gallacher wrote:
> David Fraser wrote:
> Except that the parent "advises" it's children by sending a signal,
> doesn't it?

On Unix it does, but I'm not sure about Win32. Anyway if the exit is not
actually not from the signal handler, but the signal handler is simply
flagging that an exit should be done after the current request, then the
cleanup could be done alongside the exit and outside of the signal
handler...
> That *is* the mechanism that mod_python uses to register cleanups.
> req.register_cleanup uses the request pool, and
> apache.register_cleanup uses the server pool (child_init_pool).

Good then :-)
> It's worth investigating. There may be a solution, but we just can't
> see it. I don't think anyone would argue that the current proposal to
> drop the server cleanup is sub-optimal, but the current implementation
> is worse than having no cleanup at all.

OK great that's reassuring. I forgot to mention in the above email the
mod_perl documentation that seems to indicate that mod_perl does this:
http://modperlbook.org/html/ch05_03.html
http://162.105.203.19/apache-doc/24.htm#BIN67
http://162.105.203.19/apache-doc/79.htm#BIN172
> Really though, isn't this whole discussion actually about database
> connection pooling? Doesn't that cover 99% of the cases people care
> about? If so maybe our energies would be better focused on what may be
> required to support mod_dbd within mod_python.

Database connection pooling does seem a large amount of it, but we also
do other things from within Apache like launching separate index
processes or running things like Excel via COM. At the moment the
indexing process watches the parent process and exit when it does, but
it might be quite nice to be able to tell the child process it should
exit explicitly.
> http://httpd.apache.org/docs/2.2/mod/mod_dbd.html

Although it may be awkward to use mod_dbd with its limited set of
database drivers and functionality, when there is the Python DB-API ...
I've looked at the mod_dbd documentation before - how do you even
execute a SQL statement and retrieve the rows? Maybe I'm missing how it
works...
But if we could get mod_dbd to manage Python DB-API connections and pool
them, now that would be cool as it would require minimal changes to
existing Python code...

David

Jim Gallacher

2006-08-24, 7:12 pm

David Fraser wrote:
> Jim Gallacher wrote:
> On Unix it does, but I'm not sure about Win32.


I'm not sure about Win32 either, since it doesn't have any child
processes...

> Anyway if the exit is not
> actually not from the signal handler, but the signal handler is simply
> flagging that an exit should be done after the current request, then the
> cleanup could be done alongside the exit and outside of the signal
> handler...
> Good then :-)
> OK great that's reassuring. I forgot to mention in the above email the
> mod_perl documentation that seems to indicate that mod_perl does this:
> http://modperlbook.org/html/ch05_03.html


Interesting, in as much as it touches on the problem we are trying to
solve here. See section 5.3.2.

> http://162.105.203.19/apache-doc/24.htm#BIN67
> http://162.105.203.19/apache-doc/79.htm#BIN172


I've been reading this book, "Writing Apache Modules with PERL and C",
the last couple of days. It's a darn good yarn, even if I did figure
out who-done-it by the end of the first chapter. As I'm reading I keep
having a recurring fantasy... "wouldn't it be great to have this kind of
resource for mod_python"? I think I need to get out more.

What you need to realize is that mod_python is not doing anything
exotic. We are all playing in the same sandbox by the same rules imposed
by apache. Callbacks for things like child initialization and exit, or
any other phase get triggered the same way in any module. What we are
bumping into with this particular bug is a limitation of the python
interpreter, and the whole GIL problem.

> Database connection pooling does seem a large amount of it, but we also
> do other things from within Apache like launching separate index
> processes or running things like Excel via COM. At the moment the
> indexing process watches the parent process and exit when it does, but
> it might be quite nice to be able to tell the child process it should
> exit explicitly.
> Although it may be awkward to use mod_dbd with its limited set of
> database drivers and functionality, when there is the Python DB-API ...
> I've looked at the mod_dbd documentation before - how do you even
> execute a SQL statement and retrieve the rows? Maybe I'm missing how it
> works...
> But if we could get mod_dbd to manage Python DB-API connections and pool
> them, now that would be cool as it would require minimal changes to
> existing Python code...


I've only taken a cursory glance at mod_dbd and the underlying apr_dbd,
and only in a few stolen moments during the day today, but my gut tells
me that it may not be a simple as I had hoped this morning. I have a
feeling that we might actually need to write a Python DB-API wrapper
around the apr_dbd_* calls. This would certainly be a non-trivial thing,
but would be kinda cool. Having this wrapper would allow us to add sql
database functionality (I'm thinking about the often discussed
SQL-Session subclass) without worrying about any particular database
dependency, and would likely be a real boon for mod_python.

My guess is we won't see a Python DB-API wrapper magically appearing
from outside of the mod_python community, so if we want it... start
hacking. It might be something to consider for 3.4. It could also
make a nice Google Summer of Code project for next year if nothing
happens in the interim.

Jim

Graham Dumpleton

2006-08-25, 7:13 am

David Fraser wrote ..
> Trying to find relevant info on this from the Apache docs and other
> module documentation:
> http://httpd.apache.org/docs/2.2/st...ml#gracefulstop
> talks about advising children to exit after their current request. In
> this case it would seem the cleanup methods should get called at the end
> of the request processing, and thus shouldn't be in a signal handler
> (and there should be no other Python code executing...)
> http://www.apachetutor.org/dev/pools
> talks about using pools to allocate/deallocate resources other than
> memory - could we provide a way to register Python objects that need
> cleanup using this mechanism?
>
> Am I barking up the wrong tree or is this worth investigating further?
> David


If I remember correctly, it is only in the parent Apache process that doing
stuff outside of the signal handler is avoided. In terms of waiting until a
request is finished, all this means is that the parent process waits until the
child process finishes any requests it may be handling before it sends the
child process the TERM signal. The TERM signal in the child process still
results in a call to a signal handler whose action is to destroy the child
process memory pool resulting in complex code associated with Python being
called within the context of the signal handler.

The end result is that using a graceful restart as opposed to a plain restart
may increase your chances of that complex Python code being able to
execute successfully, as no requests should be executing at the same time,
but it is still not a guarantee that it will work.

I still could be wrong about parts of this as working out how it all works
is hard because of the one code file being used for both child and parent
implementations. You also have the different MPMs to consider.

Anyway, I'll look over the code again. One thing I have noticed is that although
it says (at least for worker):

apr_signal(SIGTERM, just_die);
child_main(slot);

clean_child_exit(0);

Where just_die() being called is the problem, it later in the code does:

unblock_signal(SIGTERM);
apr_signal(SIGTERM, dummy_signal_handler);

It only does this though when not running as one process. It is entirely
possible when debugging this previously that I was interpreting things
wrongly as I was running in single process mode in order to run gdb.
This may have resulted in the behaviour of the process being changed
so that the SIGTERM was doing what we don't want, when in fact when
run normally it doesn't.

All we know is that something is causing crashes and hangs on shutdown
and if it wasn't the signal, it must be something else.

Graham


David Fraser

2006-08-25, 7:13 am

Jim Gallacher wrote:
> David Fraser wrote:
> I'm not sure about Win32 either, since it doesn't have any child
> processes...
> Interesting, in as much as it touches on the problem we are trying to
> solve here. See section 5.3.2.
> I've been reading this book, "Writing Apache Modules with PERL and C",
> the last couple of days. It's a darn good yarn, even if I did
> figure out who-done-it by the end of the first chapter. As I'm reading
> I keep having a recurring fantasy... "wouldn't it be great to have
> this kind of resource for mod_python"? I think I need to get out more.
>
> What you need to realize is that mod_python is not doing anything
> exotic. We are all playing in the same sandbox by the same rules
> imposed by apache. Callbacks for things like child initialization and
> exit, or any other phase get triggered the same way in any module.
> What we are bumping into with this particular bug is a limitation of
> the Python interpreter, and the whole GIL problem.

Right, I've understood that now. Will try see if I can get anything done
on it, but at least I understand the position more clearly now. Thanks
for the feedback

David

Jim Gallacher

2006-08-25, 7:12 pm

There was a question on the mod_python list regarding mpcp, which
provides a mod_python handler for cherrypy. Out of curiosity I took a
look at mpcp. Low and behold, they use req.server.register_cleanup to
stop the cherrypy server.

I'm becoming increasingly concerned about dropping
server.register_cleanup. I suspect that we may end up breaking code all
over the place. MODPYTHON-109 could end up being a bigger PITA than it
already is.

Jim


Jim Gallacher

2006-08-25, 7:12 pm

Jim Gallacher wrote:
> There was a question on the mod_python list regarding mpcp, which
> provides a mod_python handler for cherrypy. Out of curiosity I took a
> look at mpcp. Low and behold, they use req.server.register_cleanup to
> stop the cherrypy server.
>
> I'm becoming increasingly concerned about dropping
> server.register_cleanup. I suspect that we may end up breaking code all
> over the place. MODPYTHON-109 could end up being a bigger PITA than it
> already is.
>
> Jim
>
>


How about this as a kludge. Make server.register_cleanup an option
either at compile time or at startup via a PythonOption. (I haven't
looked at the code, but I'm assuming that it is possible).

Forcing people that *really* want to use the cleanup to explicitly turn
it on will at least give them fair warning that it might cause some
ugly problems. Something like this could be used:

PythonOption mod_python.enable_server_cleanup "yes, I did read the docs"

Jim

Graham Dumpleton

2006-08-26, 1:14 am


On 26/08/2006, at 6:22 AM, Jim Gallacher wrote:

> There was a question on the mod_python list regarding mpcp, which
> provides a mod_python handler for cherrypy. Out of curiosity I took
> a look at mpcp. Low and behold, they use
> req.server.register_cleanup to stop the cherrypy server.
>
> I'm becoming increasingly concerned about dropping
> server.register_cleanup. I suspect that we may end up breaking code
> all over the place. MODPYTHON-109 could end up being a bigger PITA
> than it already is.


Let us for the time being then simply work out for the various MPMs
under what
circumstances it may or may not work and document the various cases
in the
mod_python documentation. It may be the case that the worst of the
problems
have in fact already been solved by at least not calling Py_Finalize
() when
Apache is being shutdown. It may also be the case that perhaps
problems are
not as prevalent or do not exist in newer versions of Apache.

Anyway, I will try and do a proper analysis of what actually happens
by putting
some debug in the httpd source code and tracking through what gets
called when.

Graham

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com