Apache Mod-Python - Auto updating of req.finfo when req.filename changed.

This is Interesting: Free IT Magazines  
Home > Archive > Apache Mod-Python > March 2006 > Auto updating of req.finfo when req.filename changed.





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Auto updating of req.finfo when req.filename changed.
Graham Dumpleton

2006-03-27, 7:50 am

Now the mailing list is a bit quiet, I would like to see if I can get
some explicit feedback on some issues related to the inability to update
the req.finfo attribute.

Grisha, would be nice if you could respond on this issue and give some
guidance else I fear I'll never be able to progress a solution to this
issue. :-(

As explained in:

http://issues.apache.org/jira/browse/MODPYTHON-128

although it is possible to assign a new value to req.filename, there is
no way to update req.finfo to the file stats associated with that new
value of req.filename. If one had access to the low level C API this
would normally be achieved using:

apr_stat(&r->finfo, r->filename, APR_FINFO_MIN, r->pool);

In mod_python though, there is no way to access the function and affect
that outcome.

In mod_perl 1.0 they implemented the behaviour whereby the "finfo"
attribute was automatically updated when the "filename" attribute was
updated by a handler.

In mod_perl 2.0 they dropped this though, as they wanted to preserve
the idea that in mod_perl everything behaved exactly like the C API
they were trying to provide a 1 to 1 mapping for. Thus in mod_perl
2.0 you need to write:

use Apache2::RequestRec ();
use APR::Finfo ();
use APR::Const -compile => qw(FINFO_NORM);
$r->filename($newfile);
$r->finfo(APR::Finfo::stat($newfile, APR::Const::FINFO_NORM,
$r->pool));

As mod_python isn't attempting to provide a strict 1 to 1 mapping, it
might be argued that it could do what mod_perl 1.0 did and automatically
updated the "finfo" attribute when "filename" is updated.

The only other alternative is to add a new method to the Python request
object for which there isn't strictly speaking a direct equivalent to
in the Apache C API. That is, a method that calls apr_stat() but which
only performs it in relation to the "filename" and "finfo" attributes
in the request object itself and is not a generic routine.

Since it isn't likely that mod_python will ever provide a lower level
API for use of finfo related structures and functions and even if it
did they most likely would be distinct to the request object, the name
of the function added to the request object could still be called
"stat()".

Thus mod_python equivalent to what mod_perl 2.0 does would be:

req.filename = newfile
req.stat()

This though doesn't really convey a sense of what it occurring. Thus a
more descriptive name would probably be more appropriate. For example:

req.filename = newfile
req.update_finfo()

There is no ap_update_finfo() function now, but if they did later
implement one, this would shadow it and prevent it being added to the
request object if it was pertinent for that be done.

The next problem is that apr_stat() actually takes an argument
indicating
what fields in the "finfo" attribute should be updated. In mod_perl 1.0
the value used when the automatic update was done was APR_FINFO_MIN
which results in type, mtime, ctime, atime, size being updated. In
the documentation for mod_perl 2.0 it suggests use of APR_FINFO_NORM
instead which is described as intended to be used when an atomic unix
apr_stat() is required whatever that means.

Important to note though is that is that the ap_directory_walk()
function in Apache which is used to map a URL against a file in the
filesystem uses APR_FINFO_MIN.

Now if a function were to be provided, it seems to make sense that it
have a default whereby it uses APR_FINO_MIN, much as would be the case
if the "finfo" attribute were updated automatically when "filename" is
updated.

Should though a function if provided allow the ability to supply an
alternate for this value so as to be selective as to what attributes
of "finfo" are updated?

If it were allowed, have the problem that there are already attributes
in mod_python for:

FINFO_MODE = 0
FINFO_INO = 1
FINFO_DEV = 2
FINFO_NLINK = 3
FINFO_UID = 4
FINFO_GID = 5
FINFO_SIZE = 6
FINFO_ATIME = 7
FINFO_MTIME = 8
FINFO_CTIME = 9
FINFO_FNAME = 10
FINFO_NAME = 11
FINFO_FILETYPE = 12

Rather than these being equivalents to the APR constants:

#define APR_FINFO_LINK 0x00000001
#define APR_FINFO_MTIME 0x00000010
#define APR_FINFO_CTIME 0x00000020
#define APR_FINFO_ATIME 0x00000040
#define APR_FINFO_SIZE 0x00000100
#define APR_FINFO_CSIZE 0x00000200
#define APR_FINFO_DEV 0x00001000
#define APR_FINFO_INODE 0x00002000
#define APR_FINFO_NLINK 0x00004000
#define APR_FINFO_TYPE 0x00008000
#define APR_FINFO_USER 0x00010000
#define APR_FINFO_GROUP 0x00020000
#define APR_FINFO_UPROT 0x00100000
#define APR_FINFO_GPROT 0x00200000
#define APR_FINFO_WPROT 0x00400000
#define APR_FINFO_ICASE 0x01000000
#define APR_FINFO_NAME 0x02000000
#define APR_FINFO_MIN 0x00008170
#define APR_FINFO_IDENT 0x00003000
#define APR_FINFO_OWNER 0x00030000
#define APR_FINFO_PROT 0x00700000
#define APR_FINFO_NORM 0x0073b170
#define APR_FINFO_DIRENT 0x02000000

which can be bit wise or'd together as the argument to the apr_stat()
function, they are used as positional indexes into the tuple which
mod_python provides as req.finfo. Thus there is a clash on the names.

Thus the whole issue gets very messy. :-(

Overall, my feeling is that following what mod_perl 1.0 did of updating
the "finfo" attribute when the "filename" attribute is a reasonable
solution for mod_python given its much higher level API. At the moment
I haven't been able to for-see any problems that this might cause.
The question though is whether that it is hidden is too magic?

Whatever is done, this missing ability of not being able to update
req.finfo needs to be added. One use for it that I already have is to
get around the DirectoryIndex problems in mod_python caused by Apache's
use of the ap_internal_fast_redirect() function to implement that
feature. The specifics of this particular issue are documented under:

http://issues.apache.org/jira/browse/MODPYTHON-146

One solution I already provided for this was:

def fixuphandler(req):
if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
if req.uri[-1] == '/':
uri = req.uri + 'index.html'
if req.args: uri += '?' + req.args
req.internal_redirect(uri)
return apache.DONE
return apache.OK

Although, I believe now it is probably better done as
HeaderParserHandler
phase. That is before any access, authentication and authorization
checks. Those access checks would instead then only be done once in
the sub request instead of multiple times.

If req.finfo were automatically updated when req.filename is updated
then the alternative is to use:

def headerparserhandler(req):
if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
if req.uri[-1] == '/':
req.filename = posixpath.join(req.filename, 'index.html')
return apache.OK

This is possibly better as it avoids the need to perform an internal
redirect and thus is slightly more efficient. It will only work though
if req.finfo is updated as that will result in the file type changing
from that of a directory to that of a regular file, that value being
later consulted by type handler of mod_mime when setting up the content
type.

Another area where it will be important to be able to update req.finfo
is when a map to storage hook is implemented as described in:

http://issues.apache.org/jira/browse/MODPYTHON-123

Without the ability, there would be no point providing the new hook as
it wouldn't be possibly to correctly set up the request object for
latter
phases without it.

Anyway got anything to contribute as to an opinion or otherwise?

To summarise the problem, we need a way of updating req.finfo when
req.filename changes. How should this be done?

I would like to get this sorted sooner rather than later. This also will
not be the last complicated little issue a decision has to be made on,
and I would rather not see a lack of consensus on these issues pushing
delivery of 3.3 further and further into the future. I also would rather
not see us putting them in to the too hard basket for a later release.

Feedback much appreciated.

Graham



Jim Gallacher

2006-03-27, 7:50 am

Hi Graham,

+1 auto update req.finfo when req.filename changed.

Is there a use case where the user might change filename but not want
finfo to change? I can't think of one, so let's save the user some work
and make their code more robust to boot.

A point I'd like to address is your concern about mod_python differing
from the Apache C api in implementing certain features. Personally I
think our general concern about this is somewhat misguided. I suspect
that the majority of our users are more concerned about the
"python-ness" of mod_python rather than its "apache-ness", and most
would never notice if we deviate slightly from the apache C api. Adding
a method or attribute to the request object for example, if it's useful
to *our* users, shouldn't be rejected out of hand just because it does
not exist in the underlying api.

Jim

Graham Dumpleton wrote:
> Now the mailing list is a bit quiet, I would like to see if I can get
> some explicit feedback on some issues related to the inability to update
> the req.finfo attribute.
>
> Grisha, would be nice if you could respond on this issue and give some
> guidance else I fear I'll never be able to progress a solution to this
> issue. :-(
>
> As explained in:
>
> http://issues.apache.org/jira/browse/MODPYTHON-128
>
> although it is possible to assign a new value to req.filename, there is
> no way to update req.finfo to the file stats associated with that new
> value of req.filename. If one had access to the low level C API this
> would normally be achieved using:
>
> apr_stat(&r->finfo, r->filename, APR_FINFO_MIN, r->pool);
>
> In mod_python though, there is no way to access the function and affect
> that outcome.
>
> In mod_perl 1.0 they implemented the behaviour whereby the "finfo"
> attribute was automatically updated when the "filename" attribute was
> updated by a handler.
>
> In mod_perl 2.0 they dropped this though, as they wanted to preserve
> the idea that in mod_perl everything behaved exactly like the C API
> they were trying to provide a 1 to 1 mapping for. Thus in mod_perl
> 2.0 you need to write:
>
> use Apache2::RequestRec ();
> use APR::Finfo ();
> use APR::Const -compile => qw(FINFO_NORM);
> $r->filename($newfile);
> $r->finfo(APR::Finfo::stat($newfile, APR::Const::FINFO_NORM, $r->pool));
>
> As mod_python isn't attempting to provide a strict 1 to 1 mapping, it
> might be argued that it could do what mod_perl 1.0 did and automatically
> updated the "finfo" attribute when "filename" is updated.
>
> The only other alternative is to add a new method to the Python request
> object for which there isn't strictly speaking a direct equivalent to
> in the Apache C API. That is, a method that calls apr_stat() but which
> only performs it in relation to the "filename" and "finfo" attributes
> in the request object itself and is not a generic routine.
>
> Since it isn't likely that mod_python will ever provide a lower level
> API for use of finfo related structures and functions and even if it
> did they most likely would be distinct to the request object, the name
> of the function added to the request object could still be called "stat()".
>
> Thus mod_python equivalent to what mod_perl 2.0 does would be:
>
> req.filename = newfile
> req.stat()
>
> This though doesn't really convey a sense of what it occurring. Thus a
> more descriptive name would probably be more appropriate. For example:
>
> req.filename = newfile
> req.update_finfo()
>
> There is no ap_update_finfo() function now, but if they did later
> implement one, this would shadow it and prevent it being added to the
> request object if it was pertinent for that be done.
>
> The next problem is that apr_stat() actually takes an argument indicating
> what fields in the "finfo" attribute should be updated. In mod_perl 1.0
> the value used when the automatic update was done was APR_FINFO_MIN
> which results in type, mtime, ctime, atime, size being updated. In
> the documentation for mod_perl 2.0 it suggests use of APR_FINFO_NORM
> instead which is described as intended to be used when an atomic unix
> apr_stat() is required whatever that means.
>
> Important to note though is that is that the ap_directory_walk()
> function in Apache which is used to map a URL against a file in the
> filesystem uses APR_FINFO_MIN.
>
> Now if a function were to be provided, it seems to make sense that it
> have a default whereby it uses APR_FINO_MIN, much as would be the case
> if the "finfo" attribute were updated automatically when "filename" is
> updated.
>
> Should though a function if provided allow the ability to supply an
> alternate for this value so as to be selective as to what attributes
> of "finfo" are updated?
>
> If it were allowed, have the problem that there are already attributes
> in mod_python for:
>
> FINFO_MODE = 0
> FINFO_INO = 1
> FINFO_DEV = 2
> FINFO_NLINK = 3
> FINFO_UID = 4
> FINFO_GID = 5
> FINFO_SIZE = 6
> FINFO_ATIME = 7
> FINFO_MTIME = 8
> FINFO_CTIME = 9
> FINFO_FNAME = 10
> FINFO_NAME = 11
> FINFO_FILETYPE = 12
>
> Rather than these being equivalents to the APR constants:
>
> #define APR_FINFO_LINK 0x00000001
> #define APR_FINFO_MTIME 0x00000010
> #define APR_FINFO_CTIME 0x00000020
> #define APR_FINFO_ATIME 0x00000040
> #define APR_FINFO_SIZE 0x00000100
> #define APR_FINFO_CSIZE 0x00000200
> #define APR_FINFO_DEV 0x00001000
> #define APR_FINFO_INODE 0x00002000
> #define APR_FINFO_NLINK 0x00004000
> #define APR_FINFO_TYPE 0x00008000
> #define APR_FINFO_USER 0x00010000
> #define APR_FINFO_GROUP 0x00020000
> #define APR_FINFO_UPROT 0x00100000
> #define APR_FINFO_GPROT 0x00200000
> #define APR_FINFO_WPROT 0x00400000
> #define APR_FINFO_ICASE 0x01000000
> #define APR_FINFO_NAME 0x02000000
> #define APR_FINFO_MIN 0x00008170
> #define APR_FINFO_IDENT 0x00003000
> #define APR_FINFO_OWNER 0x00030000
> #define APR_FINFO_PROT 0x00700000
> #define APR_FINFO_NORM 0x0073b170
> #define APR_FINFO_DIRENT 0x02000000
>
> which can be bit wise or'd together as the argument to the apr_stat()
> function, they are used as positional indexes into the tuple which
> mod_python provides as req.finfo. Thus there is a clash on the names.
>
> Thus the whole issue gets very messy. :-(
>
> Overall, my feeling is that following what mod_perl 1.0 did of updating
> the "finfo" attribute when the "filename" attribute is a reasonable
> solution for mod_python given its much higher level API. At the moment
> I haven't been able to for-see any problems that this might cause.
> The question though is whether that it is hidden is too magic?
>
> Whatever is done, this missing ability of not being able to update
> req.finfo needs to be added. One use for it that I already have is to
> get around the DirectoryIndex problems in mod_python caused by Apache's
> use of the ap_internal_fast_redirect() function to implement that
> feature. The specifics of this particular issue are documented under:
>
> http://issues.apache.org/jira/browse/MODPYTHON-146
>
> One solution I already provided for this was:
>
> def fixuphandler(req):
> if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
> if req.uri[-1] == '/':
> uri = req.uri + 'index.html'
> if req.args: uri += '?' + req.args
> req.internal_redirect(uri)
> return apache.DONE
> return apache.OK
>
> Although, I believe now it is probably better done as HeaderParserHandler
> phase. That is before any access, authentication and authorization
> checks. Those access checks would instead then only be done once in
> the sub request instead of multiple times.
>
> If req.finfo were automatically updated when req.filename is updated
> then the alternative is to use:
>
> def headerparserhandler(req):
> if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
> if req.uri[-1] == '/':
> req.filename = posixpath.join(req.filename, 'index.html')
> return apache.OK
>
> This is possibly better as it avoids the need to perform an internal
> redirect and thus is slightly more efficient. It will only work though
> if req.finfo is updated as that will result in the file type changing
> from that of a directory to that of a regular file, that value being
> later consulted by type handler of mod_mime when setting up the content
> type.
>
> Another area where it will be important to be able to update req.finfo
> is when a map to storage hook is implemented as described in:
>
> http://issues.apache.org/jira/browse/MODPYTHON-123
>
> Without the ability, there would be no point providing the new hook as
> it wouldn't be possibly to correctly set up the request object for latter
> phases without it.
>
> Anyway got anything to contribute as to an opinion or otherwise?
>
> To summarise the problem, we need a way of updating req.finfo when
> req.filename changes. How should this be done?
>
> I would like to get this sorted sooner rather than later. This also will
> not be the last complicated little issue a decision has to be made on,
> and I would rather not see a lack of consensus on these issues pushing
> delivery of 3.3 further and further into the future. I also would rather
> not see us putting them in to the too hard basket for a later release.
>
> Feedback much appreciated.
>
> Graham
>
>
>



Gregory (Grisha) Trubetskoy

2006-03-31, 12:03 am


I'm -1 on updating req.finfo when req.filename is updated. I don't have
the time to explain it in detail, but I think it flows from Graham's
explanation below.

Basically, httpd does a stat() for you, and its result is stored in finfo.
Why make it any more complicated and magical?

If you alter req.filename - should that imply another stat()? I don't
think so, because the stat() is the most expensive operations in the whole
request service process and unnecessary stats should be avoided.

I also don't see any good use case for it. If you need a stat - Python
provides way for you to do it, there is no need to get httpd involved in
this process.

I, of course, may be missing something obvious here, but this is my
initial reaction. Let httpd/apr behave the way they behave - there is no
need to "pythonize" them, especially when the Python standard library
provides the same functionality in a "pythonic" way.

Grisha

On Sun, 26 Mar 2006, Jim Gallacher wrote:

> Hi Graham,
>
> +1 auto update req.finfo when req.filename changed.
>
> Is there a use case where the user might change filename but not want finfo
> to change? I can't think of one, so let's save the user some work and make
> their code more robust to boot.
>
> A point I'd like to address is your concern about mod_python differing from
> the Apache C api in implementing certain features. Personally I think our
> general concern about this is somewhat misguided. I suspect that the majority
> of our users are more concerned about the "python-ness" of mod_python rather
> than its "apache-ness", and most would never notice if we deviate slightly
> from the apache C api. Adding a method or attribute to the request object for
> example, if it's useful to *our* users, shouldn't be rejected out of hand
> just because it does not exist in the underlying api.
>
> Jim
>
> Graham Dumpleton wrote:
>


Graham Dumpleton

2006-03-31, 12:03 am

Sorry, you have missed an important point. I am not suggesting this as
some sort of convoluted alternative to using os.stat() in Python for
when a Python based handlers wants to stat some alternative file.

I am specifically addressing the issue where request_rec->filename is
being updated to affect the outcome of a later handler phase which isn't
even written in Python but C or PERL and which is designed to look at
request_rec->filename and which correspondingly expects that
request_rec->finfo be for the current file mentioned in
request_rec->filename.

The example I gave was:

def headerparserhandler(req):
if req.finfo[apache.FINFO_FILETYPE] == apache.APR_DIR:
if req.uri[-1] == '/':

req.filename = posixpath.join(req.filename, 'index.html')

# XXX
# Need to somehow ensure that req.finfo (actually
# request_rec->finfo) is updated here to contain
# apr_stat() result for new value of req.filename.
# XXX

return apache.OK

The purpose of this handler is to get around the problem that the
DirectoryIndex directive is effectively broken for mod_python if you use
handlers prior to the content handler phase. Even though a workaround
for a particular issue, updating req.filename and req.finfo is still
a legitimate thing that can be done in C API modules to affect behaviour
of later handlers implemented by other modules. In other words, please
don't judge the example on the basis that it is a workaround for an
Apache problem and say the real change should be to fix Apache's
DirectoryIndex code.

Now as to the handler above, it needs to do two things. The first is to
update request_rec->filename. The new value of this field will then
be consulted by the type checker phase of mod_mime to update the
request_rec->content_type attribute.

The second is that it needs to somehow force the update of the
request_rec->finfo attribute to correspond to the new value of the
request_rec->filename attribute. This then ensures that the file stats
will reflect that it is a file and not a directory.

If this update to request_rec->finfo is not done and it still records
details of the original directory, when the fixup handler of mod_dir is
executed, then it will still think the target of the request is a directory
and will still trigger the broken DirectoryIndex code which is not what
is desired.

Thus, no matter how it is done, whether it be automatic or whether
an explicit function is provided to do it, there needs to be some way
of updating the underlying request_rec->finfo attribute. Saying to
use Python os.stat() is of no use as req.finfo isn't writable, is not the
same structure as what os.stat() returns and isn't a full reflection of
what the underlying request_rec->finfo attribute holds anyway. So,
you couldn't even fake up request_rec->finfo from the result of
os.stat() anyway.

There is therefore two things that need to worked out. The first is
a consensus that there needs to be a way of updating req.finfo to
correspond to new value of req.filename. Second is to work out how
that should be able to be done.

At the moment I get the feeling that you question the need for req.finfo
to be updatable in the first place. I also think saying a stat is expensive
is a red herring as we are talking about a whether specific case here.

In the majority of cases, no one would have a legitimate need to be
modifying req.filename in the first place, so no extra stat is going
to be done, so cost of the stat is not an issue. Where req.filename
is modified though, it would normally follow that a stat should be
done to update req.finfo.

In the above example to workaround DirectoryIndex problem the
outcome is actually more effecient than the alternative, which is to
use req.internal_redirect(). Thus, that one extra stat is performed
is nothing compared to what is saved in avoiding the redirect.

Now of all the code I have seen so far, the only time I have ever seen anyone
modify req.filename is in mod_python.publisher. It updates it so that
req.filename will properly yield the name of the Python code file that
the request mapped against.

Because mod_python.publisher partly uses it as a temporary variable
rather than use a true temporary variable and only assign to it once, an
automatic update of finfo would incur extra cost. But then, except for
one case, mod_python.publisher is doing its own stat of some form after
each assignment anyway, so it could eliminate those and use the
automatically updated finfo. The one case it isn't stating is when it
normalises the path for local OS, which strictly speaking it possibly
shouldn't do as I think Apache might always provide req.filename in
POSIX like format anyway and thus it is breaking that convention.

That mod_python.publisher updates req.filename but not req.finfo
also means that one can't use the mtime stored in req.finfo as input to
req.set_last_modified(). Ie.,set last modified header based on the
modification time of Python code file as stored in req.finfo. If one really
wanted to do it, one would have to stat the file once again to get the
mtime and call req.update_mtime() explicitly.

I could possibly drag up other cases, but hopefully this time I have been
able to more clearly show that there are some legitimate uses for being
able to update req.finfo to reflect newer value of req.filename. If this is
accepted, the question still remains is how to enable that ability.

Graham

Grisha wrote ..[vbcol=seagreen]
>
> I'm -1 on updating req.finfo when req.filename is updated. I don't have
> the time to explain it in detail, but I think it flows from Graham's
> explanation below.
>
> Basically, httpd does a stat() for you, and its result is stored in finfo.
> Why make it any more complicated and magical?
>
> If you alter req.filename - should that imply another stat()? I don't
> think so, because the stat() is the most expensive operations in the whole
> request service process and unnecessary stats should be avoided.
>
> I also don't see any good use case for it. If you need a stat - Python
> provides way for you to do it, there is no need to get httpd involved in
> this process.
>
> I, of course, may be missing something obvious here, but this is my
> initial reaction. Let httpd/apr behave the way they behave - there is no
> need to "pythonize" them, especially when the Python standard library
> provides the same functionality in a "pythonic" way.
>
> Grisha
>
> On Sun, 26 Mar 2006, Jim Gallacher wrote:
>
> finfo
> make
> from
> our
> majority
> rather
> for
> hand
> is
> a
> 1.0
> is
> as
> will

Gregory (Grisha) Trubetskoy

2006-03-31, 12:03 am


On Sun, 26 Mar 2006, Graham Dumpleton wrote:

> One use for it that I already have is to get around the DirectoryIndex
> problems in mod_python caused by Apache's use of the
> ap_internal_fast_redirect() function to implement that feature. The
> specifics of this particular issue are documented under:
>
> http://issues.apache.org/jira/browse/MODPYTHON-146



Could we zoom in this a little bit. I've read the description, but not
quite sure I understand it quite yet. Is "the problem" that if I set
req.notes['foo'] = 'bar' in a phase prior to fixup, by the time we get to
the content handler, it will be gone because notes would be overwritten by
mod_dir?

Grisha

Graham Dumpleton

2006-03-31, 12:03 am

Grisha wrote ..
>
> On Sun, 26 Mar 2006, Graham Dumpleton wrote:
>
>
>
> Could we zoom in this a little bit. I've read the description, but not
> quite sure I understand it quite yet. Is "the problem" that if I set
> req.notes['foo'] = 'bar' in a phase prior to fixup, by the time we get
> to
> the content handler, it will be gone because notes would be overwritten
> by
> mod_dir?


Fixup phase or earlier actually. In the case of req.notes though, it isn't
that the value in req.notes vanishes, it is that it gets duplicated.

Consider .htaccess file containing:

AddHandler mod_python .py
PythonHandler mod_python.publisher
PythonDebug On

DirectoryIndex index.py

PythonFixupHandler _fixup

In _fixup.py in the same directory, have:

from mod_python import apache
import time

def fixuphandler(req):
time.sleep(0.1)
req.notes['time'] = str(time.time())
return apache.OK

In index.py have:

def index(req):
return req.notes['time']

When I use a URL:

http://localhost:8080/~grahamd/fast_redirect/index.py

the result I get is:

1143667522.23

Ie., a single float value holding the time the request was made.

If I now instead access the directory using the URL:

http://localhost:8080/~grahamd/fast_redirect/

I instead get:

['1143667680.57', '1143667680.47']

In other words, instead of getting the single value I now get two values
contained in a list. It wouldn't matter if the the two values were the same
they would both still be included. Where a content handler was expecting
a single string value, it would die when it gets a list.

What is happening is that when the request is made against the directory
it runs through the phases up to and including the fixup handler phase.
As a consequence it runs _fixup::fixuphandler() with req.notes['time']
being set to be the time at that point.

At the end of the fixup phase a mod_dir handler kicks in and it sees
that the file type of request_rec->filename as indicated by
request_rec->finfo->filetype is APR_DIR. As a consequence it will apply
the DirectoryIndex directive, looping through listed files to find a
candidate it can redirect the request too.

In finding a candidate it reapplies phases up to and including the fixup
handler phase on the new candidate filename. This is done so that access
and authorisation checks etc are still performed on the candidate file.

Because it has run the fixup handlers on the candidate file, the
_fixup::fixuphandler() will be run again. This results in req.notes
being set. At that stage the req.notes is separate as it is in effect
run as a sub request to the main request against the directory.

If after checking through the candidates it finds one that matches, to
avoid having to run phases up to and including the fixup handler phase
on the candidate again, mod_dir tries to fake a redirect. This is what
ap_internal_fast_redirect() is being used for.

What the method does is to copy details from the request_rec structure
of the sub request for the candidate into the request_rec of the main
request. When the mod_dir fixup handler returns, the main request
then continues on to execute the content handler phase, with the
details of the sub request.

The problem with this is that rather than simply using req.notes from
the sub request, or overlapping the contents from the sub request onto
that of the main request, it merges them together. You therefore end up
with multiple entries for the 'time' value which was added.

To emphasise the problem, change the fixup handler to be:

from mod_python import apache

def fixuphandler(req):
req.notes['filename'] = req.filename
return apache.OK

and index.py to:

def index(req):
return req.notes['filename']

The result when using URL against the directory is used is:

['/Users/grahamd/Sites/fast_redirect/index.py', '/Users/grahamd/Sites/fast_redirect/']

Now it isn't just req.notes that is going to see this merging as the code
in ap_internal_fast_redirect() is:

r->notes = apr_table_overlay(r->pool, rr->notes, r->notes);
r->headers_out = apr_table_overlay(r->pool, rr->headers_out,
r->headers_out);
r->err_headers_out = apr_table_overlay(r->pool, rr->err_headers_out,
r->err_headers_out);
r->subprocess_env = apr_table_overlay(r->pool, rr->subprocess_env,
r->subprocess_env);

Thus, it also merges output headers and subprocess environment variables.
The merging of these could in themselves also cause problems.

This isn't the end of the problems though as ap_internal_fast_redirect()
doesn't do anything with:

/** Notes on *this* request */
struct ap_conf_vector_t *request_config;

This has two implications for mod_python.

The first is that it is the request_config that the Python request
object instance is cached in. Because the request_config is still that
of the main request, when the content handler phase is executed, it will
pick up the Python request object of the main request. Thus, any
attributes added direct to the Python request object by the sub request
will be missing.

To illustrate this, change the fixup handler to:

from mod_python import apache

def fixuphandler(req):
if req.finfo[apache.FINFO_FILETYPE] != apache.APR_DIR:
req.attribute = req.filename
return apache.OK

and index.py to:

def index(req):
return req.attribute

Then access index file directly using:

http://localhost:8080/~grahamd/fast_redirect/index.py

The result is:

/Users/grahamd/Sites/fast_redirect/index.py

Now access:

http://localhost:8080/~grahamd/fast_redirect/

The result is an exception:

AttributeError: 'mp_request' object has no attribute 'attribute'

Now, in the fixup handler I specifically checked for file type not equal
to a directory so that the attribute was only set when fixup handler for
index.py was run. So you don't think I am doing something wrong, you
could instead use a .htaccess file containing:

AddHandler mod_python .py
PythonHandler mod_python.publisher
PythonDebug On

DirectoryIndex index.py

PythonFixupHandler _fixup | .py

and fixup handler of:

from mod_python import apache

def fixuphandler(req):
req.attribute = req.filename
return apache.OK

The important things is that whether index.py is called direct or by
application of DirectoryIndex they should behave the same and they
aren't.

The second problem with request_rec not being copied is that details
of any Python based output filters registered from within the fixup
handler are also being lost.

Keeping the .htaccess file such that fixup handler only runs with a .py
extension, change fixup handler to be:

from mod_python import apache

def outputfilter(filter):
apache.log_error("outputfilter")
filter.pass_on()
return apache.OK

def fixuphandler(req):
req.register_output_filter("PASS", "_fixup::outputfilter")
req.add_output_filter("PASS")
return apache.OK

and index.py to:

def index(req):
return "HELLO"

Access index.py directly and you get:

HELLO

Check the Apache error log and you will see:

outputfilter

logged from the filter.

Access the directory and the browser gives an error saying it couldn't load
any data from that location. Look at the Apache error log and you will get
the error:

python_filter: Could not find registered filter.

This is because Apache had a callback in place to call mod_python for the
filter, but then mod_python could not find it, as the registration details
were still in the request_config of the sub request request_rec and weren't
copied into the main request.

Thus there are a series of problems because of how ap_internal_fast_redirect()
is implemented and used by mod_dir.

The main Apache httpd mailing list acknowledged that how merge of data
is done was wrong and that ap_internal_fast_redirect() was in general
causing problems for other Apache modules as well, such as mod_rewrite.
Some suggested that should avoid the fast redirect and do a full internal
redirect, but that such a change wouldn't be able to be done until Apache 2.4.
As this is of no help now, a workaround is required which is what my example
was one.

Note though that this whole issue of problems with the fast redirect is totally
distinct from whether req.finfo be able to be updated. It just so happened
that was wanting that ability to implement the workaround. I still contend
that there are other legitimate reasons for want to have req.finfo updated.

BTW, some examples above only work with 3.3 working version. Specifically
the output filter example.

Graham


Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com