04-05-07 06:13 PM
[ https://issues.apache.org/jira/brow...ls:all-tabpanel ]
Graham Dumpleton closed MODPYTHON-115.
--------------------------------------
> import_module() and multiple modules of same name.
> --------------------------------------------------
>
> Key: MODPYTHON-115
> URL: https://issues.apache.org/jira/browse/MODPYTHON-115
> Project: mod_python
> Issue Type: Bug
> Components: core
> Affects Versions: 3.1.4, 3.2.7
> Reporter: Graham Dumpleton
> Assigned To: Graham Dumpleton
> Fix For: 3.3
>
>
> The "apache.import_module()" function is a thin wrapper over the standard Python m
odule importing system. This means that modules are still stored in "sys.modules". A
s modules in "sys.modules" are keyed by their module name, this in turn means that t
her
e can only be one active instance of a module for a specific name.
> The "import_module()" function tries to work around this by checking the path name
of the location of a module against that being requested and if it is different wil
l reload the correct module. This check of the path though only occurs when the "pat
h"
argument is actually supplied to the "import_module()" function. The "path"
is only supplied in this way when mod_python.publisher makes use of the "imp
ort_module()" function, it is not supplied when the "Python*Handler" directi
ves are used because in tha
t circumstance a module may actually be a system module and supplying "path" would prevent i
t from being found.
> Even though mod_python.publisher supplies the "path" argument to the "impo
rt_module()" function, the check of the path has bugs, with modules possibly
becoming inaccessible as documented in JIRA as MODPYTHON-9.
> The check by mod_python of the path name to the actual code file for a module to d
etermine if it should be reloaded, can also cause a continual cycle of module reload
ing even though the modules on disk may not have changed. This will occur when succe
ssi
ve requests alternate between URLs related to the distinct modules having the same name. Thi
s cyclic reloading is documented in JIRA as MODPYTHON-10.
> That a module is reloaded into the same object space as the existing module when t
wo modules of the same name are in different locations, can also cause namespace pol
lution and security issues if one location for the module was public and the other p
riv
ate. This cross contamination of modules is as documented in JIRA as MODPYTHON-11.
> In respect of the "Python*Handler" directives where the "path" argument was never
supplied to the "import_module()" function, the result would be that the first modul
e loaded under the specified name would be used. Thus, any subsequent module of the
sam
e name referred to by a "Python*Handler" directive found in a different directory but within
the same interpreter would in effect be ignored.
> A caveat to this though is that such a "Python*Handler" directive would result in
that handlers directory being inserted at the head of "sys.path". If the first insta
nce of the module loaded under that name were at some point modified, the module wou
ld
be automatically reloaded, but it would load the version from the different directory.[vbcol
=seagreen]
> Now, although these problem as they relate to mod_python.publisher are addressed i
n mod_python 3.2.6, the underlying problems in 'import_module()' are not. As the bug
reports as they relate to mod_python.publisher have been closed off as resolved, am
cr[/vbcol]
eating this bug report so as to carry on a bug report for the underlying problem as it appli
es to "Python*Handler" directive and use of "import_module()" explicitly.
> To illustrate the issue as it applies to "Python*Handler" directive, creat
e two separate directories with a .htaccess file containing:
> AddHandler mod_python .py
> PythonHandler index
> PythonDebug On
> In the "index.py" file in each separate directory put:
> import os
> from mod_python import apache
> def handler(req):
> req.content_type = 'text/plain'
> print >> req, os.getpid(), __file__
> return apache.OK
> Assuming these are accessed as:
> /~grahamd/mod_python_9/subdir-1/index.py
> /~grahamd/mod_python_9/subdir-2/index.py
> access the first URL, and the result will be:
> 10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py
> now access the second URL and we get:
> 10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py
> Note this assumes the same child process got it, so fixing Apache to run o
ne child process is required for this test.
> As one can see, it doesn't actually use the 'subdir-2/index.py" module at
all and still uses the "subdir-1/index.py' module.
> If one modifies "subdir-1/index.py' so its timestamp is updated and load t
he second URL again, we get:
> 10665 /Users/grahamd/Sites/mod_python_9/subdir-2/index.py
> This occurs because it detects the change in the first module loaded, but
because sys.path had the second handler directory at the head of sys.path no
w, when reloaded it picked up the latter.
> These issues with same name module in multiple locations is listed as ISSU
E 14 in my list of module importer problems. See:
> http://www.dscpl.com.au/articles/modpython-003.html
[ Post a follow-up to this message ]
|