quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graham Dumpleton (JIRA)" <j...@apache.org>
Subject [jira] Work started: (MODPYTHON-115) import_module() and multiple modules of same name.
Date Sat, 01 Apr 2006 04:58:27 GMT
     [ http://issues.apache.org/jira/browse/MODPYTHON-115?page=all ]
Work on MODPYTHON-115 started by Graham Dumpleton

> import_module() and multiple modules of same name.
> --------------------------------------------------
>          Key: MODPYTHON-115
>          URL: http://issues.apache.org/jira/browse/MODPYTHON-115
>      Project: mod_python
>         Type: Bug
>   Components: core
>     Versions: 3.1.4, 3.2.7
>     Reporter: Graham Dumpleton
>     Assignee: Graham Dumpleton

> The "apache.import_module()" function is a thin wrapper over the standard Python module
importing system. This means that modules are still stored in "sys.modules". As modules in
"sys.modules" are keyed by their module name, this in turn means that there can only be one
active instance of a module for a specific name.
> The "import_module()" function tries to work around this by checking the path name of
the location of a module against that being requested and if it is different will reload the
correct module. This check of the path though only occurs when the "path" argument is actually
supplied to the "import_module()" function. The "path" is only supplied in this way when mod_python.publisher
makes use of the "import_module()" function, it is not supplied when the "Python*Handler"
directives are used because in that circumstance a module may actually be a system module
and supplying "path" would prevent it from being found.
> Even though mod_python.publisher supplies the "path" argument to the "import_module()"
function, the check of the path has bugs, with modules possibly becoming inaccessible as documented
> The check by mod_python of the path name to the actual code file for a module to determine
if it should be reloaded, can also cause a continual cycle of module reloading even though
the modules on disk may not have changed. This will occur when successive requests alternate
between URLs related to the distinct modules having the same name. This cyclic reloading is
documented in JIRA as MODPYTHON-10.
> That a module is reloaded into the same object space as the existing module when two
modules of the same name are in different locations, can also cause namespace pollution and
security issues if one location for the module was public and the other private. This cross
contamination of modules is as documented in JIRA as MODPYTHON-11.
> In respect of the "Python*Handler" directives where the "path" argument was never supplied
to the "import_module()" function, the result would be that the first module loaded under
the specified name would be used. Thus, any subsequent module of the same name referred to
by a "Python*Handler" directive found in a different directory but within the same interpreter
would in effect be ignored.
> A caveat to this though is that such a "Python*Handler" directive would result in that
handlers directory being inserted at the head of "sys.path". If the first instance of the
module loaded under that name were at some point modified, the module would be automatically
reloaded, but it would load the version from the different directory.
> Now, although these problem as they relate to mod_python.publisher are addressed in mod_python
3.2.6, the underlying problems in 'import_module()' are not. As the bug reports as they relate
to mod_python.publisher have been closed off as resolved, am creating this bug report so as
to carry on a bug report for the underlying problem as it applies to "Python*Handler" directive
and use of "import_module()" explicitly.
> To illustrate the issue as it applies to "Python*Handler" directive, create two separate
directories with a .htaccess file containing:
>   AddHandler mod_python .py
>   PythonHandler index
>   PythonDebug On
> In the "index.py" file in each separate directory put:
>   import os
>   from mod_python import apache
>   def handler(req):
>     req.content_type = 'text/plain'
>     print >> req, os.getpid(), __file__
>     return apache.OK
> Assuming these are accessed as:
>   /~grahamd/mod_python_9/subdir-1/index.py
>   /~grahamd/mod_python_9/subdir-2/index.py
> access the first URL, and the result will be:
>   10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py
> now access the second URL and we get:
>   10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py
> Note this assumes the same child process got it, so fixing Apache to run one child process
is required for this test.
> As one can see, it doesn't actually use the 'subdir-2/index.py" module at all and still
uses the "subdir-1/index.py' module.
> If one modifies "subdir-1/index.py' so its timestamp is updated and load the second URL
again, we get:
>   10665 /Users/grahamd/Sites/mod_python_9/subdir-2/index.py
> This occurs because it detects the change in the first module loaded, but because sys.path
had the second handler directory at the head of sys.path now, when reloaded it picked up the
> These issues with same name module in multiple locations is listed as ISSUE 14 in my
list of module importer problems. See:
>   http://www.dscpl.com.au/articles/modpython-003.html

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message