quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sterling Hughes <sterl...@bumblebury.com>
Subject Talking about PSP: Internals
Date Wed, 09 Apr 2003 21:19:25 GMT
Hi,

With the integration of PSP and mod_python, and some of the future
directions for PSP to take, I wanted to bring up some of my ideas on
list, to hear what people think.  I'll start with some of the internal
stuff first, and then move onto the API stuff.

One of the first things I want to do with PSP is make sure it works
under Apache's threaded model.  Currently PSP uses a C lexer based on
flex.  Therefore, global variables are used while interfacing with flex
itself (extern FILE *yyin, for example), and they are also used when
parsing the document (rewriting the PSP document to a global variable).

The "easy" solution (and the one I know) is simply to rewrite the lexer
to use C++.  I can define a class that extends the yyFlexLexer class,
and has local storage.  This class is then used for all parsing, its a
relatively elegant solution.  The main problem with this however is that
mod_python is a C project, and I'm wondering if its ok to mix the two
(most systems *should* have a C++ compiler.)

I'm open to other solutions, I know re2c supports developing thread safe
scanners (although I'm not really familair with it), so I guess that's
an option (re2c scanners are a bit faster than flex scanners in most
cases).

The second, less desirable option is to use mutexes.  I would prefer
just a big old mutex around the parse loop, as PSP scripts will rarely
be reparsed, and mostly accessed from a hashtable cache.  Either that,
or we could use thread local storage (tls), which is slow, and store the
context there.

That leads me to the second area where threadsafety is an issue.  I
currently maintain a global cache of compiled psp files (PyObject*'s). 
This needs to either be mutexed, or, again, stored in tls.  Does apache
2 currently have an api for storing global data (perhaps a table
allocated from r->server->pool?)

My second internal issue with PSP is the hashtable of precompiled
files.  This idea was taken from the PHP world, where we have external
accellerators, like the Zend Cache, or ionCube encoder.  These caches
deep-copy objects into shared memory, and then access them from a
(size-configurable) pool.  

PSP is still young, and therefore for quickness, I decided to just store
the data in process memory.  Storing the data in process memory is
faster (implementation wise, and speed wise).  However, the current
implementation has a drawback - if you use large files, all that gets
tacked on to the local memory.  Therefore on larger scripts its possible
to get enormous httpd processes lying around.

The first step is adding a quick option to httpd.conf, that says "max
hash size, then purge."  This is something I'll definitely implement in
the short-run, but I'm interested in what people think regarding shared
memory storage.  Do you think its worth it to add a SHM cache, and, does
httpd provide an API for cleanly storing data in shared memory across
multiple systems?

Ok, well, that's what I'm thinking about at the moment regarding PSP
internals.  I'm interested to hear all your thoughts on this. :)

-Sterling
-- 
"A business that makes nothing but money is a poor kind of business." 
    - Henry Ford


Mime
View raw message