quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sterling Hughes <sterl...@bumblebury.com>
Subject Re: Talking about PSP: Internals
Date Thu, 10 Apr 2003 18:13:41 GMT
On Thu, 2003-04-10 at 14:14, David Fraser wrote:
> Sterling Hughes wrote:
> >On Thu, 2003-04-10 at 02:48, David Fraser wrote:
> >  
> >
> >>Sterling Hughes wrote:
> >>    
> >>
> >>>On Wed, 2003-04-09 at 20:30, Jack Diederich wrote:
> >>>
> >>>Few problems I have with a "pure" python implementation:
> >>>
> >>>- Direct memory control, more succinctly, the ability to check when
> >>>we're caching too much.
> >>>      
> >>>
> >>I think you'll find both of these issues would be better handled by 
> >>simple Python code ...
> >>You could easily keep a set of cached objects with last-used timestamps 
> >>and expire if desired, but I suspect that in the general case this 
> >>wouldn't even be needed...
> >>    
> >>
> >Well that depends, I'm basing my experience as a developer on the PHP
> >project.  Remember, there are two things to consider in this case:
> >
> >1) large code bases
> >2) hosting companies
> >
> >Especially with apache2, where some virtual hosts will have a very large
> >number of processes, you can't afford even the possibility of rampant
> >process growth.  Is it possible to check allocated memory (what the
> >cache is taking up), and prune an LRU with python?   If not, at least a
> >portion of the scaffolding needs to be in C.
> >
> Hmmmm ... the .pyc compilation Grisha suggested should sort out a lot of 
> this.
> You can prune a tree, not sure about checking the memory... Don't have 
> too much else to add here.
> >>>- Better internal manipulation, should it be necessary to do a direct
> >>>copy.  It seems shared memory extensions (at least that I've seen) only
> >>>serialize.
> >>>
> >>>      
> >>>
> >>Not sure what you mean by "Better internal manipulation"? Why does the 
> >>parser need shared memory?
> >>    
> >>
> >The parser doesn't, but the cache does (and if we were to implement the
> >parser in pure python, we would almost certainly need a cache).  As you
> >probably know, Apache has multiple processes serving the same content. 
> >Therefore, there are multiple versions of the same document cached.  It
> >allows us to be much more aggresive about caching a parsed document tree
> >if we can avoid duplication (you also only need to poison and reparse
> >once).
> >
> However, when writing mod_python code you generally don't have to be 
> aware of this - you just use one Python interpreter. So if the cache 
> were implemented in Python you wouldn't have to worry either ... Maybe 
> Grisha can comment here. In any case I think his idea of compiling to 
> .pyc files is the best way forward here...

.pyc is good, but its a full order slower than shared memory, at least. 

> >
> >I'm going from the most popular web scripting language here (PHP(*)),
> >and *every* working compiler cache uses shared memory to store document
> >treee.  
> >
> Hmmmm ... this confirms something I was thinking ... coming from a PHP 
> background, not a Python one, you're likely to approach Python as though 
> it were very much like PHP. I think it's important to realise that 
> Python is very much different to PHP.
> <attempt to avoid flame war on languages whilst still communicating>
> Note that I don't know that much about PHP, but I know a fair amount 
> about Python...

While I like both languages alot, don't worry, I don't offend easily. :)

> 1) It's really easy to make modules for Python using C that can be 
> linked in.
> 2) A nice alternative (used by twistedmatrix.com, for example) is when 
> you do really need a C module, make a Python version of that module that 
> functions in the same way ... That way, you have the best of both worlds...

Why duplicate the code, perhaps I'm not understanding?  One of the
things I usually try to do when writing extensions in any language (I've
done it in Perl, Python and PHP incidentally) is implement the core
"ugly" functionality, that's very fast and very feature incomplete in
C.  I then wrap this around with a native language implementation that
contains everything that doesn't need to be core. 

The current PSP implementation falls within that scope.

> 3) Python syntax is very different from PHP. Therefore maybe other 
> approaches are worth looking for, though this one is clearly helpful.
> This, along with Michael C. Neel's comments has caused me to think I 
> should mention some other ideas, which I did in response to his mail

Great.  I didn't get that mail, perhaps the list is lagging?  I totally
agree with both you and michael, there certainly should be other
features (i was  going to bring that up in another mail, but we can
certainly talk about it on that thread.)  PSP (the parser) is just a
small framework within which (or surrounding which) these other ideas
can exist.

> </attempt>
> >>>- There are more options in C.  Had I originally written this in python,
> >>>I might not be keen on porting it to C.  But since I have the parser
> >>>working quite well in C (reliably), it seems like less effort to just
> >>>flatten out thread issues, than port it to python, and deal with a whole
> >>>host of other issues.  Especially if in a year it turns out I need to
> >>>port it back to C for unforeseen reasons. :)
> >>>      
> >>>
> >>Of course, now that Jack has rewritten it in Python, this doesn't 
> >>neccessarily hold ...
> >>This would actually be one of the major advantages of having it in 
> >>Python, the ease of making any changes.
> >>    
> >>
> >Yeah - his lexer looks neat, I'll have to play around with it a bit
> >before I comment on it.  But, at this point, unless the other
> >scaffolding can be merged into the python code, I don't see the point of
> >moving just the lexer (imho).
> >
> >-Sterling 
> >  
> >
> Final point about Python: you can do just about anything in it.
> I think an approach where there is at least a Python alternative would 
> be really neat. Not that I'm expecting you to write it after you have 
> already written the C one...

Yeah..  Well Jack has written one, so I think that can at least be made
available or linked to.  I have a few changes I want to do to the parser
before release (i want to change to <% %> for xml compat, and look into
a little bit of ASP compat, but using a standard set of objects, I plan
on discussing this first ;-)  But they are rather uninvasive, and since
Jack already has a python version available, I see no reason it can't be

"Reductionists like to take things apart.  The rest of us are 
 just trying to get it together." 
    - Larry Wall, Programming Perl, 3rd Edition

View raw message