quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Barry Pearce <barry.pea...@copyrightwitness.net>
Subject Re: [mod_python] Sessions performance and some numbers
Date Sat, 09 Apr 2005 08:09:03 GMT

Gregory (Grisha) Trubetskoy wrote:
> On Fri, 8 Apr 2005, Barry Pearce wrote:
>>> what's a .lck file? locking by definition *must* be an OS capability 
>>> or else it _is_ subject to race conditions.
>> errr. no. here is the code snippet:
>> persist = os.open(self.m_file, os.O_CREAT | os.O_EXCL | os.O_WRONLY, 
>> 0660)
>> O_CREAT | O_EXCL ensures that any race is coped with by the operating 
>> system kernel
> you say "no" then say it is "coped with by the operating system" ;-)
> this is flock(2), BTW

no. It is implemented by open_namei() in /usr/src/linux/fs/namei.c
This states that if the file exists already then the open fails. Its a 
simple check. The file creation is protected by in-kernel semphores.

>> - works equally well on mac/win32/unix/linux if two threads try it at 
>> precisely the same time one will win one will fail - the failed one 
>> simply re-evaluates the random and tries again. As yet though I have 
>> yet to see a collision on my code...I could only test by manufacturing 
>> the situation.
>>> Current locking uses the APR's global locking mechanism which decides 
>>> based on OS and the httpd config/compile options what the most 
>>> efficient locking mechanism is. There is no reason to reinvent the 
>>> wheel here - it's a complex problem that I trust the APR folks are 
>>> most qualified to solve.
>> Fair do. But its not complex - given that part of the day job is linux 
>> kernel development - locking is not that bad!!! :)
> Well, may be how the actual locking is done in the OS isn't complex 
> (though I believe it actually has to be supported in hardware).

The hardware generally doesnt feature in this - its typically a vfs 
thing - certainly is on UNIX, Linux and Mac (dunno about windows) - all 
locking is done very simply - (in linux its just a linked list!!). 
(/usr/src/linux/fs/locks.c) But this means such locks come at a memory & 
performance cost - espiecally if you start talking about figures of 
10000. Every lock you allocate takes memory, every time you check it you 
have to traverse a linked list and if the one you want is at the bottom 
of the pile...

> The messy part is that different types of locks beave differently when 
> it comes to the same process or its children accessing the same lock, 
> and on top of that the definitions may differ across different OS's. 
> E.g. I don't think flock would be usable in a multi-threaded environment 

Thats correct. Most work at pid level.

> since appying a new lock from the same process is a noop. Also while 
> flock is a lock-per-file, an fcntl(2) lock can lock regions of a file... 
> Then there are SysV IPC semaphores which behave differently - finding 
> and correctly using the locks is the problem that the APR solves.

Right but both flock (which is not used by the example above) and 
sempahores etc have serious imlpications for the operating system when 
you start looking at 10000 of them. Semaphores typically dont get 
cleaned up when the process dies - bad news - you have used o/s 
resource. file locks do. However, if you use a .lck file then 
technically you can support a million concurrently. By the time your 
system using flock/semaphores gets that far you have probably run out of 
memory on your server - so now I would be concerned about how long a 
session remained 'locked' for - otherwise it is easy to produce a DoS 
attack by simply starting up a large number of sessions in whatever the 
timeframe is...

With my file locks I can support a million no problems...I havent zapped 
  precious system resources to make them work. They work across 
processes and threads on any platform. Should my apache process drop 
dead - yes it will leave a lock - but I have an external python script 
that expires them anyway and removes old session and lock files anyway - 
the problem you see with file based sessions and not cookies is that you 
only sometimes know when the client finished - after all with browser 
technology they can just close their browser at any point. Thus what you 
end up with when you do server-side sessions is a compromise - but it 
does mean that you have avoided cookies.

I assume that session expiry and subsequent cleanup has been thought 
of...so I wont go into any further detail on that tangent!


View raw message