quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Gallacher <jg.li...@sympatico.ca>
Subject Re: FileSession - a couple of small fixes
Date Fri, 15 Apr 2005 04:10:53 GMT
Nicolas Lehuen wrote:
> Anyway, I've made some tests like you, and it seems that unless I've
> done something stupid in the _global_trylock function, the
> implementation of trylock in the APR library is flaky on Win32.  With
> one single client, everything is OK, but as soon as I test with more
> clients (using ab -c 30 for example), I randomly get strange errors in
> the error log, once in a while :
> 

I tried a new locking scheme, attached for discussion purposes. Each 
session is locked and unlocked sequentially. I have session locking on 
by default, so the request handling the cleanup has it's session locked. 
I thought I was being clever by making sure that the cleanup request did 
not try and lock itself.

For my tests, I had a large number of session files, but the cleanup 
code would only scan from 2 to 28 files before deadlocking. So much for 
being clever.

Much banging of head against monitor ensued. And then the light went on...

We only have (typically) 31 mutexes available and they are stored in an 
array, not the beloved python dict. Each session, which is represented 
by an md5 hash, will be mapped to an integer between 1 and 32.

Here is the revelant bit of code for _global_lock and/or _global_trylock 
from _apachemodule.c

     // key is session id passed into the function.
     int hash = PyObject_Hash(key);
     hash = abs(hash);
     index = (hash % (glb->nlocks-1)+1);

Scanning a large number of session files, with one of them already 
locked by the request running the cleanup, will almost certainly result 
in a deadlock. When I unlocked the session of the cleanup request, the 
deadlock issue disappeared.

I don't know if this related to the random errors you've seen with 
global_trylock, but the fact that it happens when serving 30 concurrent 
requests sure feels like there is a relationship.

I would ask that you not merge the attached code until I've done some 
additional testing. Plus it is polluted with log_error calls :)

Regards,
Jim


Mime
View raw message