quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Fraser <dav...@sjsoft.com>
Subject Re: FileSession - a couple of small fixes
Date Thu, 21 Apr 2005 12:48:21 GMT
Hi Nicolas

120 seconds to process is *exactly* what I meant, sorry if that was not 
clear.
I often process lots of files etc in response to a request. I know this 
is a corner case and I'm not actually using mod_python's handling for 
these sessions, but I just thought it was important to point out. 
Usually with locking bugs there's no easy way to get around them :-)
Of course you may feel that this doesn't justify supporting, which is 
fine, as long as its documented so.

David

Nicolas Lehuen wrote:

>Hi
>
>You are describing a sequence of events that would happen if the
>request took longer than 120 to *process*. A 120 seconds processing
>time is not the same thing as a 120 second pause between two
>consecutive requests.
>
>Jim's solution does handle the 120 second pause between requests ; for
>a processing time of 120 second which happens at the end of a session,
>it will unduly erase the session (unless the session is saved before
>the long processing time), but I guess it is safe to forget about this
>possibility. That was not what you meant, anyway, wasn't it ?
>
>Regards,
>Nicolas
>
>On 4/21/05, David Fraser <davidf@sjsoft.com> wrote:
>  
>
>>Jim Gallacher wrote:
>>
>>    
>>
>>>Hi Nicolas,
>>>
>>>I've had a chance to review your analysis and I think there is a
>>>really simple solution. Is there an emoticon for a light bulb going on?
>>>
>>>Nicolas Lehuen wrote:
>>>
>>>      
>>>
>>>>The sequence of events that we don't want is :
>>>>
>>>>- process/thread A loads an old session S1 (older than the session
>>>>timeout) which has not been cleaned up yet. It begins processing the
>>>>request.
>>>>- process/thread B loads another session S2, the random test on
>>>>session cleanup pass, so a session cleanup is triggered. S1 is seen as
>>>>too old so it...
>>>>- process/thread A saves S1 to disk, with a brand new access time
>>>>- process/thread B ... erases the session file for S1 since it is too
>>>>old. Ooops, we've lost a perfectly valid session.
>>>>
>>>>There is another scenario which seems OK to me :
>>>>
>>>>- A loads an old session S1
>>>>- B loads another session S2, triggers cleanup, delete the old session
>>>>file for S1
>>>>- A goes on processing the request then eventually saves S1 to disk
>>>>
>>>>No problem, we've just resurrected the session. Well it's not exactly
>>>>true : what if a third process/thread C tries to access the session S1
>>>>between the delete and the save ? Fortunately, this is not possible
>>>>due to the individual session locking.
>>>>        
>>>>
>>>True unless the application code creates the session without locking.
>>>Another instance of the need for the documentation to address the
>>>importance of session locking.
>>>
>>>      
>>>
>>>>Let's try to do this methodically. There are 2 concurrent sequences of
>>>>events : in process A, we have L,P,S as Loading session, Processing
>>>>request and Saving session. In process B, we have C,D as Checking the
>>>>session timeout and Deleting the session because it's too old. Thus,
>>>>we have 10 possible schedulings of these events :
>>>>
>>>>01) CDLPS => OK, the old session is lost and a new session is built
>>>>and saved
>>>>02) CLDPS => the session is resurrected ; individual session locking
>>>>is needed
>>>>03) CLPDS => the session is resurrected ; individual session locking
>>>>is needed
>>>>04) CLPSD => this is the problem mentioned above
>>>>05) LCDPS => the session is resurrected ; individual session locking
>>>>is needed
>>>>06) LCPDS => the session is resurrected ; individual session locking
>>>>is needed
>>>>07) LCPSD => this is the problem mentioned above
>>>>08) LPCDS => the session is resurrected ; individual session locking
>>>>is needed
>>>>09) LPCSD => this is the problem mentioned above
>>>>10) LPSCD => this won't happen since when the session is saved, its
>>>>timeout goes down to 0
>>>>
>>>>First, let's notice that the problem of deleting a valid session must
>>>>be quite rare. You'll have to send a request with an session that
>>>>should have timed out, just as a random occuring session cleanup takes
>>>>place, and you have to fall into 3 of the 10 possible schedulings
>>>>mentioned above.
>>>>        
>>>>
>>>Yes, rare but possible.
>>>
>>>      
>>>
>>>>Anyway, if we really want to make this rock-solid, we'll have to find
>>>>a locking scheme that prevents the scheduling sequences 4,7 and 9 to
>>>>happen. I can see two solutions :
>>>>
>>>>1) individually acquire the session lock before checking for the
>>>>session and deleting it.
>>>>2) have a reader / writer lock (a kind of lock which can be shared for
>>>>reading but is exclusive for writing), so that sessions get a read
>>>>lock and the cleanup code tries to get a write lock.
>>>>
>>>>The problem in both solutions is that the lock can be held for quite a
>>>>long time, so you'll penalize either the cleanup (which runs into a
>>>>normal request processing thread/process, remember !) or the other
>>>>requests. I really don't think that preventing the quite unprobable
>>>>cases 4,7 and 9 is worth the price of slowing everyone.
>>>>        
>>>>
>>>So lets avoid all session locking. See below.
>>>
>>>      
>>>
>>>>So I suggest this : why not refuse to load a timed-out session ? If
>>>>the session is timed-out, then drop it AT LOAD TIME and create a new
>>>>one. This way, there won't be any race condition between the request
>>>>processing and the cleaning of sessions.
>>>>        
>>>>
>>>BaseSession.load(), which calls FileSession.do_load() already checks
>>>if the session has expired. Expired sessions will never get loaded. We
>>>still have the problem of cases 4,7 and 9 where the session save falls
>>>between the cleanup check and the cleanup delete. .
>>>
>>>And then the light came on...
>>>
>>>The important question that needs to asked - What is the purpose of
>>>running the cleanup?
>>>
>>>Answer: We don't want to consume all of our disk space with expired
>>>session files.
>>>
>>>WHEN the cleanup runs is really not all that relevant. As long as
>>>there is enough free disk space we could run it only once a year if we
>>>wanted. So why the obsession with the cleanup interfering with a
>>>session that is on the brink of expiring? Give the session a grace
>>>period before it is cleaned up and all will be well.
>>>
>>>Our cleanup code becomes something like this:
>>>
>>>    # Add a grace period of 120 seconds to the timeout
>>>    # in case a request occurs for a session occurs at approx
>>>    # the time it is about to expire
>>>
>>>    grace_period = 120  # seconds
>>>    mtime = os.stat(filename).st_mtime
>>>    if time.time() - mtime < (fast_timeout + grace_period):
>>>        continue
>>>    else:
>>>        # do the rest of the cleanup
>>>
>>>As long as grace_period is longer than the longest request the race
>>>condition expressed in cases 4,7 and 9 will never occur. At this point
>>>I think we can completely avoid session locking in the cleanup, and my
>>>concern about global lock collisions also disappears.
>>>
>>>I'll modify the code and do some testing tonight. If all goes well you
>>>should see the new FileSession tomorrow morning.
>>>      
>>>
>>I'm not so sure if this is always going to work. I quite often have HTTP
>>requests that stay open for several minutes.
>>So case 9 above could cause a problem: A Loads, A Processes, [> 120
>>seconds elapses], B Checks, A Saves, B Deletes
>>
>>Or am I misreading this?
>>
>>David
>>
>>    
>>
>
>  
>


Mime
View raw message