qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arnaud Simon <asi...@redhat.com>
Subject Re: MessageListenerMultiConsumerTest deadlock when run in 0-10 code path using ant
Date Mon, 03 Mar 2008 11:03:55 GMT

On Mon, 2008-03-03 at 10:22 +0000, Aidan Skinner wrote:
> On Fri, Feb 29, 2008 at 8:23 PM, Rafael Schloming <rafaels@redhat.com> wrote:
> 
> > The deadlock is between the session's _messageDeliveryLock and the
> >  Dispatcher's _lock. The dispatcher thread's main loop attempts to
> >  acquire _lock first and then _messageDeliveryLock. The main thread
> >  acquires _messageDeliveryLock first thing on close(...) and then
> >  subsequently many levels down the stack attempts to acquire _lock in
> >  order to call Dispatcher.rejectPending(...).
> 
> We found (and fixed) several similar problems on M2, it's probably
> worth having a look at that code to make sure there aren't any cases
> being missed.
> 
> - Aidan
> 

Rafael is absolutely right, Session.close is holding a lock on
_messageDeliveryLock only to prevent messages being delivered to
consumers when the session is closing (note that this lock should also
be held when consumer are closing, see Qpid-812). There is however no
reason for the dispatcher thread to keep a lock on _lock when
dispatching a message. _lock is used for stopping/starting the
dispatcher thread and _messageDeliveryLock for preventing message
delivery. So, we should change the run method of dispatcher for
releasing _lock and rollback and rejectPending for synchronizing on
_messageDeliveryLock. 
Note that this deadlock situation would not happen with 0.8 as an
external thread is calling rejectPending but (if the dispatcher is
trying to dispatch a message) this thread will need to wait until close
finishes and will create a new dispatcher thread that will not be
closed.
The way the dispatcher thread is closed should also be changed. The
close operation currently interrupts the thread this means that the
dispatcher does not have a chance to acquire a lock on
_messageDeliveryLock for correctly releasing the message it may be
waiting to dispatch. 

Arnaud

 


Mime
View raw message