qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Schloming <rafa...@redhat.com>
Subject Re: MessageListenerMultiConsumerTest deadlock when run in 0-10 code path using ant
Date Mon, 03 Mar 2008 13:15:18 GMT
Aidan Skinner wrote:
> On Fri, Feb 29, 2008 at 8:23 PM, Rafael Schloming <rafaels@redhat.com> wrote:
> 
>> The deadlock is between the session's _messageDeliveryLock and the
>>  Dispatcher's _lock. The dispatcher thread's main loop attempts to
>>  acquire _lock first and then _messageDeliveryLock. The main thread
>>  acquires _messageDeliveryLock first thing on close(...) and then
>>  subsequently many levels down the stack attempts to acquire _lock in
>>  order to call Dispatcher.rejectPending(...).
> 
> We found (and fixed) several similar problems on M2, it's probably
> worth having a look at that code to make sure there aren't any cases
> being missed.

I've just examined the M2.1 code, and I think the potential for the same 
deadlock exists there:

AMQSession.close(long timeout) acquires _messageDeliveryLock and then 
indirectly calls BasicMessageConsumer.close().

BasicMessageConsumer.close() calls syncWrite(BasicCancelOkBody.class). 
This will block until AMQSession.confirmConsumerCancelled() completes in 
another thread.

If the dispatcher thread happens to be holding _lock and waiting to 
acquire _messageDeliveryLock, the confirmConsumerCancelled() in the 
other thread will never be able to acquire _lock and the same deadlock 
should occur.

--Rafael


Mime
View raw message