qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Håkan Johansson (JIRA) <j...@apache.org>
Subject [jira] [Updated] (QPID-7051) Crash after reconnect with transactional session (with patch)
Date Thu, 02 Nov 2017 08:43:00 GMT

     [ https://issues.apache.org/jira/browse/QPID-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Håkan Johansson updated QPID-7051:
    Attachment: qpid-7051.patch

Patch for 1.36.0. This one works better than the previous one.

> Crash after reconnect with transactional session (with patch)
> -------------------------------------------------------------
>                 Key: QPID-7051
>                 URL: https://issues.apache.org/jira/browse/QPID-7051
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Client
>    Affects Versions: qpid-cpp-0.34
>         Environment: Red Hat Enterprise Linux Server release 6.7 (Santiago)
> The broker is ActiveMQ 5.13.0.
> The protocol used in AMQP 1.0.
>            Reporter: Håkan Johansson
>            Priority: Major
>         Attachments: consumer.cc, producer.cc, qpid-7051.patch, qpid-cpp.patch
> I have a test program (see the "consumer.cc" attachment) that creates a connection with
"reconnect" enabled.
> It then creates a transactional session and a receiver to some queue from that session.
> It then reads all messages from the queue and prints out their content.
> A sleep is used between each read to make the test possible.
> While the broker is down the program will try to reconnect to it.
> As soon as it succeeds with that the fetch call throws an exception because the transaction
has become invalid.
> The exception is caught and the read loop is broken out of.
> The test function then exits, causing the _Receiver_, _Session_, and _Connection_ objects
to be destructed.
> The crash happens while destructing the _Connection_ object.
> It took some digging, but I managed to find the reason for the crash.
> When the _Connection_ object is destructed it automatically destructs its _ConnectionHandle_
object, which in turn destructs its _ConnectionContext_ object. Nothing strange here.
> The _ConnectionContext_ destructor makes a call to its own _close_ method, which tries
to shut down all its sessions.
> The problem is that the session has been made invalid by the disconnect, which causes
the call to _syncLH_ to throw an exception,
> which is not caught anywhere, indirectly causing the _ConnectionContext_ destructor to
throw an exception. This is a big no-no in C++.
> A side effect of this is that the transport object is not closed before it is destructed,
> which means that it is still listening for events. The crash happens when the next pending
event tries to use
> the destructed transport object.
> The solution, in my humble opinion, is to catch the exception throws by the _syncLH_
call in the _ConnectionContext::close_ method.
> This way we can try to close all sessions even if one or more of them are invalidated
for some reason.
> The rest of the cleanup process will also be done properly.
> How to run the test program:
> * Compile both "producer.cc" and "consumer.cc". They both need to be linked to the "qpidmessaging"
> * Run "producer" once. This will add ten messages to the "apa.bepa" queue on the broker.
> * Start "consumer".
> * When the consumer starts to print out the messages, shut down and restart the broker.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org

View raw message