qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Håkan Johansson (JIRA) <j...@apache.org>
Subject [jira] [Created] (QPID-8056) qpid::messaging::ConnectionContext crash after network disconnect (with patch)
Date Tue, 05 Dec 2017 08:20:00 GMT
Håkan Johansson created QPID-8056:

             Summary: qpid::messaging::ConnectionContext crash after network disconnect (with
                 Key: QPID-8056
                 URL: https://issues.apache.org/jira/browse/QPID-8056
             Project: Qpid
          Issue Type: Bug
          Components: C++ Client
    Affects Versions: qpid-cpp-1.36.0
         Environment: RedHat Enterprise Linux 6
            Reporter: Håkan Johansson

When doing HA testing we found that our application often crashed inside the Qpid Messaging

Our test:
* One ActiveMQ broker.
* Two proxies connecting to the AMQP port on the broker. At the start, only one of the proxies
are running.
* Test program configured to use failover between the two proxies. Protocol is "amqp1.0".
It reads messages in a loop using a transactional session. On error it closes the connection
and opens a new.
* Send some messages and let the test program process them.
* Stop proxy1 and start proxy2.
* Send some more messages and let the test program process them.
* Stop proxy2 and start proxy1.
* And so on...

After a couple of switches the test program crashes, but not always. It's a timing thing.
A typical error message that we see before the crash:
Exception when trying to close the qpid connection: Transaction outcome unknown: transport

The reason for the crash is that the poller thread is still active when the connection is
being deleted. The destructor of the {{qpid::messaging::ConnectionContext}} class deletes
the {{TcpTransport}} instance at the same time as, or right before, the poller thread is calling
a callback on it ({{qpid::messaging::amqp::TcpTransport::disconnected}}).

I have attached a patch to solve the issue, at least for this use case.

I cannot test this on {{1.37.0}} as I cannot build that version on RHEL6 as it uses Python
2.6 which is no longer supported in {{1.37.0}}. The code in question is identical in {{1.36.0}}
and {{1.37.0}} though.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org

View raw message