qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Håkan Johansson (Jira) <j...@apache.org>
Subject [jira] [Updated] (QPID-8438) qpid_messaging (amqp1.0) randomly hangs when closing the connection.
Date Tue, 28 Apr 2020 05:31:00 GMT

     [ https://issues.apache.org/jira/browse/QPID-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Håkan Johansson updated QPID-8438:
----------------------------------
    Description: 
This is basically QPID-8179 again.

We are using {{qpid_messaging}} to communicate with an ActiveMQ broker using the {{amqp1.0}}
protocol.

We have noticed during both large scale tests and in production that our program sometimes
hangs when it is closing the connection to the broker. This is a very rare occurrence, but
happens often enough to be annoying, especially when it happens in production.

While the design of the  {{amqp1.0}} implementation is generally nice, I have noticed a lack
of life cycle management for the helper objects, especially the ones interacting with background
threads. The background thread is shut down when closing the last remaining connection, but
not enough care is being taken to make sure it is in a safe state to do so.

If an event is being processed when the connection is closed, then a deadlock might occur,
as seen in this stack-trace:
{noformat}
Thread 1 (Thread 0x7fbaa5857880 (LWP 47152)):
#0  0x00007fbaa1428965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fba7e872c72 in qpid::sys::Condition::wait(qpid::sys::Mutex&) () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidcommon.so.2
#2  0x00007fba7e878c0b in qpid::sys::TimerTask::cancel() () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidcommon.so.2
#3  0x00007fba7f2ad056 in qpid::messaging::amqp::ConnectionContext::close() () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidmessaging.so.2
{noformat}

In addition, we often see this in our logs just before the hang:
{noformat}
ConnectionTicker couldn't setup next timer firing: -7495340780ns[7.5s]
{noformat}


  was:
This is basically QPID-8179 again.

We are using {{qpid_messaging}} to communicate with an ActiveMQ broker using the {{amqp1.0}}
protocol.

We have noticed during both large scale tests and in production that our program sometimes
hangs when it is closing the connection to the broker. This is a very rare occurrence, but
happens often enough to be annoying, especially when it happens in production.

While the design of the  {{amqp1.0}} implementation is generally nice, I have noticed a lack
of life cycle management for the helper objects, especially the ones interacting with background
threads. The background thread is shut down when closing the last remaining connection, but
not enough care is being taken to make sure it is in a safe state to do so.

If an event is being processed when the connection is closed, then a deadlock might occur,
as seen in this stack-trace:
{noformat}
Thread 1 (Thread 0x7fbaa5857880 (LWP 47152)):
#0  0x00007fbaa1428965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fba7e872c72 in qpid::sys::Condition::wait(qpid::sys::Mutex&) () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidcommon.so.2
#2  0x00007fba7e878c0b in qpid::sys::TimerTask::cancel() () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidcommon.so.2
#3  0x00007fba7f2ad056 in qpid::messaging::amqp::ConnectionContext::close() () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidmessaging.so.2
{noformat}



> qpid_messaging (amqp1.0) randomly hangs when closing the connection.
> --------------------------------------------------------------------
>
>                 Key: QPID-8438
>                 URL: https://issues.apache.org/jira/browse/QPID-8438
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Client
>    Affects Versions: qpid-cpp-1.39.0
>            Reporter: Håkan Johansson
>            Priority: Major
>
> This is basically QPID-8179 again.
> We are using {{qpid_messaging}} to communicate with an ActiveMQ broker using the {{amqp1.0}}
protocol.
> We have noticed during both large scale tests and in production that our program sometimes
hangs when it is closing the connection to the broker. This is a very rare occurrence, but
happens often enough to be annoying, especially when it happens in production.
> While the design of the  {{amqp1.0}} implementation is generally nice, I have noticed
a lack of life cycle management for the helper objects, especially the ones interacting with
background threads. The background thread is shut down when closing the last remaining connection,
but not enough care is being taken to make sure it is in a safe state to do so.
> If an event is being processed when the connection is closed, then a deadlock might occur,
as seen in this stack-trace:
> {noformat}
> Thread 1 (Thread 0x7fbaa5857880 (LWP 47152)):
> #0  0x00007fbaa1428965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> #1  0x00007fba7e872c72 in qpid::sys::Condition::wait(qpid::sys::Mutex&) () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidcommon.so.2
> #2  0x00007fba7e878c0b in qpid::sys::TimerTask::cancel() () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidcommon.so.2
> #3  0x00007fba7f2ad056 in qpid::messaging::amqp::ConnectionContext::close() () from /opt/jeppesen/jcms/lib/x86_64_linux/libqpidmessaging.so.2
> {noformat}
> In addition, we often see this in our logs just before the hang:
> {noformat}
> ConnectionTicker couldn't setup next timer firing: -7495340780ns[7.5s]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message