qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Conway (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (QPID-7182) [C++ broker] high CPU usage on backup brokers following QPID-7149 scenario
Date Wed, 06 Apr 2016 13:35:25 GMT

    [ https://issues.apache.org/jira/browse/QPID-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15228244#comment-15228244

Alan Conway commented on QPID-7182:

I can't reproduce on present trunk: eb1e785 origin/trunk origin/HEAD QPID-7149: Fixed messaging
client shutdown.

Not I am using a different debug logging change to QueueReplicator.cpp, that only logs when
the size grows, not on every dequeue:
    static size_t max = 0;
    if (e.ids.size() > max) {
        QPID_LOG(critical, logPrefix << "FIXME e.size()=" << e.ids.size());
        max = e.ids.size();

2 theories to explain the reported observations:

1. The first bug report was from a build without "006496cf Bug 1318180 - HA memory leak in
primary broker when overwriting messages in a ring queue" OR was somehow a consequence of
"eb1e785 origin/trunk origin/HEAD QPID-7149: Fixed messaging client shutdown." 

2. The warning added to QueueReplicator.cpp in https://issues.apache.org/jira/browse/QPID-7182?focusedCommentId=15223921&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15223921
showing growth of the deque events. I suspect the extra logging (on backup only for every
dequeue event) slowed the backups down with respect to the primary. The primary began to accumulate
large numbers of un-sent dequeues because of the slow down, which in turn meant the backups
were getting huge dequeue records to process, which slowed them down even more and caused
them to burn CPU.

> [C++ broker] high CPU usage on backup brokers following QPID-7149 scenario
> --------------------------------------------------------------------------
>                 Key: QPID-7182
>                 URL: https://issues.apache.org/jira/browse/QPID-7182
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: qpid-cpp-next
>            Reporter: Pavel Moravec
>            Assignee: Alan Conway
> Following scenario from QPID-7149 with --ha-replicate=all, with whatever patch fixing
it applied or not, CPU usage of backup brokers grow over the time.
> gdb shows one active thread always with backtrace:
> {noformat}
> #0  0x00007f9295fc9c98 in find (this=0x7f9270055840, data=<value optimized out>)
>     at /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../include/c++/4.4.7/tr1_impl/hashtable:786
> #1  qpid::ha::QueueReplicator::dequeueEvent (this=0x7f9270055840, data=<value optimized
>     at /data_xfs/qpid/cpp/src/qpid/ha/QueueReplicator.cpp:306
> #2  0x00007f9295fca82b in operator() (this=0x7f9270055840, deliverable=<value optimized
>     at /usr/include/boost/function/function_template.hpp:1013
> #3  qpid::ha::QueueReplicator::route (this=0x7f9270055840, deliverable=<value optimized
>     at /data_xfs/qpid/cpp/src/qpid/ha/QueueReplicator.cpp:329
> #4  0x00007f9296b9b854 in qpid::broker::SemanticState::route (this=0x7f927001d088, msg=...,
>     at /data_xfs/qpid/cpp/src/qpid/broker/SemanticState.cpp:506
> #5  0x00007f9296bb8ab7 in qpid::broker::SessionState::handleContent (this=0x7f927001cec0,
frame=<value optimized out>)
>     at /data_xfs/qpid/cpp/src/qpid/broker/SessionState.cpp:233
> #6  0x00007f9296bb90a1 in qpid::broker::SessionState::handleIn (this=0x7f927001cec0,
>     at /data_xfs/qpid/cpp/src/qpid/broker/SessionState.cpp:293
> #7  0x00007f92965d4c31 in qpid::amqp_0_10::SessionHandler::handleIn (this=0x7f927002fbb0,
>     at /data_xfs/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:93
> #8  0x00007f9296b29a2b in operator() (this=0x7f9270002060, frame=...) at /data_xfs/qpid/cpp/src/qpid/framing/Handler.h:39
> #9  qpid::broker::ConnectionHandler::handle (this=0x7f9270002060, frame=...)
>     at /data_xfs/qpid/cpp/src/qpid/broker/ConnectionHandler.cpp:93
> #10 0x00007f9296b247e8 in qpid::broker::amqp_0_10::Connection::received (this=0x7f9270001e80,
>     at /data_xfs/qpid/cpp/src/qpid/broker/amqp_0_10/Connection.cpp:198
> #11 0x00007f9296ab2863 in qpid::amqp_0_10::Connection::decode (this=0x7f92700018b0, buffer=<value
optimized out>, 
>     size=<value optimized out>) at /data_xfs/qpid/cpp/src/qpid/amqp_0_10/Connection.cpp:59
> #12 0x00007f92965fdca0 in qpid::sys::AsynchIOHandler::readbuff (this=0x7f92700009b0,
>     at /data_xfs/qpid/cpp/src/qpid/sys/AsynchIOHandler.cpp:138
> #13 0x00007f929657be89 in operator() (this=0x7f9270000a50, h=...) at /usr/include/boost/function/function_template.hpp:1013
> #14 qpid::sys::posix::AsynchIO::readable (this=0x7f9270000a50, h=...) at /data_xfs/qpid/cpp/src/qpid/sys/posix/AsynchIO.cpp:453
> #15 0x00007f92966025b3 in boost::function1<void, qpid::sys::DispatchHandle&>::operator()
(this=<value optimized out>, 
>     a0=<value optimized out>) at /usr/include/boost/function/function_template.hpp:1013
> #16 0x00007f9296601246 in qpid::sys::DispatchHandle::processEvent (this=0x7f9270000a58,
>     at /data_xfs/qpid/cpp/src/qpid/sys/DispatchHandle.cpp:280
> #17 0x00007f92965a1d1d in process (this=0x7961c0) at /data_xfs/qpid/cpp/src/qpid/sys/Poller.h:131
> ..
> {noformat}
> or with:
> {noformat}
> #0  0x00000032c4c0a7b0 in pthread_mutex_unlock () from /lib64/libpthread.so.0
> #1  0x00007fb0958038fa in qpid::sys::Mutex::unlock (this=<value optimized out>)
at /data_ext4/qpid/cpp/src/qpid/sys/posix/Mutex.h:120
> #2  0x00007fb095840628 in ~ScopedLock (this=0x112cfd0, data=<value optimized out>)
at /data_ext4/qpid/cpp/src/qpid/sys/Mutex.h:34
> #3  qpid::ha::QueueReplicator::dequeueEvent (this=0x112cfd0, data=<value optimized
>     at /data_ext4/qpid/cpp/src/qpid/ha/QueueReplicator.cpp:308
> ..
> {noformat}
> Not sure where the busy loop origins.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org

View raw message