qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Dillaman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (QPID-4286) QMF queries for HA replication take too long to process
Date Wed, 19 Sep 2012 15:55:08 GMT

     [ https://issues.apache.org/jira/browse/QPID-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Dillaman updated QPID-4286:
---------------------------------

    Attachment: qpid-4286.patch

Quick patch against 0.18 branch to utilize a unique lock for v1 QMF events instead of the
standard 'userLock' and to enqueue v2 QMF commands for async processing to prevent blocking
all available worker threads.
                
> QMF queries for HA replication take too long to process
> -------------------------------------------------------
>
>                 Key: QPID-4286
>                 URL: https://issues.apache.org/jira/browse/QPID-4286
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>    Affects Versions: 0.18
>            Reporter: Jason Dillaman
>         Attachments: qpid-4286.patch
>
>
> In an HA broker with approximately 12,000 queues, it takes roughly 10-14 seconds for
the the first QMF response fragment to arrive.  While the QMF management agent is collecting
the response, all other QMF-related functionality is blocked  -- which will block any thread
that raises a QMF event.  
> Not only will this result in clients getting disconnected from the broker due to worker
threads being blocked by QMF (either due to missed heartbeats in an extreme case or from the
2 second handshake timeout), this also results in the HA backup's federated link getting disconnected
due to missed heartbeats when the link heartbeat interval is set to a low value.  
> If the HA backup loses its connection, it only exacerbates the issue since it will reconnect
and re-query the QMF data that made it lose its connection in the first place.  
> Recommend that QMF events not be blocked by a global management agent lock and also recommend
that potentially long-running QMF queries be separated from the worker thread that initiated
them to prevent a heartbeat timeout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message