qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (QPID-5974) HA qpid-txtest2 can bring down a cluster (JERR_MAP_LOCKED)
Date Fri, 08 Aug 2014 09:27:12 GMT

    [ https://issues.apache.org/jira/browse/QPID-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090521#comment-14090521
] 

ASF subversion and git services commented on QPID-5974:
-------------------------------------------------------

Commit 1616703 from [~aconway] in branch 'qpid/trunk'
[ https://svn.apache.org/r1616703 ]

QPID-5974: HA qpid-txtest2 can bring down a cluster (JERR_MAP_LOCKED))

Problem: transactional dequeues can be sent via two paths as part of the transaction and
via the normal queue replication. If journal is involved this can result result in store errors
if the normal replication path attempts to dequeue before the transaction.

Solution: this is also the case for enqueues, and we already have code in place to skip replication
of tx enqueues via the normal route. Copied the same logic for dequeues.

> HA qpid-txtest2 can bring down a cluster (JERR_MAP_LOCKED)
> ----------------------------------------------------------
>
>                 Key: QPID-5974
>                 URL: https://issues.apache.org/jira/browse/QPID-5974
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: 0.28
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>
> Description of problem:
> qpid-txtest2 AMQP0-10 transactional & durable transfer operation can bring down whole
qpid HA.  Note no brokers were killed, just the txtest was run.
> To reproduce:
> 3 node cluster 
> whlie qpid-txtest2 -b 20.0.20.200 --tx-count 500 --queues 10 --messages-per-tx 10 --total-messages
1000 --durable 1
> Result: 
> Test fails. Broker logs show critical and error messages  like this:
> {noformat}
> [root@dhcp-lab-A ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:33 [Protocol] error Connection qpid.192.168.6.246:5672-192.168.6.247:34210
timed out: closing
> [root@dhcp-lab-B ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1: Replication failed:
Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED:
Record ID locked by a pending transaction. (drid=0x6da3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:23 [Protocol] error Connection qpid.ha.link.09e80392-0c79-4239-a1d0-ea5b53c71bd9
closed by error: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue()
failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by
a pending transaction. (drid=0x6da3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed prematurely,
rollback
> 2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-1: Replication failed:
Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED:
Record ID locked by a pending transaction. (drid=0x7a42) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:38 [Protocol] error Connection qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835
closed by error: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue()
failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by
a pending transaction. (drid=0x7a42) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:38 [HA] critical Shutting down: Backup of tx-test2-10: Replication failed:
Queue tx-test2-10: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED:
Record ID locked by a pending transaction. (drid=0x7a43) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:38 [Protocol] error Connection qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835
closed by error: Backup of tx-test2-10: Replication failed: Queue tx-test2-10: async_dequeue()
failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by
a pending transaction. (drid=0x7a43) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:38 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:40 [HA] critical Shutting down: Backup of tx-test2-7: Replication failed:
Queue tx-test2-7: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED:
Record ID locked by a pending transaction. (drid=0x7a49) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:40 [Protocol] error Connection qpid.ha.link.0fc6bd3c-48c2-4b27-9db3-2742b3ddc835
closed by error: Backup of tx-test2-7: Replication failed: Queue tx-test2-7: async_dequeue()
failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by
a pending transaction. (drid=0x7a49) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:40 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:11:10 [HA] error Backup: Joining active cluster, cannot be promoted.
> [root@dhcp-lab-C ~]# grep -E 'error|critical' ~qpidd/qpidd.log
> 2014-07-24 14:10:23 [HA] critical Shutting down: Backup of tx-test2-1: Replication failed:
Queue tx-test2-1: async_dequeue() failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED:
Record ID locked by a pending transaction. (drid=0x53a3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)
> 2014-07-24 14:10:23 [Protocol] error Connection qpid.ha.link.1bb57f0a-48db-460c-9260-0f5b353e4bd1
closed by error: Backup of tx-test2-1: Replication failed: Queue tx-test2-1: async_dequeue()
failed: jexception 0x0b02 wmgr::dequeue_check() threw JERR_MAP_LOCKED: Record ID locked by
a pending transaction. (drid=0x53a3) (/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/linearstore/MessageStoreImpl.cpp:1268)
(/builddir/build/BUILD/qpid-0.22/cpp/src/qpid/ha/QueueReplicator.cpp:315)(501)
> 2014-07-24 14:10:24 [Broker] error Could not find dequeued message on commit
> 2014-07-24 14:10:24 [HA] error Backup of transaction 00648954: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 2f556197: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5bd58ffe: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 5d34703c: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction 7e93a7ea: Destroyed prematurely,
rollback
> 2014-07-24 14:10:24 [HA] error Backup of transaction e8856f6f: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 243b4279: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 4f4a25df: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction 80cbe9af: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction a3ed917a: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction b7a4b9a0: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction b9ba9995: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction cbd0d6bf: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction e127288a: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction eb43e683: Destroyed prematurely,
rollback
> 2014-07-24 14:10:35 [HA] error Backup of transaction f29196c1: Destroyed prematurely,
rollback
> 2014-07-24 14:10:53 [HA] error Backup: Still catching up, cannot be promoted.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message