qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Conway" <acon...@redhat.com>
Subject Re: Review Request 17563: QPID-5528: HA Clean up error messages around rolled-back transactions.
Date Fri, 31 Jan 2014 22:23:14 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Jan. 31, 2014, 10:23 p.m.)

Review request for qpid and Gordon Sim.


This is MUCH better. It cleans up (almost) all the bogus errors on HA backup sessions without
any effects on non-HA sessions. It does not introduce any new error messages, although it
does now have the broker _debug_ logging incoming execution exceptions in case this is ever
useful. There is one exception still showing up on the backup that I don't understand - I
don't remember seeing it before so I may have introduced it somehow. I'll track it down &
sort that out as well.

Bugs: QPID-5528

Repository: qpid

Description (updated)

QPID-5528: HA Clean up error messages around rolled-back transactions.

A simple transaction test on a 3 node cluster generates a lot of errors and
rollback notices in the broker logs even though the test code never rolls back a
transaction. E.g.

  qpid-cluster-benchmark -b -n1 -m 1000 -q3 -s2 -r2 --send-arg=--tx --send-arg=10
--receive-arg=--tx --receive-arg=10

The errors are caused by queues being deleted while backup brokers are using
them. This happens a lot in the transaction test because a transactional session
must create a new transaction when the previous one closes. When the session
closes the open transaction is rolled back automatically. Thus there is almost
always an empty transaction that is created then immediately rolled back at the
end of the session. Backup brokers may still be in the process of subscribing to
the transaction's replication queue at this point, causing (harmlesss) errors.

This commit takes the following steps to clean up the unwanted error and rollback messages:

HA TX messages cleaned up:
- Remove misleading "backup disconnected" message for cancelled transactions.
- Include TxReplicator destroy in QueueReplicator mutex. Idempotence check before any destroy.
- Remove spurious warning about ignored unreplicated dequeues.
- Remove log messages about rolling back/destroying empty transactions.

Allow HA to suppress/modify broker exception logging:
- Move broker exception logging into ErrorListener
  - Every SessionHandler has DefaultErrorListener that does the same logging as before.
  - Added SessionHandlerObserver to allow plugins to change the error listener.
- HA backup sets ErrorListeners on replication session bridges
  - Log exceptions that are not errors as HA debug messages, not broker errors.
- HA primary sets error listener on backup sessions via SessionHandlerObserver
  - Log exceptions as HA debug messages, not broker errors.

Unrelated cleanup:
- Remove unused DetachedCallback in Link.cpp
- Broker now logs "incoming execution exceptions" as debug messages rather than ignoring.

Backps are still logging this error message, not sure why:
  [Broker] error Channel exception: not-attached: session.detached from peer.
Will continue to investigate this, it looks like this should be suppressed by backup error

Diffs (updated)

  /trunk/qpid/cpp/src/qpid/Exception.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/broker/Broker.h 1562539 
  /trunk/qpid/cpp/src/qpid/broker/Link.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/broker/SessionHandler.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/ha/BrokerReplicator.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/ha/Primary.h 1562539 
  /trunk/qpid/cpp/src/qpid/ha/Primary.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/ha/PrimaryTxObserver.h 1562539 
  /trunk/qpid/cpp/src/qpid/ha/PrimaryTxObserver.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/ha/QueueReplicator.h 1562539 
  /trunk/qpid/cpp/src/qpid/ha/QueueReplicator.cpp 1562539 
  /trunk/qpid/cpp/src/qpid/ha/TxReplicator.h 1562539 
  /trunk/qpid/cpp/src/qpid/ha/TxReplicator.cpp 1562539 

Diff: https://reviews.apache.org/r/17563/diff/



Alan Conway

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message