qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (QPIDJMS-365) failover.reconnectDelay not applied between attempts if peer remote-closes during connection
Date Mon, 02 Apr 2018 15:52:00 GMT

    [ https://issues.apache.org/jira/browse/QPIDJMS-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422650#comment-16422650
] 

ASF subversion and git services commented on QPIDJMS-365:
---------------------------------------------------------

Commit b22a1e7ad898992a60ffb8ef2c66f58735d8ba1e in qpid-broker-j's branch refs/heads/master
from [~alex.rufous]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-broker-j.git;h=b22a1e7 ]

NO-JIRA: [Broker-J] [System Tests] Remove workaround for qpid jms client not respecting reconnectDelay
(QPIDJMS-365)


> failover.reconnectDelay not applied between attempts if peer remote-closes during connection
> --------------------------------------------------------------------------------------------
>
>                 Key: QPIDJMS-365
>                 URL: https://issues.apache.org/jira/browse/QPIDJMS-365
>             Project: Qpid JMS
>          Issue Type: Bug
>          Components: qpid-jms-client
>            Reporter: Keith Wall
>            Assignee: Timothy Bish
>            Priority: Major
>             Fix For: 0.31.0
>
>
> When using Broker-J's High Availability feature, the client's failover abilities are
used to allow the client to discover which Broker in the HA group has the master role.  The
client tries each Broker on the failover list until until one successfully responds to the
{{Open}} indicating that it is current master.
>  
> When a node is not in the master role, it gracefully closes the AMQP connection by sending
a {{Close}} performative.  During election periods, it is normal for all nodes in the HA group
to respond with the {{Close}} until the election concludes.
> {noformat}
> Close{error=Error{condition=amqp:not-found, description='Virtual host 'localhost' is
not active', info=null}}
> {noformat} 
> The QpidJMS Client failover options includes a {{failover.reconnectDelay}} which "Controls
the delay between successive reconnection attempts".   However it appears that the reconnection
delay is only applied between attempts when a connection fails owing to a 'transport' level
failure (connection refused etc).  If the connection fails at the AMQP level, the delay is
not applied.
> This is impactful to the HA use-case for Broker-J.   It means that during periods of
reelection, the client, tightly spins in the reconnection loop, excessively consuming system
resources.  It is also necessary to ensure that {{failover.maxReconnectAttempts}} is set sufficiently
large to allow for an election period to conclude successfully.  Whilst the user could use
unlimited reconnection attempts, this is unpleasant as it means the system won't fail in the
case where the election does not conclude within a reasonable time period.
> Extract of TRACE level logging from {{org.apache.qpid.jms.provider.failover.FailoverProvider}}
for the case when a AMQP connection is closed gracefully ({{Close}} performative):
> {noformat}
> 2018-03-13 11:04:54,951 [lization thread] - DEBUG FailoverProvider               - Failover:
the provider reports failure: Connection closed by external action [condition = amqp:connection:forced]
> 2018-03-13 11:04:54,951 [lization thread] - DEBUG FailoverProvider               - handling
Provider failure: Connection closed by external action [condition = amqp:connection:forced]
> 2018-03-13 11:04:54,951 [lization thread] - TRACE FailoverProvider               - stack
> java.io.IOException: Connection closed by external action [condition = amqp:connection:forced]
> 	at org.apache.qpid.jms.util.IOExceptionSupport.create(IOExceptionSupport.java:45)
> 	at org.apache.qpid.jms.provider.amqp.AmqpProvider.fireProviderException(AmqpProvider.java:1086)
> 	at org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.closeResource(AmqpAbstractResource.java:182)
> 	at org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.processRemoteClose(AmqpAbstractResource.java:262)
> 	at org.apache.qpid.jms.provider.amqp.AmqpProvider.processUpdates(AmqpProvider.java:949)
> 	at org.apache.qpid.jms.provider.amqp.AmqpProvider.access$1900(AmqpProvider.java:104)
> 	at org.apache.qpid.jms.provider.amqp.AmqpProvider$17.run(AmqpProvider.java:831)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.jms.JMSException: Connection closed by external action [condition =
amqp:connection:forced]
> 	at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToException(AmqpSupport.java:164)
> 	at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToException(AmqpSupport.java:117)
> 	... 11 more
> 2018-03-13 11:04:54,954 [ connect thread] - DEBUG FailoverProvider               - Connection
attempt:[1] to: amqp://localhost:5672 in-progress
> 2018-03-13 11:04:55,003 [lization thread] - DEBUG FailoverProvider               - Signalling
connection recovery: AmqpProvider: localhost:5672
> 2018-03-13 11:04:55,007 [lization thread] - DEBUG FailoverProvider               - handling
Provider failure: Virtual host 'localhost' is not active [condition = amqp:not-found]
> {noformat}
> Extract of TRACE level logging from {{org.apache.qpid.jms.provider.failover.FailoverProvider}}
for the case when a AMQP connection is closed at the transport level (Connection Refused):
> {noformat}
> 2018-03-13 11:03:47,069 [lization thread] - TRACE FailoverProvider               - stack
> java.io.IOException: Transport connection remotely closed.
> 	at org.apache.qpid.jms.provider.amqp.AmqpProvider$20.run(AmqpProvider.java:901)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> 2018-03-13 11:03:47,089 [ connect thread] - DEBUG FailoverProvider               - Connection
attempt:[1] to: amqp://localhost:5672 in-progress
> 2018-03-13 11:03:47,095 [ connect thread] - INFO  FailoverProvider               - Connection
attempt:[1] to: amqp://localhost:5672 failed
> 2018-03-13 11:03:47,095 [ connect thread] - TRACE FailoverProvider               - Next
reconnect attempt will be in 10000 milliseconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message