qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Wall (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (QPID-7185) ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved fails sporadically on Apache CI
Date Tue, 05 Apr 2016 11:00:29 GMT

     [ https://issues.apache.org/jira/browse/QPID-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Keith Wall updated QPID-7185:
-----------------------------
    Description: 
The test {{testReplicationGroupListenerHearsNodeRemoved}} failed in the following way on the
Apache CI host:

{noformat}
org.apache.qpid.server.store.StoreException: Exception on node removal from group
	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
{noformat}

The underlying exception was as follows:

{noformat}
2016-04-03 23:19:00,667 ERROR [main] o.a.q.s.u.ServerScopedRuntimeException Exception on node
removal from group
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) (JE 5.0.104) Transaction -20 cannot
execute write operations because this node is no longer a master UNEXPECTED_STATE: Unexpected
internal state, may have side effects.
	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
~[je-5.0.104.jar:5.0.104]
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
~[je-5.0.104.jar:5.0.104]
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
~[je-5.0.104.jar:5.0.104]
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
~[je-5.0.104.jar:5.0.104]
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
~[classes/:na]
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
[test-classes/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_80]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_80]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.7.0_80]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_80]
	at junit.framework.TestCase.runTest(TestCase.java:176) [junit-4.11.jar:na]
	at org.apache.qpid.test.utils.QpidTestCase.runTest(QpidTestCase.java:171) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
	at junit.framework.TestCase.runBare(TestCase.java:141) [junit-4.11.jar:na]
	at junit.framework.TestResult$1.protect(TestResult.java:122) [junit-4.11.jar:na]
	at junit.framework.TestResult.runProtected(TestResult.java:142) [junit-4.11.jar:na]
	at junit.framework.TestResult.run(TestResult.java:125) [junit-4.11.jar:na]
	at junit.framework.TestCase.run(TestCase.java:129) [junit-4.11.jar:na]
	at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:156) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
	at junit.framework.TestSuite.runTest(TestSuite.java:255) [junit-4.11.jar:na]
	at junit.framework.TestSuite.run(TestSuite.java:250) [junit-4.11.jar:na]
	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) [junit-4.11.jar:na]
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) [surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
[surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) [surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
[surefire-booter-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
[surefire-booter-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) [surefire-booter-2.17.jar:2.17]
{noformat}

The node that was the target of the {{ReplicationGroupAdmin.removeMember}} call was at that
moment being restarted as majority had been lost.  This seems to have provoked an unexpected
exception from within JE.

The test is concerned with ensuring the listener fires correctly in response to changes in
group membership.  This test can avoid the possibility of a mastership loss simply by setting
designated primary to true.

As changing the consistency of a group whilst a production system is live would be an unusual
thing to do, this chances of this manifesting in production are small.  If it were to happen,
a node restart would be required to restore service.


  was:
The test {{testReplicationGroupListenerHearsNodeRemoved }} failed in the following way on
the Apache CI host:

{noformat}
org.apache.qpid.server.store.StoreException: Exception on node removal from group
	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
{noformat}

The underlying exception was as follows:

{noformat}
2016-04-03 23:19:00,667 ERROR [main] o.a.q.s.u.ServerScopedRuntimeException Exception on node
removal from group
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) (JE 5.0.104) Transaction -20 cannot
execute write operations because this node is no longer a master UNEXPECTED_STATE: Unexpected
internal state, may have side effects.
	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
~[je-5.0.104.jar:5.0.104]
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
~[je-5.0.104.jar:5.0.104]
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
~[je-5.0.104.jar:5.0.104]
	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
~[je-5.0.104.jar:5.0.104]
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
~[classes/:na]
	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
[test-classes/:na]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_80]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_80]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.7.0_80]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_80]
	at junit.framework.TestCase.runTest(TestCase.java:176) [junit-4.11.jar:na]
	at org.apache.qpid.test.utils.QpidTestCase.runTest(QpidTestCase.java:171) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
	at junit.framework.TestCase.runBare(TestCase.java:141) [junit-4.11.jar:na]
	at junit.framework.TestResult$1.protect(TestResult.java:122) [junit-4.11.jar:na]
	at junit.framework.TestResult.runProtected(TestResult.java:142) [junit-4.11.jar:na]
	at junit.framework.TestResult.run(TestResult.java:125) [junit-4.11.jar:na]
	at junit.framework.TestCase.run(TestCase.java:129) [junit-4.11.jar:na]
	at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:156) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
	at junit.framework.TestSuite.runTest(TestSuite.java:255) [junit-4.11.jar:na]
	at junit.framework.TestSuite.run(TestSuite.java:250) [junit-4.11.jar:na]
	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) [junit-4.11.jar:na]
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) [surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
[surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) [surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
[surefire-booter-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
[surefire-booter-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) [surefire-booter-2.17.jar:2.17]
{noformat}

The node that was the target of the {{ReplicationGroupAdmin.removeMember}} call was at that
moment being restarted as majority had been lost.  This seems to have provoked an unexpected
exception from within JE.

The test is concerned with ensuring the listener fires correctly in response to changes in
group membership.  This test can avoid the possibility of a mastership loss simply by setting
designated primary to true.

As changing the consistency of a group whilst a production system is live would be an unusual
thing to do, this chances of this manifesting in production are small.  If it were to happen,
a node restart would be required to restore service.



> ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved fails sporadically
on Apache CI
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-7185
>                 URL: https://issues.apache.org/jira/browse/QPID-7185
>             Project: Qpid
>          Issue Type: Bug
>          Components: Java Tests
>            Reporter: Keith Wall
>            Assignee: Keith Wall
>            Priority: Minor
>             Fix For: qpid-java-6.1
>
>
> The test {{testReplicationGroupListenerHearsNodeRemoved}} failed in the following way
on the Apache CI host:
> {noformat}
> org.apache.qpid.server.store.StoreException: Exception on node removal from group
> 	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
> {noformat}
> The underlying exception was as follows:
> {noformat}
> 2016-04-03 23:19:00,667 ERROR [main] o.a.q.s.u.ServerScopedRuntimeException Exception
on node removal from group
> com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) (JE 5.0.104) Transaction -20
cannot execute write operations because this node is no longer a master UNEXPECTED_STATE:
Unexpected internal state, may have side effects.
> 	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
~[je-5.0.104.jar:5.0.104]
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
~[je-5.0.104.jar:5.0.104]
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
~[je-5.0.104.jar:5.0.104]
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
~[je-5.0.104.jar:5.0.104]
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
~[classes/:na]
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
[test-classes/:na]
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_80]
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_80]
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[na:1.7.0_80]
> 	at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_80]
> 	at junit.framework.TestCase.runTest(TestCase.java:176) [junit-4.11.jar:na]
> 	at org.apache.qpid.test.utils.QpidTestCase.runTest(QpidTestCase.java:171) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
> 	at junit.framework.TestCase.runBare(TestCase.java:141) [junit-4.11.jar:na]
> 	at junit.framework.TestResult$1.protect(TestResult.java:122) [junit-4.11.jar:na]
> 	at junit.framework.TestResult.runProtected(TestResult.java:142) [junit-4.11.jar:na]
> 	at junit.framework.TestResult.run(TestResult.java:125) [junit-4.11.jar:na]
> 	at junit.framework.TestCase.run(TestCase.java:129) [junit-4.11.jar:na]
> 	at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:156) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
> 	at junit.framework.TestSuite.runTest(TestSuite.java:255) [junit-4.11.jar:na]
> 	at junit.framework.TestSuite.run(TestSuite.java:250) [junit-4.11.jar:na]
> 	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) [junit-4.11.jar:na]
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
[surefire-junit4-2.17.jar:2.17]
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
[surefire-junit4-2.17.jar:2.17]
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) [surefire-junit4-2.17.jar:2.17]
> 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
[surefire-booter-2.17.jar:2.17]
> 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
[surefire-booter-2.17.jar:2.17]
> 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) [surefire-booter-2.17.jar:2.17]
> {noformat}
> The node that was the target of the {{ReplicationGroupAdmin.removeMember}} call was at
that moment being restarted as majority had been lost.  This seems to have provoked an unexpected
exception from within JE.
> The test is concerned with ensuring the listener fires correctly in response to changes
in group membership.  This test can avoid the possibility of a mastership loss simply by setting
designated primary to true.
> As changing the consistency of a group whilst a production system is live would be an
unusual thing to do, this chances of this manifesting in production are small.  If it were
to happen, a node restart would be required to restore service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message