cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Knighton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10089) NullPointerException in Gossip handleStateNormal
Date Thu, 08 Oct 2015 18:31:26 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949105#comment-14949105
] 

Joel Knighton commented on CASSANDRA-10089:
-------------------------------------------

I'm +1 on patch code quality and fixing the race condition.

I'd like to track down the cause of the missing {{handleStateNormal}}, but like you, spent
quite a while trying to reproduce and could not.

At a minimum, I'm in favor of logging at ERROR level if {{handleStateNormal}} is called with
no tokens; while I also can't find any consequences of a node with no tokens in NORMAL state,
I'm not certain there aren't any.

This seems related to a set of issues around token metadata flushing and node startup (like
[10293|https://issues.apache.org/jira/browse/CASSANDRA-10293]. As part of such, I wonder if
it doesn't make sense to force a blocking flush of the system keyspace in [updateTokens|https://github.com/apache/cassandra/blob/9515fca7692ed09aa0b3c6c12f038d6a459d87de/src/java/org/apache/cassandra/db/SystemKeyspace.java#L690].

Anyway, I'm +1 on this patch fixing the NPE but do wonder if we have a larger issue to solve
here.

> NullPointerException in Gossip handleStateNormal
> ------------------------------------------------
>
>                 Key: CASSANDRA-10089
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10089
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 2.1.x, 2.2.x, 3.0.x
>
>
> Whilst comparing dtests for CASSANDRA-9970 I found [this failing dtest|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-9970-dtest/lastCompletedBuild/testReport/consistency_test/TestConsistency/short_read_test/]
in 2.2:
> {code}
> Unexpected error in node1 node log: ['ERROR [GossipStage:1] 2015-08-14 15:39:57,873 CassandraDaemon.java:183
- Exception in thread Thread[GossipStage:1,5,main] java.lang.NullPointerException: null \tat
org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1731)
~[main/:na] \tat org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1804)
~[main/:na] \tat org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1857)
~[main/:na] \tat org.apache.cassandra.service.StorageService.onChange(StorageService.java:1629)
~[main/:na] \tat org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2312)
~[main/:na] \tat org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1025)
~[main/:na] \tat org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1106) ~[main/:na]
\tat org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
~[main/:na] \tat org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
~[main/:na] \tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_80] \tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_80] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_80]']
> {code}
> I wasn't able to find it on unpatched branches  but it is clearly not related to CASSANDRA-9970,
if anything it could have been a side effect of CASSANDRA-9871.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message