cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (Jira)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-16364) Joining nodes simultaneously with auto_bootstrap:false can cause token collision
Date Wed, 18 Aug 2021 16:07:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-16364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401171#comment-17401171
] 

Brandon Williams commented on CASSANDRA-16364:
----------------------------------------------

bq. I have double checked, nuked the cluster; removed `auto_bootstrap:true` from cassandra.yaml
(so that a default is used); redeployed - and I'm seeing the same issue; i.e. I can reproduce
it every time

auto_bootstrap is not in the yaml by default; it relies on the default true behavior.  In
any case I don't think it's a surprise that removing it had no effect and you can reproduce.

> Joining nodes simultaneously with auto_bootstrap:false can cause token collision
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16364
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16364
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Membership
>            Reporter: Paulo Motta
>            Priority: Normal
>             Fix For: 4.0.x
>
>
> While raising a 6-node ccm cluster to test 4.0-beta4, 2 nodes chosen the same tokens
using the default {{allocate_tokens_for_local_rf}}. However they both succeeded bootstrap
with colliding tokens.
> We were familiar with this issue from CASSANDRA-13701 and CASSANDRA-16079, and the workaround
to fix this is to avoid parallel bootstrap when using {{allocate_tokens_for_local_rf}}.
> However, since this is the default behavior, we should try to detect and prevent this
situation when possible, since it can break users relying on parallel bootstrap behavior.
> I think we could prevent this as following:
> 1. announce intent to bootstrap via gossip (ie. add node on gossip without token information)
> 2. wait for gossip to settle for a longer period (ie. ring delay)
> 3. allocate tokens (if multiple bootstrap attempts are detected, tie break via node-id)
> 4. broadcast tokens and move on with bootstrap



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message