cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Lerer (Jira)" <>
Subject [jira] [Updated] (CASSANDRA-13841) Allow specific sources during rebuild
Date Fri, 30 Jul 2021 13:14:00 GMT


Benjamin Lerer updated CASSANDRA-13841:
    Status: Patch Available  (was: Review In Progress)

> Allow specific sources during rebuild
> -------------------------------------
>                 Key: CASSANDRA-13841
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Streaming and Messaging
>            Reporter: Kurt Greaves
>            Assignee: Kurt Greaves
>            Priority: Low
>              Labels: 4.0-feature-freeze-review-requested
> CASSANDRA-10406 introduced the ability to rebuild specific ranges, and CASSANDRA-9875
extended that to allow specifying a set of hosts to stream from. It's not incredibly clear
why you would only want to stream a subset of ranges, but a possible use case for this functionality
is to rebuild a node from targeted replicas. 
> When doing a DC migration, if you are using racks==RF while rebuilding you can ensure
you rebuild from each copy of a replica in the source datacenter by specifying all the hosts
from a single rack to rebuild a single copy from. This can be repeated for each rack in the
new datacenter to ensure you have each copy of the replica from the source DC, and thus maintaining
consistency through rebuilds. 
> For example, with the following topology for DC A and B with an RF of A:3 and B:3
> ||A||||                   ||B||
> ||Node||Rack||Node||Rack||
> |A1|rack1|         B1|rack1|
> |A2|rack2|         B2|rack2|
> |A3|rack3|         B3|rack3|
> The following set of actions will result in having exactly 1 copy of every replica in
A in B, and B will be _at least_ as consistent as A.
> {code:java}
> Rebuild B1 from only A1
> Rebuild B2 from only A2
> Rebuild B3 from only A3
> {code}
> Unfortunately using this functionality is non-trivial at the moment, as you can only
specify specific sources WITH the nodes set of tokens to rebuild from. To perform the above
with vnodes/a large cluster, you will have to specify every token range in the -ts arg, which
quickly gets unwieldy/impossible if you have a large cluster.
> A solution to this is to simply filter on sources first, before processing ranges.

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message