cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacek Lewandowski (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5972) Reduce the amount of data to be transferred during repair
Date Tue, 03 Sep 2013 10:39:51 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756497#comment-13756497
] 

Jacek Lewandowski commented on CASSANDRA-5972:
----------------------------------------------

If it worked as you described, the node A wouldn't be repaired. However, I meant a little
different behavior. In the first step all the nodes share their Merkle trees. 

0. A compares its Merkle tree with the tree of B
1. A -- R1 --> B
2. B recomputes hash for R1 as it could have been updated, and shares it with the previous
node (A)
3. B compares its updated Merkle tree with the tree of C
4. B --> R2 --> C
5. C recomputes hash for R2 as it could have been updated, and shares it with the previous
node (B)
6. C compares its updated Merkle tree with the tree of A
7. C --> R1, R2 --> A
8. A recomputes hash for R1, R2 as they could have been updated (in the second pass, we do
not have to share the recomputed tree with a previous node)
9. A compares its updated Merkle tree with the updated tree of B and it knows that it needs
to send R2 to it
10. A --> R2 --> B
11. the end

                
> Reduce the amount of data to be transferred during repair
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-5972
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5972
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jacek Lewandowski
>            Priority: Minor
>
> Currently, when a validator finds a token range different in n replicas, data streams
are initiated simultaneously between each possible pair of these n nodes, in both directions.
It yields n*(n-1) data stream in total. 
> It can be done in a sequence - R(1) -> R(2), R(2) -> R(3), ... , R(n-1) -> R(n).
After this process, the data in R(n) are up to date. Then, we continue: R(n) -> R(1), R(1)
-> R(2), ... , R(n-2) -> R(n-1). The active repair is done after 2*(n-1) data transfers
performed sequentially in 2*(n-1) steps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message