cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Blake Eggleston (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13257) Add repair streaming preview
Date Thu, 30 Mar 2017 21:18:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949858#comment-15949858
] 

Blake Eggleston commented on CASSANDRA-13257:
---------------------------------------------

bq. I think even so streaming preview covers both full and incremental repair case, and other
streaming usage.

No, I’m afraid it doesn’t. Part of the confusion here is that my linked patch doesn’t
include the fix included in CASSANDRA-13328, which fixes how sstables are selected for streaming
post #9143. Sorry about that. The other part is that, post CASSANDRA-9143, incremental repair
does an anti-compaction before doing anything else, including validation or streaming. Rewriting
a bunch of sstables just so we can estimate the streaming that would happen if we ran one
for real is sort of a non-starter. 

So, I still don’t see a way we can prevent StreamSession from having some notion of what
is being previewed. Previewing incremental repair streaming means that we need StreamSession
to know it should only include unrepaired sstables, instead of all sstables, as it would with
a full repair, since we won’t be including a pending repair id. After #13328, the isIncremental
flag in StreamSession is not doing anything, and I have a note to remove it before 4.0. We
could make the argument that we should leave it to support preview, but then why not just
have the preview enum, which has a much clearer purpose?

Also, while knowing that there was a merkle tree mismatch is technically enough to validate
whether repaired data is in sync across nodes, having information about the related streaming
we expect does have value which shouldn’t be dismissed just because it’s a bit abstract.
From the development side, it will provide clues about the cause of the mismatch (ie: a one
way transfer indicates that one node failed to promote an sstable). From the operational side,
knowing how much data needs to be streamed to fix the out of sync data is useful, it also
indicates the severity of the problem, and worst case data loss risk in the case of corruption.
But, we can't do this without StreamSession having some notion of what's being previewed.

Rebased against trunk (and CASSANDRA-13325) here: https://github.com/bdeggleston/cassandra/tree/13257-squashed-trunk

> Add repair streaming preview
> ----------------------------
>
>                 Key: CASSANDRA-13257
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13257
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Streaming and Messaging
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>             Fix For: 4.0
>
>
> It would be useful to be able to estimate the amount of repair streaming that needs to
be done, without actually doing any streaming. Our main motivation for this having something
this is validating CASSANDRA-9143 in production, but I’d imagine it could also be a useful
tool in troubleshooting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message