cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Lynch (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-14346) Scheduled Repair in Cassandra
Date Tue, 03 Apr 2018 15:29:00 GMT


Joseph Lynch commented on CASSANDRA-14346:

[~bdeggleston] I can certainly do the port with the sidecar in mind, although the only bit
of the interface that would be hard to re-implement would be the per table configuration (currently
in Priam we rely on a {{repair_config}} table to store what the design proposes should live
on table metadata). When the reliable IPC and the official sidecar happen I am happy to talk
about porting it over. I imagine that if we started a sidecar we'd start simple and then over
time add functionality from the main daemon to it right?

{quote}We've never coordinated system level maintenance tasks before, and just looking through
the state machine in your design document (thanks a bunch for taking the time to put that
together) makes me nervous about the amount of moving parts (basically what Blake Eggleston
pointed out above) that we'd be introducing.
Indeed, Cassandra has never been eventually consistent, and as a result Datastax sold Repair
Service to customers, Last Pickle has worked on Reaper for use by customers and the community,
and Netflix (and others) has built a repair service internally because the first two solutions
wouldn't work (for us because of scale, complexity, and reliability). Do you think it is good
for the community that every user is inventing this (complex) functionality again and again
with different requirements on external tools? Furthermore all three approaches struggle to
be robust because like I said above, IPC is actually hard to do. Our internal design has probably
another 20 points in the resiliency section because the repair is in a sidecar instead of
in process.  
{quote}I'm in the camp of relying on externalized coordination and control as being an easier
place to reason about what is happening in a repair session for now. There has been so much
excellent work on repair over the past year that I would really like to see some of that 'bake
in' to get people comfortable and trusting us again before we add a dimension of complexity.
I very much appreciate that you are running a version of this in production currently, but
there is just so much that can go wrong and it's a whole new paradigm for us to include in
the code base. We just cant afford to screw this up again.
I agree trust management with the community is a big issue here. I believe the devs decided
at NGCC 2017 to introduce the notion of _experimental_ for large functionality like this,
and I'm 100% on board with labeling this experimental, keeping it off by default, and allowing
users to incrementally buy into it table by table. If we marked it as explicitly experimental
and "may be removed in future releases" would that assuage your concerns? I just really want
us to actually solve this (hard) problem, and I fear if we do not try we will end up where
we are, with numerous bespoke solutions that don't fully work.
{quote}Curious about what Stefan Podkowinski thinks here, as I agree that some of these ideas
might be much smoother to implement with CASSANDRA-12944 in place. As Blake suggested, maybe
we walk this back a bit and start from the control-plain/event loop and approach this as part
of refactoring management in general?
Shouldn't it be safe to proceed in parallel? If those additional IPC mechanisms and official
sidecar surface, we can always port almost everything into them. The only thing I'm not sure
of is how the per table control would work, but we could always figure that out (internally
we use a separate table).

> Scheduled Repair in Cassandra
> -----------------------------
>                 Key: CASSANDRA-14346
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Repair
>            Reporter: Joseph Lynch
>            Priority: Major
>              Labels: CommunityFeedbackRequested
>             Fix For: 4.0
>         Attachments: ScheduledRepairV1_20180327.pdf
> There have been many attempts to automate repair in Cassandra, which makes sense given
that it is necessary to give our users eventual consistency. Most recently CASSANDRA-10070,
CASSANDRA-8911 and CASSANDRA-13924 have all looked for ways to solve this problem.
> At Netflix we've built a scheduled repair service within Priam (our sidecar), which
we spoke about last year at NGCC. Given the positive feedback at NGCC we focussed on getting
it production ready and have now been using it in production to repair hundreds of clusters,
tens of thousands of nodes, and petabytes of data for the past six months. Also based on feedback
at NGCC we have invested effort in figuring out how to integrate this natively into Cassandra
rather than open sourcing it as an external service (e.g. in Priam).
> As such, [~vinaykumarcse] and I would like to re-work and merge our implementation into
Cassandra, and have created a [design document|]
showing how we plan to make it happen, including the the user interface.
> As we work on the code migration from Priam to Cassandra, any feedback would be greatly
appreciated about the interface or v1 implementation features. I have tried to call out in
the document features which we explicitly consider future work (as well as a path forward
to implement them in the future) because I would very much like to get this done before the
4.0 merge window closes, and to do that I think aggressively pruning scope is going to be
a necessity.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message