cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Roth (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12888) Incremental repairs broken for MVs and CDC
Date Thu, 12 Jan 2017 15:09:10 GMT


Benjamin Roth commented on CASSANDRA-12888:

Hi Victor,

1. Performance:
Performance can be better with MV than with batches but this depends on the read performance
of the base table vs. the performance overhead for batches, which also is dependent on the
batch size and the batchlog performance. An MV always creates a read before write, so it depends
much on this how the MV performs. The final write operation of the MV update is fast as it
works like a regular (local) write.

2. Partition Keys and remote MV updates
You are of course right, that this may be a common use case. You have to use it carefully.
Maybe the situation already has improved by some bugfixes. The last time I tried was some
months ago. To be fair I have to mention that back then there was a bug with a race condition
that could deadlock the whole mutation stage. With "remote MVs" we ran very frequently into
this situation during bootstraps (for example). This has to do with MV-locks and probably
the much longer lock-time when the MV update is remote, leading to more lock-contention. With
remote MV updates, the current write request also depends on the performance of remote nodes.
This can lead to write timeouts much faster as long as the (remote) MV update is part of the
write request and not deferred. So again: Maybe this situation has improved meanwhile but
I personally didn't require it so I was able to use normal tables to "twist" the PK. We currently
use MVs only to add a field to the primary key for sorting.

> Incremental repairs broken for MVs and CDC
> ------------------------------------------
>                 Key: CASSANDRA-12888
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Stefan Podkowinski
>            Assignee: Benjamin Roth
>            Priority: Critical
>             Fix For: 3.0.x, 3.x
> SSTables streamed during the repair process will first be written locally and afterwards
either simply added to the pool of existing sstables or, in case of existing MVs or active
CDC, replayed on mutation basis:
> As described in {{StreamReceiveTask.OnCompletionRunnable}}:
> {quote}
> We have a special path for views and for CDC.
> For views, since the view requires cleaning up any pre-existing state, we must put all
partitions through the same write path as normal mutations. This also ensures any 2is are
also updated.
> For CDC-enabled tables, we want to ensure that the mutations are run through the CommitLog
so they can be archived by the CDC process on discard.
> {quote}
> Using the regular write path turns out to be an issue for incremental repairs, as we
loose the {{repaired_at}} state in the process. Eventually the streamed rows will end up in
the unrepaired set, in contrast to the rows on the sender site moved to the repaired set.
The next repair run will stream the same data back again, causing rows to bounce on and on
between nodes on each repair.
> See linked dtest on steps to reproduce. An example for reproducing this manually using
ccm can be found [here|]

This message was sent by Atlassian JIRA

View raw message