cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés de la Peña (JIRA) <>
Subject [jira] [Commented] (CASSANDRA-8272) 2ndary indexes can return stale data
Date Tue, 12 Jun 2018 10:38:00 GMT


Andrés de la Peña commented on CASSANDRA-8272:

I have rebased the patch for trunk [here|]. Rebased
dtests can be found [here|].

The main differences with the previous patch version are the removal of Thrift stuff (which
makes things easier) and the refactor of {{ReadCommand}}/\{{ReadQuery}} introduced by CASSANDRA-7622.
For the latter, I have placed {{postReconciliationProcessing}} at {{ReadCommand}} level since
it is related to {{StorageProxy}} and reconciliation, whereas {{ReadQuery}} doesn't seem to
require this kind of reconciliation.

It is worth remembering that the patch doesn't support rolling upgrades since not-updated
coordinators won't be discard the stale rows sent by updated replicas. I think we don't need
the patch for 3.11, which was a refactor that didn't solve the consistency problem to don't
break rolling upgrades in a non-major version. 

The patch doesn't update SASI to use the new mechanism, so it still behaves the old way. To
benefit from this fix, it would need to provide an [{{Index.getIndexQueryFilter}}|]
implementation able to deal with analyzed values. I think that we could do it in a separate
ticket to keep things simple.

I ran the updated patch on our internal CI. There are not failures for the unit tests and
the failing dtests are not related to the change.

> 2ndary indexes can return stale data
> ------------------------------------
>                 Key: CASSANDRA-8272
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>            Priority: Major
>             Fix For: 3.0.x
> When replica return 2ndary index results, it's possible for a single replica to return
a stale result and that result will be sent back to the user, potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before having applied
the insert, then the now stale result will be returned (since C will return it and A or B
will return nothing).
> A potential solution would be that when we read a tombstone in the index (and provided
we make the index inherit the gcGrace of it's parent CF), instead of skipping that tombstone,
we'd insert in the result a corresponding range tombstone.  

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message