cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Created] (CASSANDRA-8346) Paxos operation can use stale data during multiple range movements
Date Thu, 20 Nov 2014 10:34:34 GMT
Sylvain Lebresne created CASSANDRA-8346:

             Summary: Paxos operation can use stale data during multiple range movements
                 Key: CASSANDRA-8346
             Project: Cassandra
          Issue Type: Bug
            Reporter: Sylvain Lebresne
            Assignee: Sylvain Lebresne

Paxos operations correctly account for pending ranges for all operation pertaining to the
Paxos state, but those pending ranges are not taken into account when reading the data to
check for the conditions or during a serial read. It's thus possible to break the LWT guarantees
by reading a stale value.  This require 2 node movements (on the same token range) to be a
problem though.

Basically, we have {{RF}} replicas + {{P}} pending nodes. For the Paxos prepare/propose phases,
the number of required participants (the "Paxos QUORUM") is {{(RF + P + 1) / 2}} ({{SP.getPaxosParticipants}}),
but the read done to check conditions or for serial reads is done at a "normal" QUORUM (or
LOCAL_QUORUM), and so a weaker {{(RF + 1) / 2}}. We have a problem if it's possible that said
read can read only from nodes that were not part of the paxos participants, and so we have
a problem if:
"normal quorum" == (RF + 1) / 2 <= (RF + P) - ((RF + P + 1) / 2) == "participants considered
- blocked for"
We're good if {{P = 0}} or {{P = 1}} since this inequality gives us respectively {{RF + 1
<= RF - 1}} and {{RF + 1 <= RF}}, both of which are impossible. But at {{P = 2}} (2
pending nodes), this inequality is equivalent to {{RF <= RF}} and so we might read stale

This message was sent by Atlassian JIRA

View raw message