cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andres de la Peña (Jira) <>
Subject [jira] [Updated] (CASSANDRA-16807) Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison
Date Sat, 17 Jul 2021 11:41:00 GMT


Andres de la Peña updated CASSANDRA-16807:
    Reviewers: Andres de la Peña, Andres de la Peña  (was: Andres de la Peña)
               Andres de la Peña, Andres de la Peña
       Status: Review In Progress  (was: Patch Available)

> Weak visibility guarantees of Accumulator lead to failed assertions during digest comparison
> --------------------------------------------------------------------------------------------
>                 Key: CASSANDRA-16807
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 4.0.x, 4.x
>          Time Spent: 20m
>  Remaining Estimate: 0h
> This problem could manifest on all versions, beginning on at least 3.0, but I’ll focus
on the way it manifests in 4.0 here.
> In what now seems like a wise move, CASSANDRA-16097 added an assertion to {{DigestResolver#responseMatch()}}
that ensures the responses snapshot has at least one visible elements to compare (although
of course only one element trivially cannot generate a mismatch and short-circuits immediately).
However, at the point {{ReadCallback#onResponse()}} signals the waiting resolver, there is
no guarantee that the size of the generated snapshot of the responses {{Accumulator}} is non-zero,
or perhaps more worryingly, at least equal to the number of blocked-for responses. This seems
to be a consequence of the documented weak visibility guarantees on {{Accumulator#add()}}.
In short, if there are concurrent invocations of add(), is it not guaranteed that there is
any visible size change after any one of them return, but only after all complete.
> The particular exception looks something like this:
> {noformat}
> java.lang.AssertionError: Attempted response match comparison while no responses have
been received.
> 	at org.apache.cassandra.service.reads.DigestResolver.responsesMatch(
> 	at org.apache.cassandra.service.reads.AbstractReadExecutor.awaitResponses(
> 	at org.apache.cassandra.service.StorageProxy.fetchRows(
> 	at org.apache.cassandra.service.StorageProxy.readRegular(
> 	at
> 	at org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(
> {noformat}
> It’s possible to reproduce this on simple single-partition reads without any short-read
protection or replica filtering protection. I’ve also been able to reproduce this synthetically
with [a unit test|] on {{ReadCallback}}.
> It seems like the most straightforward way to fix this would be to avoid signaling in
{{ReadCallback#onResponse()}} until the visible size of the accumulator is at least the number
of received responses. In most cases, this is trivially true, and our signaling behavior won’t
change at all. In the very rare case that there are two (or more) concurrent calls to {{onResponse()}},
the second (or last) will signal, and having one more response than we strictly need should
have no negative side effects. (We don’t seem to make any strict assertions about having
exactly the number of required responses, only that we have enough.)

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message