cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (Jira)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-16721) Repaired data tracking on a read coordinator is susceptible to races between local and remote requests
Date Wed, 18 Aug 2021 16:15:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401177#comment-17401177
] 

Sam Tunnicliffe commented on CASSANDRA-16721:
---------------------------------------------

Your approach is a great improvement, makes a lot more sense for {{ReadExecutionController}}
to handle {{RepairedDataInfo}}. The only non-obvious thing I noticed is that the sub-controllers
for index reads should probably not inherit the tracking flag, just always make it false.
It should be safe and not inefficient either way, but it just doesn't make sense for a read
of the index table to be set to track.

> Repaired data tracking on a read coordinator is susceptible to races between local and
remote requests
> ------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16721
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16721
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Sam Tunnicliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 4.0.x
>
>
> At read time on a coordinator which is also a replica, the local and remote reads can
race such that the remote responses are received while the local read is executing. If the
remote responses are mismatching, triggering a {{DigestMismatchException}} and subsequent
round of full data reads and read repair, the local runnable may find the {{isTrackingRepairedStatus}}
flag flipped mid-execution.  If this happens after a certain point in execution, it would
mean
> that the RepairedDataInfo instance in use is the singleton null object {{RepairedDataInfo.NULL_REPAIRED_DATA_INFO}}.
If this happens, it can lead to an NPE when calling {{RepairedDataInfo::extend}} when the
local results are iterated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message