cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13075) Indexer is not correctly invoked when building indexes over sstables
Date Wed, 18 Jan 2017 17:55:26 GMT


Alex Petrov commented on CASSANDRA-13075:

bq.  it is missing a test with range tombstones crossing pages.

I've tested it as far as I could with {{1}} to {{5}} page sizes, and multiple tombstones,
although didn't have a chance to dig in and write an explanation on why it works the way it
does. We're using {{CQLCounter}} for counting, and it counts only live rows, so range tombstone
markers are guaranteed to be read together on the same page. You can refer to [this file|]
for more info.

The test with range tombstones checks if we fetch them correctly disregarding the page size.
So if we have:

[RT open marker] [RT close marker] | [ row ] | [RT open marker] [RT close marker]

So if we do paging, even with a page size 1, we'll get all three of them on same page, since
paging would count only the row.

I'll post an updated patch with two other nits fixed for all branches.

> Indexer is not correctly invoked when building indexes over sstables
> --------------------------------------------------------------------
>                 Key: CASSANDRA-13075
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sergio Bossa
>            Assignee: Alex Petrov
>            Priority: Critical
>         Attachments:
> Following CASSANDRA-12796, {{SecondaryIndexManager#indexPartition()}} calls each {{Indexer}}
{{begin}} and {{finish}} methods multiple times per partition (depending on the page size),
as {{PartitionIterators#getOnlyElement()}} returns an empty partition even when the iterator
is exhausted.
> This leads to bugs for {{Indexer}} implementations doing actual work in those  methods,
but even worse, it provides the {{Indexer}} the same input of an empty partition containing
only a non-live partition deletion, as the {{Indexer#partitionDelete()}} method is *not* actually
> My proposed solution:
> 1) Stop the iteration before the empty partition is returned and ingested into the {{Indexer}}.
> 2) Actually call the {{Indexer#partitionDelete()}} method inside {{SecondaryIndexManager#indexPartition()}}
(which requires to use a filtered iterator so it actually contains the deletion info).

This message was sent by Atlassian JIRA

View raw message