cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13075) Indexer is not correctly invoked when building indexes over sstables
Date Tue, 03 Jan 2017 10:48:58 GMT


Alex Petrov commented on CASSANDRA-13075:

Good find! 

I might be misunderstanding the issue, but as far as I can say, we have multiple alternatives,
if what we need is ensuring that {{Index.Indexer::start}} and {{Index.Indexer::finish}} are
called just once per partition:

  * piggyback the {{readStatic}} boolean, that makes sure we index the static row just once
  * call them outside of the loop (since essentially we have both {{partitionColumns}} and
{{writeGroup}} available before we have a partition page, so it's optional

I may have misunderstood the part about {{PartitionIterators.getOnlyElement}}, but it also
seems to me that this behaviour will be just fine if we take the {{start}} and {{finish}}
out of the loop, since {{insertRow}} for statics will be skipped on further iterations and
for partition rows it will be also skipped since partition is empty on exhausted iterator..

> Indexer is not correctly invoked when building indexes over sstables
> --------------------------------------------------------------------
>                 Key: CASSANDRA-13075
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sergio Bossa
>            Assignee: Alex Petrov
>            Priority: Critical
> Following CASSANDRA-12796, {{SecondaryIndexManager#indexPartition()}} calls each {{Indexer}}
{{begin}} and {{finish}} methods multiple times per partition (depending on the page size),
as {{PartitionIterators#getOnlyElement()}} returns an empty partition even when the iterator
is exhausted.
> This leads to bugs for {{Indexer}} implementations doing actual work in those  methods,
but even worse, it provides the {{Indexer}} the same input of an empty partition containing
only a non-live partition deletion, as the {{Indexer#partitionDelete()}} method is *not* actually
> My proposed solution:
> 1) Stop the iteration before the empty partition is returned and ingested into the {{Indexer}}.
> 2) Actually call the {{Indexer#partitionDelete()}} method inside {{SecondaryIndexManager#indexPartition()}}
(which requires to use a filtered iterator so it actually contains the deletion info).

This message was sent by Atlassian JIRA

View raw message