cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-3885) Support multiple ranges in SliceQueryFilter
Date Fri, 15 Jun 2012 08:30:44 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne updated CASSANDRA-3885:
----------------------------------------

    Attachment: 3885-v2.txt

Ok, I like the last patch much more and I've convinced myself that this prefetched thing is
a good idea with multiple slices.

I believe this can be simpler a tiny bit further though. Typically the {{readBackward}} business
is a tad confusing and since only SimpleBlockFetcher really cares about it, it can be moved
there. I also think the switch of slice/lookup of index block can be cleaned by doing both
together. I'm attaching a v2 that does those things, a few other cleanups, adds a bunch of
comments and also adds two optimizations:
* Since we always know the order on which we read columns, we can remember when we've "entered"
a slice to avoid a bunch of comparisons until we "leave" it.
* When we switch to the next slice, we must reuse the binary search to locate where that new
slice starts rather than going to the next block, otherwise we may end up reading lots of
unnecessary blocks.

I'll note that imho, with those optimizations, SimpleSliceReader (not to be confused with
SimpleBlockFetcher) isn't really useful anymore, but we probably want to make sure by benchmarking
it.

Anyway, outside of IndexedSliceReader, there was a few problems (that are all solved by the
attached v2):
* We needed to deal with multiple slices for the memtable iterator too, otherwise we'll end
up returning the wrong columns.
* We needed to correctly serialize multiple slices for the inter-node protocol.

I'll note that with this patch, db.SerializationsTest is broken, but that is no mystery: it's
trying to read old messages using the current protocol version. So I could regenerate the
binary messages, but I'm confused on what SerializationsTest is actually testing. I though
it was making sure we don't break backward compatibility. If we regenerate the binary message
at each release we're not testing that at all.

ps: @david, it seems your editor is splitting lines where it shouldn't and is reordering imports
in a way that don't respect http://wiki.apache.org/cassandra/CodeStyle.

                
> Support multiple ranges in SliceQueryFilter
> -------------------------------------------
>
>                 Key: CASSANDRA-3885
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3885
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: David Alves
>             Fix For: 1.2
>
>         Attachments: 3885-v2.txt, CASSANDRA-3885.patch, CASSANDRA-3885.patch, CASSANDRA-3885.patch,
CASSANDRA-3885.patch, CASSANDRA-3885.patch
>
>
> This is logically a subtask of CASSANDRA-2710, but Jira doesn't allow sub-sub-tasks.
> We need to support multiple ranges in a SliceQueryFilter, and we want querying them to
be efficient, i.e., one pass through the row to get all of the ranges, rather than one pass
per range.
> Supercolumns are irrelevant since the goal is to replace them anyway.  Ignore supercolumn-related
code or rip it out, whichever is easier.
> This is ONLY dealing with the storage engine part, not the StorageProxy and Command intra-node
messages or the Thrift or CQL client APIs.  Thus, a unit test should be added to ColumnFamilyStoreTest
to demonstrate that it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message