cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format
Date Thu, 17 Mar 2016 02:06:33 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198608#comment-15198608
] 

Stefania commented on CASSANDRA-11206:
--------------------------------------

bq. My understanding of {{UnfilteredRowIteratorWithLowerBound#getPartitionIndexLowerBound}}
is, that it uses the IndexInfo objects that are already in the key-cache and will go to disk
if there is a key-cache miss.

Yes. Except previously it had to do this anyway because of the partition deletion, whereas
now the partition deletion will be available but not the full IndexInfo objects.

bq. We could (in theory) add stuff to the partition summary or change the serialized index
- but unfortunately not in 3.x.

I think it's reasonable to wait until the new major version to improve on the optimization
of CASSANDRA-8180. So I'm happy with this compromise. Shall we open a ticket for this?

> Support large partitions on the 3.0 sstable format
> --------------------------------------------------
>
>                 Key: CASSANDRA-11206
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11206
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Robert Stupp
>             Fix For: 3.x
>
>
> Cassandra saves a sample of IndexInfo objects that store the offset within each partition
of every 64KB (by default) range of rows.  To find a row, we binary search this sample, then
scan the partition of the appropriate range.
> The problem is that this scales poorly as partitions grow: on a cache miss, we deserialize
the entire set of IndexInfo, which both creates a lot of GC overhead (as noted in CASSANDRA-9754)
but is also non-negligible i/o activity (relative to reading a single 64KB row range) as partitions
get truly large.
> We introduced an "offset map" in CASSANDRA-10314 that allows us to perform the IndexInfo
bsearch while only deserializing IndexInfo that we need to compare against, i.e. log(N) deserializations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message