cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7282) Faster Memtable map
Date Wed, 04 Jul 2018 23:36:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16533127#comment-16533127
] 

Benedict edited comment on CASSANDRA-7282 at 7/4/18 11:35 PM:
--------------------------------------------------------------

Hi Michael,

I'll try to review this properly, as I find time.  A couple of things leap out to me though:
 # While we certainly need to handle out of range tokens, it seems to me that we should anyway
try hard to minimise their occurrence, by e.g. switching memtable on a token range change.
 Ideally we would switch before this new token range actually takes effect in the case of
an expansion; whether or not it makes sense to defer a token range removal probably needs
some thought.
 # A range tree is probably overkill.  Why not binary search over an array?


was (Author: benedict):
Hi Michael,

I'll try to review this properly, as I find time.  A couple of things leap out to me though:
 # While we certainly need to handle out of range tokens, we should try harder to minimise
their occurrence, by switching memtable on a token range change.  Ideally we would switch
before this new token range actually takes effect in the case of an expansion; whether or
not it makes sense to defer a token range removal probably needs some thought.
 # A range tree is probably overkill.  Why not binary search over an array?

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Michael Burman
>            Priority: Major
>              Labels: performance
>             Fix For: 4.x
>
>         Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in our
memtables. Maintaining this is an O(lg(n)) operation; since the vast majority of users use
a hash partitioner, it occurs to me we could maintain a hybrid ordered list / hash map. The
list would impose the normal order on the collection, but a hash index would live alongside
as part of the same data structure, simply mapping into the list and permitting O(1) lookups
and inserts.
> I've chosen to implement this initial version as a linked-list node per item, but we
can optimise this in future by storing fatter nodes that permit a cache-line's worth of hashes
to be checked at once,  further reducing the constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message