cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10378) Make skipping more efficient
Date Wed, 07 Oct 2015 08:19:27 GMT


Sylvain Lebresne commented on CASSANDRA-10378:

Thanks. Pushed an additional commit to the branch to address those.

bq. A static method for deciding if we have an extension byte would be nice

There was a {{isExtended()}} method already, though I added a {{readExtendedFlags}} that does
the reading conditionally as that's probably cleaner. Or did I misunderstood what you meant?

> Make skipping more efficient
> ----------------------------
>                 Key: CASSANDRA-10378
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.0 rc2
> Following on from the impact of CASSANDRA-10322, we can improve the efficiency of our
calls to skipping methods. CASSANDRA-10326 is showing our performance to be in-and-around
the same ballpark except for seeks into the middle of a large partition, which suggests (possibly)
that the higher density of data we're storing may simply be resulting in a more significant
CPU burden as we have more data to skip over (and since CASSANDRA-10322 improves performance
here really dramatically, further improvements are likely to be of similar benefit).
> I propose doing our best to flatten the skipping of macro data items into as few skip
invocations as necessary. One way of doing this would be to introduce a special {{skipUnsignedVInts(int)}}
method, that can efficiently skip a number of unsigned vints. Almost the entire body of a
cell and row consist of vints now, each data component with their own special {{skipX}} method
that invokes {{readUnsignedVint}}. This would permit more efficient despatch.
> We could also potentially avoid the construction of a new {{Columns}} instance for each
row skip, since all we need is an iterator over the columns, and share the temporary space
used for storing them, which should further reduce the GC burden for skipping many rows.

This message was sent by Atlassian JIRA

View raw message