cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Shook (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10489) arbitrary order by on partitions
Date Thu, 08 Oct 2015 22:14:27 GMT


Jonathan Shook commented on CASSANDRA-10489:

So, against a non-indexed field, the processing bound will be the size of the partition. If
you only hold a scoreboard of limit items in memory and stream through the rest, replacing
items, the memory requirements are lower, but the IO requirements could be substantial. If
you do this with RF>1 and CL>1, then you may have semantics of result merging at the
coordinator, but this should still be bounded to the result size and not the search space.

I would like for us to consider this operation for indexed fields and non-indexed fields as
separate features, possibly putting the non-indexed version behind a warning or such. I'm
sure some will absolutely try to sort 10^9 items with limit 10. At least they should know
that it has a completely different op cost.

> arbitrary order by on partitions
> --------------------------------
>                 Key: CASSANDRA-10489
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jon Haddad
>            Priority: Minor
> We've got aggregations, we might as well allow sorting rows within a partition on arbitrary
fields.  Currently the advice is "do it client side", but when combined with a LIMIT clause
it makes sense do this server side.

This message was sent by Atlassian JIRA

View raw message