cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Lerer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions
Date Tue, 19 Jun 2018 08:10:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516774#comment-16516774
] 

Benjamin Lerer commented on CASSANDRA-14344:
--------------------------------------------

{quote}I agree with the comment "approach force the deserialization of all the list elements
and of the value for each check", but as per my knowledge, this evaluation happens as part
of iterator with no additional status/context, making it difficult to reuse deserialized values
across partitions. This approached is used by other operators too. Is there a better way?{quote}

You are right, we cannot reuse deserialized values. I just believe that the code can be more
efficient.

Deserializing the value and the list elements for each check can generate a lot of garbage
that the GC will have to eliminate. If we use {{AbstractType.compareForCQL}}, instead of deserializing
and then comparing the objects, we do not end up generating that garbage. I also have the
feeling that it might actually be more efficient in term of performance than the other approach
as in most case you might end up only comparing a subset of the bytes.

{{ArrayList.contains}} is checking all the values of the list sequentially. If we pre-sort
the list elements when the {{RowFilter}} is built we can take advantage of that to stop as
soon as possible and save us some CPU usage. We could even later on allow the possibility
to switch to binary search if the list is bigger than a specific size.

Now, it is just some ideas. Personally, I would run some JMH benchmark to compare the different
implementations with different set of data.
{quote}I will add additional unit test cases.{quote}


It would be nice if you could also add a test for the {{counter}} type. If I am not mistaken
your code should fail for that type.

> Support filtering using IN restrictions
> ---------------------------------------
>
>                 Key: CASSANDRA-14344
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Dikang Gu
>            Assignee: Venkata Harikrishna Nukala
>            Priority: Major
>         Attachments: 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -----+------+------+-------
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff0000}InvalidRequest: Error from server: code=2200 [Invalid query] message="IN
restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message