cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10436) Index selection should be weighted in favour of custom expressions
Date Fri, 02 Oct 2015 13:43:26 GMT


Sylvain Lebresne commented on CASSANDRA-10436:

For some reason I was convinced this was what we ended up doing in CASSANDRA-10217 so +1 on
both the principle and the patch.

> Index selection should be weighted in favour of custom expressions
> ------------------------------------------------------------------
>                 Key: CASSANDRA-10436
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sam Tunnicliffe
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.0.0 rc2
> If a SELECT contains a custom index expression (CASSANDRA-10217), that should always
be chosen as the primary expression during query execution. Should the statement contain other
expressions which can be satsfied by a built in index, we don't currently have the ability
to apply the custom expression as a filter. What's more, the method of selecting which index
to use is fairly primitive (and cannot be overridden until CASSANDRA-10214), so we should
ensure that a custom expression, if present, is always chosen. 
> Suppose we have a custom index implementation which provides prefix matching on text
> {code}
> CREATE TABLE ks.t (k int, v1 int, v2 text, PRIMARY KEY(k));
> CREATE INDEX v1_idx ON ks.t(v1);
> CREATE CUSTOM INDEX v2_idx ON ks.t(v2) USING 'com.example.CustomIndex';
> INSERT INTO ks.t(k, v1, v2) VALUES(0, 0, 'abc');
> INSERT INTO ks.t(k, v1, v2) VALUES(1, 1, 'def');
> SELECT * FROM ks.t WHERE v1=0 AND expr(v2_idx, 'd*') ALLOW FILTERING;
> {code}
> In the above example the expected result would contain no rows, which would be the case
if {{v2_idx}} is selected as the primary (i.e. most selective) index during query execution.
However, if {{v1_idx}} is chosen instead, the results of its lookup will have no further filter
applied and so an incorrect result will be returned.  
> Note: this has always been something of an issue for custom indexes as the expressions
they support may not be natively filterable by C*. For example, with the full text search
syntax used by Stratio & DSE Search, if the custom index isn't selected the filtering
will erroneously remove all rows as the value of the dummy column does not match the Lucene/Solr
search expression literal. It's probably a fairly minor concern as in most cases a query using
a custom index will not include other expressions (usually because custom indexes are per-row
indexes, and so can support multi-field expression syntax). Also, an index implementation
can return a very low number of estimated result count to try and ensure it is selected, custom
expressions just provide an opportunity to improve the situation.

This message was sent by Atlassian JIRA

View raw message