jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Parvulescu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-446) Query: implement an intersecting index
Date Wed, 11 Dec 2013 12:42:12 GMT

     [ https://issues.apache.org/jira/browse/OAK-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alex Parvulescu updated OAK-446:
--------------------------------

    Fix Version/s:     (was: 0.14)

> Query: implement an intersecting index
> --------------------------------------
>
>                 Key: OAK-446
>                 URL: https://issues.apache.org/jira/browse/OAK-446
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core, query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Minor
>
> In order to run queries with multiple conditions efficiently, it is currently required
to create an index on all of those conditions. For example, the query:
> {code}
> where lastName = 'x' and firstName = 'y'
> {code}
> will only run efficiently (assuming there are many nodes with the same lastName and many
nodes with the same firstName) if there is an index on both lastName _and_ firstName. If there
are two indexes, one just on lastName and the other just on firstName, then one of those indexes
is used, but not both.
> The problem doesn't only apply to properties, it also applies to node types. So a query
of the form 
> {code}
> select * from [acme:Page] where [x] = 'y'
> {code}
> will use either an index on the node type, or an index on 'x', but not both. It seems
such queries are quite important in JCR.
> To speed up such queries, I suggest we implement a (virtual) 'intersecting index' that
internally merges the results from multiple (two or more) indexes. To do that, the indexes
need to have a common property, for example the path.
> For example, the first index is on lastName and path, the second index is on firstName
and path. The intersecting index would then query the first index with firstName = 'x', and
then query the second index with lastName = 'y' and path >= '...' (the value returned by
the first index). This would go back and forth until a row is found that satisfies both conditions
(the intersection could be empty of course).
> To make this work, index implementations should support path lookup.
> To speed up cost calculation for the intersecting index, it might be needed to extend
the QueryIndex interface to return the list of property restrictions an index supports.
> I don't currently see this as a very high priority because we didn't yet run into big
performance problems here, plus the Lucene index will probably not benefit from such a feature.
But I would like to keep the issue open so we have a plan in case we do run into performance
problems.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message