jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-622) Improve QueryIndex interface
Date Mon, 03 Mar 2014 15:41:21 GMT

     [ https://issues.apache.org/jira/browse/OAK-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Thomas Mueller updated OAK-622:

    Attachment: OAK-622.patch

An updated patch (WIP). The advanced index is an optional interface (the current QueryIndex
interface is still supported and used by the QueryEngine).

> Improve QueryIndex interface
> ----------------------------
>                 Key: OAK-622
>                 URL: https://issues.apache.org/jira/browse/OAK-622
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: query
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Minor
>             Fix For: 1.0
>         Attachments: OAK-622.patch
> The current QueryIndex interface is quite simple, but doesn't address some of the required
features and more advanced optimizations that are possible:
> - For fulltext queries, it doesn't address the case where the index implementation has
a different understanding of the fulltext condition than what is described in the JCR spec
(the basic features).
> - For queries with "order by" it would be good to know if the index supports returning
the data in sorted order, and if yes, how much slower that would be (if it is slower). So
a index might have multiple strategies with different costs.
> - It's quite easy to misunderstand what getCost is supposed to do exactly. The new API
should have a clearer solution here.
> - Even if the query doesn't have "order by", the index might return the data in a sorted
way, which might help improving query performance (using a merge join)
> - The cost is currently a single value, it might be better to estimate the number of
nodes, the cost to run a query, and the cost per node. That way we could optimize to quickly
return the first few nodes (versus optimize for thoughput).

This message was sent by Atlassian JIRA

View raw message