jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-5899) PropertyDefinitions should allow for some tweakability to declare usefulness
Date Thu, 09 Mar 2017 14:17:38 GMT

    [ https://issues.apache.org/jira/browse/OAK-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903107#comment-15903107

Thomas Mueller commented on OAK-5899:

> to me that we should be able to get a histogram of num-doc-per-term-per-field in lucene

I'm afraid I don't know how easy it would be to get that information from a Lucene index,
maybe [~teofili] knows a way? An alternative might be to calculate it when updating an index,
but I'm not convinced that's easier (without impacting performance too much). What I would
try instead, or at least first, is using what we have, that is field info (number of terms
or so), and possibly combine that with the index size. That can't get you a histogram however,
but just selectivity at best.

> PropertyDefinitions should allow for some tweakability to declare usefulness
> ----------------------------------------------------------------------------
>                 Key: OAK-5899
>                 URL: https://issues.apache.org/jira/browse/OAK-5899
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Vikas Saurabh
>            Priority: Minor
>             Fix For: 1.8
> At times, we have property definitions which are added to support for dense results right
out of the index (e.g. {{contains(\*, 'foo') AND \[bar]='baz'}}).
> In such cases, the added property definition "might" not be the best one to answer queries
which only have the property restriction (eg only {{\[bar]='baz'}}
> There should be a way for property definition to declare this. May be there are cases
of some spectrum too - i.e. not only a boolean-usable-or-not, but some kind of scale of how-usable
is it.

This message was sent by Atlassian JIRA

View raw message