jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dirk Rudolph (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (OAK-7109) rep:facet returns wrong results for complex queries
Date Wed, 03 Jan 2018 12:13:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309559#comment-16309559
] 

Dirk Rudolph edited comment on OAK-7109 at 1/3/18 12:12 PM:
------------------------------------------------------------

Here is an example where constraints get lost in the filter:

{code}
select * from [nt:base] where ([propa] = 'true' and [propb] in('foo','bar')) or ([propa] =
'false' and not([propb] in('foo','bar')))
{code}

It implements kind of white-/blacklisting ala "If a is set to true, b has to be in a configured
set, if not, b has not to be in the configured set." It evaluates to: 

{code}
[nt:base] as [nt:base] /* lucene:test2(/oak:index/test2) propa:[* TO *] where [nt:base].[propa]
is not null */
{code}

Which doesn't contain anything of propb, so in that case facet counting will be wrong as well.

As you can see the query is in DNF, and querying with its disjunctive statements individually
works, well. I attached a unit test showing it for this specific example.


was (Author: diru):
Here is an example where constraints get lost in the filter:

{code}
select * from [nt:base] where ([propa] = 'true' and [propb] in('foo','bar')) or ([propa] =
'false' and not([propb] in('foo','bar')))
{code}

It implements kind of white-/blacklisting ala "If a is set to true, b has to be in a configured
set, if not, b has not to be in the configured set." It evaluates to: 

{code}
[nt:base] as [nt:base] /* lucene:test2(/oak:index/test2) propa:[* TO *] where [nt:base].[propa]
is not null */
{code}

Which doesn't contain anything of propb, so in that case facet counting will be wrong as well.



> rep:facet returns wrong results for complex queries
> ---------------------------------------------------
>
>                 Key: OAK-7109
>                 URL: https://issues.apache.org/jira/browse/OAK-7109
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>    Affects Versions: 1.6.7
>            Reporter: Dirk Rudolph
>              Labels: facet
>         Attachments: facetsInMultipleRoots.patch, restrictionPropagationTest.patch
>
>
> eComplex queries in that case are queries, which are passed to lucene not containing
all original constraints. For example queries with multiple path restrictions like:
> {code}
> select [rep:facet(simple/tags)] from [nt:base] as a where contains(a.[*], 'ipsum') and
(isdescendantnode(a,'/content1') or isdescendantnode(a,'/content2'))
> {code}
> In that particular case the index planer gives ":fulltext:ipsum" to lucene even though
the index supports evaluating path constraints. 
> As counting the facets happens on the raw result of lucene, the returned facets are incorrect.
For example having the following content 
> {code}
> /content1/test/foo
>  + text = lorem ipsum
>  - simple/
>   + tags = tag1, tag2
> /content2/test/bar
>  + text = lorem ipsum
>  - simple/
>   + tags = tag1, tag2
> /content3/test/bar
>  + text = lorem ipsum
>  - simple/
>    + tags = tag1, tag2
> {code}
> the expected result for the dimensions of simple/tags and the query above is 
> - tag1: 2
> - tag2: 2
> as the result set is 2 results long and all documents are equal. The actual result set
is 
> - tag1: 3
> - tag2: 3
> as the path constraint is not handled by lucene.
> To workaround that the only solution that came to my mind is building the [disjunctive
normal form|https://en.wikipedia.org/wiki/Disjunctive_normal_form] of my complex query and
executing a query for each of the disjunctive statements. As this is expanding exponentially
its only a theoretical solution, nothing for production. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message