jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Created] (OAK-2568) Ignore redundant IS NOT NULL constraints
Date Tue, 03 Mar 2015 11:15:04 GMT
Chetan Mehrotra created OAK-2568:
------------------------------------

             Summary: Ignore redundant IS NOT NULL constraints 
                 Key: OAK-2568
                 URL: https://issues.apache.org/jira/browse/OAK-2568
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: oak-lucene
            Reporter: Chetan Mehrotra
            Assignee: Chetan Mehrotra
             Fix For: 1.0.12


Query like below at times take quite a time to evaluate with LucenePropertyIndex

{code}
SELECT * FROM [nt:unstructured] as content WHERE ISDESCENDANTNODE('/content/dam/en/us')
and(
    content.[tags] = 'Products:A'
    or content.[tags] = 'Products:A/B'
    or content.[tags] = 'Products:A/B'
    or content.[tags] = 'Products:A'
)
and(
    content.[tags] = 'DocTypes:A'
    or content.[tags] = 'DocTypes:B'
    or content.[tags] = 'DocTypes:C'
    or content.[tags] = 'ProblemType:A'
)
and(
    content.[hasRendition] IS NULL
    or content.[hasRendition] = 'false'
)
{code}

Now above SQL query translates to following plan

*Plan on 1.0 branch*
{noformat}
[nt:unstructured] as [content] /* lucene:test1(/oak:index/test1) +tags:[* TO *] +(tags:Products:A
tags:Products:A/B tags:Products:A/B tags:Products:A) +(tags:DocTypes:A tags:DocTypes:B tags:DocTypes:C
tags:ProblemType:A)
  where ((((isdescendantnode([content], [/content/dam/en/us]))
  and ([content].[tags] is not null))
  and ([content].[tags] in(cast('Products:A' as string), cast('Products:A/B' as string), cast('Products:A/B'
as string), cast('Products:A' as string))))
  and ([content].[tags] is not null))
  and ([content].[tags] in(cast('DocTypes:A' as string), cast('DocTypes:B' as string), cast('DocTypes:C'
as string), cast('ProblemType:A' as string))) */
{noformat}

Note the extra property restriction of not null which translates in Lucene to {{+tags:\[*
TO *\]}}

*Plan on trunk*
{noformat}
[nt:unstructured] as [content] /* lucene:test1(/oak:index/test1) +(tags:Products:A tags:Products:A/B)
+(tags:DocTypes:A tags:DocTypes:B tags:DocTypes:C tags:ProblemType:A)
  where (isdescendantnode([content], [/content/dam/en/us]))
  and ([content].[tags] in('Products:A', 'Products:A/B'))
  and ([content].[tags] in('DocTypes:A', 'DocTypes:B', 'DocTypes:C', 'ProblemType:A')) */
{noformat}

{color:brown}This one does not have the extra not null constraint{color}

The query was performing slower on Lucene because the property existence query i.e. not null
constraint is currently evaluated as a range query in Lucene which looks like is bit expensive
to evaluate. 

Now as shown above it appears that on trunk the QueryEngine performs such an optimization
on its own (possibly done with [1610723|http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/query/ast/OrImpl.java?r1=1610723&r2=1610722&pathrev=1610723]
as part of OAK-1965. This change is not present in branch.

Given that change in OAK-1965 was quite big it would be better to perform such optimization
in {{LucenePropertyIndex}} itself



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message