jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikas Saurabh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-6735) Lucene Index: improved cost estimation by using document count per field
Date Thu, 02 Nov 2017 08:58:08 GMT

    [ https://issues.apache.org/jira/browse/OAK-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235420#comment-16235420

Vikas Saurabh commented on OAK-6735:

bq. Also, it looks like properties with name ending with "_facet" have a special meaning.
What is a customer uses such property names... don't we have a good escape mechanism (for
example using the ":" prefix)?
Unfortunately, we don't have a good escape mechanism while naming fields - and changing it
now means migration impacts. That's why we're still tolerating it :-/.

bq. is "IndexStatistics.failReadingFieldJcrTitle" just used for testing the "fail reading
Yes, I tried to play around a lot to make a wrapping IndexReader which would fail on demand,
but each attempt required a lot of code to implement which seemed wrong to me. That said,
I'd much rather prefer to somehow use an on-demand-failing-index-reader.

bq. In that case, I would clearly mark this as a facility to simplify testing... As it is
now, it is misleading.
So, maybe fail for field name like "synthetically-falliable-field" be ok?

> Lucene Index: improved cost estimation by using document count per field
> ------------------------------------------------------------------------
>                 Key: OAK-6735
>                 URL: https://issues.apache.org/jira/browse/OAK-6735
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene, query
>    Affects Versions: 1.7.4
>            Reporter: Thomas Mueller
>            Assignee: Vikas Saurabh
>            Priority: Major
>             Fix For: 1.8, 1.7.11
>         Attachments: IndexReadPattern.txt, LuceneIndexReadPattern.java, OAK-6735.patch
> The cost estimation of the Lucene index is somewhat inaccurate because (by default) it
just used the number of documents in the index (as of Oak 1.7.4 by default, due to OAK-6333).
> Instead, it should use the number of documents for the given fields (the minimum, if
there are multiple fields with restrictions). 
> Plus divided by the number of restrictions (as we do now already).

This message was sent by Atlassian JIRA

View raw message