jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikas Saurabh (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-8184) With statistical mode, facet count seems having higher error rate than expected
Date Tue, 02 Apr 2019 21:46:00 GMT

     [ https://issues.apache.org/jira/browse/OAK-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vikas Saurabh updated OAK-8184:
-------------------------------
    Component/s:     (was: indexing)

> With statistical mode, facet count seems having higher error rate than expected
> -------------------------------------------------------------------------------
>
>                 Key: OAK-8184
>                 URL: https://issues.apache.org/jira/browse/OAK-8184
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: query, search
>    Affects Versions: 1.6.16
>            Reporter: Kelvin Xu
>            Priority: Major
>         Attachments: image-2019-03-29-10-59-03-699.png, image-2019-03-29-10-59-17-163.png,
image-2019-03-29-11-00-11-094.png, image-2019-03-29-11-00-16-305.png
>
>
> We identified facet count drifts here and there especially for small counts, which makes
them obvious. Usually it is off by 1 but seeing bigger like 20 or 30 as well. Here’s one
example, consider this query run by a non-admin user,
> {code:java}
> 1_group.propertyvalues.extractFacet=true
> 1_group.propertyvalues.property=jcr:content/metadata/msft:associatedCampaign
> 2_group.0_path=/content/dam/microsoft/rad/
> 2_group.p.or=true
> orderby=jcr:content/jcr:lastModified
> orderby.sort=desc
> p.facetStrategy=oak
> p.facets=true
> p.guessTotal=250
> p.limit=-1
> p.offset=0
> property=jcr:content/metadata/msft:lifecycleStatus
> property.10_value=microsoft:studios/lifecycleStatus/Created
> property.1_value=Created
> property.2_value=Under Review
> property.3_value=Rejected
> property.4_value=Approved
> property.5_value=Published
> property.6_value=microsoft:search-marketing/lifecycleStatus/Approved
> property.7_value=microsoft:search-marketing/lifecycleStatus/Created
> property.8_value=microsoft:studios/lifecycleStatus/Approved
> property.9_value=microsoft:studios/lifecycleStatus/UnderReview
> type=dam:Asset
> {code}
> This is what returns, and notice one of the facet `/content/dam/microsoft/rad/public-campaign`
has 1 count.
> !image-2019-03-29-10-59-17-163.png!
> If we add this facet value as one of the query condition, like this
> {code:java}
> 5_group.1_propertyvalues.0_values=/content/dam/microsoft/rad/public-campaign
> 5_group.1_propertyvalues.extractFacet=true
> 5_group.1_propertyvalues.property=jcr:content/metadata/msft:associatedCampaign
> 2_group.0_path=/content/dam/microsoft/rad/
> 2_group.p.or=true
> orderby=jcr:content/jcr:lastModified
> orderby.sort=desc
> p.facetStrategy=oak
> p.facets=true
> p.guessTotal=250
> p.limit=-1
> p.offset=0
> property=jcr:content/metadata/msft:lifecycleStatus
> property.10_value=microsoft:studios/lifecycleStatus/Created
> property.1_value=Created
> property.2_value=Under Review
> property.3_value=Rejected
> property.4_value=Approved
> property.5_value=Published
> property.6_value=microsoft:search-marketing/lifecycleStatus/Approved
> property.7_value=microsoft:search-marketing/lifecycleStatus/Created
> property.8_value=microsoft:studios/lifecycleStatus/Approved
> property.9_value=microsoft:studios/lifecycleStatus/UnderReview
> type=dam:Asset
> {code}
> We got this, as you can see the actual count is 2.
> !image-2019-03-29-11-00-16-305.png!
> Is it an expected behavior? We are even seeing count being off on large result sets…this
makes user experience pretty bad and we thought the error rate would be much lower than that
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message