lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-6351) Let Stats Hang off of Pivots (via 'tag')
Date Tue, 04 Nov 2014 01:42:34 GMT

     [ https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hoss Man updated SOLR-6351:
---------------------------
    Attachment: SOLR-6351.patch


bq. ...i can't understand how/why DistributedFacetPivotSmallAdvancedTest.doTestTopStatsWithRefinement
passes in this patch, and that scares me (the refinement call should pollute the top level
stats, double counting at least one shard) so i really want to get to the bottom of it.

After doing a bunch of manual testing to confirm that the bug really did exist, i started
pouring over the test logs and confirmed there was a stupid bug in the test itself...

{code}
// i had...
ModifiableSolrParams facetForceRefineParams = new ModifiableSolrParams(coreParams);

// should have been...
ModifiableSolrParams facetForceRefineParams = new ModifiableSolrParams(facetParams);
{code}

...so in the request where i was trying to force facet refinement by adding {{FacetParams.FACET_OVERREQUEST_COUNT=0}}
to the params, i wasn't even faceting at all.

Once that silly mistake was fixed, the test started failing as expected.

I then beefed up Vitaliy's new DistributedFacetPivotWhiteBoxTest to include some asserts on
the top level stats, and confirmed those also failed because we're re-computing them on both
the initial request, and the subsequent refinement request.

Once i had those tests failing in a satisfactory way, finding a solution was fairly straight
forward:  stop including {{stats=true}} in pivot refinement requests, and instead make PivotFacetProcessor
recognise when it needs to build up a StatsInfo object to handle refinement requests (rather
then relying on StatsComponent to do it for us).

I think this is basically ready to commit - i'm going to give the patch & issue comments
another going over tomorrow to see if anything pops out at me while my laptop hammers away
at the tests, but if no one has any other feedback i'll commit to trunk & start backporting.

{panel:title=summary of changes}
* DistributedFacetPivotSmallAdvancedTest
** fixed broken test of top level stats
** added some assertion msg strings
* StatsComponent
** removed the last remaining "nocommit" hack that set stats=true on pivot refinements
* PivotFacetProcessor
** changed process() method:
*** init a {{StatsInfo}} object on demand in cases where it's a refinement request & there
is a stats local param
*** call getTaggedStatsFields() to build up the {{List<StatsField>}} once per facet.pivot
param (was being redundently called once per refinement value in old code)
** changed processSingle() to take in the {{List<StatsField>}} as a method param
** tweaked getTaggedStatsFields() method to take in just hte stats local param instead of
the full set of localparams -- simplified the parsing & error handling 
* DistributedFacetPivotWhiteBoxTest
** beefed up test to check top level stats
** tweaked refine params to set {{stats=false}} to match what the code now does
* TestCloudPivotFacet
** fixed javadoc (lint error)
* PivotField
** fixed javadoc (lint error)
{panel}



> Let Stats Hang off of Pivots (via 'tag')
> ----------------------------------------
>
>                 Key: SOLR-6351
>                 URL: https://issues.apache.org/jira/browse/SOLR-6351
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch,
SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch,
SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch,
SOLR-6351.patch, SOLR-6351.patch
>
>
> he goal here is basically flip the notion of "stats.facet" on it's head, so that instead
of asking the stats component to also do some faceting (something that's never worked well
with the variety of field types and has never worked in distributed mode) we instead ask the
PivotFacet code to compute some stats X for each leaf in a pivot.  We'll do this with the
existing {{stats.field}} params, but we'll leverage the {{tag}} local param of the {{stats.field}}
instances to be able to associate which stats we want hanging off of which {{facet.pivot}}
> Example...
> {noformat}
> facet.pivot={!stats=s1}category,manufacturer
> stats.field={!key=avg_price tag=s1 mean=true}price
> stats.field={!tag=s1 min=true max=true}user_rating
> {noformat}
> ...with the request above, in addition to computing the min/max user_rating and mean
price (labeled "avg_price") over the entire result set, the PivotFacet component will also
include those stats for every node of the tree it builds up when generating a pivot of the
fields "category,manufacturer"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message