lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Krupansky <jack.krupan...@gmail.com>
Subject Re: Distributed mode for stats component?
Date Wed, 14 Jan 2015 20:57:31 GMT
Thanks, Chris. I just needed to stare at the code I already knew about more
intently to see what was really going on. It's super convoluted and super
confusing. The keys were the handleResponses method in the main component
class and the AbstractStatsValues class that is hidden in the
StatsValuesFactory source file. Oddly, the StatsValues source file doesn't
contain the classes that implement that interface - they're in the
"factory" source file!

BTW, we should have some doc notes on the limitations and performance
implications of the stats component. Although, admittedly, it's moot if
stats is eventually to be superseded by the analytics component.

-- Jack Krupansky

On Wed, Jan 14, 2015 at 12:26 PM, Chris Hostetter <hossman_lucene@fucit.org>
wrote:

>
> : Does anybody know for sure whether the stats component fully supports
> : distributed mode? It is listed in the doc as supporting distributed mode
>
> it's been supported for as long as i can remember -- since Day 1 of the
> StatsComponent i believe.
>
> : (at least for old, non-SolrCloud distrib mode), but... I don't see any
> code
> : that actually does that. Nor any tests, unless they are hidden somewhere
> I
> : didn't look.
>
> just like any other SearchComponent: look at StatsComponent.prepare,
> StatsComponent.process, ...distributedProcess, ....modifyRequest,
> ...handleResponses, ...finishStage, etc...
>
>
> : In particular, I am interested in the "countdistinct" parameter which
> would
> : need to retrieve all distinct values from all other shards to detect
> : whether any of the distinct values overlap between shards.
>
> yep -- that's exactly what it does ... totally naive and not a good idea
> at all for fields with non-trivial cardinality, which is why you have to
> explicitly turn it on with "calcDistinct" and why i wnat to replace it
> with HyperLogLog approximations...
>
> https://issues.apache.org/jira/browse/SOLR-6968
>
> -Hoss
> http://www.lucidworks.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message