lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jim ferenczi <jim.feren...@gmail.com>
Subject Re: Filter cache pollution during sharded edismax queries
Date Wed, 01 Oct 2014 08:55:50 GMT
I think you should test with facet.shard.limit=-1 this will disallow the
limit for the facet on the shards and remove the needs for facet
refinements. I bet that returning every facet with a count greater than 0
on internal queries is cheaper than using the filter cache to handle a lot
of refinements.

Jim

2014-10-01 10:24 GMT+02:00 Charlie Hull <charlie@flax.co.uk>:

> On 30/09/2014 22:25, Erick Erickson wrote:
>
>> Just from a 20,000 ft. view, using the filterCache this way seems...odd.
>>
>> +1 for using a different cache, but that's being quite unfamiliar with the
>> code.
>>
>
> Here's a quick update:
>
> 1. LFUCache performs worse so we returned to LRUCache
> 2. Making the cache smaller than the default 512 reduced performance.
> 3. Raising the cache size to 2048 didn't seem to have a significant effect
> on performance but did reduce CPU load significantly. This may help our
> client as they can reduce their system spec considerably.
>
> We're continuing to test with our client, but the upshot is that even if
> you think you don't need the filter cache, if you're doing distributed
> faceting you probably do, and you should size it based on experimentation.
> In our case there is a single filter but the cache needs to be considerably
> larger than that!
>
> Cheers
>
> Charlie
>
>
>
>> On Tue, Sep 30, 2014 at 1:53 PM, Alan Woodward <alan@flax.co.uk> wrote:
>>
>>
>>>
>>>>  Once all the facets have been gathered, the co-ordinating node then
>>>>> asks
>>>>> the subnodes for an exact count for the final top-N facets,
>>>>>
>>>>
>>>>
>>>> What's the point to refine these counts? I've thought that it make sense
>>>> only for facet.limit ed requests. Is it correct statement? can those who
>>>> suffer from the low performance, just unlimit  facet.limit to avoid that
>>>> distributed hop?
>>>>
>>>
>>> Presumably yes, but if you've got a sufficiently high cardinality field
>>> then any gains made by missing out the hop will probably be offset by
>>> having to stream all the return values back again.
>>>
>>> Alan
>>>
>>>
>>>  --
>>>> Sincerely yours
>>>> Mikhail Khludnev
>>>> Principal Engineer,
>>>> Grid Dynamics
>>>>
>>>> <http://www.griddynamics.com>
>>>> <mkhludnev@griddynamics.com>
>>>>
>>>
>>>
>>>
>>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.flax.co.uk
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message