lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr Cloud has lower performance with more servers
Date Thu, 09 Oct 2014 15:40:35 GMT
Just to check: your index is NOT sharded, correct?

Assuming not sharded, is it SolrCloud? If not SolrCloud, how are the
indexes kept in synch? Master/slave? Manual copy?

But for an unchanging index, this is definitely odd.

Best,
Erick

On Thu, Oct 9, 2014 at 7:40 AM, Walter Underwood <wunder@wunderwood.org> wrote:
> Is this a production log of queries, with lots of repeats? If so, you may be seeing the
normal effect of lower cache hit rates.
>
> Check the hit rate for the query result cache in the two setups. With a single machine,
the second occurrence of a query will be a cache hit. With two machines, it will not be if
the two queries are routed to different machines.
>
> I was running some benchmarks here. With one machine, the query cache had a 50% hit rate.
With eight machines, it was 20%.
>
> You can address this with a reverse proxy HTTP cache in front of the cluster, something
like Varnish.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/
>
>
> On Oct 9, 2014, at 7:21 AM, Yannick <yann1806@yahoo.com.INVALID> wrote:
>
>> Hi Toke,
>>
>> thanks for your suggestion - definitely an interesting idea. But unfortunately no,
no indexing job is running; those are static indexes being queried. The execution time is
also very consistent in each condition, I did quite a few tests.
>>
>> Yann
>>
>>
>> On Thursday, October 9, 2014 3:56 PM, Toke Eskildsen <te@statsbiblioteket.dk>
wrote:
>>
>>
>>
>> On Thu, 2014-10-09 at 15:06 +0200, Yannick wrote:
>>
>>
>>> I created a group of 2 Solr servers with a load-balancer in front
>>> (Haproxy). I have a batch client that sends requests (read-only)
>>> continuously to the load-balancer. The problem is: the performance is
>>> slower with 2 servers than it is with a single server (still via the
>>> load-balancer, with the second server down, so it's not the
>>> load-balancer itself causing the slowdown).
>>
>> (speculating a lot here:)
>>
>> Is another job updating the indexes while you are batch-searching?
>> If so, the slowdown could be explained by the servers disk caches being
>> flushed by the indexing job. When a request arrives some cache is
>> reclaimed, but is will be a battle between the update and the search
>> jobs. With more machines, there will be fewer request/machine, so the
>> search-cache has a lower chance of being used again before it is
>> reclaimed by the updater.
>>
>> Still, worse performance for 2 machines sounds pretty bad.
>>
>> - Toke Eskildsen, State and University Library, Denmark
>

Mime
View raw message