lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr Query Performance benchmarking
Date Fri, 28 Apr 2017 16:02:08 GMT
re: the q vs. fq question. My claim (not verified) is that the fastest
of all would be q=*:*&fq={!cache=false}. That would bypass the scoring
that putting it in the "q" clause would entail as well as bypass the
filter cache.

But I have to agree with Walter, this is very suspicious IMO. Here's
what I'd do:

Change my solrconfig to have a cache size so that both
queryResultCache and filterCache that was significantly smaller than
the number of queries I was cycling through for my stress test. If you
really want to have a worst-case scenario, set the sizes to zero. If
that _still_ gives you responses in the 30-40ms range you're in great
shape. I suspect Walter and I would be on the same side of a bet that
this won't be true.

I once worked with a client who was thrilled that their QTimes were
3ms. They were firing the same query over and over.... Which
reinforces Walter's point.

Best,
Erick

On Fri, Apr 28, 2017 at 7:43 AM, Walter Underwood <wunder@wunderwood.org> wrote:
> More “unrealistic” than “amazing”. I bet the set of test queries is smaller than
the query result cache size.
>
> Results from cache are about 2 ms, but network communication to the shards would add
enough overhead to reach 40 ms.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Apr 28, 2017, at 5:59 AM, Shawn Heisey <apache@elyograg.org> wrote:
>>
>> On 4/27/2017 5:20 PM, Suresh Pendap wrote:
>>> Max throughput that I get: 12000 to 12500 reqs/sec
>>> 95 percentile query latency: 30 to 40 msec
>>
>> These numbers are *amazing* ... far better than I would have expected to
>> see on a 27GB index, even in a situation where it fits entirely into
>> available memory.  I would only expect to see a few hundred requests per
>> second, maybe as much as several hundred.  Congratulationsare definitely
>> deserved.
>>
>> Adding more shards as Toke suggested *might* help, but it might also
>> lower performance.  More shards means that a single query from the
>> user's perspective becomes more queries in the background.  Unless you
>> add servers to the cloud to handle the additional shards, more shards
>> will usually slow things down on an index with a high query rate.  On
>> indexes with a very low query rate, more shards on the same hardware is
>> likely to be faster, because there will be plenty of idle CPU capacity.
>>
>> What Toke said about filter queries is right on the money.  Uncached
>> filter queries are pretty expensive.  Once a filter gets cached, it is
>> SUPER fast ... but if you are constantly changing the filter query, then
>> it is unlikely that new filters will be cached.
>>
>> When a particular query does not appear in either the queryResultCache
>> or the filterCache, running it as a clause on the q parameter will
>> usually be faster than running it as an fq parameter.  If that exact
>> query text will be used a LOT, then it makes sense to put it into a
>> filter, where it will become very fast once it is cached.
>>
>> Thanks,
>> Shawn
>>
>

Mime
View raw message