lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wun...@wunderwood.org>
Subject Re: Solr performance on EC2 linux
Date Wed, 03 May 2017 23:41:47 GMT
Already have a Jira issue for next week. I have a script to run prod logs against a cluster.
I’ll be testing a four shard by two replica cluster with 17 million docs and very long queries.
We are working on getting the 95th percentile under one second, so we should exercise the
timeAllowed feature.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On May 3, 2017, at 3:53 PM, Rick Leir <rleir@leirtech.com> wrote:
> 
> +Walter test it
> 
> Jeff,
> How much CPU does the EC2 hypervisor use? I have heard 5% but that is for a normal workload,
and is mostly consumed during system calls or context changes. So it is quite understandable
that frequent time calls would take a bigger bite in the AWS cloud compared to bare metal.
Sorry, my words are mostly conjecture so please ignore. Cheers -- Rick
> 
> On May 3, 2017 2:35:33 PM EDT, Jeff Wartes <jwartes@whitepages.com> wrote:
>> 
>> It’s presumably not a small degradation - this guy very recently
>> suggested it’s 77% slower:
>> https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/
>> 
>> The other reason that blog post is interesting to me is that his
>> benchmark utility showed the work of entering the kernel as high system
>> time, which is also what I was seeing.
>> 
>> I really want to go back and try some more tests, including (now)
>> disabling the timeAllowed param in my query corpus. 
>> I think I’m still a few weeks of higher priority issues away from that
>> though.
>> 
>> 
>> On 5/2/17, 1:45 PM, "Tomás Fernández Löbbe" <tomasflobbe@gmail.com>
>> wrote:
>> 
>> I remember seeing some performance impact (even when not using it) and
>> it
>> was attributed to the calls to System.nanoTime. See SOLR-7875 and
>> SOLR-7876
>> (fixed for 5.3 and 5.4). Those two Jiras fix the impact when
>> timeAllowed is
>>  not used, but I don't know if there were more changes to improve the
>> performance of the feature itself. The problem was that System.nanoTime
>> may
>> be called too many times on indices with many different terms. If this
>> is
>> the problem Jeff is seeing, a small degradation of System.nanoTime
>> could
>>   have a big impact.
>> 
>>   Tomás
>> 
>> On Tue, May 2, 2017 at 10:23 AM, Walter Underwood
>> <wunder@wunderwood.org>
>>   wrote:
>> 
>>> Hmm, has anyone measured the overhead of timeAllowed? We use it all
>> the
>>> time.
>>> 
>>> If nobody has, I’ll run a benchmark with and without it.
>>> 
>>> wunder
>>> Walter Underwood
>>> wunder@wunderwood.org
>>> 
>> https://linkprotect.cudasvc.com/url?a=http://observer.wunderwood.org/&c=E,1,7uGY1VtJPqam-MhMKpspcb31C9NQ_Jh4nI0gzkQP4gVJkhcC5l031vMIHH0j38EdMESOM5Chjav3lUu1rpTdohTNTPdchTkl4TGNEHWJpJFJ-MR6RrjnTQ,,&typo=0
>> (my blog)
>>> 
>>> 
>>>> On May 2, 2017, at 9:52 AM, Chris Hostetter
>> <hossman_lucene@fucit.org>
>>> wrote:
>>>> 
>>>> 
>>>> : I specify a timeout on all queries, ....
>>>> 
>>>> Ah -- ok, yeah -- you mean using "timeAllowed" correct?
>>>> 
>>>> If the root issue you were seeing is in fact clocksource related,
>>>> then using timeAllowed would probably be a significant compounding
>>>> factor there since it would involve a lot of time checks in a
>> single
>>>> request (even w/o any debugging enabled)
>>>> 
>>>> (did your coworker's experiements with ES use any sort of
>> equivilent
>>>> timeout feature?)
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -Hoss
>>>> 
>> https://linkprotect.cudasvc.com/url?a=http://www.lucidworks.com/&c=E,1,DwDibSb7PG6wpqsnn-u9uKdCuujyokjeyc6ero6bEdoUjs4Hn_X1jj_z6QAEDmorDqAP_TcaEJX8k5HYYJI7bJ7jQxTDpKUX9MvWAaP6ICoyVmpmQ8X7&typo=0
>>> 
>>> 
>> 
> 
> -- 
> Sorry for being brief. Alternate email is rickleir at yahoo dot com


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message