lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: Query takes a long time Solr 6.1.0
Date Mon, 10 Jun 2019 14:30:58 GMT
On 6/10/2019 3:24 AM, vishal patel wrote:
> We have 27 collections and each collection has many schema fields and in live too many
search and index create&update requests come and most of the searching requests are sorting,
faceting, grouping, and long query.
> So approx average 40GB heap are used so we gave 80GB memory.

Unless you've been watching an actual *graph* of heap usage over a 
significant amount of time, you can't learn anything useful from it.

And it's very possible that you can't get anything useful even from a 
graph, unless that graph is generated by analyzing a lengthy garbage 
collection log.

> our directory in solrconfig.xml
> <directoryFactory name="DirectoryFactory"
>                      class="${solr.directoryFactory:solr.MMapDirectoryFactory}">
> </directoryFactory>

When using MMAP, one of the memory columns should show a total that's 
approximately equal to the max heap plus the size of all indexes being 
handled by Solr.  None of the columns in your Resource Monitor memory 
screenshot show numbers over 400GB, which is what I would expect based 
on what you said about the index size.

MMapDirectoryFactory is a decent choice, but Solr's default of 
NRTCachingDirectoryFactory is probably better.  Switching to NRT will 
not help whatever is causing your performance problems, though.

> Here our schema file and solrconfig XML and GC log, please verify it. is it anything
wrong or suggestions for improvement?
> https://drive.google.com/drive/folders/1wV9bdQ5-pP4s4yc8jrYNz77YYVRmT7FG

That GC log covers a grand total of three and a half minutes.  It's 
useless.  Heap usage is nearly constant for the full time at about 30GB. 
  Without a much more comprehensive log, I cannot offer any useful 
advice.  I'm looking for logs that lasts several hours, and a few DAYS 
would be better.

Your caches are commented out, so that is not contributing to heap 
usage.  Another reason to drop the heap size, maybe.

> 2019-06-06T11:55:53.456+0100: 1053797.556: Total time for which application threads were
stopped: 42.4594545 seconds, Stopping threads took: 26.7301882 seconds

Part of the problem here is that stopping threads took 26 seconds.  I 
have never seen anything that high before.  It should only take a 
*small* fraction of a second to stop all threads.  Something seems to be 
going very wrong here.  One thing that it *might* be is something called 
"the four month bug", which is fixed by adding -XX:+PerfDisableSharedMem 
to the JVM options.  Here's a link to the blog post about that problem:

https://www.evanjones.ca/jvm-mmap-pause.html

It's not clear whether the 42 seconds *includes* the 26 seconds, or 
whether there was 42 seconds of pause AFTER the threads were stopped.  I 
would imagine that the larger number includes the smaller number.  Might 
need to ask Oracle engineers.  Pause times like this do not surprise me 
with a heap this big, but 26 seconds to stop threads sounds like a major 
issue, and I am not sure about what might be causing it.  My guess about 
the four month bug above is a shot in the dark that might be completely 
wrong.

Thanks,
Shawn

Mime
View raw message