lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: solr multicore vs sharding vs 1 big collection
Date Sun, 02 Aug 2015 17:19:44 GMT
On 8/2/2015 8:29 AM, Jay Potharaju wrote:
> The document contains around 30 fields and have stored set to true for
> almost 15 of them. And these stored fields are queried and updated all the
> time. You will notice that the deleted documents is almost 30% of the
> docs.  And it has stayed around that percent and has not come down.
> I did try optimize but that was disruptive as it caused search errors.
> I have been playing with merge factor to see if that helps with deleted
> documents or not. It is currently set to 5.
> 
> The server has 24 GB of memory out of which memory consumption is around 23
> GB normally and the jvm is set to 6 GB. And have noticed that the available
> memory on the server goes to 100 MB at times during a day.
> All the updates are run through DIH.

Using all availble memory is completely normal operation for ANY
operating system.  If you hold up Windows as an example of one that
doesn't ... it lies to you about "available" memory.  All modern
operating systems will utilize memory that is not explicitly allocated
for the OS disk cache.

The disk cache will instantly give up any of the memory it is using for
programs that request it.  Linux doesn't try to hide the disk cache from
you, but older versions of Windows do.  In the newer versions of Windows
that have the Resource Monitor, you can go there to see the actual
memory usage including the cache.

> Every day at least once i see the following error, which result in search
> errors on the front end of the site.
> 
> ERROR org.apache.solr.servlet.SolrDispatchFilter -
> null:org.eclipse.jetty.io.EofException
> 
> From what I have read these are mainly due to timeout and my timeout is set
> to 30 seconds and cant set it to a higher number. I was thinking maybe due
> to high memory usage, sometimes it leads to bad performance/errors.

Although this error can be caused by timeouts, it has a specific
meaning.  It means that the client disconnected before Solr responded to
the request, so when Solr tried to respond (through jetty), it found a
closed TCP connection.

Client timeouts need to either be completely removed, or set to a value
much longer than any request will take.  Five minutes is a good starting
value.

If all your client timeout is set to 30 seconds and you are seeing
EofExceptions, that means that your requests are taking longer than 30
seconds, and you likely have some performance issues.  It's also
possible that some of your client timeouts are set a lot shorter than 30
seconds.

> My objective is to stop the errors, adding more memory to the server is not
> a good scaling strategy. That is why i was thinking maybe there is a issue
> with the way things are set up and need to be revisited.

You're right that adding more memory to the servers is not a good
scaling strategy for the general case ... but in this situation, I think
it might be prudent.  For your index and heap sizes, I would want the
company to pay for at least 32GB of RAM.

Having said that ... I've seen Solr installs work well with a LOT less
memory than the ideal.  I don't know that adding more memory is
necessary, unless your system (CPU, storage, and memory speeds) is
particularly slow.  Based on your document count and index size, your
documents are quite small, so I think your memory size is probably good
-- if the CPU, memory bus, and storage are very fast.  If one or more of
those subsystems aren't fast, then make up the difference with lots of
memory.

Some light reading, where you will learn why I think 32GB is an ideal
memory size for your system:

https://wiki.apache.org/solr/SolrPerformanceProblems

It is possible that your 6GB heap is not quite big enough for good
performance, or that your GC is not well-tuned.  These topics are also
discussed on that wiki page.  If you increase your heap size, then the
likelihood of needing more memory in the system becomes greater, because
there will be less memory available for the disk cache.

Thanks,
Shawn


Mime
View raw message