lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Fairey" <>
Subject RE: Solr configuration, memory usage and MMapDirectory
Date Wed, 08 Oct 2014 10:02:07 GMT

I'm currently setting up jconsole but as I have to remotely monitor (no gui capability on
the server) I have to wait before I can restart solr with a JMX port setup. In the meantime
I looked at top and given the calculations you said based on your top output and this top
of my java process from the node that handles the querying, the indexing node has a similar
memory profile:

It would seem I need a monstrously large heap in the 60GB region?

We do use a lot of navigators/filters so I have set the caches to be quite large for these,
are these what are using up the memory?



-----Original Message-----
From: Shawn Heisey [] 
Sent: 06 October 2014 16:56
Subject: Re: Solr configuration, memory usage and MMapDirectory

On 10/6/2014 9:24 AM, Simon Fairey wrote:
> I've inherited a Solr config and am doing some sanity checks before 
> making some updates, I'm concerned about the memory settings.
> System has 1 index in 2 shards split across 2 Ubuntu 64 bit nodes, 
> each node has 32 CPU cores and 132GB RAM, we index around 500k files a 
> day spread out over the day in batches every 10 minutes, a portion of 
> these are updates to existing content, maybe 5-10%. Currently 
> MergeFactor is set to 2 and commit settings are:
> <autoCommit>
>     <maxTime>60000</maxTime>
>     <openSearcher>false</openSearcher>
> </autoCommit>
> <autoSoftCommit>
>     <maxTime>900000</maxTime>
> </autoSoftCommit>
> Currently each node has around 25M docs in with an index size of 45GB, 
> we prune the data every few weeks so it never gets much above 35M docs 
> per node.
> On reading I've seen a recommendation that we should be using 
> MMapDirectory, currently it's set to NRTCachingDirectoryFactory.
> However currently the JVM is configured with -Xmx131072m, and for 
> MMapDirectory I've read you should use less memory for the JVM so 
> there is more available for the OS caching.
> Looking at the dashboard in the JVM memory usage I see:
> enter image description here
> Not sure I understand the 3 bands, assume 127.81 is Max, dark grey is 
> in use at the moment and the light grey is allocated as it was used 
> previously but not been cleaned up yet?
> I'm trying to understand if this will help me know how much would be a 
> good value to change Xmx to, i.e. say 64GB based on light grey?
> Additionally once I've changed the max heap size is it a simple case 
> of changing the config to use MMapDirectory or are there things i need 
> to watch out for?

NRTCachingDirectoryFactory is a wrapper directory implementation. The wrapped Directory implementation
is used with some code between that implementation and the consumer (Solr in this case) that
does caching for NRT indexing.  The wrapped implementation is MMapDirectory, so you do not
need to switch, you ARE using MMap.

Attachments rarely make it to the list, and that has happened in this case, so I cannot see
any of your pictures.  Instead, look at one of mine, and the output of a command from the
same machine, running Solr
4.7.2 with Oracle Java 7:

[root@idxa1 ~]# du -sh /index/solr4/data/
64G     /index/solr4/data/

I've got 64GB of index data on this machine, used by about 56 million documents.  I've also
got 64GB of RAM.  The solr process shows a virtual memory size of 54GB, a resident size of
16GB, and a shared size of 11GB.  My max heap on this process is 6GB.  If you deduct the shared
memory size from the resident size, you get about 5GB.  The admin dashboard for this machine
says the current max heap size is 5.75GB, so that 5GB is pretty close to that, and probably
matches up really well when you consider that the resident size may be considerably more than
16GB and the shared size may be just barely over 11GB.

My system has well over 9GB free memory and 44GB is being used for the OS disk cache.  This
system is NOT facing memory pressure.  The index is well-cached and there is even memory that
is not used *at all*.

With an index size of 45GB and 132GB of RAM, you're unlikely to be having problems with memory
unless your heap size is *ENORMOUS*.  You
*should* have your garbage collection highly tuned, especially if your max heap larger than
2 or 3GB.  I would guess that a 4 to 6GB heap is probably enough for your needs, unless you're
doing a lot with facets, sorting, or Solr's caches, then you may need more.  Here's some info
about heap requirements, followed by information about garbage collection tuning:

Your automatic commit settings do not raise any red flags with me. 
Those are sensible settings.


View raw message