lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: yet another optimize question
Date Sat, 15 Jun 2013 12:52:09 GMT
Hi Robi,

I'm going to guess you are seeing smaller heap also simply because you
restarted the JVM recently (hm, you don't say you restarted, maybe I'm
making this up). If you are indeed indexing continuously then you
shouldn't optimize. Lucene will merge segments itself. Lower
mergeFactor will force it to do it more often (it means slower
indexing, bigger IO hit when segments are merged, more per-segment
data that Lucene/Solr need to read from the segment for faceting and
such, etc.) so maybe you shouldn't mess with that.  Do you know what
your caches are like in terms of size, hit %, evictions?  We've
recently seen people set those to a few hundred K or even higher,
which can eat a lot of heap.  We have had luck with G1 recently, too.
Maybe you can run jstat and see which of the memory pools get filled
up and change/increase appropriate JVM param based on that?  How many
fields do you index, facet, or group on?

Otis
--
Performance Monitoring - http://sematext.com/spm/index.html
Solr & ElasticSearch Support -- http://sematext.com/





On Fri, Jun 14, 2013 at 8:04 PM, Petersen, Robert
<robert.petersen@mail.rakuten.com> wrote:
> Hi guys,
>
> We're on solr 3.6.1 and I've read the discussions about whether to optimize or not to
optimize.  I decided to try not optimizing our index as was recommended.  We have a little
over 15 million docs in our biggest index and a 32gb heap for our jvm.  So without the optimizes
the index folder seemed to grow in size and quantity of files.  There seemed to be an upper
limit but eventually it hit 300 files consuming 26gb of space and that seemed to push our
slave farm over the edge and we started getting the dreaded OOMs.  We have continuous indexing
activity, so I stopped the indexer and manually ran an optimize which made the index become
9 files consuming 15gb of space and our slave farm started having acceptable memory usage.
 Our merge factor is 10, we're on java 7.  Before optimizing, I tried on one slave machine
to go with the latest JVM and tried switching from the CMS GC to the G1GC but it hit OOM condition
even faster.  So it seems like I have to continue to schedule a regular optimize.  Right now
it has been a couple of days since running the optimize and the index is slowly growing bigger,
now up to a bit over 19gb.  What do you guys think?  Did I miss something that would make
us able to run without doing an optimize?
>
> Robert (Robi) Petersen
> Senior Software Engineer
> Search Department

Mime
View raw message