lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Atita Arora <atitaar...@gmail.com>
Subject Re: High CPU utilization on Upgrading to Solr Version 6.3
Date Thu, 03 Aug 2017 06:38:45 GMT
Hi All ,

Just thought of giving quick update on this.
So we were able to *knock down this issue by using jvisualvm* which comes
with java .
So , we enabled monitoring  through jmx and the CPU profiling showed (as
attached in one of my previous emails) *Highlighting taking maximum
processing.*
Mysteriously , this was happening in highlighting-> merge which was invoked
through when we enabled *mergecontiguous=true* I'm still surprised as to
turning this only property false, resolved the issue and we happily went
live last week.

Later , as I found the code for this particular property is causing endless
recursions as I traced.

Please guide / share if you may have any other thoughts.

Thanks,
Atita



On Fri, Jul 28, 2017 at 7:18 PM, Shawn Heisey <apache@elyograg.org> wrote:

> On 7/27/2017 1:30 AM, Atita Arora wrote:
> > What OS is Solr running on?  I'm only asking because some additional
> > information I'm after has different gathering methods depending on OS.
> > Other questions:
> >
> > /*OpenJDK 64-Bit Server VM (25.141-b16) for linux-amd64 JRE
> > (1.8.0_141-b16), built on Jul 20 2017 21:47:59 by "mockbuild" with gcc
> > 4.4.7 20120313 (Red Hat 4.4.7-18)*/
> > /*Memory: 4k page, physical 264477520k(92198808k free), swap 0k(0k
> free)*/
>
> Linux is the easiest to get good information from.  Run the "top"
> program in a commandline session.  Press shift-M to sort by memory size,
> and grab a screenshot.  Share that screenshot with a file sharing site
> and give us the URL.
>
> > Is there only one Solr process per machine, or more than one?
> > /*On an average yes , one solr process per machine , however , we do
> > have a machine (where this log is taken) has two solr processes
> > (master and slave)*/
>
> Running a master and a slave on one machine does nothing for
> redundancy.  They need to be on separate machines for that to really
> help.  As for multiple processes per machine, tou can have many indexes
> in one Solr instance -- you don't need more than one in most cases.
>
> > How many total documents are managed by one machine?
> > */About 220945 per machine ( and double for this machine as it has
> > instance of master as well as other slave)/*
> >
> > How big is all the index data managed by one machine?
> > */The index is about 4G./*
>
> If less than a quarter of a million documents results in a 4GB index,
> those documents must be ENORMOUS, or else there is something strange
> going on.
>
> > What is the max heap on each Solr process?
> > */Max heap is 25G for each Solr Process. (Xms 25g Xmx 25g)/*
> > */
> > /*
> > The reason of choosing RAMDirectory was that it was used in the
> > similar manner while the production Solr was on Version 4.3.2, so no
> > particular reason but just replicated how it was working , never
> > thought this may give troubles.
>
> Set up the slaves just like the masters, with
> NRTCachingDirectoryFactory.  For a couple hundred thousand docs, you
> probably only need a 2GB heap, possibly even less.
>
> > I had included a pastebin of GC snapshot (the complete log was too big
> > to be included in the pastebin , so pasted a sampler)
>
> I asked for the full log because that's what I need to look deeper.  A
> sampler won't be enough.  There are file sharing websites for sharing
> larger content, and if you compress the file before uploading it, you
> should be able to achieve a fairly impressive compression ratio.
> Dropbox is generally a good choice for sharing fairly large content.
> Dropbox also works for image data, like the "top" screenshot I asked for
> above.
>
> > Another thing is as we observed the CPU cycles yesterday in high load
> > condition we observed that the Highlighter component was taking
> > longest , is there anything in particular we forgot to include that
> > highlighting doesn't gives a performance hit .
> > Attached is the snapshot taken from jvisualvm.
>
> Attachments rarely make it through the mailing list.  Yours didn't, so I
> cannot see that snapshot.
>
> I do not know anything about highlighting, so I cannot comment on how
> much CPU it takes.  I've never used the feature.
>
> My best idea about why your CPU is so high is problems with garbage
> collection.  To look into that, I need to have the full GC log.  The
> rest of the information I've asked for will help focus my efforts.
>
> Thanks,
> Shawn
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message