lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_37) - Build # 3421 - Failure!
Date Wed, 26 Dec 2012 11:36:50 GMT
I started jhat on the machine:

http://jenkins.sd-datasolutions.de:7000/

you can inspect the heap dump with it. The Jenkins build was made sticky, so it stays alive
until I delete it. It is also nice to look to the heap dump with visualvm (shipped with JDK
@ "jvisualvm <heapfile>"). You should use the same bitness and version of the JDK (32bit/jdk1.6.0_37)
like used for this build after downloading the heapdump: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/3421/artifact/heapdumps/java_pid13141.hprof

Unfortunately I did not find a good tool to inspect permgen heap only (it contains loaded
classes and interned strings). I checked the heapdump, we have no strange classloaders involved,
all classes seem to be loaded by the standard app-classloader of the JDK and there are no
duplicates (same class loaded multiple times by different class loaders). So SolrResourceLoader
is not the bad guy as Robert and I expected as a first guess.  Interestingly the dump has
milltions of java.lang.String objects (which makes me wonder, I thought Lucene 4.x does no
longer use Strings? - BUT Solr, 90% of all strings look like this: http://jenkins.sd-datasolutions.de:7000/object/0xdbf3d938,
contents are similar to "org.apache.solr.handler:type=RequestHandlerBase,scope=metrics-scope-22344,name=numTimeouts".
The parent object are some huge HashMaps of com.yammer.metrics.core.MetricName instances).

When looking at the MBean mess, it looks like:
The whole VM is filled with MBean statistics (20% of the total heap!!!), just for statistics.
It looks like the MBean server is not shut down correctly when the Solr instance shuts down,
so it sums up while running tests, every new Solr instance adds new statistics to the huge
MBean maps eating all the heap (and possibly permgen, because most strings may be interned)!
This is a huge leak, we should fix this (or disable the whole useless MBean shit completely,
at least for tests). Was this strange, never-seen package com.yammer.metrics introduced recently
related to mbeans - or is zookeeper the bad guy?

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: Wednesday, December 26, 2012 3:22 AM
> To: dev@lucene.apache.org
> Subject: Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_37) - Build #
> 3421 - Failure!
> 
> Is this one a nightly build?
> 
> I can run it and look at it closely tomorrow.
> 
> - Mark
> 
> On Dec 25, 2012, at 6:04 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
> 
> > Can we add a finally/try block that catches permgen errors and calls
> System.halt (not exit)? I could add another extra allowance to the security
> manager, disallowing exits.
> >
> > But we should try to find the issue in the tests, maybe Mark has an idea.
> We have the heap dump readily available, but I don't have the tools to
> inspect it.
> >
> > Uwe
> >
> >
> >
> > Dawid Weiss <dawid.weiss@cs.put.poznan.pl> schrieb:
> > > the test framework crashes somehow and does not respond anymore.
> >
> > I think I know exactly how it crashes -- there's not much mystery
> > about this: once the permgen is exhausted OOM errors are thrown from
> > tests; what happens then is these errors are caught and an attempt is
> > made to serialize these errors to the master node. Unfortunately this
> > process involves loading some classes that are not yet loaded and,
> > since the permgen is already exhausted, everything goes insane (the
> > thread apparently just silently quits; there are finally blocks that
> > are never reached).
> >
> > Like I said -- I'll see what I can do about it but I don't have any
> > optimistic feelings. This is really riding a critical edge and short
> > of preallocating static data structures I don't see any way of
> > implementing a clean solution for the problem.
> >
> > Dawid
> >
> >
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > --
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, 28213 Bremen
> > http://www.thetaphi.de
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message