lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Woodward <a...@flax.co.uk>
Subject Re: Several builds hanging pecause of permgen
Date Tue, 09 Jul 2013 09:16:46 GMT
SOLR-4948 changed the test's default CoreContainer from using a hardcoded JulWatcher to whatever
would be picked up in the standard config, which will be log4j if that's what's configured
for tests.  Could that be the cause?  When did the permgen crashes start?

FWIW, I'm not seeing any permgen errors locally (MacOS, Java 7).

Alan Woodward
www.flax.co.uk


On 9 Jul 2013, at 08:57, Uwe Schindler wrote:

> I digged more:
> 
> In most heap dumps, the failing thread looks like that, so it looks like Log4J is the
bad guy - I looked at the code, it does tricky stuff to get stack traces but I found no String.intern()
anywhere.
> 
> My question: When did we change tests to use log4j? Was this before/after the hadoop
change? Maybe it's not related to hadoop, and is more a bug in Log4J. I find similar stack
traces all around the internet always linked to Log4J.
> 
> "qtp1522342989-5342" prio=5 tid=5342 RUNNABLE
> 	at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:25)
> 	at java.lang.Throwable.getStackTraceElement(Native Method)
> 	at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
> 	at java.lang.Throwable.getStackTrace(Throwable.java:582)
> 	at sun.reflect.GeneratedMethodAccessor2.invoke(<unknown string>)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.log4j.spi.LocationInfo.<init>(LocationInfo.java:139)
> 	   Local Variable: org.apache.log4j.spi.LocationInfo#52
> 	   Local Variable: java.lang.String#28067
> 	   Local Variable: java.lang.Throwable#1
> 	at org.apache.log4j.spi.LoggingEvent.getLocationInformation(LoggingEvent.java:253)
> 	at org.apache.solr.util.SolrLogLayout._format(SolrLogLayout.java:122)
> 	   Local Variable: java.lang.StringBuilder#32
> 	   Local Variable: java.lang.String#26042
> 	   Local Variable: org.apache.solr.util.SolrLogLayout#1
> 	at org.apache.solr.util.SolrLogLayout.format(SolrLogLayout.java:110)
> 	at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
> 	   Local Variable: org.apache.log4j.ConsoleAppender#1
> 	at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
> 	at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
> 	at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
> 	   Local Variable: org.apache.log4j.helpers.AppenderAttachableImpl#1
> 	at org.apache.log4j.Category.callAppenders(Category.java:206)
> 	   Local Variable: org.apache.log4j.spi.LoggingEvent#52
> 	   Local Variable: org.apache.log4j.Logger#52
> 	at org.apache.log4j.Category.forcedLog(Category.java:391)
> 	at org.apache.log4j.Category.log(Category.java:856)
> 	at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:305)
> 	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1909)
> 	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
> 	   Local Variable: org.apache.solr.handler.StandardRequestHandler#4
> 	   Local Variable: org.apache.solr.core.SolrCore#3
> 	   Local Variable: java.lang.String#25968
> 	   Local Variable: org.apache.solr.response.SolrQueryResponse#3
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#15
> 	   Local Variable: org.apache.solr.servlet.cache.Method#3
> 	   Local Variable: org.apache.solr.servlet.SolrRequestParsers$1#3
> 	   Local Variable: org.apache.solr.servlet.SolrDispatchFilter#2
> 	   Local Variable: org.apache.solr.core.CoreContainer#3
> 	   Local Variable: org.apache.solr.servlet.SolrRequestParsers#3
> 	   Local Variable: org.apache.solr.core.SolrConfig#3
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
> 	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> 	at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
> 	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#13
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> 	at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
> 	   Local Variable: org.eclipse.jetty.server.handler.GzipHandler#2
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHolder#2
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHandler#2
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> 	   Local Variable: org.eclipse.jetty.server.session.SessionHandler#2
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> 	   Local Variable: java.lang.String#26053
> 	   Local Variable: org.eclipse.jetty.servlet.ServletContextHandler#2
> 	   Local Variable: org.eclipse.jetty.server.Request#4
> 	   Local Variable: javax.servlet.DispatcherType#1
> 	   Local Variable: org.eclipse.jetty.server.Response#4
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> 	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> 	at org.eclipse.jetty.server.Server.handle(Server.java:368)
> 	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> 	   Local Variable: org.eclipse.jetty.server.Server#2
> 	   Local Variable: java.lang.String#25761
> 	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
> 	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
> 	   Local Variable: org.eclipse.jetty.io.View#5
> 	   Local Variable: org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler#4
> 	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
> 	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
> 	   Local Variable: org.eclipse.jetty.http.HttpParser#4
> 	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
> 	   Local Variable: org.eclipse.jetty.server.AsyncHttpConnection#4
> 	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
> 	   Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint#4
> 	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
> 	   Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint$1#4
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> 	at java.lang.Thread.run(Thread.java:662)
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
>> -----Original Message-----
>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>> Sent: Tuesday, July 09, 2013 9:32 AM
>> To: dev@lucene.apache.org
>> Subject: RE: Several builds hanging pecause of permgen
>> 
>> I inspected the heap dumps already, 2 things are interesting:
>> 
>> - The heap dump file is quite small (only approx. 50 MB), so the permgen
>> issues start when not even -Xmx is used completely.
>> - There are lots of (250) reflection class loaders, but no leaks. All classloaders
>> exist one time, just the reflection-based sun.relect.DelegatingClassLoader
>> with the stubs for speeding up reflection shows up, but only with one
>> class/loader. Which is perfectly fine.
>> - ~2500 classes loaded, which is not much.
>> 
>> How do I get the number of interned Strings from jvisualvm?
>> 
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>> 
>> 
>>> -----Original Message-----
>>> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf
>>> Of Dawid Weiss
>>> Sent: Tuesday, July 09, 2013 9:13 AM
>>> To: dev@lucene.apache.org
>>> Subject: Re: Several builds hanging pecause of permgen
>>> 
>>> What's the core of those permgen leaks, guys? Is it the number of
>>> classes (via different classloaders)? Number of interned strings?
>>> Maybe something is going on that we should reflect on before we raise
>>> the default permgen limit? Our typical permgen headaches were due to
>>> repeated classloader leaks -- is Hadoop using something like that?
>>> 
>>> Dawid
>>> 
>>> On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uwe@thetaphi.de> wrote:
>>>> Hi Mark,
>>>> 
>>>> The problem with raising permgen is:
>>>> - It's Hotspot specific only, so does not work with other JVMs
>>>> - Its no longer available in Java 8
>>>> 
>>>> I would really prefer to maybe tune the tests and maybe not create
>>>> so
>>> many nodes in the cloud tests. It looks like the bug happens more
>>> often with higher test multiplier (-Dtests.multiplier=3), so maybe we
>>> can really tune that.
>>>> If we want to raise permgen, we have to do it in a similar way like
>>>> we do enable the heap dumps - with lots of <condition/> tasks in ANT...
>>>> :(
>>>> 
>>>> Uwe
>>>> 
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: Mark Miller [mailto:markrmiller@gmail.com]
>>>>> Sent: Tuesday, July 09, 2013 5:20 AM
>>>>> To: dev@lucene.apache.org
>>>>> Subject: Re: Several builds hanging pecause of permgen
>>>>> 
>>>>> Looks like we currently don't set the max perm gen for tests, so you
>>>>> get the default - I think we want to change that regardless - we
>>>>> don't want it to vary IMO - it should work like Xmx.
>>>>> 
>>>>> I think we should just set it to 128 mb, and these tests should have
>>>>> plenty of room to run.
>>>>> 
>>>>> - Mark
>>>>> 
>>>>> On Jul 8, 2013, at 11:06 PM, Mark Miller <markrmiller@gmail.com>
>> wrote:
>>>>> 
>>>>>> 
>>>>>> On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uwe@thetaphi.de>
wrote:
>>>>>> 
>>>>>>> Next one:
>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
>>>>>>> 
>>>>>>> I will vote all releases of 4.4 with -1 until this is fixed!
>>>>>> 
>>>>>> You can't veto a release, so it's kind of a hollow threat ;)
>>>>>> 
>>>>>>> It hangs on my local computer, too! Tests pass only ½ of the
time,
>>>>>>> the
>>>>> remaining time it hangs with permgen errors.
>>>>>> 
>>>>>> Depending on the host OS and java version, I have had to raise the
>>>>>> max
>>>>> perm gem a bit higher for heavy SolrCloud tests that also start up
>>>>> hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply
>>>>> a test resource issue - those tests are very heavy, which is why most
>>>>> are set to run nightly - I've seen the code run heavily without any
>>>>> perm gen issues though - and on some of my machines, I don't have to
>>> raise the perm gen at all.
>>>>>> 
>>>>>> - Mark
>>>>>> 
>>>>>>> 
>>>>>>> Uwe
>>>>>>> 
>>>>>>> -----
>>>>>>> Uwe Schindler
>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>> eMail: uwe@thetaphi.de
>>>>>>> 
>>>>>>> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On
>>>>> Behalf
>>>>>>> Of Dawid Weiss
>>>>>>> Sent: Monday, July 08, 2013 2:16 PM
>>>>>>> To: dev@lucene.apache.org
>>>>>>> Subject: Re: Several builds hanging pecause of permgen
>>>>>>> 
>>>>>>> 
>>>>>>> Not much I can do from my side about permgen errors. There is
>>>>>>> really no
>>>>> way to deal with these from within Java (the same process) -- you
>>>>> cannot effectively handle anything because your own classes may not
>>> load at all.
>>>>>>> 
>>>>>>> Dawid
>>>>>>> 
>>>>>>> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uwe@thetaphi.de>
>>>>> wrote:
>>>>>>> Another one, this time on OSX:
>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
>>>>>>> 
>>>>>>> -----
>>>>>>> Uwe Schindler
>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>> eMail: uwe@thetaphi.de
>>>>>>> 
>>>>>>> From: Robert Muir [mailto:rcmuir@gmail.com]
>>>>>>> Sent: Sunday, July 07, 2013 11:15 PM
>>>>>>> To: dev@lucene.apache.org
>>>>>>> Subject: Re: Several builds hanging pecause of permgen
>>>>>>> 
>>>>>>> When there were leaks from static classes, we added a checker
to
>>>>> LuceneTestCase that looks for RAM > N and fails with debugging
>>> information.
>>>>>>> 
>>>>>>> I wonder if some similar check is possible for this case (to
make
>>>>>>> it easier
>>>>> than going thru heapdumps, and to find issues before crash-time)...
>>>>>>> 
>>>>>>> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uwe@thetaphi.de>
>>> wrote:
>>>>>>> Another one:
>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-
>> Linux/6385/testRepo
>>>>>>> rt/
>>>>>>> 
>>>>> 
>>> junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT
>>>>>>> est/
>>>>>>> 
>>>>>>> Had to be killed with kill -9
>>>>>>> 
>>>>>>> -----
>>>>>>> Uwe Schindler
>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>> eMail: uwe@thetaphi.de
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>>>>>>>> Sent: Saturday, July 06, 2013 10:16 PM
>>>>>>>> To: dev@lucene.apache.org
>>>>>>>> Subject: RE: Several builds hanging pecause of permgen
>>>>>>>> 
>>>>>>>> Another one:
>>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
>>>>>>>> 
>>>>>>>> I was only able to kill the JVM with kill -9 I am sure, it's
horrible
>>>>> slowdoop!
>>>>>>>> 
>>>>>>>> -----
>>>>>>>> Uwe Schindler
>>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>>> eMail: uwe@thetaphi.de
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>>>>>>>>> Sent: Friday, July 05, 2013 3:59 PM
>>>>>>>>> To: dev@lucene.apache.org
>>>>>>>>> Subject: Several builds hanging pecause of permgen
>>>>>>>>> 
>>>>>>>>> Several Jenkins builds now hang because of permgen. The
runner
>>> JVM
>>>>>>>>> is dead (can only be killed by -9), last example:
>>>>>>>>> 
>>>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-
>> Linux/6360/console
>>>>>>>>> 
>>>>>>>>> -----
>>>>>>>>> Uwe Schindler
>>>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>>>> eMail: uwe@thetaphi.de
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For
>>>>>>>>> additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --------------------------------------------------------------------
>>>>>>>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For
>>>>>>>> additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>>>>>>> additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>>> additional
>>>>> commands, e-mail: dev-help@lucene.apache.org
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


Mime
View raw message