lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Solr and Garbage Collection
Date Sat, 03 Oct 2009 18:51:50 GMT
Ah, yes - thanks for the clarification. Didn't pay attention to how
ambiguously I was using "supported" there :)

Bill Au wrote:
> SUN has recently clarify the issue regarding "unsupported unless you pay"
> for the G1 garbage collector. Here is the updated release of Java 6 update
> 14:
> http://java.sun.com/javase/6/webnotes/6u14.html
>
>
> G1 will be part of Java 7, fully supported without pay.  The version
> included in Java 6 update 14 is a beta release.  Since it is beta, SUN does
> not recommend using it unless you have a support contract because as with
> any beta software there will be bugs.  Non paying customers may very well
> have to wait for the official version in Java 7 for bug fixes.
>
> Here is more info on the G1 garbage collector:
>
> http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp
>
>
> Bill
>
> On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <markrmiller@gmail.com> wrote:
>
>   
>> Another option of course, if you're using a recent version of Java 6:
>>
>> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
>> I've only recently started playing with it, but its supposed to be much
>> better than CMS. Its supposedly got much better throughput, its much
>> better at dealing with fragmentation issues (CMS is actually pretty bad
>> with fragmentation come to find out), and overall its just supposed to
>> be a very nice leap ahead in GC. Havn't had a chance to play with it
>> much myself, but its supposed to be fantastic. A whole new approach to
>> generational collection for Sun, and much closer to the "real time" GC's
>> available from some other vendors.
>>
>> Mark Miller wrote:
>>     
>>> siping liu wrote:
>>>
>>>       
>>>> Hi,
>>>>
>>>> I read pretty much all posts on this thread (before and after this one).
>>>>         
>> Looks like the main suggestion from you and others is to keep max heap size
>> (-Xmx) as small as possible (as long as you don't see OOM exception). This
>> brings more questions than answers (for me at least. I'm new to Solr).
>>     
>>>>
>>>> First, our environment and problem encountered: Solr1.4 (nightly build,
>>>>         
>> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
>> Solaris(multi-cpu/cores). The cache setting is from the default
>> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS and
>> quickly run into the problem similar to the one orignal poster reported --
>> long pause (seconds to minutes) under load test. jconsole showed that it
>> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
>> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
>> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the thinking
>> is with mutile-cpu/cores we can get over with GC as quickly as possibe. With
>> the new setup, it works fine until Tomcat reaches heap size, then it blocks
>> and takes minutes on "full GC" to get more space from "tenure generation".
>> We tried different Xmx (from very small to large), no difference in long GC
>> time. We never run into OOM.
>>     
>>>>         
>>> MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
>>> the Parallel collector. That also doesnt look like a good survivorratio.
>>>
>>>       
>>>> Questions:
>>>>
>>>> * In general various cachings are good for performance, we have more RAM
>>>>         
>> to use and want to use more caching to boost performance, isn't your
>> suggestion (of lowering heap limit) going against that?
>>     
>>>>         
>>> Leaving RAM for the FileSystem cache is also very important. But you
>>> should also have enough RAM for your Solr caches of course.
>>>
>>>       
>>>> * Looks like Solr caching made its way into tenure-generation on heap,
>>>>         
>> that's good. But why they get GC'ed eventually?? I did a quick check of Solr
>> code (Solr 1.3, not 1.4), and see a single instance of using WeakReference.
>> Is that what is causing all this? This seems to suggest a design flaw in
>> Solr's memory management strategy (or just my ignorance about Solr?). I
>> mean, wouldn't this be the "right" way of doing it -- you allow user to
>> specify the cache size in solrconfig.xml, then user can set up heap limit in
>> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
>> SoftReference)??
>>     
>>>>         
>>> Do you see concurrent mode failure when looking at your gc logs? ie:
>>>
>>> 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
>>> secs]174.446: [CMS (concurrent mode failure): 161928K->162118K(175104K),
>>> 4.0975124 secs] 228336K->162118K(241520K)
>>>
>>> That means you have still getting major collections with CMS, and you
>>> don't want that. You might try kicking GC off earlier with something
>>> like: -XX:CMSInitiatingOccupancyFraction=50
>>>
>>>       
>>>> * Right now I have a single Tomcat hosting Solr and other applications.
>>>>         
>> I guess now it's better to have Solr on its own Tomcat, given that it's
>> tricky to adjust the java options.
>>     
>>>>
>>>> thanks.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>>> From: wunder@wunderwood.org
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: RE: Solr and Garbage Collection
>>>>> Date: Fri, 25 Sep 2009 09:51:29 -0700
>>>>>
>>>>> 30ms is not better or worse than 1s until you look at the service
>>>>> requirements. For many applications, it is worth dedicating 10% of your
>>>>> processing time to GC if that makes the worst-case pause short.
>>>>>
>>>>> On the other hand, my experience with the IBM JVM was that the maximum
>>>>>           
>> query
>>     
>>>>> rate was 2-3X better with the concurrent generational GC compared to
>>>>>           
>> any of
>>     
>>>>> their other GC algorithms, so we got the best throughput along with the
>>>>> shortest pauses.
>>>>>
>>>>> Solr garbage generation (for queries) seems to have two major
>>>>>           
>> components:
>>     
>>>>> per-request garbage and cache evictions. With a generational collector,
>>>>> these two are handled by separate parts of the collector. Per-request
>>>>> garbage should completely fit in the short-term heap (nursery), so that
>>>>>           
>> it
>>     
>>>>> can be collected rapidly and returned to use for further requests. If
>>>>>           
>> the
>>     
>>>>> nursery is too small, the per-request allocations will be made in
>>>>>           
>> tenured
>>     
>>>>> space and sit there until the next major GC. Cache evictions are almost
>>>>> always in long-term storage (tenured space) because an LRU algorithm
>>>>> guarantees that the garbage will be old.
>>>>>
>>>>> Check the growth rate of tenured space (under constant load, of course)
>>>>> while increasing the size of the nursery. That rate should drop when
>>>>>           
>> the
>>     
>>>>> nursery gets big enough, then not drop much further as it is increased
>>>>>           
>> more.
>>     
>>>>> After that, reduce the size of tenured space until major GCs start
>>>>>           
>> happening
>>     
>>>>> "too often" (a judgment call). A bigger tenured space means longer
>>>>>           
>> major GCs
>>     
>>>>> and thus longer pauses, so you don't want it oversized by too much.
>>>>>
>>>>> Also check the hit rates of your caches. If the hit rate is low, say
>>>>>           
>> 20% or
>>     
>>>>> less, make that cache much bigger or set it to zero. Either one will
>>>>>           
>> reduce
>>     
>>>>> the number of cache evictions. If you have an HTTP cache in front of
>>>>>           
>> Solr,
>>     
>>>>> zero may be the right choice, since the HTTP cache is cherry-picking
>>>>>           
>> the
>>     
>>>>> easily cacheable requests.
>>>>>
>>>>> Note that a commit nearly doubles the memory required, because you have
>>>>>           
>> two
>>     
>>>>> live Searcher objects with all their caches. Make sure you have
>>>>>           
>> headroom for
>>     
>>>>> a commit.
>>>>>
>>>>> If you want to test the tenured space usage, you must test with real
>>>>>           
>> world
>>     
>>>>> queries. Those are the only way to get accurate cache eviction rates.
>>>>>
>>>>> wunder
>>>>>
>>>>>
>>>>>           
>>>> _________________________________________________________________
>>>> Bing™  brings you maps, menus, and reviews organized in one place.   Try
>>>>         
>> it now.
>>     
>> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
>>     
>>>>         
>>>
>>>       
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>
>>
>>     
>
>   


-- 
- Mark

http://www.lucidimagination.com




Mime
View raw message