lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Au <bill.w...@gmail.com>
Subject Re: Solr and Garbage Collection
Date Sat, 03 Oct 2009 17:39:39 GMT
SUN has recently clarify the issue regarding "unsupported unless you pay"
for the G1 garbage collector. Here is the updated release of Java 6 update
14:
http://java.sun.com/javase/6/webnotes/6u14.html


G1 will be part of Java 7, fully supported without pay.  The version
included in Java 6 update 14 is a beta release.  Since it is beta, SUN does
not recommend using it unless you have a support contract because as with
any beta software there will be bugs.  Non paying customers may very well
have to wait for the official version in Java 7 for bug fixes.

Here is more info on the G1 garbage collector:

http://java.sun.com/javase/technologies/hotspot/gc/g1_intro.jsp


Bill

On Sat, Oct 3, 2009 at 1:28 PM, Mark Miller <markrmiller@gmail.com> wrote:

> Another option of course, if you're using a recent version of Java 6:
>
> try out the beta-ish, unsupported unless you pay, G1 garbage collector.
> I've only recently started playing with it, but its supposed to be much
> better than CMS. Its supposedly got much better throughput, its much
> better at dealing with fragmentation issues (CMS is actually pretty bad
> with fragmentation come to find out), and overall its just supposed to
> be a very nice leap ahead in GC. Havn't had a chance to play with it
> much myself, but its supposed to be fantastic. A whole new approach to
> generational collection for Sun, and much closer to the "real time" GC's
> available from some other vendors.
>
> Mark Miller wrote:
> > siping liu wrote:
> >
> >> Hi,
> >>
> >> I read pretty much all posts on this thread (before and after this one).
> Looks like the main suggestion from you and others is to keep max heap size
> (-Xmx) as small as possible (as long as you don't see OOM exception). This
> brings more questions than answers (for me at least. I'm new to Solr).
> >>
> >>
> >>
> >> First, our environment and problem encountered: Solr1.4 (nightly build,
> downloaded about 2 months ago), Sun JDK1.6, Tomcat 5.5, running on
> Solaris(multi-cpu/cores). The cache setting is from the default
> solrconfig.xml (looks very small). At first we used minimum JAVA_OPTS and
> quickly run into the problem similar to the one orignal poster reported --
> long pause (seconds to minutes) under load test. jconsole showed that it
> pauses on GC. So more JAVA_OPTS get added: "-XX:+UseConcMarkSweepGC
> -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:SurvivorRatio=2
> -XX:NewSize=128m -XX:MaxNewSize=512m -XX:MaxGCPauseMillis=200", the thinking
> is with mutile-cpu/cores we can get over with GC as quickly as possibe. With
> the new setup, it works fine until Tomcat reaches heap size, then it blocks
> and takes minutes on "full GC" to get more space from "tenure generation".
> We tried different Xmx (from very small to large), no difference in long GC
> time. We never run into OOM.
> >>
> >>
> > MaxGCPauseMillis doesnt work with UseConcMarkSweepGC - its for use with
> > the Parallel collector. That also doesnt look like a good survivorratio.
> >
> >>
> >>
> >> Questions:
> >>
> >> * In general various cachings are good for performance, we have more RAM
> to use and want to use more caching to boost performance, isn't your
> suggestion (of lowering heap limit) going against that?
> >>
> >>
> > Leaving RAM for the FileSystem cache is also very important. But you
> > should also have enough RAM for your Solr caches of course.
> >
> >> * Looks like Solr caching made its way into tenure-generation on heap,
> that's good. But why they get GC'ed eventually?? I did a quick check of Solr
> code (Solr 1.3, not 1.4), and see a single instance of using WeakReference.
> Is that what is causing all this? This seems to suggest a design flaw in
> Solr's memory management strategy (or just my ignorance about Solr?). I
> mean, wouldn't this be the "right" way of doing it -- you allow user to
> specify the cache size in solrconfig.xml, then user can set up heap limit in
> JAVA_OPTS accordingly, and no need to use WeakReference (BTW, why not
> SoftReference)??
> >>
> >>
> > Do you see concurrent mode failure when looking at your gc logs? ie:
> >
> > 174.445: [GC 174.446: [ParNew: 66408K->66408K(66416K), 0.0000618
> > secs]174.446: [CMS (concurrent mode failure): 161928K->162118K(175104K),
> > 4.0975124 secs] 228336K->162118K(241520K)
> >
> > That means you have still getting major collections with CMS, and you
> > don't want that. You might try kicking GC off earlier with something
> > like: -XX:CMSInitiatingOccupancyFraction=50
> >
> >> * Right now I have a single Tomcat hosting Solr and other applications.
> I guess now it's better to have Solr on its own Tomcat, given that it's
> tricky to adjust the java options.
> >>
> >>
> >>
> >> thanks.
> >>
> >>
> >>
> >>
> >>
> >>> From: wunder@wunderwood.org
> >>> To: solr-user@lucene.apache.org
> >>> Subject: RE: Solr and Garbage Collection
> >>> Date: Fri, 25 Sep 2009 09:51:29 -0700
> >>>
> >>> 30ms is not better or worse than 1s until you look at the service
> >>> requirements. For many applications, it is worth dedicating 10% of your
> >>> processing time to GC if that makes the worst-case pause short.
> >>>
> >>> On the other hand, my experience with the IBM JVM was that the maximum
> query
> >>> rate was 2-3X better with the concurrent generational GC compared to
> any of
> >>> their other GC algorithms, so we got the best throughput along with the
> >>> shortest pauses.
> >>>
> >>> Solr garbage generation (for queries) seems to have two major
> components:
> >>> per-request garbage and cache evictions. With a generational collector,
> >>> these two are handled by separate parts of the collector. Per-request
> >>> garbage should completely fit in the short-term heap (nursery), so that
> it
> >>> can be collected rapidly and returned to use for further requests. If
> the
> >>> nursery is too small, the per-request allocations will be made in
> tenured
> >>> space and sit there until the next major GC. Cache evictions are almost
> >>> always in long-term storage (tenured space) because an LRU algorithm
> >>> guarantees that the garbage will be old.
> >>>
> >>> Check the growth rate of tenured space (under constant load, of course)
> >>> while increasing the size of the nursery. That rate should drop when
> the
> >>> nursery gets big enough, then not drop much further as it is increased
> more.
> >>>
> >>> After that, reduce the size of tenured space until major GCs start
> happening
> >>> "too often" (a judgment call). A bigger tenured space means longer
> major GCs
> >>> and thus longer pauses, so you don't want it oversized by too much.
> >>>
> >>> Also check the hit rates of your caches. If the hit rate is low, say
> 20% or
> >>> less, make that cache much bigger or set it to zero. Either one will
> reduce
> >>> the number of cache evictions. If you have an HTTP cache in front of
> Solr,
> >>> zero may be the right choice, since the HTTP cache is cherry-picking
> the
> >>> easily cacheable requests.
> >>>
> >>> Note that a commit nearly doubles the memory required, because you have
> two
> >>> live Searcher objects with all their caches. Make sure you have
> headroom for
> >>> a commit.
> >>>
> >>> If you want to test the tenured space usage, you must test with real
> world
> >>> queries. Those are the only way to get accurate cache eviction rates.
> >>>
> >>> wunder
> >>>
> >>>
> >>
> >> _________________________________________________________________
> >> Bing™  brings you maps, menus, and reviews organized in one place.   Try
> it now.
> >>
> http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1
> >>
> >>
> >
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message