lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
Date Sun, 01 May 2016 03:17:13 GMT


David Smiley commented on LUCENE-7258:

It appears to me that caching bitsets is a much easier task than most any other caches I've
seen -- there is no key and most (half on average?) bitsets in the cache will be long enough
to be re-used by some subsequent lookup.  RE knowing when an instance can be recycled -- if
it were in conjunction with the QueryCache, then on eviction the bitset can be put into a
bitset cache.  RE knowing when bitsets should be evicted, especially for a library -- make
that configurable/disable?  _My_ main concern with such a cache is its overall code impact
-- how many places would be touched by it.  Perhaps a lot but maybe not too bad?  And of course
for what measurable benefit?  I imagine some of the GC cost of the current situation can be
addressed with GC tuning -- say raising the young gen via {{-Xmn}}.

> Tune DocIdSetBuilder allocation rate
> ------------------------------------
>                 Key: LUCENE-7258
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spatial
>            Reporter: Jeff Wartes
>         Attachments: LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch,
> LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but didn't actually
reduce garbage generation for my Solr index.
> Since something like 40% of my garbage (by space) is now attributed to DocIdSetBuilder.growBuffer,
I charted a few different allocation strategies to see if I could tune things more. 
> See here: 
> The jump-then-flatline at the right would be where DocIdSetBuilder gives up and allocates
a FixedBitSet for a 100M-doc index. (The 1M-doc index curve/cutoff looked similar)
> Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is terrible from
an allocation standpoint if you're doing a lot of expansions, and is especially terrible when
used to build a short-lived data structure like this one.
> By the time it goes with the FBS, it's allocated around twice as much memory for the
buffer as it would have needed for just the FBS.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message