lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Sequence IDs for NRT deletes
Date Wed, 21 Jul 2010 10:12:46 GMT
On Tue, Jul 20, 2010 at 4:21 PM, Jason Rutherglen
<jason.rutherglen@gmail.com> wrote:
>> Right, much less GC if app frequently reopens.  But a 32X increase in
>> RAM usage is not trivial; I think we shouldn't enable it by default?
>
> Right, the RAM usage is quite high!  Is there a more compact
> representation we could use?  Ah well, either way for good RT
> performance, there are some users who may want to use this option.

Well, packed ints are more compact, but the decode cost would probably
be catastrophic :)

Maybe you could also use a smaller type (byte[], short[]) for sequence
ids, but, you'd then have to handle wraparound/overflow.  (In fact
even w/ int[] you have to handle wraparound?  long[] is probably safe
:) )  EG on overflow, you'd have to allocate all new (zero'd) arrays
for the next re-opened reader?

>> Have you tested?
>
> The test would be a basic benchmark of queries against BV vs. an int[]
> of deletes?

Yes, in a normal reader  (ie, not testing NRT -- just testing cost of
applying deletes via int cmp instead of BV lookup).

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message