lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <>
Subject Re: Sequence IDs for NRT deletes
Date Wed, 21 Jul 2010 21:41:11 GMT
> long[] is probably safe

Yeah it's safe for most things...

> short[]

That could be a much better option for minimizing RAM usage, and then
implement wraparound.

On Wed, Jul 21, 2010 at 3:12 AM, Michael McCandless
<> wrote:
> On Tue, Jul 20, 2010 at 4:21 PM, Jason Rutherglen
> <> wrote:
>>> Right, much less GC if app frequently reopens.  But a 32X increase in
>>> RAM usage is not trivial; I think we shouldn't enable it by default?
>> Right, the RAM usage is quite high!  Is there a more compact
>> representation we could use?  Ah well, either way for good RT
>> performance, there are some users who may want to use this option.
> Well, packed ints are more compact, but the decode cost would probably
> be catastrophic :)
> Maybe you could also use a smaller type (byte[], short[]) for sequence
> ids, but, you'd then have to handle wraparound/overflow.  (In fact
> even w/ int[] you have to handle wraparound?  long[] is probably safe
> :) )  EG on overflow, you'd have to allocate all new (zero'd) arrays
> for the next re-opened reader?
>>> Have you tested?
>> The test would be a basic benchmark of queries against BV vs. an int[]
>> of deletes?
> Yes, in a normal reader  (ie, not testing NRT -- just testing cost of
> applying deletes via int cmp instead of BV lookup).
> Mike
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message