lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: Sequence IDs for NRT deletes
Date Wed, 21 Jul 2010 21:41:11 GMT
> long[] is probably safe

Yeah it's safe for most things...

> short[]

That could be a much better option for minimizing RAM usage, and then
implement wraparound.

On Wed, Jul 21, 2010 at 3:12 AM, Michael McCandless
<lucene@mikemccandless.com> wrote:
> On Tue, Jul 20, 2010 at 4:21 PM, Jason Rutherglen
> <jason.rutherglen@gmail.com> wrote:
>>> Right, much less GC if app frequently reopens.  But a 32X increase in
>>> RAM usage is not trivial; I think we shouldn't enable it by default?
>>
>> Right, the RAM usage is quite high!  Is there a more compact
>> representation we could use?  Ah well, either way for good RT
>> performance, there are some users who may want to use this option.
>
> Well, packed ints are more compact, but the decode cost would probably
> be catastrophic :)
>
> Maybe you could also use a smaller type (byte[], short[]) for sequence
> ids, but, you'd then have to handle wraparound/overflow.  (In fact
> even w/ int[] you have to handle wraparound?  long[] is probably safe
> :) )  EG on overflow, you'd have to allocate all new (zero'd) arrays
> for the next re-opened reader?
>
>>> Have you tested?
>>
>> The test would be a basic benchmark of queries against BV vs. an int[]
>> of deletes?
>
> Yes, in a normal reader  (ie, not testing NRT -- just testing cost of
> applying deletes via int cmp instead of BV lookup).
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message