lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <ben...@basistech.com>
Subject Re: Exploiting a whole lot of memory
Date Tue, 08 Oct 2013 23:43:23 GMT
Oh, drat, I left out an 's'. I got it now.


On Tue, Oct 8, 2013 at 7:40 PM, Benson Margulies <benson@basistech.com>wrote:

> Mike, where do I find DirectPostingFormat?
>
>
> On Tue, Oct 8, 2013 at 5:50 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> DirectPostingsFormat?
>>
>> It stores all terms + postings as simple java arrays, uncompressed.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Tue, Oct 8, 2013 at 5:45 PM, Benson Margulies <benson@basistech.com>
>> wrote:
>> > Consider a Lucene index consisting of 10m documents with a total disk
>> > footprint of 3G. Consider an application that treats this index as
>> > read-only, and runs very complex queries over it. Queries with many
>> terms,
>> > some of them 'fuzzy' and 'should' terms and a dismax. And, finally,
>> > consider doing all this on a box with over 100G of physical memory, some
>> > cores, and nothing else to do with its time.
>> >
>> > I should probably just stop here and see what thoughts come back, but
>> I'll
>> > go out on a limb and type the word 'codec'. The MMapDirectory, of
>> course,
>> > cheerfully gets to keep every single bit in memory. And then each query
>> > runs, exercising the  the codec, building up a flurry of Java objects,
>> all
>> > of which turn into garbage and we start all over. So, I find myself
>> > wondering, is there some sort of an opportunity for a codec-that-caches
>> in
>> > here? In other words, I'd like to sell some of my space to buy some
>> time.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message