lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: MMapDirectory vs In Memory Lucene Index (i.e., ByteBuffersDirectory)
Date Mon, 14 Dec 2020 23:35:59 GMT
Thanks Robert.

I think these valuable comments need to be placed on javadocs for future 

i think i am getting enough info for making a decision:

i will use MMapDirectory without setPreload and i hope my index will fit 
into the RAM.

i plan to post a blog for findings.

Best regards

On 12/14/20 5:52 PM, Robert Muir wrote:
> On Mon, Dec 14, 2020 at 1:59 PM Uwe Schindler <> wrote:
>> Hi,
>> as writer of the original bog post, here my comments:
>> Yes, MMapDirectory.setPreload() is the feature mentioned in my blog post is
>> to load everything into memory - but that does not guarantee anything!
>> Still, I would not recommend to use that function, because all it does is to
>> just touch every page of the file, so the linux kernel puts it into OS cache
>> - nothing more; IMHO very ineffective as it slows down openining index for a
>> stupid for-each-page-touch-loop. It will do this with EVERY page, if it is
>> later used or not! So this may take some time until it is done. Lateron,
>> still Lucene needs to open index files, initialize its own data
>> structures,...
>> In general it is much better to open index, with MMAP directory and execute
>> some "sample" queries. This will do exactly the same like the preload
>> function, but it is more "selective". Parts of the index which are not used
>> won't be touched, and on top, it will also load ALL the required index
>> structures to heap.
> The main purpose of this thing is a fast warming option for random
> access files such as "i want to warm all my norms in RAM" or "i want
> to warm all my docvalues in RAM"... really it should only be used with
> the FileSwitchDirectory for a targeted purpose such as that: it is
> definitely a waste to set it for your entire index. It is just
> exposing the
> which first calls madvise(MADV_WILLNEED) and then touches every page.
> If you want to "warm" an ENTIRE very specific file for a reason like
> this (e.g. per-doc scoring value, ensuring it will be hot for all
> docs), it is hard to be more efficient than that.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message