lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fuad Efendi" <>
Subject RE: Too many open files
Date Sat, 24 Oct 2009 17:14:33 GMT

Hi Yonik,

I am still using pre-2.9 Lucene (taken from SOLR trunk two months ago).

2048 is limit for documents, not for array of pointers to documents. And
especially for new "uninverted" SOLR features, plus non-tokenized stored
fields, we need 1Gb to store 1Mb of a simple field only (size of field: 1000

May be it would broke... frankly, I started with 8Gb, then by some reason I
set if to 2Gb (a month ago), I don't remember why... I had hardware problems
and I didn't want frequent loose of ram buffer...

But again: why it would broke? Because "int" has 2048M different values?!! 

This is extremely strange. My understanding is that "buffer" stores
processed data such as "term -> document_id" values, _per_field_array(s!!!);
so that 2048M is _absolute_maximum_ in case if your SOLR schema consists
from _single_tokenized_field_only_. What about 10 fields? What about plain
text stored with document, term vectors, "uninverted" values??? What are
reasons on putting such check in Lucene? Array overflow?


> -----Original Message-----
> From: [] On Behalf Of Yonik
> Sent: October-24-09 12:27 PM
> To:
> Subject: Re: Too many open files
> On Sat, Oct 24, 2009 at 12:18 PM, Fuad Efendi <> wrote:
> >
> > Mark, I don't understand this; of course it is use case specific, I
> > seen any terrible behaviour with 8Gb
> If you had gone over 2GB of actual buffer *usage*, it would have
> broke...  Guaranteed.
> We've now added a check in Lucene 2.9.1 that will throw an exception
> if you try to go over 2048MB.
> And as the javadoc says, to be on the safe side, you probably
> shouldn't go too near 2048 - perhaps 2000MB is a good practical limit.
> -Yonik

View raw message