lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fuad Efendi" <f...@efendi.ca>
Subject RE: Too many open files
Date Sat, 24 Oct 2009 16:25:01 GMT
This JavaDoc is incorrect especially for SOLR, when you store raw (non
tokenized, non indexed) "text" value with a document (which almost everyone
does). Try to store 1,000,000 documents with 1000 bytes non-tokenized field:
you will need 1Gb just for this array.


> -----Original Message-----
> From: Fuad Efendi [mailto:fuad@efendi.ca]
> Sent: October-24-09 12:10 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Too many open files
> 
> Thanks for pointing to it, but it is so obvious:
> 
> 1. "Buffer" is used as a RAM storage for index updates
> 2. "int" has 2 x Gb different values (2^^32)
> 3. We can have _up_to_ 2Gb of _Documents_ (stored as key->value pairs,
> inverted index)
> 
> In case of 5 fields which I have, I need 5 arrays (up to 2Gb of size for
> each) to store inverted pointers, so that there is no any theoretical
limit:
> 
> > Also, from the javadoc in IndexWriter:
> >
> >    * <p> <b>NOTE</b>: because IndexWriter uses
> >    * <code>int</code>s when managing its internal storage,
> >    * the absolute maximum value for this setting is somewhat
> >    * less than 2048 MB.  The precise limit depends on
> >    * various factors, such as how large your documents are,
> >    * how many fields have norms, etc., so it's best to set
> >    * this value comfortably under 2048.</p>
> 
> 
> 
> Note also, I use norms etc...
> 
> 




Mime
View raw message