lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ash nix <>
Subject Re: high memory usage by indexreader
Date Wed, 20 Mar 2013 18:21:15 GMT
Thanks Ian.

Number of documents in index is 381,153,828.
The data set size is 1.9TB.
The index size of this dataset is 290G. It is single index.
The following are the fields indexed for each of the document.

1. Document id : It is StoredField and is generally around 128 chars or more.
2. Text field:  It is TextField  and not stored.
3. Title : it is a Textfield and not stored.
4. anchor : It is Textfield and not stored.
5. Timestamp : DoubleDocValue field and not stored. Actually this
should be DoubleField and I need to fix it.

Initialization of indexreader at the start of search takes approximately 4 min.
After initialization , I am executing a series of Boolean AND queries
of 2-3 terms. Each search result is dumped with some information on
score and doc id in a output file.

The resident size (RES) of process is 1.7 Gigs.
The total virtual memory (VIRT) is 307 Gig.

Do you think this is okay?
Do you think I should use Solr instead of using lucene core?

I have times tamps for document and so I can split into multiple
indexes sorted on chronology.


On Wed, Mar 20, 2013 at 1:43 PM, Ian Lea <> wrote:
> Searching doesn't usually use that much memory, even on large indexes.
> What version of lucene are you on?  How many docs in the index?  What
> does a slow query look like (q.toString()) and what search method are
> you calling?  Anything else relevant you forgot to tell us?
> Or google "lucene sharding" if you are determined to split the index.
> --
> Ian.
> On Wed, Mar 20, 2013 at 5:12 PM, ash nix <> wrote:
>> Hi Everybody,
>> I have created a single compound index which is of size 250 Gigs.
>> I open a single index reader to search simple boolean queries.
>> The process is consuming lot of memory search painfully slow.
>> It seems that I will have to create multiple indexes and have multiple
>> index readers.
>> Can anyone suggest me good blog or documentation on creating multiple
>> indexes and performing parallel search.
>> --
>> Thanks,
>> A
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message