lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Terence Lai <t...@trekspace.com>
Subject RE: Re: OutOfMemoryError
Date Wed, 18 Aug 2004 14:58:35 GMT
Hi Otis,

The reason why I ran into this problem is that I partition my search documents into multiple
index directories ordered by document modified date. My application only returns the lastest
500 documents that matches the criteria. By partitioning the documents into different directories,
we have a huge performance gain. Considering I have the following partitions,

- partition 1 (earliest documents are in this partition)
- partition 2
- partition 3
- partition 4 (latest documents are in this partition)

If I only need the lastest 500 documents, I will start searching from partition 4. If I got
500 documents matched, I don't need to search for the remaining partitions. Otherwise, I will
perform another search on partition 3 and so forth util I get 500 documents or I go through
all the partitions. I can also make use of the MultiSearcher and ParallelMultiSearcher in
my search.

Now, the problems that I am having if I keep the IndexSearcher opened are the followings:

1) As the number of documents increases, my number of my index partition directories will
also increase since I set a upper limit of the number of documents in each partition. If it
reaches the limit, I will create a new partitions. As the number of IndexSearcher increases,
it will eventually runs out of memory if I cannot close the IndexSearcher and release the
memory.

2) I have a background process to update the index files. If I keep the IndexSearcher opened,
I am not sure whether it will pick up the changes from the index updates done in the background
process.

Any idea how I can work around this problem?

Thanks,
Terence
> Reuse your IndexSearcher! :)
> 
> Also, I think somebody has written some EJB stuff to work with Lucene. 
> The project is on SF.net.
> 
> Otis
> 
> --- Terence Lai <tlai@trekspace.com> wrote:
> 
> > Hi All,
> > 
> > I am getting a OutOfMemoryError when I deploy my EJB application. To
> > debug the problem, I wrote the following test program:
> > 
> >     public static void main(String[] args) {
> >         try {
> >             Query query = getQuery();
> > 
> >             for (int i=0; i<1000; i++) {
> >                 search(query);
> >                 
> >                 if ( i%50 == 0 ) {
> >                     System.out.println("Sleep...");
> >                     Thread.currentThread().sleep(5000);
> >                     System.out.println("Wake up!");
> >                 }
> >             }            
> >         } catch (Exception e) {
> >             e.printStackTrace();
> >         }
> >     }
> > 
> >     private static void search(Query query) throws IOException {
> >         FSDirectory fsDir = null;
> >         IndexSearcher is = null;
> >         Hits hits = null;
> >         
> >         try {
> >             fsDir = FSDirectory.getDirectory("C:\\index, false);
> >             is = new IndexSearcher(fsDir);
> >             SortField sortField = new
> > SortField("profile_modify_date",
> >                 SortField.STRING, true);
> > 
> >             hits = is.search(query, new Sort(sortField));
> >         } finally {
> >             if (is != null) {
> >                 try {
> >                     is.close();
> >                 } catch (Exception ex) {
> >                 }
> >             }
> >             
> >             if (fsDir != null) {
> >                 try {
> >                     is.close();
> >                 } catch (Exception ex) {
> >                 }
> >             }
> >         }
> >         
> >     }
> > 
> > In the test program, I wrote a loop to keep calling the search
> > method. Everytime it enters the search method, I would instantiate
> > the IndexSearcher. Before I exit the method, I close the
> > IndexSearcher and FSDirectory. I also made the Thread sleep for 5
> > seconds in every 50 searches. Hopefully, this will give some time for
> > the java to do the Garbage Collection. Unfortunately, when I observe
> > the memory usage of my process, it keeps increasing until I got the
> > java.lang.OutOfMemoryError.
> > 
> > Note that I invoke the IndexSearcher.search(Query query, Sort sort)
> > to process the search. If I don't specify the Sort field(i.e. using
> > IndexSearcher.search(query)), I don't have this problem, and the
> > memory usage keeps at a very static level.
> > 
> > Does anyone experience a similar problem? Did I do something wrong in
> > the test program. I throught by closing the IndexSearcher and the
> > FSDirectory, the memory will be able to release during the Garbage
> > Collection.
> > 
> > Thanks,
> > Terence
> > 
> > 
> > 
> > 
> > ----------------------------------------------------------
> > Get your free email account from http://www.trekspace.com
> >           Your Internet Virtual Desktop!
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 




----------------------------------------------------------
Get your free email account from http://www.trekspace.com
          Your Internet Virtual Desktop!

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message