nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Frederic Ciminera" <ciminera.frede...@gmail.com>
Subject Issue with IndexSearcher initialization in NuchBean
Date Tue, 27 Nov 2007 17:10:16 GMT
Hi,



I'm using Nutch 0.9 with a Tomcat web server.



When Tomcat is launched the NutchBean is initialized and creates a new
IndexSearcher (org.apache.nutch.searcher.IndexSearcher) on the "index" path
defined in my searcher.dir configuration.



This IndexSearcher is then initialized and opens a Lucene IndexReader (
org.apache.lucene.index.IndexReader) on this index path.



/** Construct given a single merged index. */

public IndexSearcher(Path index,  Configuration conf)

    …

    init(IndexReader.open(getDirectory(index)), conf);

…



And also create a Lucene IndexSearcher(
org.apache.lucene.search.IndexSearcher) with this IndexReader



private void init(IndexReader reader, Configuration conf) throws IOException
{

    this.reader = reader;

    this.luceneSearcher = new org.apache.lucene.search.IndexSearcher
(reader);

                …



The issue is that the NutchBean is initialized once when the Web server is
launched and so uses the same IndexReader for each search.

This is an issue for me when I try to update the index while the server is
running.



For example if I run the server without any index, then I try to search
using Nutch search form (http://localhost:8080/nutch-1.0-dev) I get no
result (no issue here).

But then if I generate a new index in the expecting path without restarting
Tomcat the search form will still return no results.

I need to restart tomcat for the results to be visible.

Another issue is that when the IndexReader is opened it keeps a lock on the
index files and can't be deleted or modified while server is running.



It seems that the Lucene IndexReader is working on a snapshot of the index.



A solution for this could be to initialize a new IndexSearcher (or
IndexReader) only before a search and close it once finished.





Am I misunderstanding something or doing something wrong?

Thanks for your help,

Frederic

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message