lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From galford23 <>
Subject Lucene or Nutch???
Date Sun, 17 Feb 2008 20:14:05 GMT

Hi all, 

I am new to lucene and nutch.
I am doing a project on an archiving web portal which allow individual user
to index document (from file system) and to crawl website and RSS feed for

Looking at the above requirement. I thought lucene is able to achieve it,
however I found out that lucene does not have a crawler to crawl url. 

Then I look came across Nutch = perfect for my latter requirement to fetch
website and RSS feed. I realise another thing from nutch it allow me to
crawl my file system as well... 

Well then in my case, I guess I should be using API from nutch instead of
>From another discussion on Nabble:

there is this advice to use lucene to index the same index file that nutch
have created. But I thought that nutch is using a webdb to store the return
crawl result? anyway from the threat mention above... why would one use
lucene if nutch can perform all the local file system and web index and
search function

please correct my brief understanding .

Steven (Singapore)

View this message in context:
Sent from the Lucene - General mailing list archive at

View raw message