lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Libbrecht <>
Subject Re: a complete solution for building a website search with lucene
Date Fri, 08 Jan 2010 08:27:37 GMT

Lucene is a back-end library, it's very useful for developer but it is  
not a complete site-search-engine.
A lucene-based site-search-engine is Nutch, it does crawl.
Solr also provides functions close to these with a large amount of  
thoughts on flexible integration; crawling methods are rather based on  
feeds or other acquisition methods (see DIH for example).


Le 08-janv.-10 à 08:08, <> a écrit :

> Hi ,
> I am new in Lucene.
> To build a web search function, it need to have a backendc indexing  
> function. But, before that, should run a Crawler? because Lucene  
> index based on Html documents, while Crawler can change the website  
> pages to Html documents. Am i right?
> If so, please anyone suggest to me a Crawler? like Nutch?
> Thanks
> Zhou
>      New Email names for you!
> Get the Email name you&#39;ve always wanted on the new @ymail and  
> @rocketmail.
> Hurry before someone else does!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message