lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Townsend" <>
Subject RE: indexing/searching a website
Date Thu, 27 Nov 2003 10:53:48 GMT
I would advise you to use the excellent articles listed here.

Some good examples and by the end of it you should have a good understanding of the major
classes and their use.

-----Original Message-----
From: Michal S []
Sent: 27 November 2003 10:52
To: Lucene Users List
Subject: Re: indexing/searching a website

> Another option is to deploy your site and crawl it from the outside 
> (have a look at Nutch at sourceforge - or write your own using 
> HttpClient and some HTML parsing for hyperlinks).

I realize that it will be necessary to write or use existing html 
parser. I know that i need But i don't know how the whole framework 
would look like (how to translate pages on webserwer to Lucene 
documents, how to index them, how to search them).

The example on the Lucene home page is very simple and doesn't give me 
much answers.

> I would argue that content within the JSP is a bad thing given that you 
> want to index it - perhaps it makes more sense to put the content 
> somewhere easier to get at like a database?

You are absolutely right. But my client wants to edit the content as 
easy as possible (via notepad or other text editor). If the content were 
in database, it would be necessery to provide my client with some kind 
of application which could let him update the content. The budget of the 
project is strongly limited so i can't afford to allocate more 
developers to build content editor.

Thanks for the reply.

Najlepsze bo darmowe - konta e-mail


To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message