lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Wechner <michael.wech...@wyona.org>
Subject <no-index> or <index>
Date Thu, 30 Jan 2003 09:56:50 GMT
Hi

I am looking for an HTMLParser which skips text tagged by

<no-index>  or something similar. This way I could exclude for
instance a "global navigation section" within the HTML

<no-index>
International<br>
Business<br>
Science<br>
...
</no-index>

It seems that the current demo/HTMLParser 
(http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi?file=chapter.indexing&toc=faq#q11)
is not capable of doing something like that.

Any pointers are very welcome.

Thanks a lot

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message