lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kelvin Tan <>
Subject Re: <no-index> or <index>
Date Thu, 30 Jan 2003 12:42:07 GMT
My suggestion would be to modify HTMLParser to do the job. Don't think it's 
very difficult. I'm unaware of any existing HTML Parsers which support that 


The book giving manifesto     -

On Thu, 30 Jan 2003 10:56:50 +0100, Michael Wechner said:
>I am looking for an HTMLParser which skips text tagged by
><no-index>  or something similar. This way I could exclude for
>instance a "global navigation section" within the HTML
><no-index> International<br> Business<br> Science<br> ...
>It seems that the current demo/HTMLParser
>bin/faq/faqmanager.cgi?file=chapter.indexing&toc=faq#q11) is not
>capable of doing something like that.
>Any pointers are very welcome.
>Thanks a lot
>To unsubscribe, e-mail:
>For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message