nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gal Nitzan" <>
Subject RE: 'RegexIndexingFilter'
Date Mon, 29 Jan 2007 21:18:54 GMT

Since the plug-in you are about to write is actually a filter in a chain of
filters all you need to do is throw an exception in the filter interface
like so: throw new IndexingException("Doesn't comply to regex blah. Do not



-----Original Message-----
From: Tobias Zahn [] 
Sent: Monday, January 29, 2007 8:58 PM
Subject: 'RegexIndexingFilter'

Good evening!
I have found out that it is impossible to index only some specific file
types with nutch. Needing this feature, I thought of implementing an
'RegexIndexingFilter', if that would be the right thing to do so.
I have read some sourcecode, but I couldn't find out how to tell the
indexer that he shouldn't index a file.

Hoping that I am on the right way I hope for your opinions, ideas and
your help.

Tobias Zahn

View raw message