nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Tanaman (JIRA)" <>
Subject [jira] Updated: (NUTCH-421) Allow predeterminate running order of index filters
Date Wed, 27 Dec 2006 15:11:22 GMT
     [ ]

Alan Tanaman updated NUTCH-421:

    Attachment: nutch-421.patch

> Allow predeterminate running order of index filters
> ---------------------------------------------------
>                 Key: NUTCH-421
>                 URL:
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 0.8.1
>         Environment: All
>            Reporter: Alan Tanaman
>            Priority: Minor
>         Attachments: nutch-421.patch
> I've tested a patch for org.apache.nutch.indexer.IndexingFilters, allowing the user to
state in which order the indexing filters are to be run based on a new
> indexingfilter.order property. This is needed when a filter needs to rely on previously
generated document fields as a source of input to generate further fields.
> As suggested elsewhere, I based this on the urlfilter.order functionality:
> <property>
>   <name>indexingfilter.order</name>
>   <value>org.apache.nutch.indexer.basic.BasicIndexingFilter org.apache.nutch.indexer.more.MoreIndexingFilter</value>
>   <description>The order by which index filters are applied.
>   If empty, all available index filters (as dictated by properties
>   plugin-includes and plugin-excludes above) are loaded and applied in system
>   defined order. If not empty, only named filters are loaded and applied
>   in given order. For example, if this property has value:
>   org.apache.nutch.indexer.basic.BasicIndexingFilter org.apache.nutch.indexer.more.MoreIndexingFilter
>   then BasicIndexingFilter is applied first, and MoreIndexingFilter second.
>   Since all filters are AND'ed, filter ordering does not have impact
>   on end result, but it may have performance implication, depending
>   on relative expensiveness of filters.
>   </description>
> </property>
> Patch will be attached to this issue by 29/12/06

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:


View raw message