nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Updated: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1
Date Wed, 04 Mar 2009 11:07:56 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrzej Bialecki  updated NUTCH-711:
------------------------------------

    Attachment: patch.txt

This patch instantiates IndexingFilters in IndexerOutputFormat, and thus fixes the issue.
If there are not objections I will commit it shortly.

> Indexer failing after upgrade to Hadoop 0.19.1
> ----------------------------------------------
>
>                 Key: NUTCH-711
>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>             Fix For: 1.0.0
>
>         Attachments: patch.txt
>
>
> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before
(see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate
current JobConf with field options that are required for IndexerOutputFormat to function properly.
However, the filters are instantiated in Reducer.configure(), which is now called after the
OutputFormat is initialized, and not before as previously.
> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.
 This issue should be revisited before 1.1 in order to find a better solution.
> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message