nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Updated: (NUTCH-634) Patch - Nutch - Hadoop 0.17.1
Date Sat, 19 Jul 2008 07:36:31 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrzej Bialecki  updated NUTCH-634:
------------------------------------

    Attachment: hadoop-0.17.patch

Patch to upgrade to Hadoop 0.17.1. This builds upon the previous patches, but it also replaces
many deprecated API uses. It also uses the workaround discussed previously, instead of using
specialized InputFormat-s.

> Patch - Nutch - Hadoop 0.17.1
> -----------------------------
>
>                 Key: NUTCH-634
>                 URL: https://issues.apache.org/jira/browse/NUTCH-634
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Michael Gottesman
>            Assignee: Andrzej Bialecki 
>             Fix For: 1.0.0
>
>         Attachments: diff, hadoop-0.17.patch, hadoop-0.17.patch, hadoop-0.17.patch
>
>
> This is a patch so that Nutch can be used with Hadoop 0.17.0. The patch is located at
http://pastie.org/212001
> The patch compiles and passes all current Nutch unit tests.
> I have tested that the crawler side of Nutch (i.e. inject, generate, fetch, parse, merge
w/crawldb) definetly works, but have not tested the lucene indexing part. It might work, but
it might not. 
> *NOTE* - the two main bugs that had to be overcome were not noticed by any of the unit
tests. The bugs only came up during actual testing. The bugs were:
> 1. Changes to the Hadoop Iterator
> 2. Addition of Serialization to MapReduce Framework

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message