nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis Kubes (JIRA)" <>
Subject [jira] Updated: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20
Date Wed, 25 Nov 2009 21:34:39 GMT


Dennis Kubes updated NUTCH-768:

    Attachment: NUTCH-768-1-20091125.patch

I thought I was going to be able to do this without code changes.  No such luck.  

There are many, many deprecations as a result of this upgrade.  Anything that used the old
Mapper and Reducer interfaces seems to have deprecated methods in it.  The NutchBean class
needed to implement the two RPC*Bean interfaces to handle changes in Hadoop RPC (that could
have been a leftover from 1.0 changes but I don't think so).  Also there are numerous changes
to build scripts and the nutch bin script to support different hadoop jars.

There are also many new files for the conf directory as Hadoop has split out files and has
new configuration files for new capabilities.

After all changes I was able to run everything in local and pseudo-distributed mode as well
as test out local and distributed searching.  Everything seems to work fine.  After we make
this upgrade I would recommend going back and updating all of the tool interfaces for the
most recent APIs.

> Upgrade Nutch 1.0 to use Hadoop 0.20
> ------------------------------------
>                 Key: NUTCH-768
>                 URL:
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.1
>         Environment: All
>            Reporter: Dennis Kubes
>            Assignee: Dennis Kubes
>             Fix For: 1.1
>         Attachments: NUTCH-768-1-20091125.patch
> Upgrade Nutch 1.0 to use the Hadoop 0.20 release.  

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message