nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Gottesman (JIRA)" <j...@apache.org>
Subject [jira] Created: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0
Date Tue, 10 Jun 2008 04:41:45 GMT
Patch - Nutch - Hadoop 0.17.0
-----------------------------

                 Key: NUTCH-634
                 URL: https://issues.apache.org/jira/browse/NUTCH-634
             Project: Nutch
          Issue Type: Improvement
    Affects Versions: 0.9.0
            Reporter: Michael Gottesman
             Fix For: 0.9.0


This is a patch so that Nutch can be used with Hadoop 0.17.0. The patch is located at http://pastie.org/212001

The patch compiles and passes all current Nutch unit tests.

I have tested that the crawler side of Nutch (i.e. inject, generate, fetch, parse, merge w/crawldb)
definetly works, but have not tested the lucene indexing part. It might work, but it might
not. 

*NOTE* - the two main bugs that had to be overcome were not noticed by any of the unit tests.
The bugs only came up during actual testing. The bugs were:

1. Changes to the Hadoop Iterator
2. Addition of Serialization to MapReduce Framework


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message