nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.4 stable
Date Tue, 18 Aug 2015 01:02:46 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700547#comment-14700547
] 

Lewis John McGibbney edited comment on NUTCH-2049 at 8/18/15 1:01 AM:
----------------------------------------------------------------------

Update, tested on 
* Amazon EMR's Hadoop 2.4.0
* Apache Hadoop 2.4.0 running psudo distrib and 
* Apache Hadoop 2.4.0 running Nutch in local mode. 

All tests pass, all jobs are successful and I am able to complete full crawls on Hadoop 2.4.0.
Would be great if we could get further validation of this patch.
[~asitang] please note that this patch CANNOT be run with your parsefilter-naivebayes activated,
take a look into the patch to see that it has been deactivated. As I stated above, _hopefully_
this is addressed in NUTCH-1486... if not, then we need to look at making it work seamlessly.


was (Author: lewismc):
Update, tested on 
* Amazon EMR's Hadoop 2.4.0
* Apache Hadoop 2.4.0 running psudo distrib and 
* Apache Hadoop 2.4.0 running Nutch in local mode. 
All tests pass, all jobs are successful and I am able to complete full crawls on Hadoop 2.4.0.
Would be great if we could get further validation of this patch.
[~asitang] please note that this patch CANNOT be run with your parsefilter-naivebayes activated,
take a look into the patch to see that it has been deactivated. As I stated above, _hopefully_
this is addressed in NUTCH-1486... if not, then we need to look at making it work seamlessly.

> Upgrade Trunk to Hadoop > 2.4 stable
> ------------------------------------
>
>                 Key: NUTCH-2049
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2049
>             Project: Nutch
>          Issue Type: Improvement
>          Components: build
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>             Fix For: 1.11
>
>         Attachments: NUTCH-2049.patch, NUTCH-2049v2.patch
>
>
> Convo here - http://www.mail-archive.com/dev%40nutch.apache.org/msg18225.html
> I am +1 for taking trunk (or a branch of trunk) to explicit dependency on > Hadoop
2.6.
> We can run our tests, we can validate, we can fix.
> I will be doing validation on 2.X in paralegal as this is what I use on my own projects.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message