nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.4 stable
Date Tue, 18 Aug 2015 16:10:46 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14701497#comment-14701497
] 

Chris A. Mattmann commented on NUTCH-2049:
------------------------------------------

Asitang, if you recall, we discussed simply figuring out the Hadoop cluster's server name
- there is nothing stopping us from a Hadoop job inside a Hadoop job. I would suggest you
try going down that path to sense the Hadoop TaskTracker host (via Context or other properties)
and to pass that down to Mahout.

Also I think a good improvement would be to separate out the training tool too. 

Can you please work on both?

> Upgrade Trunk to Hadoop > 2.4 stable
> ------------------------------------
>
>                 Key: NUTCH-2049
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2049
>             Project: Nutch
>          Issue Type: Improvement
>          Components: build
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>              Labels: memex
>             Fix For: 1.11
>
>         Attachments: NUTCH-2049.patch, NUTCH-2049v2.patch
>
>
> Convo here - http://www.mail-archive.com/dev%40nutch.apache.org/msg18225.html
> I am +1 for taking trunk (or a branch of trunk) to explicit dependency on > Hadoop
2.6.
> We can run our tests, we can validate, we can fix.
> I will be doing validation on 2.X in paralegal as this is what I use on my own projects.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message