nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <>
Subject [jira] [Commented] (NUTCH-2049) Upgrade Trunk to Hadoop > 2.4 stable
Date Tue, 18 Aug 2015 16:10:46 GMT


Chris A. Mattmann commented on NUTCH-2049:

Asitang, if you recall, we discussed simply figuring out the Hadoop cluster's server name
- there is nothing stopping us from a Hadoop job inside a Hadoop job. I would suggest you
try going down that path to sense the Hadoop TaskTracker host (via Context or other properties)
and to pass that down to Mahout.

Also I think a good improvement would be to separate out the training tool too. 

Can you please work on both?

> Upgrade Trunk to Hadoop > 2.4 stable
> ------------------------------------
>                 Key: NUTCH-2049
>                 URL:
>             Project: Nutch
>          Issue Type: Improvement
>          Components: build
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>              Labels: memex
>             Fix For: 1.11
>         Attachments: NUTCH-2049.patch, NUTCH-2049v2.patch
> Convo here -
> I am +1 for taking trunk (or a branch of trunk) to explicit dependency on > Hadoop
> We can run our tests, we can validate, we can fix.
> I will be doing validation on 2.X in paralegal as this is what I use on my own projects.

This message was sent by Atlassian JIRA

View raw message