nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-2327) Seeds injected in REST workflow must be ingested into HDFS
Date Thu, 20 Oct 2016 22:36:58 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15593243#comment-15593243
] 

ASF GitHub Bot commented on NUTCH-2327:
---------------------------------------

GitHub user sujen1412 opened a pull request:

    https://github.com/apache/nutch/pull/155

    Fix for NUTCH-2327: Seeds injected in REST must be ingested into HDFS

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sujen1412/nutch NUTCH-2327

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nutch/pull/155.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #155
    
----
commit b66e2cae2f227ee9f2775f87ddf20fc4d9479aa4
Author: Sujen Shah <sujen1412@gmail.com>
Date:   2016-10-19T04:36:27Z

    Fix for NUTCH-2327: Seeds injected in REST must be ingested into HDFS

----


> Seeds injected in REST workflow must be ingested into HDFS
> ----------------------------------------------------------
>
>                 Key: NUTCH-2327
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2327
>             Project: Nutch
>          Issue Type: Improvement
>          Components: injector, REST_api
>    Affects Versions: 1.12
>            Reporter: Lewis John McGibbney
>            Assignee: Sujen Shah
>             Fix For: 1.13
>
>
> Right now when one uses the REST POST /seed/create API, a directory is created within
/var/some/path/here which is create if you are working locally with the Nutch server e.g.
on one machine. It is however not suitable for using the REST API in distributed deployments
where seeds needs to be present within HDFS. More documentation on this topic is available
at 
> https://wiki.apache.org/nutch/Nutch_1.X_RESTAPI#Seed_List_creation
> There are also various mailing list threads regarding use of the REST and this injector
url issue described above needs to be addressed.
> [~sujenshah] CC for context.
> http://www.mail-archive.com/user%40nutch.apache.org/msg14922.html
> http://www.mail-archive.com/user%40nutch.apache.org/msg14921.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message