nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dennis Kubes (JIRA)" <j...@apache.org>
Subject [jira] Updated: (NUTCH-667) Input Forma for working with Content in Hadoop Streaming
Date Wed, 26 Nov 2008 16:04:44 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dennis Kubes updated NUTCH-667:
-------------------------------

    Attachment: NUTCH-667-1-20081126.patch

Input format for working with hadoop streaming.

> Input Forma for working with Content in Hadoop Streaming
> --------------------------------------------------------
>
>                 Key: NUTCH-667
>                 URL: https://issues.apache.org/jira/browse/NUTCH-667
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>         Environment: All
>            Reporter: Dennis Kubes
>            Assignee: Dennis Kubes
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-667-1-20081126.patch
>
>
> This is a ContextAsText input format that removes line endings with spaces that allow
Nutch content to be used more effectively inside of Hadoop streaming jobs that allow MapReduce
jobs to be written in any language that can communicate with stdin and stdout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message