nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Kugel (JIRA)" <>
Subject [jira] [Commented] (NUTCH-1622) Create Outlinks with metadata
Date Sat, 10 May 2014 22:04:47 GMT


Daniel Kugel commented on NUTCH-1622:

I might have done something wrong but reading the Nutch 2.x code I was under the impression
that the only way to pass data between map/reduce jobs (the outlink data) is appending something
that can be stored in an HBase table and/or mapped by Gora and for that the ByteBuffer was
There's a chance I misunderstood a key concept. That patch was a quick hack.
You seem to separate two concerns that I don't fully understand. I will be happy if you could
We can continue this talk on the mailing list if this is not the platform for this sort of

> Create Outlinks with metadata
> -----------------------------
>                 Key: NUTCH-1622
>                 URL:
>             Project: Nutch
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.7, 2.2.1
>            Reporter: Julien Nioche
>            Assignee: Julien Nioche
>             Fix For: 1.8, 2.4
>         Attachments: NUTCH-1622-2.x.patch, NUTCH-1622.patch
> Having the possibility to specify metadata when creating an outlink is extremely useful
as it allows to pass information from a source page to the pages it links to. We use that
routinely within our custom parsers in combination with the url-meta plugin.

This message was sent by Atlassian JIRA

View raw message