nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (NUTCH-2441) ARG_SEGMENT usage
Date Wed, 29 Nov 2017 14:46:01 GMT


ASF GitHub Bot commented on NUTCH-2441:

okedoki opened a new pull request #250: fix for NUTCH-2441 ARG_SEGMENT fix for REST API
   In the REST API segment parameter is not used consistently. For part of the endpoints it
is used as a path, for the other part as an array of paths. The fix unifies the usage for
all endpoints.

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

> -----------------
>                 Key: NUTCH-2441
>                 URL:
>             Project: Nutch
>          Issue Type: Improvement
>          Components: metadata
>    Affects Versions: 1.13
>            Reporter: Semyon Semyonov
>             Fix For: 1.14
>         Attachments: metadataARG_SEGMENT.patch
> The class metadata/  public static final String ARG_SEGMENT = "segment" is
not used correctly. In some cases Fetcher and ParseSegment it is interpreted as a single segmenet,
in others CrawlDb, LinkDb, IndexingJob as an array of segments. Such misunderstanding leads
to inconsistency of usage of the parameter.
> After a discussion with [~wastl-nagel]  the proposed solution is to allow the usage of
both array and a string in all cases. That gives an opportunity to not introduce the broken
> A path is proposed.
>  *The question left is refactoring, all these five components share the same code(two
versions of the same code to be precise). Shouldn't we extract a method and reduce duplicates?

This message was sent by Atlassian JIRA

View raw message