nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jorge Luis Betancourt Gonzalez (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1934) Refactor Fetcher in trunk
Date Mon, 20 Apr 2015 23:11:59 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503904#comment-14503904
] 

Jorge Luis Betancourt Gonzalez commented on NUTCH-1934:
-------------------------------------------------------

+1 to [~chrismattmann] comment, 

If the tests pass without any problem I think we can commit and do some more testing, the
basic test that covers the monolithic fetcher right now is a great starting point, and of
course take it for a spin :) I plan on taking some time to prepare some midsize crawl before/after
the commit if it helps.

> Refactor Fetcher in trunk
> -------------------------
>
>                 Key: NUTCH-1934
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1934
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.10
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>              Labels: memex
>             Fix For: 1.11
>
>         Attachments: NUTCH-1934-trunkv2.patch, NUTCH-1934.patch
>
>
> Put simply [Fetcher|https://github.com/apache/nutch/blob/trunk/src/java/org/apache/nutch/fetcher/Fetcher.java]
is too big.
> This is kinda strange as the size of this file is unique (I think) from every other class
within Nutch. The others are reasonably well modularized and split into constituent classes
which make sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message