nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Commented: (NUTCH-558) Need tool to retrieve domain statistics
Date Tue, 03 Feb 2009 13:25:59 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669951#action_12669951
] 

Andrzej Bialecki  commented on NUTCH-558:
-----------------------------------------

Part of this tool's functionality is already available in DomainStats utility in trunk. Remaining
parts (especially the expression language) may become moot if Nutch moves to HBase.

> Need tool to retrieve domain statistics
> ---------------------------------------
>
>                 Key: NUTCH-558
>                 URL: https://issues.apache.org/jira/browse/NUTCH-558
>             Project: Nutch
>          Issue Type: New Feature
>    Affects Versions: 1.1
>            Reporter: Chris Schneider
>            Assignee: Chris Schneider
>         Attachments: DomainStats.patch
>
>
> Several developers have expressed interest in a tool to retrieve statistics from a crawl
on a domain basis (e.g., how many pages were successfully fetched from www.apache.org vs.
apache.org, where the latter total would include the former).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message