nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wallace Xia (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-2045) index-basic incorrect assignment of next fetch time (page.getFetchTime()) as page fetch time
Date Thu, 22 Dec 2016 06:28:58 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769226#comment-15769226
] 

Wallace Xia commented on NUTCH-2045:
------------------------------------

The attached patch did not resolve this problem, page.getPrevFetchTime() return 0, so tstamp
get value of "1970-01-01T00:00:00Z", what I see when index to solr.

> index-basic incorrect assignment of next fetch time (page.getFetchTime()) as page fetch
time
> --------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-2045
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2045
>             Project: Nutch
>          Issue Type: Bug
>          Components: plugin
>    Affects Versions: 2.3
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>             Fix For: 2.3.1
>
>         Attachments: NUTCH-2045.patch
>
>
> The issue here as flagged up when using indexer-elastic plugin where the page fetch time
is incorrectly assigned as the NEXT fetch time as oppose to the time at which the page was
actually fetched (prevFetchTime).
> The ML thread for this issue can be found below
> http://www.mail-archive.com/user%40nutch.apache.org/msg13661.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message